-
Notifications
You must be signed in to change notification settings - Fork 48
Convert negative frame indices in C++, convert slices in Python #746
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
test/test_decoders.py
Outdated
IndexError, | ||
match="Stop index \\(-\\d+\\) must not be less than the start index", | ||
): | ||
decoder.get_frames_in_range(start=0, stop=-1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I considered possible edge cases for stop
, I believe these tests should cover them.
We assert that start
is within 0 <= start < self._num_frames
, and that stop >= start
.
As a result, we do not need to explicitly check that stop
is within the valid range, since any higher value is reduced to self._num_frames
, and any lower value cannot be below the lowest start
value of 0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of precaution and to be pedantic, maybe also test when both start and stop are positive:
decoder.get_frames_in_range(start=10, stop=0)
so that stop
is fully independent of our negative index conversion logic
test/test_decoders.py
Outdated
with pytest.raises( | ||
RuntimeError, match=f"Invalid frame index={expected_converted_index}" | ||
): | ||
with pytest.raises(IndexError, match="Index -\\d+ is out of bounds"): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since pytest.raises accepts regex patterns, it is not necessary to calculate the index that will appear in the error.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR @Dan-Flores . Left a few comments below, this mostly looks good. There's a discussion point I'd like to dig into with you and @scotts , but this isn't blocking for this PR.
test/test_decoders.py
Outdated
@@ -772,6 +769,66 @@ def test_get_frames_in_range(self, stream_index, device, seek_mode): | |||
empty_frames.duration_seconds, NASA_VIDEO.empty_duration_seconds | |||
) | |||
|
|||
@pytest.mark.parametrize("device", cpu_and_cuda()) | |||
@pytest.mark.parametrize("stream_index", [3, None]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: I think we can remove the stream_index
parametrization, because what we're testing here is (or should be) completely orthogonal to the stream concept.
f"Index {index} is out of bounds; must be in the range [0, {self._num_frames})." | ||
) | ||
else: | ||
indices[i] = index |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In general, we should avoid modifying the user-provided input inplace, as this can be surprising and have unintended effects. Let's keep the list comprehension logic?
@@ -247,6 +252,8 @@ def get_frames_in_range(self, start: int, stop: int, step: int = 1) -> FrameBatc | |||
Returns: | |||
FrameBatch: The frames within the specified range. | |||
""" | |||
start = start if start >= 0 else start + self._num_frames | |||
stop = min(stop if stop >= 0 else stop + self._num_frames, self._num_frames) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to note that eventually, we'll want to support stop=None
and make it the default. This is what we have in the audio decoder and that's useful to decode all the frames at once. We should also add it to the time-based API get_frames_played_in_range()
.
But we should do that separately.
test/test_decoders.py
Outdated
start=387, stop=390, stream_index=stream_index | ||
).to(device) | ||
frames387_389 = decoder.get_frames_in_range(start=387, stop=1000) | ||
print(f"{frames387_389.data.shape=}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like a debugging left-over:
print(f"{frames387_389.data.shape=}") |
test/test_decoders.py
Outdated
ref_frames386_389 = NASA_VIDEO.get_frame_data_by_range( | ||
start=386, stop=390, stream_index=stream_index | ||
).to(device) | ||
frames386_389 = decoder.get_frames_in_range(start=-4, stop=1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's check for a negative stop
value as well. This should exclude the last frame so you'll have to exclude it from the ref above.
frames386_389 = decoder.get_frames_in_range(start=-4, stop=1000) | |
frames386_389 = decoder.get_frames_in_range(start=-4, stop=-1) |
test/test_decoders.py
Outdated
IndexError, | ||
match="Stop index \\(-\\d+\\) must not be less than the start index", | ||
): | ||
decoder.get_frames_in_range(start=0, stop=-1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Out of precaution and to be pedantic, maybe also test when both start and stop are positive:
decoder.get_frames_in_range(start=10, stop=0)
so that stop
is fully independent of our negative index conversion logic
indices = [ | ||
index if index >= 0 else index + self._num_frames for index in indices | ||
] | ||
for i, index in enumerate(indices): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize I'm sort of going back on the discussion from https://github.com/pytorch/torchcodec/pull/743/files#r2167398761, so I apologize for not bringing that up earlier. But I wonder if we actually want to do the validation and the conversion in Python. For performance reasons, it might be better to keep everything in C++, as the indices
list can potentially be quite long, especially when sampling a large number of frames for long videos. I suspect we can easily have len(indices) > Nx1000
, typically when using our samplers on long videos:
frames = decoder.get_frames_at(indices=all_clips_indices) |
As a somewhat related concern I think we'll eventually want to support indices
not just as a list, but also as a tensor, e.g. for users who create their own sampling strategy using torch
utilities. And it'd be great if we could avoid copying that input tensor before passing it to the underlying decoding ops, which we can only do if we run the validation/conversion in C++.
In any case, we can and should merge this PR as-is regardless of our conclusion, because we are already doing such validation in main
, and this PR doesn't add much on top of the existing one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed on both points - when I wrote the comment on #743, I was also wondering about performance, and figured let's go for consistency first, then deal with performance later. We can make the checking basically zero-cost on the C++ side if we do it as we iterate through the indices - but we'll want to not do a TORCH_CHECK()
, but throw a std::out_of_range
so it becomes the right thing on the Python side. We can do all of that on a follow-up PR.
d238cb0
to
d327b10
Compare
@@ -568,8 +574,6 @@ FrameBatchOutput SingleStreamDecoder::getFramesAtIndices( | |||
auto indexInOutput = indicesAreSorted ? f : argsort[f]; | |||
auto indexInVideo = frameIndices[indexInOutput]; | |||
|
|||
validateFrameIndex(streamMetadata, indexInVideo); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this call to validateFrameIndex
is not needed, since getFrameAtIndexInternal
will also call validateFrameIndex
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Yes, although I had to think about it. What gave me pause is the fact that we also use it in the if
branch, but if indexInVideo == previousIndexInVideo
then we must have checked it during the previous iteration so we're safe.
@@ -247,6 +236,8 @@ def get_frames_in_range(self, start: int, stop: int, step: int = 1) -> FrameBatc | |||
Returns: | |||
FrameBatch: The frames within the specified range. | |||
""" | |||
# Adjust start / stop indices to enable indexing semantics, ex. [-10, 1000] returns the last 10 frames | |||
start, stop, step = slice(start, stop, step).indices(self._num_frames) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is how slices are handled in VideoDecoder._getitem_slice
, using it here ensures consistent behavior.
with pytest.raises( | ||
IndexError, | ||
match="must be greater than or equal to 0, or be a valid negative index", | ||
): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The previous error check in get_frame_at
was removed from Python. Now, this error comes from validateFrameIndex
.
I removed the single index error check in Python for consistency between the get_frame(s)...
functions, but I am open to added them back if erroring earlier is preferred, please let me know if you have any preference @scotts @NicolasHug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes sense to do the error checking once, on the C++ side. But we still have the error checking on the Python side for get_frames_in_range()
- is that now also redundant?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, I'll remove those checks since .indices()
will prevent most of them. It does not alter a negative step
, but we check and error on that case in SingleStreamDecoder::getFramesInRange
.
if (numFrames.has_value()) { | ||
// If the frameIndex is negative, we convert it to a positive index | ||
frameIndex = frameIndex >= 0 ? frameIndex : frameIndex + numFrames.value(); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If numFrames
does not have a value, then I think we need to assert that frameIndex
is positive, and throw a std::range_error
if it's not. Otherwise, we'll end up using a negative frameIndex
further down.
Ah, now I see that we actually do that in validateFrameIndex()
, which we call right after. 👍
"Invalid frame index=" + std::to_string(frameIndex) + | ||
" for streamIndex=" + std::to_string(streamMetadata.streamIndex) + | ||
"; must be greater than or equal to 0, or be a valid negative index"); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could end up being wordy, but we might want to mention that we can't receive any negative index if we don't know the number of frames. Maybe something like:
throw std::out_of_range(
"Invalid frame index=" + std::to_string(frameIndex) +
" for streamIndex=" + std::to_string(streamMetadata.streamIndex) +
"; negative indices must have an absolute value that is less than the number of frames, "
"and the number of frames must be known.");
}
This PR adds the follow ups from #743:
getFrameAtIndexInternal
:getFrameAtIndex
,getFramesAtIndices
,getFramesInRange
,getFramesPlayedInRange
..indices()
function to convert negative or None indices inget_frames_in_range
.validateFrameIndex
are changed tostd::out_of_range
, to better reflect the IndexError.The relevant tests are updated as well:
test_get_frame_at_index
to test returning a negative index.test_get_frames_at_indices_negative_indices
to test returning a negative index.test_get_frames_at_indices_fail_on_invalid_negative_indices
to ensure negative indices fail when an invalid negative index is passed in.test_get_frames_in_range_slice_indices_syntax
to test negative indices, capping upper bound indices, and none as stop.