Use raft::copy in the inner loop of the host-side raft::gather #2464
+15
−10
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Use
raft::copy
instead of an assignment operator loop in the host-side gather operation. This should allow to leverage the logic ofraft::copy
to use fast memcpy when possible (especially for high-dimensional datasets).As some very circumstantial evidence, this change is observed to reduce
raft::matrix::sample_rows(100000000, 128)
time from approximately 5 to 4 minutes (bigann-100M int8 dataset in RAM, 16-core machine).Make the NVTX annotations a little bit more descriptive along the way.