Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use raft::copy in the inner loop of the host-side raft::gather #2464

Draft
wants to merge 3 commits into
base: branch-25.02
Choose a base branch
from

Conversation

achirkin
Copy link
Contributor

@achirkin achirkin commented Oct 2, 2024

Use raft::copy instead of an assignment operator loop in the host-side gather operation. This should allow to leverage the logic of raft::copy to use fast memcpy when possible (especially for high-dimensional datasets).

As some very circumstantial evidence, this change is observed to reduce raft::matrix::sample_rows(100000000, 128) time from approximately 5 to 4 minutes (bigann-100M int8 dataset in RAM, 16-core machine).

Make the NVTX annotations a little bit more descriptive along the way.

@achirkin achirkin added enhancement New feature or request 3 - Ready for Review non-breaking Non-breaking change labels Oct 2, 2024
@achirkin achirkin self-assigned this Oct 2, 2024
@achirkin achirkin requested a review from a team as a code owner October 2, 2024 09:07
@github-actions github-actions bot added the cpp label Oct 2, 2024
@achirkin achirkin added improvement Improvement / enhancement to an existing function and removed enhancement New feature or request labels Oct 2, 2024
@achirkin achirkin marked this pull request as draft October 8, 2024 07:28
@achirkin
Copy link
Contributor Author

achirkin commented Oct 8, 2024

Converting this to a draft until we properly benchmark the change.

Copy link
Contributor

@tarang-jain tarang-jain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the change. LGTM! But it would be nice to compare the perf in your benchmarking results.

for (IdxT k = 0; k < buff.extent(1); k++) {
buff(i, k) = dataset(in_idx, k);
}
raft::copy(res,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the resource is actually unused since the src and dest are both in host mem? I wonder if we can create a resource on the fly and supply it to raft::copy() rather than adding the resource to gather_buff's signature.

@achirkin achirkin changed the base branch from branch-24.12 to branch-25.02 November 22, 2024 07:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3 - Ready for Review cpp improvement Improvement / enhancement to an existing function non-breaking Non-breaking change
Projects
Development

Successfully merging this pull request may close these issues.

2 participants