Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
173 commits
Select commit Hold shift + click to select a range
cce66e6
Add pq_len=8 instances
enp1s0 Aug 3, 2025
7d14a39
Merge branch 'branch-25.10' into cagra-q-pq_len-8
enp1s0 Aug 14, 2025
43e6145
Update CAGRA-Q test
enp1s0 Aug 14, 2025
16321bc
Update CAGRA-Q distance kernel
enp1s0 Sep 3, 2025
a4e5050
Merge branch 'branch-25.10' into cagra-q-pq_len-8
enp1s0 Sep 3, 2025
bfdc2d4
Add DatasetBlockDim check
enp1s0 Sep 3, 2025
23c02e1
Update VPQ compute distance kernel
enp1s0 Sep 3, 2025
5b3a832
Merge branch 'branch-25.12' into cagra-q-pq_len-8
enp1s0 Oct 21, 2025
e0f629c
Add fp_8bit4
enp1s0 Oct 21, 2025
0da1aa2
Fix compilation error
enp1s0 Oct 21, 2025
62ba0ad
Add as_u32
enp1s0 Oct 21, 2025
7f9f614
Update VPQ
enp1s0 Oct 21, 2025
058abbb
Fix fp_8bit4 constructor
enp1s0 Oct 21, 2025
e64f8c6
Add sts for u32
enp1s0 Oct 21, 2025
77492dd
Add f8
enp1s0 Oct 21, 2025
4638eb2
Fix a bug
enp1s0 Oct 22, 2025
a64f264
Add native f8 support
enp1s0 Oct 22, 2025
77dbe73
Merge branch 'branch-25.12' into cagra-q-pq_len-8
enp1s0 Oct 22, 2025
60ba5e9
Fix VPQ init
enp1s0 Oct 22, 2025
f37a131
Update clock measure
enp1s0 Aug 26, 2025
3b3c20b
Add fp8x8
enp1s0 Oct 23, 2025
bc572e0
Fix a bug
enp1s0 Oct 24, 2025
ec275d4
Update 2, 4, 8 configs
enp1s0 Oct 24, 2025
a85d8a3
Fix a bug
enp1s0 Oct 26, 2025
d1f628c
Add F8 query support
enp1s0 Oct 27, 2025
e05dfb0
Fix query vec id calc
enp1s0 Oct 30, 2025
5f2b78f
Improve performance
enp1s0 Oct 30, 2025
484f9e6
Improve performance
enp1s0 Oct 31, 2025
88871c9
Merge branch 'cagra-q-pq_len-8' into cagra-q-pq_len-8-query-f8
enp1s0 Oct 31, 2025
76d820f
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Nov 2, 2025
e710c62
Merge branch 'cagra-q-pq_len-8' into cagra-q-pq_len-8-query-f8
enp1s0 Nov 2, 2025
6bcb0e6
Fix template switch
enp1s0 Nov 2, 2025
581eba1
Fix pq_val_config
enp1s0 Nov 2, 2025
f8a7a74
Merge branch 'cagra-q-pq_len-8' into cagra-q-pq_len-8-query-f8
enp1s0 Nov 2, 2025
a088cfd
Improve smem index calculation
enp1s0 Nov 3, 2025
e244ac1
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Nov 11, 2025
8f00f22
Merge branch 'cagra-q-pq_len-8' into cagra-q-pq_len-8-query-f8
enp1s0 Nov 11, 2025
12809f3
Update fp8 pack dtype
enp1s0 Nov 11, 2025
f91d041
Refactoring
enp1s0 Nov 11, 2025
7c8ecd4
Fix a bug
enp1s0 Nov 11, 2025
7b46115
Add EnableFP8 flag
enp1s0 Nov 11, 2025
d49959c
Fix a bug
enp1s0 Nov 11, 2025
ed906f7
Fix a bug in compute_distance_00_generate.py
enp1s0 Nov 12, 2025
c0e9ddd
Update VPQ instances
enp1s0 Nov 12, 2025
35259e3
Add `smem_dtype` option
enp1s0 Nov 12, 2025
77fbcf6
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Nov 12, 2025
7639d02
Remove unnecessary include
enp1s0 Nov 12, 2025
710232a
Remove unnecessary files
enp1s0 Nov 12, 2025
daf6f89
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Nov 12, 2025
d183be6
Remove unnecessary file
enp1s0 Nov 12, 2025
e7d3d42
Revert "Remove unnecessary file"
enp1s0 Nov 12, 2025
24089bc
Fix Copyright
enp1s0 Nov 12, 2025
fd4530e
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Nov 17, 2025
a4bd0e9
Merge branch 'main' into cagra-q-pq_len-8
cjnolet Jan 5, 2026
13725b3
add a new max_node_id parameter to the CAGRA search API, allowing use…
irina-resh-nvda Feb 6, 2026
0746446
Changed the max node id parameter name to graph_size for clarity; rem…
irina-resh-nvda Feb 6, 2026
70a69d9
wrote test
irina-resh-nvda Feb 6, 2026
f428e54
minor pre-commit changes
irina-resh-nvda Feb 6, 2026
c69c067
Merge branch 'main' into add-max-node-id-parameter
irina-resh-nvda Feb 9, 2026
9b1dff8
started updating the index building for itrative cagra q build
irina-resh-nvda Jan 13, 2026
71a4a09
removed temp files
irina-resh-nvda Jan 13, 2026
ff8174d
started implementing cagra q index computation outside of the loop
irina-resh-nvda Jan 14, 2026
74f5894
removed stub file
irina-resh-nvda Jan 14, 2026
5d7c4f4
updated gitignore
irina-resh-nvda Jan 14, 2026
c6dd661
cagra q index now is only calculated once in the iterative build
irina-resh-nvda Jan 14, 2026
83580cd
fixed rebasing artefact
irina-resh-nvda Feb 9, 2026
7d3b52c
Merge branch 'main' into add-max-node-id-parameter
irina-resh-nvda Feb 16, 2026
13b1f77
addressed comments regarding type cast
irina-resh-nvda Feb 16, 2026
8d3103b
Merge branch 'add-max-node-id-parameter' into iterative_cagra_q
irina-resh-nvda Feb 16, 2026
609b0f3
prune kernel smem
mfoerste4 Feb 16, 2026
a320e0e
reduce copies within reverse graph compute
mfoerste4 Feb 18, 2026
6d1a618
optimize() draft move more compute to GPU
mfoerste4 Feb 19, 2026
77ab079
Merge branch 'rapidsai:main' into cagra_optimize
mfoerste4 Feb 19, 2026
3e9767c
Removed max_node_id
irina-resh-nvda Feb 20, 2026
008e0fb
Merge branch 'rapidsai:main' into cagra_optimize
mfoerste4 Feb 20, 2026
822faea
some fixes, cleanup
mfoerste4 Feb 20, 2026
8ed1497
Merge branch 'main' into cagra_optimize
mfoerste4 Feb 24, 2026
d5882e8
merged main in
irina-resh-nvda Feb 25, 2026
129ee4f
added the fix that also checks whether the kernel function pointer ha…
irina-resh-nvda Feb 25, 2026
03bafdc
Merge remote-tracking branch 'origin/main' into cuvsbench_smem_size_bug
irina-resh-nvda Feb 25, 2026
5ec3027
merged main in
irina-resh-nvda Feb 25, 2026
9b1f741
some fixes
mfoerste4 Feb 25, 2026
7499d42
merged main and the cuvsbench bug fix
irina-resh-nvda Feb 25, 2026
1864711
style fixes|
irina-resh-nvda Feb 25, 2026
a92aa64
optimisation attempt: skip optimisation except last step
irina-resh-nvda Feb 26, 2026
6e42fb1
make optimize() accept device graph
irina-resh-nvda Feb 26, 2026
c540cbc
use reconstructed queries and free the original dataset in cagraq ite…
irina-resh-nvda Feb 26, 2026
099a8d7
moved edge selection to gpu
irina-resh-nvda Feb 26, 2026
764aa64
Moved reverse graph construction to GPU
irina-resh-nvda Feb 26, 2026
ecf3b1d
extract prune into separate function
mfoerste4 Feb 27, 2026
972d278
extract optimize components
mfoerste4 Mar 2, 2026
5e9ebc5
enable both host/device inout graphs for optimize
mfoerste4 Mar 2, 2026
8f24d9d
resolve conflicts
mfoerste4 Mar 2, 2026
40977e2
smaller fixes
mfoerste4 Mar 2, 2026
14e9f3e
bugfix
mfoerste4 Mar 3, 2026
416558d
fuse and simplify pruning, remove CPU path
mfoerste4 Mar 5, 2026
d8d8bd8
cleanup merge, remove CPU path
mfoerste4 Mar 5, 2026
00c4204
batch reverse creation
mfoerste4 Mar 6, 2026
9e63a7c
add prefetch view to handle managed & host
mfoerste4 Mar 6, 2026
a38ad52
fix batched iterator
mfoerste4 Mar 9, 2026
89b0d1c
implement fallback / simplify strategy
mfoerste4 Mar 9, 2026
d0e3dae
add logging / remove stats compute
mfoerste4 Mar 10, 2026
ec45fd2
add test, persist stream pool, cleanup
mfoerste4 Mar 10, 2026
e43b51b
Merge branch 'main' into cagra_optimize
mfoerste4 Mar 10, 2026
c412138
switch to cooperative groups as __reduce_min_sync causes issues
mfoerste4 Mar 11, 2026
b035ea0
Merge branch 'cagra_optimize' of github.com:mfoerste4/cuvs into cagra…
mfoerste4 Mar 11, 2026
ab01bab
back to column wise reverse graph creation to boost closer connections
mfoerste4 Mar 13, 2026
139774f
Merge branch 'main' into cagra_optimize
mfoerste4 Mar 13, 2026
68f7883
fix signness
mfoerste4 Mar 13, 2026
add206a
stupid me trusting cursor to fix this
mfoerste4 Mar 13, 2026
ce593fa
merged with main
irina-resh-nvda Mar 16, 2026
ab21766
leftover files
irina-resh-nvda Mar 16, 2026
6f450cd
Revert graph_core.cuh to merge base before merging PR 1830
irina-resh-nvda Mar 16, 2026
59efb20
Merge remote-tracking branch 'remotes/upstream/pull-request/1830' int…
irina-resh-nvda Mar 16, 2026
d2195fd
older api artefact
irina-resh-nvda Mar 16, 2026
9f315ce
put all optimize() steps onto device, no more extra copies d->h; also…
irina-resh-nvda Mar 17, 2026
e9331b4
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Mar 31, 2026
1fc5acb
Fix copyright
enp1s0 Mar 31, 2026
fa72748
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Apr 1, 2026
4d7bcd4
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Apr 6, 2026
b42c96a
Merge branch 'main' into cagra-q-pq_len-8
cjnolet Apr 6, 2026
44ef664
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Apr 7, 2026
3e1fcbb
merged main in
irina-resh-nvda Apr 7, 2026
b930923
merged 1533 in
irina-resh-nvda Apr 7, 2026
9fb763a
cmake fix
irina-resh-nvda Apr 8, 2026
c91011f
Decoupled compression parameters used during iterative graph construc…
irina-resh-nvda Apr 8, 2026
a1ebe80
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Apr 9, 2026
fbba6b2
Add enable_fp8
enp1s0 Apr 8, 2026
7ed0fe7
Fix smem_dtype validation
enp1s0 Apr 9, 2026
b7f5210
Fix params.smem_dtype set
enp1s0 Apr 9, 2026
beb0a47
Fix CAGRA VPQ instance list
enp1s0 Apr 9, 2026
d430c39
fixed the In this search mode, only AUTO or F16 are supported as the …
irina-resh-nvda Apr 9, 2026
471069b
Fix structured binding mismatch in calc_recall and add explicit retur…
irina-resh-nvda Apr 9, 2026
648ade4
Search parameters used during iterative cagra graph construction are …
irina-resh-nvda Apr 9, 2026
d780bc7
Remove unnecessary files
enp1s0 Apr 9, 2026
9d19f74
Remove unnecessary files (2)
enp1s0 Apr 9, 2026
d60da02
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Apr 9, 2026
c3a3cd9
Remove unnecessary files (3)
enp1s0 Apr 9, 2026
7a96c3d
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Apr 10, 2026
ec34959
Fix a compilation error
enp1s0 Apr 10, 2026
7f9a639
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Apr 14, 2026
0810185
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Apr 20, 2026
183082d
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Apr 26, 2026
2539833
Add pq_len=8
enp1s0 May 29, 2026
90a5970
Merged the updated pr1533 in
irina-resh-nvda Jun 3, 2026
19d5a0a
Update cagra-q test
enp1s0 Jun 4, 2026
09deae5
Update the compute distance kernel
enp1s0 Jun 4, 2026
c1a2ce6
Merge branch 'main' into cagra-q-pq_len-8-alpha
enp1s0 Jun 4, 2026
5fa5321
Add FP8 support
enp1s0 Jun 4, 2026
c323fa1
Update EnableFP8
enp1s0 Jun 4, 2026
a577563
Update vpq test
enp1s0 Jun 4, 2026
d05f552
Remove internal_dtype::AUTO
enp1s0 Jun 5, 2026
9020739
Update fp8xN to used SW emulated FP8 when FP8 is not natively supported
enp1s0 Jun 5, 2026
07d29d2
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Jun 5, 2026
627ee0d
Fix VPQ test
enp1s0 Jun 5, 2026
e7e4205
Fix compilation error
enp1s0 Jun 5, 2026
b788dbb
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Jun 7, 2026
1032ffb
Update VPQ test to use VpqMathT
enp1s0 Jun 8, 2026
02e3726
Add pq_bits assert
enp1s0 Jun 8, 2026
d8c8844
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Jun 9, 2026
8f5895d
Merge remote-tracking branch 'upstream/pull-request/1533' into iterat…
irina-resh-nvda Jun 9, 2026
ba1e26a
Merge remote-tracking branch 'upstream/pull-request/1533' into iterat…
irina-resh-nvda Jun 9, 2026
c608bd1
Remove SW emulated FP8
enp1s0 Jun 9, 2026
f706baa
Update dispatch funcs
enp1s0 Jun 10, 2026
0eef38f
Fix ldg_cg use
enp1s0 Jun 10, 2026
f777809
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Jun 10, 2026
0a74ac6
Merge branch 'cagra-q-pq_len-8' of github.com:enp1s0/cuvs into cagra-…
enp1s0 Jun 10, 2026
dd2500a
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Jun 11, 2026
ba1a5cf
Merge branch 'main' into cagra-q-pq_len-8
enp1s0 Jun 12, 2026
b5355d9
Add shuffle_dataset option for iterative CAGRA-Q graph build
irina-resh-nvda Jun 15, 2026
e152280
Merge remote-tracking branch 'upstream/pull-request/1533' into iterat…
irina-resh-nvda Jun 15, 2026
23c811b
fix oob due to in place raft shuffle
irina-resh-nvda Jun 15, 2026
8fc10ac
made the growth-phase build search itopk parameter a tunable parameter
irina-resh-nvda Jul 1, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,9 @@ docs/source/_static/rust

# clang tooling
compile_commands.json
.clangd/




# serialized ann indexes
brute_force_index
Expand All @@ -86,5 +88,8 @@ ivf_pq_index
/datasets/
/*.json

# clangd
*/.clangd

# java
.classpath
65 changes: 0 additions & 65 deletions cpp/.clangd

This file was deleted.

14 changes: 7 additions & 7 deletions cpp/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -268,12 +268,12 @@ if(NOT BUILD_CPU_ONLY)
INPUT_FILE
"${CMAKE_CURRENT_SOURCE_DIR}/src/neighbors/detail/cagra/compute_distance_vpq_inst.cu.in"
OUTPUT_FILE_FORMAT
"${CMAKE_CURRENT_BINARY_DIR}/src/neighbors/detail/cagra/compute_distance_vpq_inst_data_@data_abbrev@_index_@index_abbrev@_distance_@distance_abbrev@_codebook_@codebook_abbrev@_metric_@metric@_team_@team_size@_dim_@dim@_pq_bits_@pq_bits@_pq_len_@pq_len@.cu"
"${CMAKE_CURRENT_BINARY_DIR}/src/neighbors/detail/cagra/compute_distance_vpq_inst_data_@data_abbrev@_index_@index_abbrev@_distance_@distance_abbrev@_codebook_@codebook_abbrev@_metric_@metric@_team_@team_size@_dim_@dim@_pq_bits_@pq_bits@_pq_len_@pq_len@_smem_@smem_abbrev@.cu"
)
generate_string_matrix(
cagra_compute_distance_vpq_selector_template_params
ITEM_FORMAT
"\nvpq_descriptor_spec<DistanceType::@metric@, @team_size@, @dim@, @pq_bits@, @pq_len@, @codebook_type@, @data_type@, @index_type@, @distance_type@>"
"\nvpq_descriptor_spec<DistanceType::@metric@, @team_size@, @dim@, @pq_bits@, @pq_len@, @codebook_type@, @data_type@, @index_type@, @distance_type@, @smem_dtype@>"
GLUE
","
MATRIX_JSON_FILE
Expand All @@ -282,7 +282,7 @@ if(NOT BUILD_CPU_ONLY)
generate_string_matrix(
cagra_compute_distance_vpq_template_inst
ITEM_FORMAT
"extern template struct vpq_descriptor_spec<DistanceType::@metric@, @team_size@, @dim@, @pq_bits@, @pq_len@, @codebook_type@, @data_type@, @index_type@, @distance_type@>@semicolon@"
"extern template struct vpq_descriptor_spec<DistanceType::@metric@, @team_size@, @dim@, @pq_bits@, @pq_len@, @codebook_type@, @data_type@, @index_type@, @distance_type@, @smem_dtype@>@semicolon@"
GLUE
"\n"
MATRIX_JSON_FILE
Expand Down Expand Up @@ -688,13 +688,13 @@ if(NOT BUILD_CPU_ONLY)
generate_jit_lto_kernels(
jit_lto_files
NAME_FORMAT
"cagra_setup_workspace@pq_prefix@_team_size_@team_size@_dataset_block_dim_@dataset_block_dim@_@pq_bits@pq_@pq_len@subd_data_@data_abbrev@_query_@query_abbrev@"
"cagra_setup_workspace@pq_prefix@_team_size_@team_size@_dataset_block_dim_@dataset_block_dim@_@pq_bits@pq_@pq_len@subd_data_@data_abbrev@_query_@query_abbrev@_smem_@smem_abbrev@"
MATRIX_JSON_FILE
"${CMAKE_CURRENT_SOURCE_DIR}/src/neighbors/detail/cagra/jit_lto_kernels/setup_workspace_matrix.json"
KERNEL_INPUT_FILE
"${CMAKE_CURRENT_SOURCE_DIR}/src/neighbors/detail/cagra/jit_lto_kernels/setup_workspace_kernel.cu.in"
FRAGMENT_TAG_FORMAT
"${cagra_ns}::fragment_tag_setup_workspace<${neighbors_ns}::tag_@data_abbrev@, ${neighbors_ns}::tag_index_@index_abbrev@, ${cagra_ns}::tag_dist_@distance_abbrev@, ${neighbors_ns}::tag_@query_abbrev@, ${cagra_ns}::tag_codebook_@codebook_abbrev@, @team_size@, @dataset_block_dim@, @pq_bits@, @pq_len@>"
"${cagra_ns}::fragment_tag_setup_workspace<${neighbors_ns}::tag_@data_abbrev@, ${neighbors_ns}::tag_index_@index_abbrev@, ${cagra_ns}::tag_dist_@distance_abbrev@, ${neighbors_ns}::tag_@query_abbrev@, ${cagra_ns}::tag_codebook_@codebook_abbrev@, @team_size@, @dataset_block_dim@, @pq_bits@, @pq_len@, ${cagra_ns}::tag_smem_@smem_abbrev@>"
FRAGMENT_TAG_HEADER_FILES "<cuvs/detail/jit_lto/cagra/cagra_fragments.hpp>"
"<cuvs/detail/jit_lto/common_fragments.hpp>"
OUTPUT_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/generated_kernels/cagra/setup_workspace"
Expand All @@ -704,13 +704,13 @@ if(NOT BUILD_CPU_ONLY)
generate_jit_lto_kernels(
jit_lto_files
NAME_FORMAT
"cagra_compute_distance@pq_prefix@_team_size_@team_size@_dataset_block_dim_@dataset_block_dim@_@pq_bits@pq_@pq_len@subd_data_@data_abbrev@_query_@query_abbrev@"
"cagra_compute_distance@pq_prefix@_team_size_@team_size@_dataset_block_dim_@dataset_block_dim@_@pq_bits@pq_@pq_len@subd_data_@data_abbrev@_query_@query_abbrev@_smem_@smem_abbrev@"
MATRIX_JSON_FILE
"${CMAKE_CURRENT_SOURCE_DIR}/src/neighbors/detail/cagra/jit_lto_kernels/compute_distance_matrix.json"
KERNEL_INPUT_FILE
"${CMAKE_CURRENT_SOURCE_DIR}/src/neighbors/detail/cagra/jit_lto_kernels/compute_distance_kernel.cu.in"
FRAGMENT_TAG_FORMAT
"${cagra_ns}::fragment_tag_compute_distance<${neighbors_ns}::tag_@data_abbrev@, ${neighbors_ns}::tag_index_@index_abbrev@, ${cagra_ns}::tag_dist_@distance_abbrev@, ${neighbors_ns}::tag_@query_abbrev@, ${cagra_ns}::tag_codebook_@codebook_abbrev@, @team_size@, @dataset_block_dim@, @pq_bits@, @pq_len@>"
"${cagra_ns}::fragment_tag_compute_distance<${neighbors_ns}::tag_@data_abbrev@, ${neighbors_ns}::tag_index_@index_abbrev@, ${cagra_ns}::tag_dist_@distance_abbrev@, ${neighbors_ns}::tag_@query_abbrev@, ${cagra_ns}::tag_codebook_@codebook_abbrev@, @team_size@, @dataset_block_dim@, @pq_bits@, @pq_len@, ${cagra_ns}::tag_smem_@smem_abbrev@>"
FRAGMENT_TAG_HEADER_FILES "<cuvs/detail/jit_lto/cagra/cagra_fragments.hpp>"
"<cuvs/detail/jit_lto/common_fragments.hpp>"
OUTPUT_DIRECTORY "${CMAKE_CURRENT_BINARY_DIR}/generated_kernels/cagra/compute_distance"
Expand Down
98 changes: 94 additions & 4 deletions cpp/bench/ann/src/cuvs/cuvs_ann_bench_param_parser.h
Original file line number Diff line number Diff line change
Expand Up @@ -320,10 +320,12 @@ void parse_build_param(const nlohmann::json& conf, cuvs::neighbors::cagra::index
}

// Parse build-algo-specific parameters and use them to decide on the algo type
nlohmann::json ivf_pq_build_conf = collect_conf_with_prefix(conf, "ivf_pq_build_");
nlohmann::json ivf_pq_search_conf = collect_conf_with_prefix(conf, "ivf_pq_search_");
nlohmann::json nn_descent_conf = collect_conf_with_prefix(conf, "nn_descent_");
nlohmann::json ace_conf = collect_conf_with_prefix(conf, "ace_");
nlohmann::json ivf_pq_build_conf = collect_conf_with_prefix(conf, "ivf_pq_build_");
nlohmann::json ivf_pq_search_conf = collect_conf_with_prefix(conf, "ivf_pq_search_");
nlohmann::json nn_descent_conf = collect_conf_with_prefix(conf, "nn_descent_");
nlohmann::json ace_conf = collect_conf_with_prefix(conf, "ace_");
nlohmann::json build_compression_conf = collect_conf_with_prefix(conf, "build_compression_");
nlohmann::json build_search_conf = collect_conf_with_prefix(conf, "build_search_");

// When graph_build_algo is not specified, leave graph_build_params as monostate so the
// CAGRA build uses AUTO selection (NN_DESCENT or IVF_PQ based on dataset/heuristics).
Expand Down Expand Up @@ -354,6 +356,94 @@ void parse_build_param(const nlohmann::json& conf, cuvs::neighbors::cagra::index
} else if constexpr (std::is_same_v<U,
cuvs::neighbors::graph_build_params::nn_descent_params>) {
parse_build_param<T, IdxT>(nn_descent_conf, arg);
} else if constexpr (std::is_same_v<
U,
cuvs::neighbors::graph_build_params::iterative_search_params>) {
if (!build_compression_conf.empty()) {
auto vpq_pams = arg.build_compression.value_or(cuvs::neighbors::vpq_params{});
parse_build_param(build_compression_conf, vpq_pams);
arg.build_compression.emplace(vpq_pams);
}
if (build_search_conf.contains("width")) {
arg.search_width = build_search_conf.at("width");
}
if (build_search_conf.contains("max_iterations")) {
arg.max_iterations = build_search_conf.at("max_iterations");
}
if (build_search_conf.contains("min_iterations")) {
arg.min_iterations = build_search_conf.at("min_iterations");
}
if (build_search_conf.contains("itopk")) { arg.itopk_size = build_search_conf.at("itopk"); }
if (build_search_conf.contains("max_queries")) {
arg.max_queries = build_search_conf.at("max_queries");
}
if (build_search_conf.contains("team_size")) {
arg.team_size = build_search_conf.at("team_size");
}
if (build_search_conf.contains("thread_block_size")) {
arg.thread_block_size = build_search_conf.at("thread_block_size");
}
if (build_search_conf.contains("hashmap_min_bitlen")) {
arg.hashmap_min_bitlen = build_search_conf.at("hashmap_min_bitlen");
}
if (build_search_conf.contains("hashmap_max_fill_rate")) {
arg.hashmap_max_fill_rate = build_search_conf.at("hashmap_max_fill_rate");
}
if (build_search_conf.contains("num_random_samplings")) {
arg.num_random_samplings = build_search_conf.at("num_random_samplings");
}
if (build_search_conf.contains("persistent")) {
arg.persistent = build_search_conf.at("persistent");
}
if (build_search_conf.contains("persistent_lifetime")) {
arg.persistent_lifetime = build_search_conf.at("persistent_lifetime");
}
if (build_search_conf.contains("persistent_device_usage")) {
arg.persistent_device_usage = build_search_conf.at("persistent_device_usage");
}
if (build_search_conf.contains("algo")) {
std::string algo = build_search_conf.at("algo");
if (algo == "single_cta") {
arg.algo = cuvs::neighbors::cagra::search_algo::SINGLE_CTA;
} else if (algo == "multi_cta") {
arg.algo = cuvs::neighbors::cagra::search_algo::MULTI_CTA;
} else if (algo == "multi_kernel") {
arg.algo = cuvs::neighbors::cagra::search_algo::MULTI_KERNEL;
} else if (algo == "auto") {
arg.algo = cuvs::neighbors::cagra::search_algo::AUTO;
}
}
if (build_search_conf.contains("hashmap_mode")) {
std::string mode = build_search_conf.at("hashmap_mode");
if (mode == "hash") {
arg.hashmap_mode = cuvs::neighbors::cagra::hash_mode::HASH;
} else if (mode == "small") {
arg.hashmap_mode = cuvs::neighbors::cagra::hash_mode::SMALL;
} else if (mode == "auto") {
arg.hashmap_mode = cuvs::neighbors::cagra::hash_mode::AUTO;
}
}
// Whether to shuffle the (compressed) dataset before the iterative build loop.
if (build_search_conf.contains("shuffle_dataset")) {
arg.shuffle_dataset = build_search_conf.at("shuffle_dataset").get<bool>();
}
// Precision of the codebook/query in shared memory for the VPQ search used during
// the iterative build. Accepts an integer code (0=F16, 1=E5M2) or a string.
if (build_search_conf.contains("smem_dtype")) {
const auto& sd = build_search_conf.at("smem_dtype");
if (sd.is_number_integer()) {
arg.smem_dtype = static_cast<cuvs::neighbors::cagra::internal_dtype>(sd.get<int>());
} else {
std::string s = sd.get<std::string>();
if (s == "f16" || s == "F16" || s == "fp16" || s == "half") {
arg.smem_dtype = cuvs::neighbors::cagra::internal_dtype::F16;
} else if (s == "e5m2" || s == "E5M2" || s == "fp8") {
arg.smem_dtype = cuvs::neighbors::cagra::internal_dtype::E5M2;
} else {
throw std::runtime_error("invalid value for build_search smem_dtype: " + s);
}
}
}
}
},
params.graph_build_params);
Expand Down
8 changes: 6 additions & 2 deletions cpp/include/cuvs/detail/jit_lto/cagra/cagra_fragments.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,8 @@ struct tag_metric_cosine {};
struct tag_metric_hamming {};
struct tag_codebook_none {};
struct tag_codebook_half {};
struct tag_smem_f16 {};
struct tag_smem_e5m2 {};
struct tag_metric_l1 {};
struct tag_norm_noop {};
struct tag_norm_cosine {};
Expand All @@ -33,7 +35,8 @@ template <typename DataTag,
uint32_t TeamSize,
uint32_t DatasetBlockDim,
uint32_t PqBits,
uint32_t PqLen>
uint32_t PqLen,
typename SmemTag>
struct fragment_tag_setup_workspace {};

template <typename DataTag,
Expand All @@ -44,7 +47,8 @@ template <typename DataTag,
uint32_t TeamSize,
uint32_t DatasetBlockDim,
uint32_t PqBits,
uint32_t PqLen>
uint32_t PqLen,
typename SmemTag>
struct fragment_tag_compute_distance {};

template <typename QueryTag, typename DistanceTag, typename MetricTag>
Expand Down
Loading
Loading