perf: Speed up vertex/edge sequence construction (~10× for large result sets)#2696
Open
schochastics wants to merge 9 commits into
Open
perf: Speed up vertex/edge sequence construction (~10× for large result sets)#2696schochastics wants to merge 9 commits into
schochastics wants to merge 9 commits into
Conversation
Vertex/edge sequences eagerly materialized their `names` (and edge `vnames`) character vectors at construction time. For functions that return many sequences (e.g. max_cliques returning tens of thousands), this allocates a named character vector per object even when the names are never read. Add an `igraph_lazy_names` ALTREP string class (src/rinterface_extra.c) that wraps the graph's full name vector plus a 1-based index, and only materializes the strings when an element is touched (printing, named indexing, as_ids()). Subsetting stays lazy via an Extract_subset method that composes indices. V() and E() now attach names through the lazy_index_names() helper instead of subsetting eagerly; downstream constructors (create_vs/simple_vs_index/...) inherit laziness through the Extract_subset path with no changes. Names resolve identically to before; vertex/edge sequence printing, named indexing and as_ids() are unaffected. Note: this removes the name-materialization penalty (named sequences now cost about the same as unnamed), but it is not by itself sufficient to close the gap to the numeric (return.vs.es = FALSE) path -- that is dominated by per-object graph-reference attachment in add_vses_graph_ref(), addressed separately. cpp11::cpp_register() also drops two dead registrations (Rx_igraph_weak_ref_run_finalizer, UUID_gen) that no R code .Call()s. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a benchmark group exercising the sequence-construction path on named graphs, where building the `names`/`vnames` attribute and attaching the graph reference dominate: max_cliques (tens of thousands of vertex sequences), head_of over every edge, and V()/E() on a large named graph. These guard the lazy-names work and any future construction-path changes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Constructing a vertex/edge sequence attached a graph reference per object via add_vses_graph_ref(), which calls .Call(Rx_igraph_copy_env), .Call(Rx_igraph_make_weak_ref) and .Call(Rx_igraph_get_graph_id) for every object. For functions returning many sequences (e.g. max_cliques returning tens of thousands) this dominated construction time -- profiling showed it was ~75% of the cost, far more than name building. The weak reference's key is the graph's environment, which is identical for every sequence of a graph (Rf_duplicate() is a no-op on an environment, so get_vs_ref() returns the same env each call). A single shared weak reference is therefore semantically identical to one per object: while the graph is alive the reference resolves, and once the graph is released the (weak) reference reports it gone -- verified that get_vs_graph() still returns NULL after rm(graph); gc(). simple_es_index() already propagates env/graph from its input, so edge construction only needed the redundant per-object add_vses_graph_ref() call removed. simple_vs_index() now propagates env/graph the same way, and unsafe_create_vs()/unsafe_create_es() rely on that propagation. The existing `lapply(res, unsafe_create_vs, graph = graph, verts = V(graph))` call sites then share the single weak reference built by V()/E() with no call-site or codegen changes. max_cliques(sample_gnp(500, 0.15)) on a named graph: ~338ms -> ~200ms. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Two construction-cost reductions on top of the shared weak reference: * simple_vs_index() now sets names/class/env/graph in a single `attributes<-` call instead of separate `attr<-`/`class<-` assignments. Each incremental assignment shallow-copies the vector, and that copying dominated when building many sequences. * unsafe_create_vs() no longer routes through simple_vs_index(). Its `idx` are vertex IDs from C and `verts` is always the full V(graph), so `verts[idx]` just reproduces `idx`; we now use the IDs directly as the (integer) payload and take names lazily from `verts` via the ALTREP's O(1) subsetting, avoiding a full copy of V(graph) per object. Payload type (integer), names, NA handling and graph recovery are unchanged. max_cliques(sample_gnp(500, 0.15)) on a named graph: ~200ms -> ~140ms (was ~338ms before this branch). Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Functions returning many vertex sequences used `lapply(res, unsafe_create_vs, graph = graph, verts = V(graph))`, which re-read the graph reference, graph id and name source from `verts` on every object. Add `create_vs_list(graph, idx_list)` which hoists all of that per-graph work out of the loop and builds each sequence with a single `attributes<-` and a directly-constructed lazy-names ALTREP, so per-object cost drops to ~an integer coercion plus attribute set. The `VERTEXSET_LIST` OUTCONV template in tools/stimulus/types-RR.yaml now emits `create_vs_list(graph, res)`; the generated R/aaa-*.R files are updated to match (27 sites), along with the hand-written call sites in cliques.R, cohesive.blocks.R, components.R, conversion.R, interface.R, paths.R and structural-properties.R (10 sites). `unsafe_create_vs()` stays as the single-object form. Edge sequences are left as-is: simple_es_index() already propagates the shared weak reference from a single E(graph), so unsafe_create_es() does not pay the per-object cost. A lazy `vnames` and an es batch form remain as follow-ups. max_cliques(sample_gnp(500, 0.15)) on a named graph: ~140ms -> ~100ms; ~338ms -> ~100ms over the whole branch (numeric floor ~25-30ms). Output, names, NA handling and graph recovery are unchanged; 0 test failures across the vertex-sequence-returning suites and aaa-auto. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Add a benchmark group exercising the three distinct wins on named graphs: * ego_order2_named -- batch construction of one vertex sequence per node (create_vs_list via neighborhood()). * max_cliques_sizes_named -- build many cliques but only read their sizes; with lazy names the name vectors are never materialized. * all_simple_paths_named -- another high-volume vertex-sequence-list path. * vs_subset_positional -- O(1) lazy subsetting of a large named vertex sequence (ALTREP Extract_subset) instead of copying a name subset. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
create_vs_list() drove the per-element work (lapply closure, as.integer, .Call to build the lazy-names ALTREP, and attributes<-) from R, repeated once per sequence -- the dominant cost when functions like max_cliques() return tens of thousands of igraph.vs objects. Move that loop into Rx_igraph_vs_list() in rinterface_extra.c, which reuses the file-static igraph_lazy_names ALTREP class directly. R now only builds V(graph) once (to mint the shared weak reference and graph id) and hands the pieces to C. Each element gets a fresh integer payload (guarded against coerceVector aliasing so the caller's vectors are never mutated), an optional lazy-names attribute, the shared env/graph attributes, and the igraph.vs class. max_cliques(sample_gnp(500, 0.15)) on a named graph (medians, 12 iters): vs: ~100ms -> 34.5ms (numeric floor ~30ms) Correctness verified: integer payloads with identical values/names, no names on unnamed graphs, NA/out-of-range IDs map to NA_STRING, inputs unmutated, and the shared weakref still releases the graph after rm()+gc(). Full suite passes (iterators, cliques, paths, components, flow, conversion, interface, structural-properties, cycles, attributes, aaa-auto); clean under gctorture. cpp11::cpp_register() added only the Rx_igraph_vs_list registration; it left the previously-noted unused entries untouched. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 21f3d4b is merged into main:
|
R CMD check on R 4.5 flagged a non-API call to `DATAPTR`, used by the `igraph_lazy_names` ALTREP Dataptr method. Switch to `DATAPTR_RO` (and keep `DATAPTR_OR_NULL`), both of which are part of the API and are the sanctioned replacements; `DATAPTR` is on tools:::nonAPI, these are not. The materialized name cache is read-only for our purposes (as.vector(), coercion, printing), so a read-only data pointer is sufficient. Verified the compiled object no longer references the `DATAPTR` entry point and that names/as_ids/as.vector/subsetting/printing still work. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Contributor
|
This is how benchmark results would change (along with a 95% confidence interval in relative change) if 79fadf6 is merged into main:
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
part of #2695. I went down the ALTREP hole and while it is possible to get near the
return.vs.es = FALSEresult, it does add quite some complexity to the code. Lets discuss if we want this or not. maybe some stuff has to go to the C core?Summary
Functions that return many vertex sequences (
max_cliques(),cliques(),ego(), shortest-pathvpaths,all_simple_paths(), separators, cohesive blocks) were slow on named graphs because each returned sequence paid a fixed per-object tax. This PR removes that tax.max_cliques(sample_gnp(500, 0.15))on a named graph (37k cliques) locally:mainreturn.vs.es = FALSE)The default (vs objects) is now within ~4 ms of returning bare integer indices.
The finding (and how it differs from the issue)
The motivation in #1614 / #1652 assumed the cost was
create_vs()building thenamesattribute, and the fix was lazy/ALTREP names. Profiling says otherwise: lazy names was the smallest of the three contributors (~10%). The real costs were.Calls per sequence — ~75% of the time), andThe weak-reference turned out to be the same for every sequence of a graph (
Rf_duplicate()is a no-op on an environment), so one shared reference replaces one-per-object with no change in lifetime semantics. The rest came from building the payload directly, setting attributes in one pass, and finally constructing the whole list in a single C call.This reframes #1652. The performance gap that blocked deprecating
return.vs.es(the worst case stibu81 called "unusable" was an 8× gap) is now ~1.15×. For typical workloads it's negligible. The option's main remaining justification is gone, so it can be deprecated and left inert rather than urgently removed.What changed
names/edge-names are a lazy ALTREP string vector (igraph_lazy_names) that materializes only when read, and subsets in O(1).create_vs_list(), backed by a C constructor (Rx_igraph_vs_list) that builds the whole list in one pass.VERTEXSET_LISTstimulus template emitscreate_vs_list(); the generatedaaa-*.Rfiles were regenerated to match (verified by re-running stimulus).max_cliques_named,head_of_named,V_named/E_named) and the specific wins (ego_order2_named,max_cliques_sizes_named,all_simple_paths_named,vs_subset_positional).Correctness
Output is unchanged: integer payloads with the same values and names, no
namesattribute on unnamed graphs, NA/out-of-range IDs map toNA_STRING, and inputs are never mutated. The shared weak reference still releases the graph:get_vs_graph(seq)returnsNULLafterrm(graph); gc(). The full suite passes (iterators, cliques, paths, components, flow, conversion, interface, structural-properties, cycles, attributes, aaa-auto), and the C code is clean undergctorture(TRUE).Scope and follow-ups
Vertex sequences only. Edge sequences already share the weak reference via
simple_es_index(), but theirvnames("tail|head") are still built eagerly; making those lazy and adding a Ccreate_es_list()is left as a follow-up. Thereturn.vs.esoption and its codegen guards are untouched here — its deprecation is a separate decision, now much better supported.Relates to #1652; complements the consistency fix in #1614 / #2439.