chore: remove ms2features_range modes and IDRipper#706
Conversation
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Up to standards ✅🟢 Issues
|
Drop the `by_sample` and `by_project` branches of `ms2features_range` together with the `IDRipper` Nextflow module. These modes exist only to enable a merged Percolator run that is then split back per-run for ConsensusID, which is not the direction the pipeline is heading. With the default `independent_run` path being the only supported topology, the three-way switch in `psm_rescoring`, `dda_id`, and `peptide_database_search` collapses to a single Percolator call per file. IDRipper remains available as a binary inside the OpenMS container (Sage's module still uses it internally for multi-file search output splitting), but the Nextflow module is deleted as it has no remaining caller. GET_SAMPLE / `extract_sample` is also removed — it existed solely to key the by_sample merge and has no other use. IDMerger's `ms2features_range`-keyed prefix logic is stripped (the module is kept for future reuse). Config surface shrinks: `ms2features_range` is removed from `nextflow.config` and `nextflow_schema.json`. `ms2features_enable` remains as the sole rescoring gate. No behaviour change for users on the default (`independent_run`). Anyone running with `by_sample` or `by_project` today will see a schema error on the removed parameter and will need to drop it from their config.
03785ce to
fb116ad
Compare
There was a problem hiding this comment.
Pull request overview
This PR removes the ms2features_range modes (by_sample, by_project) and the associated merge/split rescoring topology, standardizing rescoring to a single per-file Percolator path, and deletes now-unused Nextflow modules (IDRipper, GET_SAMPLE / extract_sample).
Changes:
- Collapse rescoring logic in
psm_rescoring,dda_id, andpeptide_database_searchto always use per-file Percolator outputs. - Remove
ms2features_rangefromnextflow.configandnextflow_schema.json(intentional breaking change). - Delete unused modules (
modules/local/openms/id_ripper/*,modules/local/utils/extract_sample/*) and simplifyIDMergerprefix logic.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| subworkflows/local/psm_rescoring/main.nf | Removes ms2features_range branching and ID_RIPPER usage; always emits Percolator results. |
| subworkflows/local/peptide_database_search/main.nf | Removes sample/project grouping paths (GET_SAMPLE, ID_MERGER); always mixes per-engine per-file IDs. |
| subworkflows/local/dda_id/main.nf | Removes ID_RIPPER branching; always uses per-file Percolator results prior to downstream steps. |
| nextflow_schema.json | Deletes ms2features_range parameter from schema to enforce the breaking change. |
| nextflow.config | Removes default ms2features_range parameter. |
| modules/local/utils/extract_sample/meta.yml | Deletes metadata for removed GET_SAMPLE / extract_sample module. |
| modules/local/utils/extract_sample/main.nf | Deletes the GET_SAMPLE process implementation. |
| modules/local/openms/id_ripper/meta.yml | Deletes metadata for removed ID_RIPPER module. |
| modules/local/openms/id_ripper/main.nf | Deletes the ID_RIPPER process implementation. |
| modules/local/openms/id_merger/main.nf | Removes ms2features_range-keyed prefix branching. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Two comments from the Copilot reviewer: 1. PSM_RESCORING still declared ch_file_preparation_results and _ch_expdesign in its take: block, but neither is used after the rescoring collapse. Dropped both from the workflow signature and updated the caller in subworkflows/local/id/main.nf. 2. The ms2features_range schema removal is a hard break, but nf-schema 2.5.1 only emits a warning for unknown parameters by default — so a config that still sets ms2features_range would be silently ignored rather than rejected. Added an explicit check in validateInputParameters() that errors with a migration message, and a new tests/removed_params.nf.test nf-test that asserts the pipeline fails with 'ms2features_range ... has been removed' when the parameter is provided.
The pipeline-test CI checker requires every nf.test to snapshot a versions.yml file, which a negative test (pipeline expected to fail) doesn't produce. The explicit rejection in validateInputParameters() still catches the removed ms2features_range parameter at runtime with a migration message; the nf-test wrapper isn't needed for that to work.
Summary
by_sampleandby_projectbranches ofms2features_rangeand collapse the three-way switch inpsm_rescoring,dda_id, andpeptide_database_searchto a single per-file Percolator path.IDRipperNextflow module (no remaining caller). TheIDRipperbinary stays available inside the OpenMS container — Sage's module still invokes it internally to split multi-file search output.GET_SAMPLE/extract_samplemodule (only used by the removedby_samplebranch).ms2features_rangefromnextflow.configandnextflow_schema.json.ms2features_enableremains as the sole rescoring gate.IDMerger'sms2features_range-keyed prefix logic (module kept for future reuse).Net:
10 files changed, 7 insertions(+), 296 deletions(-).Why
by_sample/by_projectmerge per-file idXMLs into one and run a single Percolator over the merged pool, thenIDRippersplits the result back per-run forCONSENSUSID. The pooled-Percolator step has two problems: (a) it is not backed by a project-specific benchmark at 1000+ file scale, and (b)PercolatorAdapterrefuses multi-engine input natively, so the whole point of the merge-then-split dance depends onCONSENSUSID— which is itself under discussion for replacement (see follow-up issue).Dropping these modes now leaves only the
independent_runtopology (per-file Percolator), which is both the default today and the conservative choice for correctness. Users of the removed modes see a schema error and must adjust their config; this is an intentional hard break rather than a silent behaviour change.The
CONSENSUSIDreplacement strategy is a bigger design decision with its own trade-offs and is tracked separately. This PR is the pure-deletion prerequisite that removes code those discussions would otherwise have to reason around.Test plan
independent_run,ms2features_enable=false).ms2features_enable=true(single-engine rescoring path) still passes.CONSENSUSID— unchanged).ms2features_range(parameter no longer exists).IDRipperinvocation (modules/local/openms/sage/main.nf:63) still works — the binary stays in the OpenMS container.Breaking change
Yes, intentional. Any config setting
ms2features_range = by_sampleorby_projectmust be updated (or removed —independent_runis the only supported value, and it's now implicit).