Skip to content

chore: remove ms2features_range modes and IDRipper#706

Merged
ypriverol merged 3 commits intobigbio:devfrom
ypriverol:feat/remove-ms2features-range
Apr 21, 2026
Merged

chore: remove ms2features_range modes and IDRipper#706
ypriverol merged 3 commits intobigbio:devfrom
ypriverol:feat/remove-ms2features-range

Conversation

@ypriverol
Copy link
Copy Markdown
Member

Summary

  • Drop the by_sample and by_project branches of ms2features_range and collapse the three-way switch in psm_rescoring, dda_id, and peptide_database_search to a single per-file Percolator path.
  • Delete the IDRipper Nextflow module (no remaining caller). The IDRipper binary stays available inside the OpenMS container — Sage's module still invokes it internally to split multi-file search output.
  • Delete the GET_SAMPLE / extract_sample module (only used by the removed by_sample branch).
  • Remove ms2features_range from nextflow.config and nextflow_schema.json. ms2features_enable remains as the sole rescoring gate.
  • Tidy IDMerger's ms2features_range-keyed prefix logic (module kept for future reuse).

Net: 10 files changed, 7 insertions(+), 296 deletions(-).

Why

by_sample / by_project merge per-file idXMLs into one and run a single Percolator over the merged pool, then IDRipper splits the result back per-run for CONSENSUSID. The pooled-Percolator step has two problems: (a) it is not backed by a project-specific benchmark at 1000+ file scale, and (b) PercolatorAdapter refuses multi-engine input natively, so the whole point of the merge-then-split dance depends on CONSENSUSID — which is itself under discussion for replacement (see follow-up issue).

Dropping these modes now leaves only the independent_run topology (per-file Percolator), which is both the default today and the conservative choice for correctness. Users of the removed modes see a schema error and must adjust their config; this is an intentional hard break rather than a silent behaviour change.

The CONSENSUSID replacement strategy is a bigger design decision with its own trade-offs and is tracked separately. This PR is the pure-deletion prerequisite that removes code those discussions would otherwise have to reason around.

Test plan

  • nf-test LFQ default run still passes (single-engine, independent_run, ms2features_enable=false).
  • nf-test LFQ with ms2features_enable=true (single-engine rescoring path) still passes.
  • nf-test TMT default run still passes.
  • nf-test multi-engine (Comet + MSGF+) path still passes (goes through CONSENSUSID — unchanged).
  • Schema validation rejects any config that still sets ms2features_range (parameter no longer exists).
  • Verify Sage's internal IDRipper invocation (modules/local/openms/sage/main.nf:63) still works — the binary stays in the OpenMS container.

Breaking change

Yes, intentional. Any config setting ms2features_range = by_sample or by_project must be updated (or removed — independent_run is the only supported value, and it's now implicit).

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Apr 20, 2026

Important

Review skipped

Auto reviews are disabled on base/target branches other than the default branch.

Please check the settings in the CodeRabbit UI or the .coderabbit.yaml file in this repository. To trigger a single review, invoke the @coderabbitai review command.

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3c8fe393-cf4a-43ee-9a45-e0dd403fd6f2

You can disable this status message by setting the reviews.review_status to false in the CodeRabbit configuration file.

Use the checkbox below for a quick retry:

  • 🔍 Trigger review
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented Apr 20, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

TIP This summary will be updated as you push new changes. Give us feedback

Drop the `by_sample` and `by_project` branches of `ms2features_range` together
with the `IDRipper` Nextflow module. These modes exist only to enable a
merged Percolator run that is then split back per-run for ConsensusID, which
is not the direction the pipeline is heading. With the default
`independent_run` path being the only supported topology, the three-way
switch in `psm_rescoring`, `dda_id`, and `peptide_database_search` collapses
to a single Percolator call per file.

IDRipper remains available as a binary inside the OpenMS container (Sage's
module still uses it internally for multi-file search output splitting), but
the Nextflow module is deleted as it has no remaining caller.

GET_SAMPLE / `extract_sample` is also removed — it existed solely to key the
by_sample merge and has no other use. IDMerger's `ms2features_range`-keyed
prefix logic is stripped (the module is kept for future reuse).

Config surface shrinks: `ms2features_range` is removed from `nextflow.config`
and `nextflow_schema.json`. `ms2features_enable` remains as the sole
rescoring gate.

No behaviour change for users on the default (`independent_run`). Anyone
running with `by_sample` or `by_project` today will see a schema error on
the removed parameter and will need to drop it from their config.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR removes the ms2features_range modes (by_sample, by_project) and the associated merge/split rescoring topology, standardizing rescoring to a single per-file Percolator path, and deletes now-unused Nextflow modules (IDRipper, GET_SAMPLE / extract_sample).

Changes:

  • Collapse rescoring logic in psm_rescoring, dda_id, and peptide_database_search to always use per-file Percolator outputs.
  • Remove ms2features_range from nextflow.config and nextflow_schema.json (intentional breaking change).
  • Delete unused modules (modules/local/openms/id_ripper/*, modules/local/utils/extract_sample/*) and simplify IDMerger prefix logic.

Reviewed changes

Copilot reviewed 10 out of 10 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
subworkflows/local/psm_rescoring/main.nf Removes ms2features_range branching and ID_RIPPER usage; always emits Percolator results.
subworkflows/local/peptide_database_search/main.nf Removes sample/project grouping paths (GET_SAMPLE, ID_MERGER); always mixes per-engine per-file IDs.
subworkflows/local/dda_id/main.nf Removes ID_RIPPER branching; always uses per-file Percolator results prior to downstream steps.
nextflow_schema.json Deletes ms2features_range parameter from schema to enforce the breaking change.
nextflow.config Removes default ms2features_range parameter.
modules/local/utils/extract_sample/meta.yml Deletes metadata for removed GET_SAMPLE / extract_sample module.
modules/local/utils/extract_sample/main.nf Deletes the GET_SAMPLE process implementation.
modules/local/openms/id_ripper/meta.yml Deletes metadata for removed ID_RIPPER module.
modules/local/openms/id_ripper/main.nf Deletes the ID_RIPPER process implementation.
modules/local/openms/id_merger/main.nf Removes ms2features_range-keyed prefix branching.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread subworkflows/local/psm_rescoring/main.nf
Comment thread nextflow_schema.json
Two comments from the Copilot reviewer:

1. PSM_RESCORING still declared ch_file_preparation_results and
   _ch_expdesign in its take: block, but neither is used after the
   rescoring collapse. Dropped both from the workflow signature and
   updated the caller in subworkflows/local/id/main.nf.

2. The ms2features_range schema removal is a hard break, but nf-schema
   2.5.1 only emits a warning for unknown parameters by default — so a
   config that still sets ms2features_range would be silently ignored
   rather than rejected. Added an explicit check in
   validateInputParameters() that errors with a migration message, and a
   new tests/removed_params.nf.test nf-test that asserts the pipeline
   fails with 'ms2features_range ... has been removed' when the
   parameter is provided.
The pipeline-test CI checker requires every nf.test to snapshot a
versions.yml file, which a negative test (pipeline expected to fail)
doesn't produce. The explicit rejection in validateInputParameters()
still catches the removed ms2features_range parameter at runtime with
a migration message; the nf-test wrapper isn't needed for that to work.
@ypriverol ypriverol requested a review from daichengxin April 21, 2026 04:52
@ypriverol ypriverol merged commit 9e8840a into bigbio:dev Apr 21, 2026
39 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants