Test Qwen2.5-Coder-32b-instruct pass@10

Hi,

I’m currently testing Qwen2.5-Coder-32B-Instruct and trying to compute pass@10, but the evaluation stage seems to become nearly endless, particularly during the test execution phase.

In addition, I’m encountering OOM issues even on fairly large CPU clusters. From the behavior, it seems possible that worker processes are recursively creating more workers, which may be contributing to both the extremely long runtime and the memory blow-up.

I’m wondering:

* Is this a known issue with the evaluator?
* Are there recommended settings for pass@10 evaluation to avoid runaway execution?
* Should the number of workers be restricted manually, or is there a safer evaluation mode for large-scale runs?

Any advice would be appreciated. Thanks.


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Test Qwen2.5-Coder-32b-instruct pass@10 #120

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Test Qwen2.5-Coder-32b-instruct pass@10 #120

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions