WIP watch diff with upstream main branch#6
Open
Jeronymous wants to merge 68 commits intoupstream-mainfrom
Open
WIP watch diff with upstream main branch#6Jeronymous wants to merge 68 commits intoupstream-mainfrom
Jeronymous wants to merge 68 commits intoupstream-mainfrom
Conversation
…ion of the dataset)
…q len (131072) is larger than the maximum number of tokens that can be stored in KV cache (130944). Try increasing `gpu_memory_utilization` or decreasing `max_model_len` when initializing the engine"
…and new version of the dataset is different)
…t it has eos_token_id)
…o avoid failures or NaN
…it (>= 0.15).
Unfortunately, it currently fails with VLLM 0.15.1 in our env:
File ".../vllm/v1/worker/gpu_worker.py", line 412, in initialize_from_config
self.model_runner.initialize_kv_cache(kv_cache_config)
File ".../vllm/v1/worker/gpu_model_runner.py", line 5874, in initialize_kv_cache
self.initialize_attn_backend(kv_cache_config)
File ".../vllm/v1/worker/gpu_model_runner.py", line 5225, in initialize_attn_backend
check_attention_cp_compatibility(self.vllm_config)
File ".../vllm/v1/worker/cp_utils.py", line 39, in check_attention_cp_compatibility
assert layer_impl.supports_pcp, (
AssertionError: PCP requires attention impls' support, but the impl FlashAttentionImpl does not support PCP.
Fix multi-parallelism (TP+DP or PP+DP)
Add MathAlea French math MCQ community task
Add Red Teaming benchmark based on AvgBench
Upstream refactor splits src/lighteval/tasks into per-task files under src/lighteval/tasks/tasks/ and src/lighteval/tasks/multilingual/tasks/, drops default_tasks.py / default_prompts.py / multilingual/tasks.py, and removes the suite field from LightevalTaskConfig. Port our edits to the new structure: - tasks/gsm_plus.py: generation_size 16384 - tasks/gsm8k.py: generation_size 2048 - tasks/mgsm.py: hf_revision, suffix exact_match + expr_gold_metric, language-specific stop sequences for all 11 subsets - tasks/piqa.py: switch to lighteval/piqa mirror - tasks/siqa.py: pin hf_revision - tasks/mmlu_pro.py: fix upstream's hardcoded ABCD letters so the prompt uses dynamic letters based on the number of options; add a parallel mmlu_pro_raw task exposing the handmade prompt (no inspect_ai) - tasks/ruler.py: new home for the ruler prompt helper - tasks/advbench.py: move here from community_tasks/ - multilingual/tasks/mathalea.py: move here from community_tasks/ - multilingual/tasks/french.py: keep jzhang86/fr_ifeval fallback and the generative GPQA-fr-diamond variant with prompt_gpqa_fr_instruct Other conflict resolutions: - pyproject.toml: take upstream unpinned transformers, vllm>=0.11.0, new inspect-ai and openai deps - vllm_model.py: keep max_seq_len_to_capture fallback, Mistral eos_token guard, prefix-cache None-skip in logprob loop, and skip_reading_prefix_cache via guarded attribute assignment; adopt upstream's build_vllm_token_prompts helper - llm_as_judge.py: keep max_model_len=65536, adopt upstream's api_key/base_url litellm pass-through - lighteval_task.py: preserve name/data_dir fallback in load_dataset while picking up upstream's data_files support; keep partial args detail in __str__ for deterministic cache hashing - cache_management.py: adopt name-only task_to_configs lookup; keep regex that strips function memory addresses for hash determinism
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.