Fix VLM position_ids packing in Megatron strategy with sequence packing enabled by sanmuf · Pull Request #452 · alibaba/ROLL

sanmuf · 2026-05-26T11:36:02Z

Summary

Fix Megatron VLM sequence packing by packing 3D mRoPE position_ids together with input_ids.

Details

For Qwen2-VL / Qwen3-VL style VLM models, position_ids are 3D mRoPE tensors. Previously, when sequence_packing was enabled, only token tensors such as input_ids and labels were packed, while VLM position_ids remained unpacked.

This caused packed token sequences and mRoPE position ids to have inconsistent shapes, leading to Megatron RoPE runtime errors during reference/train forward:

RuntimeError: Sizes of tensors must match except in dimension 3.
Expected size 2 but got size 1 for tensor number 1 in the list.

This happens in Megatron's RoPE application path:

megatron/core/models/common/embeddings/rope_utils.py
_apply_rotary_pos_emb_bshd -> torch.cat((t, t_pass), dim=-1)

Changes

Pack position_ids together with input_ids when sequence_packing is enabled.

Adjust padding logic in _pack_sequences to support non-1D tensors such as VLM position ids.

For non-packing mode, preserve the existing VLM position id layout.

Validation

After this fix, VLM position_ids are packed consistently with tokens and the Megatron forward path no longer hits the RoPE shape mismatch.

fix positions_ids no pack when megatron vlm seq_packing

d1483c3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix VLM position_ids packing in Megatron strategy with sequence packing enabled#452

Fix VLM position_ids packing in Megatron strategy with sequence packing enabled#452
sanmuf wants to merge 1 commit into
alibaba:mainfrom
sanmuf:vlm-pos

sanmuf commented May 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sanmuf commented May 26, 2026

Summary

Details

Changes

Validation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant