Skip to content

Conversation

@none0663
Copy link
Contributor

Checklist Before Starting

  • Searched for similar PR(s).

What does this PR do?

This merge request addresses a potential TypeError that arises when running run_deepseek671b_math_megatron.sh. The parameters moe_router_bias_update_r and moe_aux_loss_coeff are already present in the hf_to_mcore_config_dpskv3 function's _get_mla_transformer_config. Including them again in the script leads to the error:
TypeError: verl.models.mcore.config_converter._get_mla_transformer_config() got multiple values for keyword argument 'moe_router_bias_update_r' and 'moe_aux_loss_coeff'.
https://github.com/volcengine/verl/blob/13475caaa9bb1b89b1a29f850f0b54253a8c2d38/verl/models/mcore/config_converter.py#L252C19-L252C46

To resolve this issue, this PR removes the duplicate arguments from the script, ensuring that the function is called correctly without any conflicts.

@vermouth1992 vermouth1992 merged commit 4a3881b into volcengine:main Jun 12, 2025
3 of 4 checks passed
@none0663 none0663 deleted the fix_multiple_argument_bug_for_671b branch June 12, 2025 13:53
yellowbee686 pushed a commit to yellowbee686/verl that referenced this pull request Jun 18, 2025
Tyizhanshen pushed a commit to HyperdriveHustle/verl that referenced this pull request Jul 1, 2025
whatadayG pushed a commit to whatadayG/verl that referenced this pull request Sep 5, 2025
chenjiaoAngel added a commit to chenjiaoAngel/verl that referenced this pull request Nov 14, 2025
TimurTaepov pushed a commit to giorgossideris/verl that referenced this pull request Dec 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants