保存后的Reward Model用来inference

### Reminder

- [X] I have read the README and searched the existing issues.

### System Info

导出之后的reward model用这个方式加载：
```
model = AutoModelForCausalLMWithValueHead.from_pretrained('...')
```
弹出一个warning: no v_head weight is found. This IS expected if you are not resuming PPO training.

请问这是正常可以忽略的吗？我想用保存后的reward model做inference输出value

### Reproduction

```
model = AutoModelForCausalLMWithValueHead.from_pretrained('...')
```

### Expected behavior

_No response_

### Others

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

保存后的Reward Model用来inference #4379

Reminder

System Info

Reproduction

Expected behavior

Others

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

保存后的Reward Model用来inference #4379

Description

Reminder

System Info

Reproduction

Expected behavior

Others

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions