sharegpt dataset convert error

### Reminder

- [x] I have read the above rules and searched the existing issues.

### System Info

- `llamafactory` version: 0.9.2.dev0
- Platform: Linux-5.15.0-131-generic-x86_64-with-glibc2.35
- Python version: 3.10.16
- PyTorch version: 2.6.0+cu126 (GPU)
- Transformers version: 4.49.0.dev0
- Datasets version: 2.21.0
- Accelerate version: 1.0.1
- PEFT version: 0.12.0
- TRL version: 0.9.6
- GPU type: NVIDIA H100 80GB HBM3
- GPU number: 8
- GPU memory: 79.19GB

### Reproduction

Just use `sharegpt_hyper` dataset would cause the error.

```
[rank0]:   File "/home/xxx/LLaMA-Factory/src/llamafactory/data/aligner.py", line 15
3, in convert_sharegpt                                                                  
[rank0]:     {"role": tag_mapping[message[dataset_attr.role_tag]], "content": message[da
taset_attr.content_tag]}                                                                
[rank0]: KeyError: 'user'
```

code [here](https://github.com/hiyouga/LLaMA-Factory/blob/b68199db274a53d5916179e1aaf9722fd94fa2dc/src/llamafactory/data/aligner.py#L145-L151), when `broken_data = True`, it doesn't break and cause key error finally.

### Others

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

sharegpt dataset convert error #6878

Reminder

System Info

Reproduction

Others

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

sharegpt dataset convert error #6878

Description

Reminder

System Info

Reproduction

Others

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions