-
Notifications
You must be signed in to change notification settings - Fork 3k
Open
Description
| {"enable_lora": True, "max_loras": 1, "max_lora_rank": model_config.lora_rank} |
In an attempt to follow the Thinking Machines "Lora Without Regret" blogpost, I attempted to use a lora of rank 1 with vllm rollouts. However - this is not possible because vllm does not allow it. I'm wondering if it would be appropriate here to overwrite the lora kwargs we send to vllm (e.g. send max(lora_rank, 8)). I'm a bit a naive here, so let me know if this actually wouldn't help with the memory footprint at all
Edit: Provide Links
superallen13, kekmodel and listar2000
Metadata
Metadata
Assignees
Labels
No labels