Files
llama-swap/examples/speculative-decoding
Benson Wong 2fceb78e8d Add examples
2024-11-28 22:05:41 -08:00
..
2024-11-28 22:05:41 -08:00

Qwen 2.5 Coder with a Draft Model

Using a small draft model like qwen-2.5-coder-0.5B can have a big impact on the performance of the larger 32 billion parameter model.