Add examples
This commit is contained in:
9
examples/README.md
Normal file
9
examples/README.md
Normal file
@@ -0,0 +1,9 @@
|
||||
# Example Configurations
|
||||
|
||||
Learning by example is best.
|
||||
|
||||
Here in the `examples/` folder are llama-swap configurations that can be used on your local LLM server.
|
||||
|
||||
## List
|
||||
|
||||
* [Speculative Decoding](speculative-decoding/README.md) - using a small draft model can increase inference speeds from 20% to 40%. This example includes a configurations Qwen2.5-Coder-32B (2.5x increase) and Llama-3.1-70B (1.4x increase) in the best cases.
|
||||
Reference in New Issue
Block a user