diff --git a/README.md b/README.md index 45a389d..eeb7a36 100644 --- a/README.md +++ b/README.md @@ -126,11 +126,16 @@ profiles: - "llama" ``` -### Advanced Examples +### Use Case Examples - [config.example.yaml](config.example.yaml) includes example for supporting `v1/embeddings` and `v1/rerank` endpoints - [Speculative Decoding](examples/speculative-decoding/README.md) - using a small draft model can increase inference speeds from 20% to 40%. This example includes a configurations Qwen2.5-Coder-32B (2.5x increase) and Llama-3.1-70B (1.4x increase) in the best cases. - [Optimizing Code Generation](examples/benchmark-snakegame/README.md) - find the optimal settings for your machine. This example demonstrates defining multiple configurations and testing which one is fastest. +- [Restart on Config Change](examples/restart-on-config-change/README.md) - automatically restart llama-swap when trying out different configurations. + +## Configuration + +llama-s diff --git a/examples/restart-on-config-change/README.md b/examples/restart-on-config-change/README.md new file mode 100644 index 0000000..1f67aea --- /dev/null +++ b/examples/restart-on-config-change/README.md @@ -0,0 +1,51 @@ +# Restart llama-swap on config change + +Sometimes editing the configuration file can take a bit of trail and error to get a model configuration tuned just right. The `watch-and-restart.sh` script can be used to watch `config.yaml` for changes and restart `llama-swap` when it detects a change. + +```bash +#!/bin/bash +# +# A simple watch and restart llama-swap when its configuration +# file changes. Useful for trying out configuration changes +# without manually restarting the server each time. +if [ -z "$1" ]; then + echo "Usage: $0 " + exit 1 +fi + +while true; do + # Start the process again + ./llama-swap-linux-amd64 -config $1 -listen :1867 & + PID=$! + echo "Started llama-swap with PID $PID" + + # Wait for modifications in the specified directory or file + inotifywait -e modify "$1" + + # Check if process exists before sending signal + if kill -0 $PID 2>/dev/null; then + echo "Sending SIGTERM to $PID" + kill -SIGTERM $PID + wait $PID + else + echo "Process $PID no longer exists" + fi + sleep 1 +done +``` + +## Usage and output example + +```bash +$ ./watch-and-restart.sh config.yaml +Started llama-swap with PID 495455 +Setting up watches. +Watches established. +llama-swap listening on :1867 +Sending SIGTERM to 495455 +Shutting down llama-swap +Started llama-swap with PID 495486 +Setting up watches. +Watches established. +llama-swap listening on :1867 +```