update readme

2024-12-08 21:34:16 -08:00
parent cb978f760f
commit 97dae50dc4
1 changed files with 13 additions and 8 deletions
--- a/README.md
+++ b/README.md
@@ -9,14 +9,25 @@ Features:

 - ✅ Easy to deploy: single binary with no dependencies
 - ✅ Single yaml configuration file
- ✅ Automatic switching between models
- ✅ Full control over llama.cpp server settings per model
+- ✅ On-demand model switching
+- ✅ Full control over server settings per model
 - ✅ OpenAI API support (`v1/completions` and `v1/chat/completions`)
 - ✅ Multiple GPU support
 - ✅ Run multiple models at once with `profiles`
 - ✅ Remote log monitoring at `/log`
 - ✅ Automatic unloading of models from GPUs after timeout

+## Releases
+
+Builds for Linux and OSX are available on the [Releases](https://github.com/mostlygeek/llama-swap/releases) page.
+
+### Building from source
+
+1. Install golang for your system
+1. `git clone git@github.com:mostlygeek/llama-swap.git`
+1. `make clean all`
+1. Binaries will be in `build/` subdirectory
+
 ## config.yaml

 llama-swap's configuration is purposefully simple.
@@ -126,9 +137,3 @@ StartLimitInterval=30
 [Install]
 WantedBy=multi-user.target
 ```
-
-## Building from Source
-
-1. Install golang for your system
-1. run `make clean all`
-1. binaries will be built into `build/` directory