Support llama-server's /infill endpoint (#272)

Add support for llama-server's /infill endpoint and metrics gathering on the Activities page.
This commit is contained in:
Benson Wong
2025-08-27 08:36:05 -07:00
committed by GitHub
parent c55d0cc842
commit 57803fd3aa
3 changed files with 39 additions and 25 deletions

View File

@@ -18,9 +18,11 @@ Written in golang, it is very easy to install (single binary with no dependencie
- `v1/completions`
- `v1/chat/completions`
- `v1/embeddings`
- `v1/rerank`, `v1/reranking`, `rerank`
- `v1/audio/speech` ([#36](https://github.com/mostlygeek/llama-swap/issues/36))
- `v1/audio/transcriptions` ([docs](https://github.com/mostlygeek/llama-swap/issues/41#issuecomment-2722637867))
- ✅ llama-server (llama.cpp) supported endpoints:
- `v1/rerank`, `v1/reranking`, `/rerank`
- `/infill` - for code infilling
- ✅ llama-swap custom API endpoints
- `/ui` - web UI
- `/log` - remote log monitoring