llama-swap

Files

T

Benson Wong 01d4838fb3 Fix token metrics parsing (#199 )

Fix #198

- use llama-server's `timings` info if available in response body
- send "-1" for token/sec when not able to accurately calculate
  performance
- optimize streaming body search for metrics information

2025-07-22 23:10:14 -07:00

assets

Add /upstream endpoint (#30 )

2024-12-17 14:37:44 -08:00

process-cmd-test

improve llama-swap upstream process recovery and restarts (#155 )

2025-06-05 16:24:55 -07:00

simple-responder

Fix token metrics parsing (#199 )

2025-07-22 23:10:14 -07:00

test-rerank

support v1/rerank endpoint

2024-12-17 21:22:25 -08:00