Commit Graph

91 Commits

Author SHA1 Message Date
Benson Wong
7ff50631e0 Update README for setup instructions clarity [skip ci] 2025-10-19 14:55:23 -07:00
Benson Wong
9fc0431531 Clean up and Documentation (#347) [skip ci]
* cmd,misc: move misc binaries to cmd/
* docs: add docs and move examples/ there
* misc: remove unused misc/assets dir
* docs: add configuration.md
* update README with better structure

Updates: #334
2025-10-19 14:53:13 -07:00
Artur Podsiadły
558801db1a Fix nginx proxy buffering for streaming endpoints (#295)
* Fix nginx proxy buffering for streaming endpoints

- Add X-Accel-Buffering: no header to SSE endpoints (/api/events, /logs/stream)
- Add X-Accel-Buffering: no header to proxied text/event-stream responses
- Add nginx reverse proxy configuration section to README
- Add tests for X-Accel-Buffering header on streaming endpoints

Fixes #236

* Fix goroutine cleanup in streaming endpoints test

Add context cancellation to TestProxyManager_StreamingEndpointsReturnNoBufferingHeader
to ensure the goroutine is properly cleaned up when the test completes.
2025-09-09 16:07:46 -07:00
Benson Wong
954e2dee73 Remove cmdStart from README [skip ci]
cmdStart was in the README but it doesn't exist. Fixed the typo. Oops.
2025-09-04 11:57:28 -07:00
Benson Wong
2457840698 Update README.md [skip ci] 2025-08-28 23:44:37 -07:00
Benson Wong
7f55494151 Update README.md [skip ci] 2025-08-28 22:47:28 -07:00
Yandrik
977f1856bb add /completion endpoint (#275)
* feat: add /completion endpoint
* chore: reformat using gofmt
2025-08-28 21:41:02 -07:00
Benson Wong
57803fd3aa Support llama-server's /infill endpoint (#272)
Add support for llama-server's /infill endpoint and metrics gathering on the Activities page.
2025-08-27 08:36:05 -07:00
Benson Wong
5dc6b3e6d9 Add barebones but working implementation of model preload (#209, #235)
Add barebones but working implementation of model preload

* add config test for Preload hook
* improve TestProxyManager_StartupHooks
* docs for new hook configuration
* add a .dev to .gitignore
2025-08-14 10:27:28 -07:00
Benson Wong
a186318892 Update Readme, Add screenshot for Activities page [skip ci] 2025-08-08 13:39:46 -07:00
Benson Wong
c4e4d5e1e9 Update Readme UI Screenshot [skip ci] 2025-08-08 13:33:47 -07:00
Benson Wong
701476c0c4 Update README.md - remove contributor block [skip ci]
Contributor information available on the Github page's sidebar. Redundant.
2025-08-06 11:11:47 -07:00
Martin Garton
8be5073c51 Fix typo (#223) [skip ci]
Fix typo `lama-swap` -> `llama-swap`
2025-08-06 10:02:38 -07:00
Ryein Goddard
ba0a81937a Update README.md (#216)
Update git clone protocol to https
2025-08-01 19:48:09 -07:00
Benson Wong
5172cb2e12 Update docs in Readme [skip ci] 2025-07-30 11:51:14 -07:00
Ian Sebastian Mathew
bbaf172956 add trigger to rebuild homebrew formula (#210) 2025-07-30 10:12:21 -07:00
Gaël James
8c693e7fcf Add endpoint aliases for reranking models (#201)
* Add endpoint aliases for reranking models
* Add MetricsMiddleware to the previous reranking endpoint
* Fix the embeddings endpoint not having model set
2025-07-24 08:32:47 -07:00
Benson Wong
accd65294b add contributors to README [skip ci] 2025-07-21 23:16:48 -07:00
Benson Wong
7472a25864 Update README.md [skip ci]
update screenshot for web UI
2025-07-21 23:08:19 -07:00
Benson Wong
717d64e336 update GUI image in README [skip ci] 2025-06-24 10:38:28 -07:00
Benson Wong
a6b2e930d8 Update README.md [skip ci] 2025-06-18 11:47:08 -07:00
Benson Wong
3fce9ee0e9 Update README.md [skip ci] 2025-06-17 09:53:22 -07:00
Benson Wong
5899ae7966 Update README.md [skip ci] 2025-06-17 09:52:47 -07:00
Benson Wong
4d02ccd26a Update README.md [skip ci] 2025-05-30 09:38:45 -07:00
Benson Wong
dfd47eeac4 Readme updates [skip ci] 2025-05-30 09:19:08 -07:00
Benson Wong
1ac6499c08 Add macros to Configuration schema (#149)
* Add macros to Configuration schema
* update docs
2025-05-29 21:51:25 -07:00
Benson Wong
25f3dc25e7 small doc update [skip ci] 2025-05-26 16:03:27 -07:00
Benson Wong
8422e4e6a1 move some docs to the wiki [no-ci] 2025-05-26 15:46:08 -07:00
Benson Wong
02ee29d881 increase default healthCheckTimeout to 120s 2025-05-26 09:57:53 -07:00
Yuta Hayashibe
fb44cf4e08 Fix typos (#143) 2025-05-23 08:40:15 -07:00
Benson Wong
6e2ff28d59 improve cmdStop docs [no ci] 2025-05-16 13:52:04 -07:00
Benson Wong
a8b81f2799 Add stopCmd for custom stopping instructions (#136)
Allow configuration of how a model is stopped before swapping. Setting `cmdStop` in the configuration will override the default behaviour and enables better integration with other process/container managers like docker or podman.
2025-05-16 13:48:42 -07:00
Benson Wong
f9ee7156dc update configuration examples for multiline yaml commands #133 2025-05-16 11:45:39 -07:00
Sam
bc652709a5 Add config hot-reload (#106)
introduce --watch-config command line option to reload ProxyManager when configuration changes.
2025-05-11 17:37:00 -07:00
Benson Wong
5c5a5da664 Update README.md
removed extra section.
2025-05-06 06:59:15 -07:00
Benson Wong
09e52c0500 Automatic Port Numbers (#105)
Add automatic port numbers assignment in configuration file. The string `${PORT}` will be substituted in model.cmd and model.proxy for an actual port number. This also allows model.proxy to be omitted from the configuration.
2025-05-05 17:07:43 -07:00
Benson Wong
448ccae959 Introduce Groups Feature (#107)
Groups allows more control over swapping behaviour when a model is requested. The new groups feature provides three ways to control swapping: within the group, swapping out other groups or keep the models in the group loaded persistently (never swapped out). 

Closes #96, #99 and #106.
2025-05-02 22:35:38 -07:00
Benson Wong
1f7aa359b1 Update header image
AI has finally made my dreams of llamas in funny clothing and stuck in
a claw machine waiting to be picked come true!
2025-04-23 13:02:12 -07:00
Benson Wong
b138d6cf25 fix starhistory in README 2025-04-15 20:23:46 -07:00
Benson Wong
b8f888f864 Logging Improvements (#88)
This change revamps the internal logging architecture to be more flexible and descriptive. Previously all logs from both llama-swap and upstream services were mixed together. This makes it harder to troubleshoot and identify problems. This PR adds these new endpoints: 

- `/logs/stream/proxy` - just llama-swap's logs
- `/logs/stream/upstream` - stdout output from the upstream server
2025-04-04 21:01:33 -07:00
Benson Wong
5565fca3ac add some badges to README 2025-03-19 11:25:06 -07:00
Benson Wong
a3f82c140b tidy up config examples in README 2025-03-15 10:36:45 -07:00
Benson Wong
5c97299e7b Add support for sending a custom model name to upstream (#69) (#71)
* add test for splitRequestedModel()
* Add `useModelName` parameter to model configuration
* add docs to README
2025-03-14 21:07:52 -07:00
Benson Wong
52c0196e0f clean up feature list in readme 2025-03-13 13:55:20 -07:00
Benson Wong
3201a68a04 Add /v1/audio/transcriptions support (#41)
* add support for /v1/audio/transcriptions
2025-03-13 13:49:39 -07:00
Florin-Gabriel Dumitru
3ac94ad20e Adds an endpoint '/running' (#61)
* Adds an endpoint '/running' that returns either an empty JSON object if no model has been loaded so far, or the last model loaded (model key) and it's current state (state key). Possible state values are: stopped, starting, ready and stopping.

* Improves the `/running` endpoint by allowing multiple entries under the `running` key within the JSON response.
Refactors the `/running` method name (listRunningProcessesHandler).
Removes the unlisted filter implementation.

* Adds tests for:
- no model loaded
- one model loaded
- multiple models loaded

* Adds simple comments.

* Simplified code structure as per 250313 comments on PR #65.

---------

Co-authored-by: FGDumitru|B <xelotx@gmail.com>
2025-03-13 13:42:59 -07:00
Benson Wong
62275e078d add examples to restart on config change #59 2025-03-06 10:50:29 -08:00
Benson Wong
88916059e1 add /unload to docs 2025-03-03 10:44:16 -08:00
Benson Wong
af653347ae Update README.md w/ starhistory graph 2025-02-27 16:43:34 -08:00
daschiller
7187cfe52e add Windows build support to Makefile (#54) 2025-02-18 17:24:31 -08:00