Commit Graph

72 Commits

Author SHA1 Message Date
Benson Wong
717d64e336 update GUI image in README [skip ci] 2025-06-24 10:38:28 -07:00
Benson Wong
a6b2e930d8 Update README.md [skip ci] 2025-06-18 11:47:08 -07:00
Benson Wong
3fce9ee0e9 Update README.md [skip ci] 2025-06-17 09:53:22 -07:00
Benson Wong
5899ae7966 Update README.md [skip ci] 2025-06-17 09:52:47 -07:00
Benson Wong
4d02ccd26a Update README.md [skip ci] 2025-05-30 09:38:45 -07:00
Benson Wong
dfd47eeac4 Readme updates [skip ci] 2025-05-30 09:19:08 -07:00
Benson Wong
1ac6499c08 Add macros to Configuration schema (#149)
* Add macros to Configuration schema
* update docs
2025-05-29 21:51:25 -07:00
Benson Wong
25f3dc25e7 small doc update [skip ci] 2025-05-26 16:03:27 -07:00
Benson Wong
8422e4e6a1 move some docs to the wiki [no-ci] 2025-05-26 15:46:08 -07:00
Benson Wong
02ee29d881 increase default healthCheckTimeout to 120s 2025-05-26 09:57:53 -07:00
Yuta Hayashibe
fb44cf4e08 Fix typos (#143) 2025-05-23 08:40:15 -07:00
Benson Wong
6e2ff28d59 improve cmdStop docs [no ci] 2025-05-16 13:52:04 -07:00
Benson Wong
a8b81f2799 Add stopCmd for custom stopping instructions (#136)
Allow configuration of how a model is stopped before swapping. Setting `cmdStop` in the configuration will override the default behaviour and enables better integration with other process/container managers like docker or podman.
2025-05-16 13:48:42 -07:00
Benson Wong
f9ee7156dc update configuration examples for multiline yaml commands #133 2025-05-16 11:45:39 -07:00
Sam
bc652709a5 Add config hot-reload (#106)
introduce --watch-config command line option to reload ProxyManager when configuration changes.
2025-05-11 17:37:00 -07:00
Benson Wong
5c5a5da664 Update README.md
removed extra section.
2025-05-06 06:59:15 -07:00
Benson Wong
09e52c0500 Automatic Port Numbers (#105)
Add automatic port numbers assignment in configuration file. The string `${PORT}` will be substituted in model.cmd and model.proxy for an actual port number. This also allows model.proxy to be omitted from the configuration.
2025-05-05 17:07:43 -07:00
Benson Wong
448ccae959 Introduce Groups Feature (#107)
Groups allows more control over swapping behaviour when a model is requested. The new groups feature provides three ways to control swapping: within the group, swapping out other groups or keep the models in the group loaded persistently (never swapped out). 

Closes #96, #99 and #106.
2025-05-02 22:35:38 -07:00
Benson Wong
1f7aa359b1 Update header image
AI has finally made my dreams of llamas in funny clothing and stuck in
a claw machine waiting to be picked come true!
2025-04-23 13:02:12 -07:00
Benson Wong
b138d6cf25 fix starhistory in README 2025-04-15 20:23:46 -07:00
Benson Wong
b8f888f864 Logging Improvements (#88)
This change revamps the internal logging architecture to be more flexible and descriptive. Previously all logs from both llama-swap and upstream services were mixed together. This makes it harder to troubleshoot and identify problems. This PR adds these new endpoints: 

- `/logs/stream/proxy` - just llama-swap's logs
- `/logs/stream/upstream` - stdout output from the upstream server
2025-04-04 21:01:33 -07:00
Benson Wong
5565fca3ac add some badges to README 2025-03-19 11:25:06 -07:00
Benson Wong
a3f82c140b tidy up config examples in README 2025-03-15 10:36:45 -07:00
Benson Wong
5c97299e7b Add support for sending a custom model name to upstream (#69) (#71)
* add test for splitRequestedModel()
* Add `useModelName` parameter to model configuration
* add docs to README
2025-03-14 21:07:52 -07:00
Benson Wong
52c0196e0f clean up feature list in readme 2025-03-13 13:55:20 -07:00
Benson Wong
3201a68a04 Add /v1/audio/transcriptions support (#41)
* add support for /v1/audio/transcriptions
2025-03-13 13:49:39 -07:00
Florin-Gabriel Dumitru
3ac94ad20e Adds an endpoint '/running' (#61)
* Adds an endpoint '/running' that returns either an empty JSON object if no model has been loaded so far, or the last model loaded (model key) and it's current state (state key). Possible state values are: stopped, starting, ready and stopping.

* Improves the `/running` endpoint by allowing multiple entries under the `running` key within the JSON response.
Refactors the `/running` method name (listRunningProcessesHandler).
Removes the unlisted filter implementation.

* Adds tests for:
- no model loaded
- one model loaded
- multiple models loaded

* Adds simple comments.

* Simplified code structure as per 250313 comments on PR #65.

---------

Co-authored-by: FGDumitru|B <xelotx@gmail.com>
2025-03-13 13:42:59 -07:00
Benson Wong
62275e078d add examples to restart on config change #59 2025-03-06 10:50:29 -08:00
Benson Wong
88916059e1 add /unload to docs 2025-03-03 10:44:16 -08:00
Benson Wong
af653347ae Update README.md w/ starhistory graph 2025-02-27 16:43:34 -08:00
daschiller
7187cfe52e add Windows build support to Makefile (#54) 2025-02-18 17:24:31 -08:00
Benson Wong
24089d2d9c remove "no musa container" note from README 2025-02-18 16:38:48 -08:00
Benson Wong
48bd766536 Update README.md 2025-02-14 22:05:52 -08:00
Benson Wong
8d319da4dd improve README organization (i think...) 2025-02-14 15:59:12 -08:00
Benson Wong
be7c502448 improve docs 2025-02-14 15:47:31 -08:00
Benson Wong
96a8ea0241 add cpu docker container build 2025-02-14 15:25:45 -08:00
Benson Wong
f20f2c9b7a add docs and container build improvements #43 2025-02-14 12:20:07 -08:00
Benson Wong
6667e307a2 Update README.md 2025-02-08 10:28:35 -08:00
Benson Wong
7ac446e6a9 Update README.md 2025-02-08 10:26:11 -08:00
Benson Wong
314d2f2212 remove cmd_stop configuration and functionality from PR #40 (#44)
* remove cmd_stop functionality from #40
2025-01-31 12:42:44 -08:00
Benson Wong
baeb0c4e7f Add cmd_stop configuration to better support docker (#35)
Add `cmd_stop` to model configuration to run a command instead of sending a SIGTERM to shutdown a process before swapping.
2025-01-30 16:59:57 -08:00
Benson Wong
c3b834737f Update README.md 2025-01-13 22:37:30 -08:00
Benson Wong
3c8e727b73 Update README.md 2025-01-12 19:48:35 -08:00
Benson Wong
3a1e9f81f1 support TTS /v1/audio/speech (#36) 2025-01-12 16:27:01 -08:00
Benson Wong
72c883f36c Update README.md 2025-01-02 09:01:51 -08:00
Benson Wong
1b04d034cf Update README.md 2025-01-02 08:59:11 -08:00
Benson Wong
2e45f5692a Update README.md
Improve README documentation.
2025-01-01 12:51:24 -08:00
Benson Wong
c97b80bdfe Update README.md 2025-01-01 12:25:45 -08:00
Benson Wong
84b667ca7a improve logging and error reporting for troubleshooting 2024-12-20 10:46:56 -08:00
Benson Wong
29657106fc add more OpenAI API supported in README 2024-12-20 10:08:20 -08:00