Yuta Hayashibe
8d2b568897
Improve install script ( #144 )
...
* Use `python3` instead of `curl` and `jq`
* Use quote to word splitting
* Remove undefined `local` in POSIX sh
* Added `LLAMA_SWAP_DEFAULT_ADDRESS` to customize the server address
* Added `mktemp` to `NEEDS`
2025-05-23 09:39:55 -07:00
Yuta Hayashibe
fb44cf4e08
Fix typos ( #143 )
2025-05-23 08:40:15 -07:00
Benson Wong
02aee4e86d
remove noisy debug print message
2025-05-20 10:43:10 -07:00
Benson Wong
f45896d395
add guard to avoid unnecessary logic in Process.Shutdown
2025-05-20 10:43:09 -07:00
choyuansu
f7e46a359f
Add link to unload endpoint in upstream list ( #140 )
...
* Add link to open /unload
2025-05-20 08:31:44 -07:00
choyuansu
c260907415
Add linux install and uninstall shell scripts ( #139 )
...
Contribution for install, and uninstall llama-swap in linux.
2025-05-19 12:03:33 -07:00
Benson Wong
b83a5fa291
make Failed stated recoverable ( #137 )
...
A process in the failed state can transition to stopped either by calling /unload or swapping to another model.
2025-05-16 19:54:44 -07:00
Benson Wong
6e2ff28d59
improve cmdStop docs [no ci]
2025-05-16 13:52:04 -07:00
Benson Wong
a8b81f2799
Add stopCmd for custom stopping instructions ( #136 )
...
Allow configuration of how a model is stopped before swapping. Setting `cmdStop` in the configuration will override the default behaviour and enables better integration with other process/container managers like docker or podman.
2025-05-16 13:48:42 -07:00
Benson Wong
f9ee7156dc
update configuration examples for multiline yaml commands #133
2025-05-16 11:45:39 -07:00
fakezeta
2d00120781
Update proxymanager.go ( #135 )
2025-05-16 06:45:09 -07:00
Benson Wong
afc9aef058
Fix #133 SanitizeCommand removes comments ( #134 )
2025-05-15 15:28:50 -07:00
Benson Wong
d7b390df74
Add GH Action for Testing on Windows ( #132 )
...
* Add windows specific test changes
* Change the command line parsing library - Possible breaking changes for windows users!
2025-05-14 21:51:53 -07:00
Benson Wong
5025c2f1f3
Add GH windows tests (not working yet)
2025-05-14 19:58:22 -07:00
Benson Wong
e3a0b013c1
add content length test for #131
2025-05-14 19:50:01 -07:00
Fadenfire
f5763a94a0
Fix content length being incorrect when useModelName is used ( #131 )
...
* Fix content length being incorrect when useModelName is used
* Update c.Request.ContentLength as well
2025-05-14 19:37:54 -07:00
Benson Wong
8ada72eb57
Update issue templates
2025-05-14 16:36:32 -07:00
Benson Wong
2441b383d3
Make checking for process killed status more robust
2025-05-14 16:26:56 -07:00
Benson Wong
25f251699c
Prevent StateFailed after SIGKILL ( #129 )
...
Closes #125
2025-05-14 10:47:35 -07:00
Benson Wong
7f37bcc6eb
Improve testing around using SIGKILL ( #127 )
...
* Add test for SIGKILL of process
* silent TestProxyManager_RunningEndpoint debug output
* Ref #125
2025-05-13 21:21:52 -07:00
Benson Wong
519c3a4d22
Change /unload to not wait for inflight requests ( #125 )
...
Sometimes upstreams can accept HTTP but never respond causing requests
to build up waiting for a response. This can block Process.Stop() as
that waits for inflight requests to finish. This change refactors the
code to not wait when attempting to shutdown the process.
2025-05-13 11:39:19 -07:00
Benson Wong
9dc4bcb46c
Add a concurrency limit to Process.ProxyRequest ( #123 )
2025-05-12 18:12:52 -07:00
Benson Wong
cb876c143b
update example config
2025-05-12 10:20:18 -07:00
Sam
bc652709a5
Add config hot-reload ( #106 )
...
introduce --watch-config command line option to reload ProxyManager when configuration changes.
2025-05-11 17:37:00 -07:00
Thammachart Chinvarapon
9548931258
ci: re-enabled intel build pipeline ( #121 )
2025-05-11 00:19:57 -07:00
Benson Wong
5c5a5da664
Update README.md
...
removed extra section.
2025-05-06 06:59:15 -07:00
Benson Wong
aa9ef59aa5
Create .coderabbit.yaml
2025-05-05 19:47:23 -07:00
Benson Wong
09e52c0500
Automatic Port Numbers ( #105 )
...
Add automatic port numbers assignment in configuration file. The string `${PORT}` will be substituted in model.cmd and model.proxy for an actual port number. This also allows model.proxy to be omitted from the configuration.
2025-05-05 17:07:43 -07:00
Benson Wong
ca9063ffbe
ensure aliases are unique ( #116 )
2025-05-05 15:34:18 -07:00
Benson Wong
21d7973d11
Improve content-length handling ( #115 )
...
ref: See #114
* Improve content-length handling
- Content length was not always being sent
- Add tests for content-length
2025-05-05 10:46:26 -07:00
Yi Hong Ang
cc450e9c5f
fix issue where proxy is still proxying with chunked transfer-encoding ( #114 )
2025-05-05 10:00:03 -07:00
Benson Wong
27465fe053
bug fix with missing early return statements fix #112
2025-05-05 09:32:44 -07:00
Benson Wong
9667989727
Disabling intel container build since it's been broken for weeks.
2025-05-04 21:39:42 -07:00
Benson Wong
d9a1ddea0d
Truncate web logs to 100K characters ( #111 )
...
* set log limit to 100K in browser
2025-05-02 23:43:21 -07:00
Benson Wong
e7ab024ca0
small locking optimization
2025-05-02 23:18:07 -07:00
Benson Wong
448ccae959
Introduce Groups Feature ( #107 )
...
Groups allows more control over swapping behaviour when a model is requested. The new groups feature provides three ways to control swapping: within the group, swapping out other groups or keep the models in the group loaded persistently (never swapped out).
Closes #96 , #99 and #106 .
2025-05-02 22:35:38 -07:00
Benson Wong
ec0348e431
Reduce stale time for issues
2025-04-29 21:16:34 -07:00
Benson Wong
06eda7f591
tag all process logs with its ID ( #103 )
...
Makes identifying Process of log messages easier
2025-04-25 12:58:25 -07:00
Benson Wong
5fad24c16f
Make checkHealthTimeout Interruptable during startup ( #102 )
...
interrupt and exit Process.start() early if the upstream process exits prematurely or unexpectedly.
2025-04-24 14:39:33 -07:00
Benson Wong
8404244fab
Moderate security update for golang/x/net -> v0.38.0
2025-04-24 09:58:40 -07:00
Benson Wong
712cd01081
fix confusing INFO message [no ci]
2025-04-24 09:56:20 -07:00
Benson Wong
1f7aa359b1
Update header image
...
AI has finally made my dreams of llamas in funny clothing and stuck in
a claw machine waiting to be picked come true!
2025-04-23 13:02:12 -07:00
Benson Wong
b138d6cf25
fix starhistory in README
2025-04-15 20:23:46 -07:00
Benson Wong
fb7c808082
add timing for Process start, stop, total request time ( #91 )
2025-04-14 14:34:59 -07:00
Benson Wong
a7e640b0f7
add aider example
2025-04-07 12:37:14 -07:00
Benson Wong
593604dfdc
Show proxy and upstream logs in separate columns in logs UI
2025-04-05 10:36:54 -07:00
Benson Wong
b8f888f864
Logging Improvements ( #88 )
...
This change revamps the internal logging architecture to be more flexible and descriptive. Previously all logs from both llama-swap and upstream services were mixed together. This makes it harder to troubleshoot and identify problems. This PR adds these new endpoints:
- `/logs/stream/proxy` - just llama-swap's logs
- `/logs/stream/upstream` - stdout output from the upstream server
2025-04-04 21:01:33 -07:00
Benson Wong
192b2ae621
Remove no longer needed test
2025-04-04 14:46:01 -07:00
Benson Wong
b7f8cb5094
Limit Access-Control-Allow-Origin to OPTIONS preflight requests #85
2025-04-04 14:44:35 -07:00
Benson Wong
a23da6eb57
Sanitize CORS headers ( #85 )
...
Add sanitation step for `Access-Control-Allow-Headers` when echoing back user supplied headers
2025-04-01 08:43:53 -07:00