Benson Wong
717d64e336
update GUI image in README [skip ci]
2025-06-24 10:38:28 -07:00
Benson Wong
285191e655
Various UI improvements ( #176 )
...
* add retry/backoff to reconnecting log streams
* update favicons
2025-06-23 16:17:21 -07:00
Benson Wong
4236cec03a
Add Filters to Model Configuration ( #174 )
...
llama-swap can strip specific keys in JSON requests. This is useful for removing the ability for clients to set sampling parameters like temperature, top_k, top_p, etc.
2025-06-23 10:52:29 -07:00
Alex O'Connell
756193d0dd
Load models in the UI without navigating the page ( #173 )
...
* Load models in the UI without navigating the page
* fix table layout for mobile
2025-06-19 14:39:07 -07:00
Benson Wong
a6b2e930d8
Update README.md [skip ci]
2025-06-18 11:47:08 -07:00
Benson Wong
9e02c22ff8
stopCmd should use same environment as p.cmd.Env ( #171 , #172 )
2025-06-18 11:36:59 -07:00
Benson Wong
0bdbf2fdc1
fix more goreleaser deprecation warnings [skip ci]
2025-06-18 11:15:12 -07:00
Benson Wong
49035e2e8e
Append custom env vars instead of replace in Process ( #171 )
...
Append custom env vars instead of replace in Process (#168 , #169 )
PR #162 refactored the default configuration code. This
introduced a subtle bug where `env` became `[]string{}` instead of the
default of `nil`.
In golang, `exec.Cmd.Env == nil` means to use the "current process's
environment". By setting it to `[]string{}` as a default the Process's
environment was emptied out which caused an array of strange and
difficult to troubleshoot behaviour. See issues #168 and #169
This commit changes the behaviour to append model configured environment
variables to the default list rather than replace them.
2025-06-18 11:09:13 -07:00
Benson Wong
9963ae18bf
fix? deprecation warning in .goreleaser.yaml [skip-ci]
2025-06-18 07:49:33 -07:00
Benson Wong
2ae48c713b
add debug output for start command
2025-06-18 07:43:23 -07:00
Benson Wong
54c519e365
update Makefile to install ui deps
2025-06-17 09:54:01 -07:00
Benson Wong
3fce9ee0e9
Update README.md [skip ci]
2025-06-17 09:53:22 -07:00
Benson Wong
5899ae7966
Update README.md [skip ci]
2025-06-17 09:52:47 -07:00
Benson Wong
591a9cdf4d
update release.yml
2025-06-16 16:50:25 -07:00
Benson Wong
9a3c656738
New UI ( #157 , #164 )
...
- Add a react UI to replace the plain HTML one.
- Serve as a foundation for better GUI interactions
2025-06-16 16:45:19 -07:00
Benson Wong
75015f82ea
fix bug caused by macro replacement order ( #166 )
...
User defined macros should be applied before checking for ${PORT} constraint in model.cmd and model.proxy.
2025-06-16 15:32:09 -07:00
Thammachart Chinvarapon
cc33b6c270
restore intel docker builds ( #163 )
2025-06-16 11:13:49 -07:00
Benson Wong
4fa12a429c
Refactor all default config values into config.go ( #162 )
...
- Move all default values into one place.
- Update tests to be more cross platform
2025-06-15 12:32:00 -07:00
Benson Wong
2dc0ca0663
improve llama-swap upstream process recovery and restarts ( #155 )
...
Refactor internal upstream process life cycle management to recover better from unexpected situations. With this change llama-swap should never need to be restarted due to a crashed upstream child process. The `StateFailed` state was removed in favour of always trying to start/restart a process.
2025-06-05 16:24:55 -07:00
Daniel Hofer
a84098d3b4
Add missing object type to /v1/models endpoint ( #154 )
2025-06-02 09:25:45 -07:00
Benson Wong
4d02ccd26a
Update README.md [skip ci]
2025-05-30 09:38:45 -07:00
Benson Wong
dfd47eeac4
Readme updates [skip ci]
2025-05-30 09:19:08 -07:00
Benson Wong
1ac6499c08
Add macros to Configuration schema ( #149 )
...
* Add macros to Configuration schema
* update docs
2025-05-29 21:51:25 -07:00
Benson Wong
25f3dc25e7
small doc update [skip ci]
2025-05-26 16:03:27 -07:00
Benson Wong
8422e4e6a1
move some docs to the wiki [no-ci]
2025-05-26 15:46:08 -07:00
Benson Wong
02ee29d881
increase default healthCheckTimeout to 120s
2025-05-26 09:57:53 -07:00
Benson Wong
b2a891f8f4
Disable building of intel container until it's fixed upstream
2025-05-23 22:54:43 -07:00
Yuta Hayashibe
8d2b568897
Improve install script ( #144 )
...
* Use `python3` instead of `curl` and `jq`
* Use quote to word splitting
* Remove undefined `local` in POSIX sh
* Added `LLAMA_SWAP_DEFAULT_ADDRESS` to customize the server address
* Added `mktemp` to `NEEDS`
2025-05-23 09:39:55 -07:00
Yuta Hayashibe
fb44cf4e08
Fix typos ( #143 )
2025-05-23 08:40:15 -07:00
Benson Wong
02aee4e86d
remove noisy debug print message
2025-05-20 10:43:10 -07:00
Benson Wong
f45896d395
add guard to avoid unnecessary logic in Process.Shutdown
2025-05-20 10:43:09 -07:00
choyuansu
f7e46a359f
Add link to unload endpoint in upstream list ( #140 )
...
* Add link to open /unload
2025-05-20 08:31:44 -07:00
choyuansu
c260907415
Add linux install and uninstall shell scripts ( #139 )
...
Contribution for install, and uninstall llama-swap in linux.
2025-05-19 12:03:33 -07:00
Benson Wong
b83a5fa291
make Failed stated recoverable ( #137 )
...
A process in the failed state can transition to stopped either by calling /unload or swapping to another model.
2025-05-16 19:54:44 -07:00
Benson Wong
6e2ff28d59
improve cmdStop docs [no ci]
2025-05-16 13:52:04 -07:00
Benson Wong
a8b81f2799
Add stopCmd for custom stopping instructions ( #136 )
...
Allow configuration of how a model is stopped before swapping. Setting `cmdStop` in the configuration will override the default behaviour and enables better integration with other process/container managers like docker or podman.
2025-05-16 13:48:42 -07:00
Benson Wong
f9ee7156dc
update configuration examples for multiline yaml commands #133
2025-05-16 11:45:39 -07:00
fakezeta
2d00120781
Update proxymanager.go ( #135 )
2025-05-16 06:45:09 -07:00
Benson Wong
afc9aef058
Fix #133 SanitizeCommand removes comments ( #134 )
2025-05-15 15:28:50 -07:00
Benson Wong
d7b390df74
Add GH Action for Testing on Windows ( #132 )
...
* Add windows specific test changes
* Change the command line parsing library - Possible breaking changes for windows users!
2025-05-14 21:51:53 -07:00
Benson Wong
5025c2f1f3
Add GH windows tests (not working yet)
2025-05-14 19:58:22 -07:00
Benson Wong
e3a0b013c1
add content length test for #131
2025-05-14 19:50:01 -07:00
Fadenfire
f5763a94a0
Fix content length being incorrect when useModelName is used ( #131 )
...
* Fix content length being incorrect when useModelName is used
* Update c.Request.ContentLength as well
2025-05-14 19:37:54 -07:00
Benson Wong
8ada72eb57
Update issue templates
2025-05-14 16:36:32 -07:00
Benson Wong
2441b383d3
Make checking for process killed status more robust
2025-05-14 16:26:56 -07:00
Benson Wong
25f251699c
Prevent StateFailed after SIGKILL ( #129 )
...
Closes #125
2025-05-14 10:47:35 -07:00
Benson Wong
7f37bcc6eb
Improve testing around using SIGKILL ( #127 )
...
* Add test for SIGKILL of process
* silent TestProxyManager_RunningEndpoint debug output
* Ref #125
2025-05-13 21:21:52 -07:00
Benson Wong
519c3a4d22
Change /unload to not wait for inflight requests ( #125 )
...
Sometimes upstreams can accept HTTP but never respond causing requests
to build up waiting for a response. This can block Process.Stop() as
that waits for inflight requests to finish. This change refactors the
code to not wait when attempting to shutdown the process.
2025-05-13 11:39:19 -07:00
Benson Wong
9dc4bcb46c
Add a concurrency limit to Process.ProxyRequest ( #123 )
2025-05-12 18:12:52 -07:00
Benson Wong
cb876c143b
update example config
2025-05-12 10:20:18 -07:00