Commit Graph

250 Commits

Author SHA1 Message Date
Benson Wong
9a54273d15 Update UI with new Activity event stream from #195
- use new metrics data instead of log parsing
- auto-start events connection to server, improves responsiveness
- remove unnecessary libraries and code
2025-07-21 22:42:30 -07:00
g2mt
87dce5f8f6 Add metrics logging for chat completion requests (#195)
- Add token and performance metrics  for v1/chat/completions 
- Add Activity Page in UI
- Add /api/metrics endpoint

Contributed by @g2mt
2025-07-21 22:19:55 -07:00
Benson Wong
307e619521 remove old eventsources from UI 2025-07-19 15:36:40 -07:00
Benson Wong
6299c1b874 Fix High CPU (#189)
* vendor in kelindar/event lib and refactor to remove time.Ticker
2025-07-15 18:04:30 -07:00
Yathi
a906cd459b Strip comments before macro expansion in config (#193)
A bug fix that ensures comments don't interfere with macro expansion by
removing them first. This prevents unwanted comment text from appearing
in the final expanded command.

Co-authored-by: Yathiraj Bollimbala G <yathi@yStudio.localdomain>
2025-07-15 10:14:16 -07:00
Benson Wong
78b2bc3dbc add toggle to hide/show unlisted models (#187) 2025-07-02 16:14:20 -07:00
Benson Wong
6a058e4191 Change fsnotify to watch config directory instead of file
The fsnotify library suggests watching a directory and checking that the
name matches the configuration file.
2025-07-02 10:23:52 -07:00
Benson Wong
1921e570d7 Add Event Bus (#184)
Major internal refactor to use an event bus to pass event/messages along. These changes are largely invisible user facing but sets up internal design for real time stats and information.

- `--watch-config` logic refactored for events
- remove multiple SSE api endpoints, replaced with /api/events
- keep all functionality essentially the same
- UI/backend sync is in near real time now
2025-07-01 22:17:35 -07:00
Benson Wong
c867a6c9a2 Add name and description to v1/models list (#179)
* Add support for name and description in v1/models list
* add configuration example for name and description
2025-06-30 23:02:44 -07:00
Leoyzen
3bd1b23ce0 fix config hot-reload on k8s (#181)
Co-authored-by: Leoyzen <leoyzen@gmial.com>
2025-06-27 11:49:31 -07:00
srevn
10606abf89 fix config hot-reload on macos (#180)
Co-authored-by: srevn <srevn@github>
2025-06-26 09:20:50 -07:00
Benson Wong
fefd14903d improve log display and add a small stats table in ui (#178) 2025-06-25 12:27:49 -07:00
Benson Wong
717d64e336 update GUI image in README [skip ci] 2025-06-24 10:38:28 -07:00
Benson Wong
285191e655 Various UI improvements (#176)
* add retry/backoff to reconnecting log streams
* update favicons
2025-06-23 16:17:21 -07:00
Benson Wong
4236cec03a Add Filters to Model Configuration (#174)
llama-swap can strip specific keys in JSON requests. This is useful for removing the ability for clients to set sampling parameters like temperature, top_k, top_p, etc.
2025-06-23 10:52:29 -07:00
Alex O'Connell
756193d0dd Load models in the UI without navigating the page (#173)
* Load models in the UI without navigating the page

* fix table layout for mobile
2025-06-19 14:39:07 -07:00
Benson Wong
a6b2e930d8 Update README.md [skip ci] 2025-06-18 11:47:08 -07:00
Benson Wong
9e02c22ff8 stopCmd should use same environment as p.cmd.Env (#171, #172) 2025-06-18 11:36:59 -07:00
Benson Wong
0bdbf2fdc1 fix more goreleaser deprecation warnings [skip ci] 2025-06-18 11:15:12 -07:00
Benson Wong
49035e2e8e Append custom env vars instead of replace in Process (#171)
Append custom env vars instead of replace in Process (#168, #169)

PR #162 refactored the default configuration code. This
introduced a subtle bug where `env` became `[]string{}` instead of the
default of `nil`.

In golang, `exec.Cmd.Env == nil` means to use the "current process's
environment". By setting it to `[]string{}` as a default the Process's
environment was emptied out which caused an array of strange and
difficult to troubleshoot behaviour. See issues #168 and #169

This commit changes the behaviour to append model configured environment
variables to the default list rather than replace them.
2025-06-18 11:09:13 -07:00
Benson Wong
9963ae18bf fix? deprecation warning in .goreleaser.yaml [skip-ci] 2025-06-18 07:49:33 -07:00
Benson Wong
2ae48c713b add debug output for start command 2025-06-18 07:43:23 -07:00
Benson Wong
54c519e365 update Makefile to install ui deps 2025-06-17 09:54:01 -07:00
Benson Wong
3fce9ee0e9 Update README.md [skip ci] 2025-06-17 09:53:22 -07:00
Benson Wong
5899ae7966 Update README.md [skip ci] 2025-06-17 09:52:47 -07:00
Benson Wong
591a9cdf4d update release.yml 2025-06-16 16:50:25 -07:00
Benson Wong
9a3c656738 New UI (#157, #164)
- Add a react UI to replace the plain HTML one. 
- Serve as a foundation for better GUI interactions
2025-06-16 16:45:19 -07:00
Benson Wong
75015f82ea fix bug caused by macro replacement order (#166)
User defined macros should be applied before checking for ${PORT} constraint in model.cmd and model.proxy.
2025-06-16 15:32:09 -07:00
Thammachart Chinvarapon
cc33b6c270 restore intel docker builds (#163) 2025-06-16 11:13:49 -07:00
Benson Wong
4fa12a429c Refactor all default config values into config.go (#162)
- Move all default values into one place.
- Update tests to be more cross platform
2025-06-15 12:32:00 -07:00
Benson Wong
2dc0ca0663 improve llama-swap upstream process recovery and restarts (#155)
Refactor internal upstream process life cycle management to recover better from unexpected situations. With this change llama-swap should never need to be restarted due to a crashed upstream child process.  The `StateFailed` state was removed in favour of always trying to start/restart a process.
2025-06-05 16:24:55 -07:00
Daniel Hofer
a84098d3b4 Add missing object type to /v1/models endpoint (#154) 2025-06-02 09:25:45 -07:00
Benson Wong
4d02ccd26a Update README.md [skip ci] 2025-05-30 09:38:45 -07:00
Benson Wong
dfd47eeac4 Readme updates [skip ci] 2025-05-30 09:19:08 -07:00
Benson Wong
1ac6499c08 Add macros to Configuration schema (#149)
* Add macros to Configuration schema
* update docs
2025-05-29 21:51:25 -07:00
Benson Wong
25f3dc25e7 small doc update [skip ci] 2025-05-26 16:03:27 -07:00
Benson Wong
8422e4e6a1 move some docs to the wiki [no-ci] 2025-05-26 15:46:08 -07:00
Benson Wong
02ee29d881 increase default healthCheckTimeout to 120s 2025-05-26 09:57:53 -07:00
Benson Wong
b2a891f8f4 Disable building of intel container until it's fixed upstream 2025-05-23 22:54:43 -07:00
Yuta Hayashibe
8d2b568897 Improve install script (#144)
* Use `python3` instead of `curl` and `jq`

* Use quote to word splitting

* Remove undefined `local` in POSIX sh

* Added `LLAMA_SWAP_DEFAULT_ADDRESS` to customize the server address

* Added `mktemp` to `NEEDS`
2025-05-23 09:39:55 -07:00
Yuta Hayashibe
fb44cf4e08 Fix typos (#143) 2025-05-23 08:40:15 -07:00
Benson Wong
02aee4e86d remove noisy debug print message 2025-05-20 10:43:10 -07:00
Benson Wong
f45896d395 add guard to avoid unnecessary logic in Process.Shutdown 2025-05-20 10:43:09 -07:00
choyuansu
f7e46a359f Add link to unload endpoint in upstream list (#140)
* Add link to open /unload
2025-05-20 08:31:44 -07:00
choyuansu
c260907415 Add linux install and uninstall shell scripts (#139)
Contribution for install, and uninstall llama-swap in linux.
2025-05-19 12:03:33 -07:00
Benson Wong
b83a5fa291 make Failed stated recoverable (#137)
A process in the failed state can transition to stopped either by calling /unload or swapping to another model.
2025-05-16 19:54:44 -07:00
Benson Wong
6e2ff28d59 improve cmdStop docs [no ci] 2025-05-16 13:52:04 -07:00
Benson Wong
a8b81f2799 Add stopCmd for custom stopping instructions (#136)
Allow configuration of how a model is stopped before swapping. Setting `cmdStop` in the configuration will override the default behaviour and enables better integration with other process/container managers like docker or podman.
2025-05-16 13:48:42 -07:00
Benson Wong
f9ee7156dc update configuration examples for multiline yaml commands #133 2025-05-16 11:45:39 -07:00
fakezeta
2d00120781 Update proxymanager.go (#135) 2025-05-16 06:45:09 -07:00