Commit Graph

294 Commits

Author SHA1 Message Date
Benson Wong
2457840698 Update README.md [skip ci] 2025-08-28 23:44:37 -07:00
Benson Wong
7f55494151 Update README.md [skip ci] 2025-08-28 22:47:28 -07:00
Benson Wong
831a90d3b0 Add different timeout scenarios to Process.checkHealthEndpoint #276 (#278)
- add a TCP connection timeout of 500ms
- increase HTTP client timeout to 5000ms

In this new behaviour the upstream has 500ms to accept a tcp connection
and 5000ms to respond to the HTTP request.
2025-08-28 22:03:14 -07:00
Yandrik
977f1856bb add /completion endpoint (#275)
* feat: add /completion endpoint
* chore: reformat using gofmt
2025-08-28 21:41:02 -07:00
Benson Wong
52b329f7bc Fix #277 race condition in ProcessGroup.ProxyRequest when swap=true 2025-08-28 21:38:40 -07:00
Benson Wong
57803fd3aa Support llama-server's /infill endpoint (#272)
Add support for llama-server's /infill endpoint and metrics gathering on the Activities page.
2025-08-27 08:36:05 -07:00
Benson Wong
c55d0cc842 Add docs for model.concurrencyLimit #263 [skip ci] 2025-08-22 16:08:37 -07:00
Benson Wong
7acbaf4712 Add connection status indicator in UI (#260)
* show connection status as icon in UI title
* make connection status event driven
2025-08-20 13:58:24 -07:00
Benson Wong
fcc5ad135a UI: Allow editing of title (#246)
- make <h1> title contentEditable
- title setting persists across reloads in localStorage
2025-08-17 09:42:06 -07:00
Benson Wong
305e5a0031 improve example config [skip ci] 2025-08-17 09:19:04 -07:00
Benson Wong
04fc67354a Improve Activity event handling in the UI (#254)
Improve Activity event handling in the UI

- fixes #252 found that the Activity page showed activity inconsistent
  with /api/metrics
- Change data structure for event metrics to array.
- Add Event stream connections status indicator
2025-08-15 21:44:08 -07:00
Benson Wong
4662cf7699 add 'unconfirmed bug' as default label in bug-report.md 2025-08-15 15:38:12 -07:00
Benson Wong
5dc6b3e6d9 Add barebones but working implementation of model preload (#209, #235)
Add barebones but working implementation of model preload

* add config test for Preload hook
* improve TestProxyManager_StartupHooks
* docs for new hook configuration
* add a .dev to .gitignore
2025-08-14 10:27:28 -07:00
Benson Wong
74c69f39ef Add prompt processing metrics (#250)
- capture prompt processing metrics
- display prompt processing metrics on UI Activity page
2025-08-14 10:02:16 -07:00
Benson Wong
a186318892 Update Readme, Add screenshot for Activities page [skip ci] 2025-08-08 13:39:46 -07:00
Benson Wong
c4e4d5e1e9 Update Readme UI Screenshot [skip ci] 2025-08-08 13:33:47 -07:00
Benson Wong
7985e94ba4 add tokens processed to ui models page 2025-08-08 13:28:39 -07:00
Benson Wong
74556c3a36 Update bug-report.md [skip ci] 2025-08-08 09:52:05 -07:00
Benson Wong
5c381e4b30 Add gofmt linting to ci 2025-08-07 20:29:18 -07:00
Benson Wong
10569ed546 Fix model alias usage in upstream path (#230)
Model alias values are not properly resolved and work in upstream/ path.

Related to #229.
2025-08-07 20:16:56 -07:00
Benson Wong
5b10b3c23f UI Tweaks (#228)
* sort model names in UI

* add toggle to show model id/name on UI model page
2025-08-07 11:07:03 -07:00
Benson Wong
45ea792a3a Fix UI panel not saving position correctly 2025-08-06 14:02:22 -07:00
Benson Wong
1bc2802353 fix panels not saving sizing state 2025-08-06 14:00:21 -07:00
Benson Wong
701476c0c4 Update README.md - remove contributor block [skip ci]
Contributor information available on the Github page's sidebar. Redundant.
2025-08-06 11:11:47 -07:00
Ben Greene
5c63e0066c return models sorted by id in /v1/models (#222) 2025-08-06 10:04:52 -07:00
Martin Garton
8be5073c51 Fix typo (#223) [skip ci]
Fix typo `lama-swap` -> `llama-swap`
2025-08-06 10:02:38 -07:00
Aaron Ang
6307bd3205 Add support for building Linux ARM64 binary in Makefile (#221) 2025-08-05 16:26:06 -07:00
Benson Wong
558a72de17 UI Improvements (#219)
- use react-resizable-panels for UI
- improve icons for buttons
- improve mobile layout with drag/resize panels
2025-08-03 17:49:13 -07:00
Leoyzen
dc42cf366d Add config monitor support for k8s configmap. (#217) 2025-08-03 08:05:48 -07:00
Ryein Goddard
ba0a81937a Update README.md (#216)
Update git clone protocol to https
2025-08-01 19:48:09 -07:00
Benson Wong
574fdfabb4 UI improvements (#213)
* use two column for logs view on wider screens

* hide log controls when panel is minimized
2025-07-31 11:59:21 -07:00
Benson Wong
5172cb2e12 Update docs in Readme [skip ci] 2025-07-30 11:51:14 -07:00
Benson Wong
5672cb03fd Update github actions for notifying homebrew build (#212)
Combine homebrew-llama-swap event with the release action
2025-07-30 11:29:03 -07:00
Benson Wong
0f583163f7 add /health (#211) 2025-07-30 10:37:10 -07:00
Benson Wong
7905fa9ea3 Update trigger-homebrew-update.yml [skip ci] 2025-07-30 10:13:49 -07:00
Ian Sebastian Mathew
bbaf172956 add trigger to rebuild homebrew formula (#210) 2025-07-30 10:12:21 -07:00
Benson Wong
fd50932dbc Decouple MetricsMiddleware from downstream handlers (#206)
* Decouple MetricsMiddleware from downstream handlers

Remove ls-real-model-name optimization. Within proxyOAIHandler the
request body's bytes are required for various rewriting features
anyways. This negated any benefits from trying not to parse it twice.
2025-07-27 10:36:06 -07:00
Gaël James
8c693e7fcf Add endpoint aliases for reranking models (#201)
* Add endpoint aliases for reranking models
* Add MetricsMiddleware to the previous reranking endpoint
* Fix the embeddings endpoint not having model set
2025-07-24 08:32:47 -07:00
Benson Wong
8f2af26a41 fix stats on model page 2025-07-23 13:57:33 -07:00
Benson Wong
01d4838fb3 Fix token metrics parsing (#199)
Fix #198

- use llama-server's `timings` info if available in response body
- send "-1" for token/sec when not able to accurately calculate
  performance
- optimize streaming body search for metrics information
2025-07-22 23:10:14 -07:00
Benson Wong
accd65294b add contributors to README [skip ci] 2025-07-21 23:16:48 -07:00
Benson Wong
7472a25864 Update README.md [skip ci]
update screenshot for web UI
2025-07-21 23:08:19 -07:00
Benson Wong
cce0bc6aa1 add guard to ensure ls-real-model-name is set in context 2025-07-21 22:59:41 -07:00
Benson Wong
36e25125e8 UI tidy [skip ci] 2025-07-21 22:47:55 -07:00
Benson Wong
9a54273d15 Update UI with new Activity event stream from #195
- use new metrics data instead of log parsing
- auto-start events connection to server, improves responsiveness
- remove unnecessary libraries and code
2025-07-21 22:42:30 -07:00
g2mt
87dce5f8f6 Add metrics logging for chat completion requests (#195)
- Add token and performance metrics  for v1/chat/completions 
- Add Activity Page in UI
- Add /api/metrics endpoint

Contributed by @g2mt
2025-07-21 22:19:55 -07:00
Benson Wong
307e619521 remove old eventsources from UI 2025-07-19 15:36:40 -07:00
Benson Wong
6299c1b874 Fix High CPU (#189)
* vendor in kelindar/event lib and refactor to remove time.Ticker
2025-07-15 18:04:30 -07:00
Yathi
a906cd459b Strip comments before macro expansion in config (#193)
A bug fix that ensures comments don't interfere with macro expansion by
removing them first. This prevents unwanted comment text from appearing
in the final expanded command.

Co-authored-by: Yathiraj Bollimbala G <yathi@yStudio.localdomain>
2025-07-15 10:14:16 -07:00
Benson Wong
78b2bc3dbc add toggle to hide/show unlisted models (#187) 2025-07-02 16:14:20 -07:00