Benson Wong
f45469f7ff
Merge pull request #8 from mostlygeek/improve-upstream-monitoring-issue5
...
Improvements to handling of the upstream process so errors happen whenever one of these is first:
the health check timeout is reached waiting for the upstream process to be ready
the upstream process exits unexpectedly
With this change llama-swap is more compatible with use cases like containerized upstream services (#5 ) which pull the container before HTTP endpoints are ready.
2024-11-01 15:28:06 -07:00
Benson Wong
34f9fd7340
Improve timeout and exit handling of child processes. fix #3 and #5
...
llama-swap only waited a maximum of 5 seconds for an upstream
HTTP server to be available. If it took longer than that it will error
out the request. Now it will wait up to the configured healthCheckTimeout
or the upstream process unexpectedly exits.
2024-11-01 14:32:39 -07:00
Benson Wong
8448efa7fc
revise health check logic to not error on 5 second timeout
2024-11-01 09:42:37 -07:00
Benson Wong
8cf2a389d8
Refactor log implementation
...
- use []byte instead of unnecessary string conversions
- make LogManager.Broadcast private
- make LogManager.GetHistory public
- add tests
2024-10-31 12:16:54 -07:00
Benson Wong
0f133f5b74
Add /logs endpoint to monitor upstream processes
...
- outputs last 10KB of logs from upstream processes
- supports streaming
2024-10-30 21:02:30 -07:00
Benson Wong
1510b3fbd9
clean up README
2024-10-22 10:37:45 -07:00
Benson Wong
0f8a8e70f1
add header image
2024-10-22 10:30:30 -07:00
Benson Wong
6c3819022c
Add compatibility with OpenAI /v1/models endpoint to list models
2024-10-21 15:38:12 -07:00
Benson Wong
8580f0f733
Merge pull request #6 from mostlygeek/multiline-config
...
Support multiline cmds in YAML configuration
2024-10-19 20:07:36 -07:00
Benson Wong
be82d1a6a0
Support multiline cmds in YAML configuration
...
Add support for multiline `cmd` configurations allowing for nicer looking configuration YAML files.
2024-10-19 20:06:59 -07:00
Benson Wong
6cf0962807
Add custom check endpoint
...
Replace previously hardcoded value for /health to check when the server became ready to serve traffic. With this the server can support any server that provides an an OpenAI compatible inference endpoint.
2024-10-11 22:02:14 -07:00
Benson Wong
8eb5b7b6c4
Add custom check endpoint
...
Replace previously hardcoded value for `/health` to check when the
server became ready to serve traffic. With this the server can support
any server that provides an an OpenAI compatible inference endpoint.
2024-10-11 21:59:21 -07:00
Benson Wong
5a57688aa8
add .vscode to .gitignore
2024-10-05 19:37:00 -07:00
Benson Wong
b79b7ef3d9
add goreleaser config to limit GOOS and GOARCH builds
2024-10-04 21:46:55 -07:00
Benson Wong
476086c066
Add Cmd.Wait() to prevent creation of zombie child processes see: #1
2024-10-04 21:38:29 -07:00
Benson Wong
4fae7cf946
update docs
2024-10-04 21:11:08 -07:00
Benson Wong
cc944251df
update README
2024-10-04 20:43:48 -07:00
Benson Wong
ef05c05f9c
renaming to llama-swap
2024-10-04 20:21:11 -07:00
Benson Wong
ef8d0020f3
release works?
2024-10-04 12:54:14 -07:00
Benson Wong
5a4a41c015
add release thing
2024-10-04 12:53:10 -07:00
Benson Wong
f992f7f52f
Create go.yml
2024-10-04 12:45:13 -07:00
Benson Wong
85743ad914
remove the v1/models endpoint, needs improvement
2024-10-04 12:33:41 -07:00
Benson Wong
3e90f8328d
add /v1/models endpoint and proxy everything to llama-server
2024-10-04 12:28:50 -07:00
Benson Wong
e0103d1884
build simple-responder with make all
2024-10-04 12:14:10 -07:00
Benson Wong
d682589fb1
support environment variables
2024-10-04 11:55:27 -07:00
Benson Wong
43119e807f
add README
2024-10-04 11:37:51 -07:00
Benson Wong
844615bfcc
rename to llamagate
2024-10-04 11:09:36 -07:00
Benson Wong
aaca9d889b
add Makefile
2024-10-04 11:07:00 -07:00
Benson Wong
bfdba43bd8
improve error handling
2024-10-04 10:55:02 -07:00
Benson Wong
2d387cf373
rename proxy.go to manager.go
2024-10-04 09:39:10 -07:00
Benson Wong
d061819fb1
moved config into proxy package
2024-10-04 09:38:30 -07:00
Benson Wong
7475bf0fff
.
2024-10-04 09:31:08 -07:00
Benson Wong
4c2cc1cf57
add license
2024-10-03 21:51:10 -07:00
Benson Wong
83415430ba
move proxy logic into the proxy package
2024-10-03 21:35:33 -07:00
Benson Wong
f44faf5a93
move config to its own package
2024-10-03 21:08:11 -07:00
Benson Wong
cb576fb178
replace io.Copy to improve performance sending data to client
2024-10-03 20:33:55 -07:00
Benson Wong
b63b81b121
first commit
2024-10-03 20:20:01 -07:00