Commit Graph

294 Commits

Author SHA1 Message Date
Benson Wong
e9e88fd229 rename proxy.go to proxymanager.go 2024-11-18 15:30:34 -08:00
Benson Wong
c3b4bb1684 use gin for http server 2024-11-18 15:30:16 -08:00
Benson Wong
e5c909ddf7 add tests for proxy.Process 2024-11-17 20:49:14 -08:00
Benson Wong
36a31f450f add proxy.Process to manage upstream proxy logic 2024-11-17 16:41:15 -08:00
Benson Wong
a8e5ee13b9 Add logging with pipes example to README 2024-11-15 09:10:43 -08:00
Benson Wong
5944a86e86 fix early timeout bug 2024-11-09 20:08:40 -08:00
Benson Wong
63d4a7d0eb Improve LogMonitor to handle empty writes and ensure buffer immutability
- Add a check to return immediately if the write buffer is empty
- Create a copy of new history data to ensure it is immutable
- Update the `GetHistory` method to use the `any` type for the buffer interface
- Add a test case to verify that the buffer remains unchanged
  even if the original message is modified after writing
2024-11-02 10:41:23 -07:00
Benson Wong
f45469f7ff Merge pull request #8 from mostlygeek/improve-upstream-monitoring-issue5
Improvements to handling of the upstream process so errors happen whenever one of these is first:

    the health check timeout is reached waiting for the upstream process to be ready
    the upstream process exits unexpectedly

With this change llama-swap is more compatible with use cases like containerized upstream services (#5) which pull the container before HTTP endpoints are ready.
2024-11-01 15:28:06 -07:00
Benson Wong
34f9fd7340 Improve timeout and exit handling of child processes. fix #3 and #5
llama-swap only waited a maximum of 5 seconds for an upstream
HTTP server to be available. If it took longer than that it will error
out the request. Now it will wait up to the configured healthCheckTimeout
or the upstream process unexpectedly exits.
2024-11-01 14:32:39 -07:00
Benson Wong
8448efa7fc revise health check logic to not error on 5 second timeout 2024-11-01 09:42:37 -07:00
Benson Wong
8cf2a389d8 Refactor log implementation
- use []byte instead of unnecessary string conversions
- make LogManager.Broadcast private
- make LogManager.GetHistory public
- add tests
2024-10-31 12:16:54 -07:00
Benson Wong
0f133f5b74 Add /logs endpoint to monitor upstream processes
- outputs last 10KB of logs from upstream processes
- supports streaming
2024-10-30 21:02:30 -07:00
Benson Wong
1510b3fbd9 clean up README 2024-10-22 10:37:45 -07:00
Benson Wong
0f8a8e70f1 add header image 2024-10-22 10:30:30 -07:00
Benson Wong
6c3819022c Add compatibility with OpenAI /v1/models endpoint to list models 2024-10-21 15:38:12 -07:00
Benson Wong
8580f0f733 Merge pull request #6 from mostlygeek/multiline-config
Support multiline cmds in YAML configuration
2024-10-19 20:07:36 -07:00
Benson Wong
be82d1a6a0 Support multiline cmds in YAML configuration
Add support for multiline `cmd` configurations allowing for nicer looking configuration YAML files.
2024-10-19 20:06:59 -07:00
Benson Wong
6cf0962807 Add custom check endpoint
Replace previously hardcoded value for /health to check when the server became ready to serve traffic. With this the server can support any server that provides an an OpenAI compatible inference endpoint.
2024-10-11 22:02:14 -07:00
Benson Wong
8eb5b7b6c4 Add custom check endpoint
Replace previously hardcoded value for `/health` to check when the
server became ready to serve traffic. With this the server can support
any server that provides an an OpenAI compatible inference endpoint.
2024-10-11 21:59:21 -07:00
Benson Wong
5a57688aa8 add .vscode to .gitignore 2024-10-05 19:37:00 -07:00
Benson Wong
b79b7ef3d9 add goreleaser config to limit GOOS and GOARCH builds 2024-10-04 21:46:55 -07:00
Benson Wong
476086c066 Add Cmd.Wait() to prevent creation of zombie child processes see: #1 2024-10-04 21:38:29 -07:00
Benson Wong
4fae7cf946 update docs 2024-10-04 21:11:08 -07:00
Benson Wong
cc944251df update README 2024-10-04 20:43:48 -07:00
Benson Wong
ef05c05f9c renaming to llama-swap 2024-10-04 20:21:11 -07:00
Benson Wong
ef8d0020f3 release works? 2024-10-04 12:54:14 -07:00
Benson Wong
5a4a41c015 add release thing 2024-10-04 12:53:10 -07:00
Benson Wong
f992f7f52f Create go.yml 2024-10-04 12:45:13 -07:00
Benson Wong
85743ad914 remove the v1/models endpoint, needs improvement 2024-10-04 12:33:41 -07:00
Benson Wong
3e90f8328d add /v1/models endpoint and proxy everything to llama-server 2024-10-04 12:28:50 -07:00
Benson Wong
e0103d1884 build simple-responder with make all 2024-10-04 12:14:10 -07:00
Benson Wong
d682589fb1 support environment variables 2024-10-04 11:55:27 -07:00
Benson Wong
43119e807f add README 2024-10-04 11:37:51 -07:00
Benson Wong
844615bfcc rename to llamagate 2024-10-04 11:09:36 -07:00
Benson Wong
aaca9d889b add Makefile 2024-10-04 11:07:00 -07:00
Benson Wong
bfdba43bd8 improve error handling 2024-10-04 10:55:02 -07:00
Benson Wong
2d387cf373 rename proxy.go to manager.go 2024-10-04 09:39:10 -07:00
Benson Wong
d061819fb1 moved config into proxy package 2024-10-04 09:38:30 -07:00
Benson Wong
7475bf0fff . 2024-10-04 09:31:08 -07:00
Benson Wong
4c2cc1cf57 add license 2024-10-03 21:51:10 -07:00
Benson Wong
83415430ba move proxy logic into the proxy package 2024-10-03 21:35:33 -07:00
Benson Wong
f44faf5a93 move config to its own package 2024-10-03 21:08:11 -07:00
Benson Wong
cb576fb178 replace io.Copy to improve performance sending data to client 2024-10-03 20:33:55 -07:00
Benson Wong
b63b81b121 first commit 2024-10-03 20:20:01 -07:00