llama-swap

Author	SHA1	Message	Date
Benson Wong	5944a86e86	fix early timeout bug	2024-11-09 20:08:40 -08:00
Benson Wong	63d4a7d0eb	Improve LogMonitor to handle empty writes and ensure buffer immutability - Add a check to return immediately if the write buffer is empty - Create a copy of new history data to ensure it is immutable - Update the `GetHistory` method to use the `any` type for the buffer interface - Add a test case to verify that the buffer remains unchanged even if the original message is modified after writing	2024-11-02 10:41:23 -07:00
Benson Wong	f45469f7ff	Merge pull request #8 from mostlygeek/improve-upstream-monitoring-issue5 Improvements to handling of the upstream process so errors happen whenever one of these is first: the health check timeout is reached waiting for the upstream process to be ready the upstream process exits unexpectedly With this change llama-swap is more compatible with use cases like containerized upstream services (#5) which pull the container before HTTP endpoints are ready.	2024-11-01 15:28:06 -07:00
Benson Wong	34f9fd7340	Improve timeout and exit handling of child processes. fix #3 and #5 llama-swap only waited a maximum of 5 seconds for an upstream HTTP server to be available. If it took longer than that it will error out the request. Now it will wait up to the configured healthCheckTimeout or the upstream process unexpectedly exits.	2024-11-01 14:32:39 -07:00
Benson Wong	8448efa7fc	revise health check logic to not error on 5 second timeout	2024-11-01 09:42:37 -07:00
Benson Wong	8cf2a389d8	Refactor log implementation - use []byte instead of unnecessary string conversions - make LogManager.Broadcast private - make LogManager.GetHistory public - add tests	2024-10-31 12:16:54 -07:00
Benson Wong	0f133f5b74	Add /logs endpoint to monitor upstream processes - outputs last 10KB of logs from upstream processes - supports streaming	2024-10-30 21:02:30 -07:00
Benson Wong	1510b3fbd9	clean up README	2024-10-22 10:37:45 -07:00
Benson Wong	0f8a8e70f1	add header image	2024-10-22 10:30:30 -07:00
Benson Wong	6c3819022c	Add compatibility with OpenAI /v1/models endpoint to list models	2024-10-21 15:38:12 -07:00
Benson Wong	8580f0f733	Merge pull request #6 from mostlygeek/multiline-config Support multiline cmds in YAML configuration	2024-10-19 20:07:36 -07:00
Benson Wong	be82d1a6a0	Support multiline cmds in YAML configuration Add support for multiline `cmd` configurations allowing for nicer looking configuration YAML files.	2024-10-19 20:06:59 -07:00
Benson Wong	6cf0962807	Add custom check endpoint Replace previously hardcoded value for /health to check when the server became ready to serve traffic. With this the server can support any server that provides an an OpenAI compatible inference endpoint.	2024-10-11 22:02:14 -07:00
Benson Wong	8eb5b7b6c4	Add custom check endpoint Replace previously hardcoded value for `/health` to check when the server became ready to serve traffic. With this the server can support any server that provides an an OpenAI compatible inference endpoint.	2024-10-11 21:59:21 -07:00
Benson Wong	5a57688aa8	add .vscode to .gitignore	2024-10-05 19:37:00 -07:00
Benson Wong	b79b7ef3d9	add goreleaser config to limit GOOS and GOARCH builds	2024-10-04 21:46:55 -07:00
Benson Wong	476086c066	Add Cmd.Wait() to prevent creation of zombie child processes see: #1	2024-10-04 21:38:29 -07:00
Benson Wong	4fae7cf946	update docs	2024-10-04 21:11:08 -07:00
Benson Wong	cc944251df	update README	2024-10-04 20:43:48 -07:00
Benson Wong	ef05c05f9c	renaming to llama-swap	2024-10-04 20:21:11 -07:00
Benson Wong	ef8d0020f3	release works?	2024-10-04 12:54:14 -07:00
Benson Wong	5a4a41c015	add release thing	2024-10-04 12:53:10 -07:00
Benson Wong	f992f7f52f	Create go.yml	2024-10-04 12:45:13 -07:00
Benson Wong	85743ad914	remove the v1/models endpoint, needs improvement	2024-10-04 12:33:41 -07:00
Benson Wong	3e90f8328d	add /v1/models endpoint and proxy everything to llama-server	2024-10-04 12:28:50 -07:00
Benson Wong	e0103d1884	build simple-responder with make all	2024-10-04 12:14:10 -07:00
Benson Wong	d682589fb1	support environment variables	2024-10-04 11:55:27 -07:00
Benson Wong	43119e807f	add README	2024-10-04 11:37:51 -07:00
Benson Wong	844615bfcc	rename to llamagate	2024-10-04 11:09:36 -07:00
Benson Wong	aaca9d889b	add Makefile	2024-10-04 11:07:00 -07:00
Benson Wong	bfdba43bd8	improve error handling	2024-10-04 10:55:02 -07:00
Benson Wong	2d387cf373	rename proxy.go to manager.go	2024-10-04 09:39:10 -07:00
Benson Wong	d061819fb1	moved config into proxy package	2024-10-04 09:38:30 -07:00
Benson Wong	7475bf0fff	.	2024-10-04 09:31:08 -07:00
Benson Wong	4c2cc1cf57	add license	2024-10-03 21:51:10 -07:00
Benson Wong	83415430ba	move proxy logic into the proxy package	2024-10-03 21:35:33 -07:00
Benson Wong	f44faf5a93	move config to its own package	2024-10-03 21:08:11 -07:00
Benson Wong	cb576fb178	replace io.Copy to improve performance sending data to client	2024-10-03 20:33:55 -07:00
Benson Wong	b63b81b121	first commit	2024-10-03 20:20:01 -07:00

... 2 3 4 5 6

289 Commits