Benson Wong
b7f8cb5094
Limit Access-Control-Allow-Origin to OPTIONS preflight requests #85
2025-04-04 14:44:35 -07:00
Benson Wong
a23da6eb57
Sanitize CORS headers ( #85 )
...
Add sanitation step for `Access-Control-Allow-Headers` when echoing back user supplied headers
2025-04-01 08:43:53 -07:00
Grigorii Khvatskii
4c3aa40564
add graceful process termination on windows ( #82 )
2025-03-25 15:26:33 -07:00
Benson Wong
84e2c07a7e
Refactor wildcard out of CORS headers ( #81 )
...
Changes to CORS functionality:
- `Access-Control-Allow-Origin: *` is set for all requests
- for pre-flight OPTIONS requests
- specify methods: `Access-Control-Allow-Methods: GET, POST, PUT, PATCH, DELETE, OPTIONS`
- if the client sent `Access-Control-Request-Headers` then echo back the same value in `Access-Control-Allow-Headers`. If no `Access-Control-Request-Headers` were sent, then send back a default set
- set `Access-Control-Max-Age: 86400` to that may improve performance
- Add CORS tests to the proxy-manager
2025-03-25 15:24:43 -07:00
Benson Wong
680af28bcc
Allow very permissive CORS headers ( #77 )
2025-03-20 15:50:21 -07:00
Benson Wong
d94db42ffe
fix bug checking incorrect error
2025-03-20 15:49:36 -07:00
Benson Wong
93cd83c55c
add override for windows ( #76 )
2025-03-20 13:23:04 -07:00
Benson Wong
5565fca3ac
add some badges to README
2025-03-19 11:25:06 -07:00
Benson Wong
d625ab8d92
Refactor process state management ( #70 ) ( #73 )
...
* add isValidStateTransition helper function
* Replace Process.setState() with Process.swapState()
* Refactor locking logic in Process
2025-03-15 17:14:03 -07:00
Benson Wong
a3f82c140b
tidy up config examples in README
2025-03-15 10:36:45 -07:00
Benson Wong
5c97299e7b
Add support for sending a custom model name to upstream ( #69 ) ( #71 )
...
* add test for splitRequestedModel()
* Add `useModelName` parameter to model configuration
* add docs to README
2025-03-14 21:07:52 -07:00
Benson Wong
671c1a5a7b
update deps
2025-03-13 14:00:15 -07:00
Benson Wong
52c0196e0f
clean up feature list in readme
2025-03-13 13:55:20 -07:00
Benson Wong
3201a68a04
Add /v1/audio/transcriptions support ( #41 )
...
* add support for /v1/audio/transcriptions
2025-03-13 13:49:39 -07:00
Florin-Gabriel Dumitru
3ac94ad20e
Adds an endpoint '/running' ( #61 )
...
* Adds an endpoint '/running' that returns either an empty JSON object if no model has been loaded so far, or the last model loaded (model key) and it's current state (state key). Possible state values are: stopped, starting, ready and stopping.
* Improves the `/running` endpoint by allowing multiple entries under the `running` key within the JSON response.
Refactors the `/running` method name (listRunningProcessesHandler).
Removes the unlisted filter implementation.
* Adds tests for:
- no model loaded
- one model loaded
- multiple models loaded
* Adds simple comments.
* Simplified code structure as per 250313 comments on PR #65 .
---------
Co-authored-by: FGDumitru|B <xelotx@gmail.com >
2025-03-13 13:42:59 -07:00
Benson Wong
60355bf74a
fix some potentially confusing Process.start() comment
2025-03-11 11:00:45 -07:00
Benson Wong
9b2ed244e2
Improve Continuous integration and fix concurrency bugs ( #66 )
...
- improvements to the continuous GH actions
- fix edge case concurrency bugs with Process.start() and state transitions discovered setting up CI.
2025-03-11 10:39:14 -07:00
Benson Wong
eeb72297f7
add first version of CI for go
2025-03-11 08:45:56 -07:00
Benson Wong
eabfe70cc6
add GH action to close inactive issues
2025-03-09 19:51:48 -07:00
Benson Wong
29cd98878d
better container build logic when upstream containers do not exist
2025-03-09 13:02:06 -07:00
Benson Wong
b3d331da0d
Properly strip profile name slug from models fixes ( #62 )
...
The profile slug in a model name, `profile:model`, is specific to
llama-swap. This strips `profile:` out of the model name request so
upstreams that expect just `model` work and do not require knowing about
the profile slug.
2025-03-09 12:41:52 -07:00
Benson Wong
62275e078d
add examples to restart on config change #59
2025-03-06 10:50:29 -08:00
Benson Wong
88916059e1
add /unload to docs
2025-03-03 10:44:16 -08:00
Benson Wong
082d5d0fc5
Add /unload endpoint ( #58 ) to unload all currently running models
2025-03-03 10:33:36 -08:00
Benson Wong
53338938bd
increase health check to a minimum of 5 seconds
2025-03-03 10:04:08 -08:00
Benson Wong
af653347ae
Update README.md w/ starhistory graph
2025-02-27 16:43:34 -08:00
Benson Wong
1e25b44a06
add workflow_dispatch to release action
2025-02-18 17:27:43 -08:00
Benson Wong
0815bb4cc3
Add windows to goreleaser #54
2025-02-18 17:26:43 -08:00
daschiller
7187cfe52e
add Windows build support to Makefile ( #54 )
2025-02-18 17:24:31 -08:00
Benson Wong
24089d2d9c
remove "no musa container" note from README
2025-02-18 16:38:48 -08:00
Benson Wong
ebabe55ff3
Delete untagged packages after build and push ( #55 )
2025-02-18 10:32:32 -08:00
Benson Wong
41a338297c
deletion of untagged containers happen after build-and-push
2025-02-18 10:11:59 -08:00
Benson Wong
7e3353efeb
add action step to remove untagged containers
2025-02-18 10:08:41 -08:00
Benson Wong
4ed58fb173
update container build action
2025-02-18 09:59:06 -08:00
Benson Wong
f5a2be698d
revert package src until new ggml-org has them
2025-02-15 18:23:58 -08:00
Benson Wong
f5e6ec3b7a
fix package src in containerfile
2025-02-15 18:20:35 -08:00
Benson Wong
3f462da146
switch package source from ggerganov to ggml-org
2025-02-15 18:18:49 -08:00
Benson Wong
48bd766536
Update README.md
2025-02-14 22:05:52 -08:00
Benson Wong
8d319da4dd
improve README organization (i think...)
2025-02-14 15:59:12 -08:00
Benson Wong
be7c502448
improve docs
2025-02-14 15:47:31 -08:00
Benson Wong
92336f00bf
more container build fixes
2025-02-14 15:34:38 -08:00
Benson Wong
ed2a50d9a6
fix bug in build-container.sh
2025-02-14 15:27:56 -08:00
Benson Wong
0acfdb9f78
update workflow to build cpu and disable musa
2025-02-14 15:26:59 -08:00
Benson Wong
96a8ea0241
add cpu docker container build
2025-02-14 15:25:45 -08:00
Benson Wong
f20f2c9b7a
add docs and container build improvements #43
2025-02-14 12:20:07 -08:00
Benson Wong
7a97c38828
enable parallel container built #46
2025-02-14 11:04:33 -08:00
Benson Wong
4885132565
more permissions futzing
2025-02-14 11:02:15 -08:00
Benson Wong
8b46a0b7f1
grant package:write to container workflow #46
2025-02-14 10:55:30 -08:00
Benson Wong
1b6736ec6f
rename workflow for containers
2025-02-14 10:50:15 -08:00
Benson Wong
ddc1ce031e
fix container file name #46
2025-02-14 10:49:44 -08:00