Commit Graph

  • 7187cfe52e add Windows build support to Makefile (#54) daschiller 2025-02-19 02:24:31 +01:00
  • 24089d2d9c remove "no musa container" note from README Benson Wong 2025-02-18 16:38:48 -08:00
  • ebabe55ff3 Delete untagged packages after build and push (#55) Benson Wong 2025-02-18 10:32:32 -08:00
  • 41a338297c deletion of untagged containers happen after build-and-push Benson Wong 2025-02-18 10:11:59 -08:00
  • 7e3353efeb add action step to remove untagged containers Benson Wong 2025-02-18 10:08:41 -08:00
  • 4ed58fb173 update container build action Benson Wong 2025-02-18 09:59:06 -08:00
  • f5a2be698d revert package src until new ggml-org has them Benson Wong 2025-02-15 18:23:58 -08:00
  • f5e6ec3b7a fix package src in containerfile Benson Wong 2025-02-15 18:20:35 -08:00
  • 3f462da146 switch package source from ggerganov to ggml-org Benson Wong 2025-02-15 18:18:37 -08:00
  • 48bd766536 Update README.md Benson Wong 2025-02-14 22:05:52 -08:00
  • 8d319da4dd improve README organization (i think...) Benson Wong 2025-02-14 15:59:12 -08:00
  • be7c502448 improve docs Benson Wong 2025-02-14 15:47:31 -08:00
  • 92336f00bf more container build fixes Benson Wong 2025-02-14 15:34:35 -08:00
  • ed2a50d9a6 fix bug in build-container.sh Benson Wong 2025-02-14 15:27:56 -08:00
  • 0acfdb9f78 update workflow to build cpu and disable musa Benson Wong 2025-02-14 15:26:59 -08:00
  • 96a8ea0241 add cpu docker container build Benson Wong 2025-02-14 15:25:45 -08:00
  • f20f2c9b7a add docs and container build improvements #43 Benson Wong 2025-02-14 12:20:07 -08:00
  • 7a97c38828 enable parallel container built #46 Benson Wong 2025-02-14 11:04:33 -08:00
  • 4885132565 more permissions futzing Benson Wong 2025-02-14 11:02:15 -08:00
  • 8b46a0b7f1 grant package:write to container workflow #46 Benson Wong 2025-02-14 10:55:30 -08:00
  • 1b6736ec6f rename workflow for containers Benson Wong 2025-02-14 10:50:15 -08:00
  • ddc1ce031e fix container file name #46 Benson Wong 2025-02-14 10:49:44 -08:00
  • 11d024bbaa just build cuda while debugging Benson Wong 2025-02-14 10:48:06 -08:00
  • 43e23c16dc add check for GITHUB_TOKEN #46 Benson Wong 2025-02-14 10:47:25 -08:00
  • f9c8e763ba add execute bit on build-container.sh Benson Wong 2025-02-14 10:44:53 -08:00
  • d7e1bb9f7c add GITHUB_TOKEN to container build env Benson Wong 2025-02-14 10:43:44 -08:00
  • ab93460a8b first container code (#52) Benson Wong 2025-02-14 10:39:25 -08:00
  • 13d4552edc Add FreeBSD/amd64 to auto built releases (#51) Benson Wong 2025-02-13 16:43:51 -08:00
  • 6667e307a2 Update README.md Benson Wong 2025-02-08 10:28:35 -08:00
  • 7ac446e6a9 Update README.md Benson Wong 2025-02-08 10:26:11 -08:00
  • eab9795bcc remove panic() when cmd or process is nil Benson Wong 2025-02-07 14:00:32 -08:00
  • 09bdd86b54 Improve shutdown behaviour (#47) (#49) Benson Wong 2025-02-05 17:19:59 -08:00
  • 85cd74a51c Improve process start and stop reliability (#38) Benson Wong 2025-02-03 11:50:38 -08:00
  • 314d2f2212 remove cmd_stop configuration and functionality from PR #40 (#44) Benson Wong 2025-01-31 12:42:44 -08:00
  • fad25f3e11 Use client request context in proxy request (#43) Benson Wong 2025-01-31 10:21:49 -08:00
  • 2c3e3e27f7 Support OPTIONS requests (#42) Benson Wong 2025-01-31 10:09:00 -08:00
  • baeb0c4e7f Add cmd_stop configuration to better support docker (#35) Benson Wong 2025-01-30 16:59:57 -08:00
  • 2833517eef Improve handling of process that do not handle SIGTERM (#38) Benson Wong 2025-01-20 14:39:52 -08:00
  • abdc2bfdb3 Fix panic when requesting non-members of profiles Benson Wong 2025-01-16 12:06:38 -08:00
  • c3b834737f Update README.md Benson Wong 2025-01-13 22:37:30 -08:00
  • 3c8e727b73 Update README.md Benson Wong 2025-01-12 19:48:35 -08:00
  • 3a1e9f81f1 support TTS /v1/audio/speech (#36) Benson Wong 2025-01-12 16:27:01 -08:00
  • 72c883f36c Update README.md Benson Wong 2025-01-02 09:01:51 -08:00
  • 1b04d034cf Update README.md Benson Wong 2025-01-02 08:59:11 -08:00
  • 2e45f5692a Update README.md Benson Wong 2025-01-01 12:51:24 -08:00
  • c97b80bdfe Update README.md Benson Wong 2025-01-01 12:25:45 -08:00
  • ae3ef9bc39 Refactor UI (#33) Benson Wong 2024-12-23 19:48:59 -08:00
  • db6715bec3 update golang.org/x/net -> v0.33.0 for dependabot Benson Wong 2024-12-20 11:28:32 -08:00
  • da5d9e8a6a fix HTTP logging so true path is printed Benson Wong 2024-12-20 11:25:01 -08:00
  • 84b667ca7a improve logging and error reporting for troubleshooting Benson Wong 2024-12-20 10:46:56 -08:00
  • 29657106fc add more OpenAI API supported in README Benson Wong 2024-12-20 10:08:20 -08:00
  • 9c8860471e support v1/rerank endpoint Benson Wong 2024-12-17 21:22:25 -08:00
  • 9b4e3f307e rename proxy handler Benson Wong 2024-12-17 17:24:41 -08:00
  • 6fe37c3abf support /v1/embeddings (#4) Benson Wong 2024-12-17 17:23:26 -08:00
  • 7f45493a37 Update README.md Benson Wong 2024-12-17 14:45:41 -08:00
  • 891f6a5b5a Add /upstream endpoint (#30) Benson Wong 2024-12-17 14:37:44 -08:00
  • 7183f6b43d fix bad logging due to wrong []byte used #28 Benson Wong 2024-12-16 16:22:14 -08:00
  • d89bfeb441 add .DS_Store to .gitignore Benson Wong 2024-12-16 12:30:31 -08:00
  • 9a0c6bed40 Improve stop exceptions (#28) (#29) Benson Wong 2024-12-16 12:29:25 -08:00
  • d6ca535939 tweak release tagging so it is not based on number of commits Benson Wong 2024-12-14 15:46:10 -08:00
  • 27302c0c02 change llama-swap to use goreleaser default ldflag values Benson Wong 2024-12-14 10:29:57 -08:00
  • d4e22cceaa Fix security vulnerability with golang.org/x/crypto Benson Wong 2024-12-14 10:20:22 -08:00
  • 4c94927658 Move release to Makefile out of goreleaser Benson Wong 2024-12-14 10:16:46 -08:00
  • a955a4a5c0 create tag to release Benson Wong 2024-12-14 10:07:20 -08:00
  • 22d3f1a4f9 Change versioning to use git commits counts instead of semver Benson Wong 2024-12-14 09:53:13 -08:00
  • e2443251ad update readme Benson Wong 2024-12-09 19:14:49 -08:00
  • 5fbd53c616 delay TTL check until after all requests are complete (#25) Benson Wong 2024-12-09 19:08:03 -08:00
  • 97dae50dc4 update readme Benson Wong 2024-12-08 21:34:16 -08:00
  • cb978f760f add web interface to /logs Benson Wong 2024-12-08 21:26:22 -08:00
  • 387f0ef6c4 use new timings data in server response in run-benchmark.sh Benson Wong 2024-12-03 20:48:31 -08:00
  • 18c134624d Add Access-Control-Allow-Origin CORS header to /v1/models endpoint Benson Wong 2024-12-03 15:53:59 -08:00
  • da2326bdc7 add example: optimizing code generation Benson Wong 2024-12-03 10:25:43 -08:00
  • da46545630 fix profile example in README Benson Wong 2024-12-01 10:13:31 -08:00
  • 04b4760e7e change profile split character to : (colon) (#21) Benson Wong 2024-12-01 09:10:50 -08:00
  • 9fc5d5b5eb improve cmd parsing (#22) Benson Wong 2024-12-01 09:02:58 -08:00
  • cf82b3c633 Improve Concurrency and Parallel Request Handling (#19) Benson Wong 2024-11-30 15:24:42 -08:00
  • e363f8f498 clean up writing with AI :b Benson Wong 2024-11-28 22:12:44 -08:00
  • c9629cf3a2 add speculative decoding example Benson Wong 2024-11-28 22:07:22 -08:00
  • 50426935a4 . Benson Wong 2024-11-28 22:06:29 -08:00
  • 2fceb78e8d Add examples Benson Wong 2024-11-28 22:05:15 -08:00
  • 9a81c53664 chore: update process_test.go (#17) Ikko Eltociear Ashimine 2024-11-27 03:20:16 +09:00
  • 716d37de82 Update README.md Benson Wong 2024-11-25 12:35:00 -08:00
  • 73ad85ea69 Implement Multi-Process Handling (#7) Benson Wong 2024-11-23 19:45:13 -08:00
  • 533162ce6a add support for automatically unloading a model (#10) (#14) Benson Wong 2024-11-19 16:32:51 -08:00
  • ba39ed4c18 Add support for legacy v1/completions API (#12) Benson Wong 2024-11-19 09:56:41 -08:00
  • 21f54f96c2 Merge pull request #13 from mostlygeek/set-content-length Benson Wong 2024-11-19 09:46:03 -08:00
  • 7eec51f3f2 Dechunk HTTP requests by default (#11) Benson Wong 2024-11-19 09:40:44 -08:00
  • 5021e0f299 remove the process handler override Benson Wong 2024-11-18 21:26:39 -08:00
  • c9233d2c9a use gin instead of standard http lib in main Benson Wong 2024-11-18 15:58:28 -08:00
  • a33ac6f8fb update README Benson Wong 2024-11-18 15:37:50 -08:00
  • 401aa88949 move log handlers to separate file Benson Wong 2024-11-18 15:33:06 -08:00
  • e9e88fd229 rename proxy.go to proxymanager.go Benson Wong 2024-11-18 15:30:34 -08:00
  • c3b4bb1684 use gin for http server Benson Wong 2024-11-18 15:30:16 -08:00
  • e5c909ddf7 add tests for proxy.Process Benson Wong 2024-11-17 20:49:14 -08:00
  • 36a31f450f add proxy.Process to manage upstream proxy logic Benson Wong 2024-11-17 16:41:15 -08:00
  • a8e5ee13b9 Add logging with pipes example to README Benson Wong 2024-11-15 09:10:43 -08:00
  • 5944a86e86 fix early timeout bug Benson Wong 2024-11-09 20:08:40 -08:00
  • 63d4a7d0eb Improve LogMonitor to handle empty writes and ensure buffer immutability Benson Wong 2024-11-02 10:41:23 -07:00
  • f45469f7ff Merge pull request #8 from mostlygeek/improve-upstream-monitoring-issue5 Benson Wong 2024-11-01 15:28:06 -07:00
  • 34f9fd7340 Improve timeout and exit handling of child processes. fix #3 and #5 Benson Wong 2024-11-01 14:32:39 -07:00