Add custom check endpoint

Replace previously hardcoded value for `/health` to check when the
server became ready to serve traffic. With this the server can support
any server that provides an an OpenAI compatible inference endpoint.
This commit is contained in:
Benson Wong
2024-10-11 21:59:21 -07:00
parent 5a57688aa8
commit 8eb5b7b6c4
5 changed files with 40 additions and 11 deletions

View File

@@ -30,6 +30,13 @@ models:
- "gpt-4o-mini"
- "gpt-3.5-turbo"
# wait for this path to return an HTTP 200 before serving requests
# defaults to /health to match llama.cpp
#
# use "none" to skip endpoint checking. This may cause requests to fail
# until the server is ready
checkEndpoint: "/custom-endpoint"
"qwen":
# environment variables to pass to the command
env: