move some docs to the wiki [no-ci]
This commit is contained in:
26
README.md
26
README.md
@@ -299,32 +299,6 @@ Any OpenAI compatible server would work. llama-swap was originally designed for
|
|||||||
|
|
||||||
For Python based inference servers like vllm or tabbyAPI it is recommended to run them via podman or docker. This provides clean environment isolation as well as responding correctly to `SIGTERM` signals to shutdown.
|
For Python based inference servers like vllm or tabbyAPI it is recommended to run them via podman or docker. This provides clean environment isolation as well as responding correctly to `SIGTERM` signals to shutdown.
|
||||||
|
|
||||||
## Systemd Unit Files
|
|
||||||
|
|
||||||
Use this unit file to start llama-swap on boot. This is only tested on Ubuntu.
|
|
||||||
|
|
||||||
`/etc/systemd/system/llama-swap.service`
|
|
||||||
|
|
||||||
```
|
|
||||||
[Unit]
|
|
||||||
Description=llama-swap
|
|
||||||
After=network.target
|
|
||||||
|
|
||||||
[Service]
|
|
||||||
User=nobody
|
|
||||||
|
|
||||||
# set this to match your environment
|
|
||||||
ExecStart=/path/to/llama-swap --config /path/to/llama-swap.config.yml
|
|
||||||
|
|
||||||
Restart=on-failure
|
|
||||||
RestartSec=3
|
|
||||||
StartLimitBurst=3
|
|
||||||
StartLimitInterval=30
|
|
||||||
|
|
||||||
[Install]
|
|
||||||
WantedBy=multi-user.target
|
|
||||||
```
|
|
||||||
|
|
||||||
## Star History
|
## Star History
|
||||||
|
|
||||||
[](https://www.star-history.com/#mostlygeek/llama-swap&Date)
|
[](https://www.star-history.com/#mostlygeek/llama-swap&Date)
|
||||||
|
|||||||
Reference in New Issue
Block a user