andreas/llama-swap

Fork 0

Files

Andreas b984f5ca08

Linux CI / run-tests (push) Has been cancelled

Details

Windows CI / run-tests (push) Has been cancelled

Details

Close inactive issues / close-issues (push) Successful in -1m29s

Details

Build Containers / build-and-push (cpu) (push) Failing after -1m27s

Details

Build Containers / build-and-push (cuda) (push) Failing after -1m28s

Details

Build Containers / build-and-push (intel) (push) Failing after -1m28s

Details

Build Containers / build-and-push (musa) (push) Failing after -1m28s

Details

Build Containers / build-and-push (vulkan) (push) Failing after -1m28s

Details

Build Containers / delete-untagged-containers (push) Has been skipped

Details

Containerized build and socket activation docs

2025-11-20 00:18:58 +01:00

1.3 KiB

Raw Permalink Blame History

Rootless podman container with Systemd Socket activation

Idea

By passing in the socket from systemd we minimize resource use when not in use. Since no other network access is required for operation, we can configure the container with network=none and minimize the risk of the AI escaping.

Set up

Optional, if you want to run this as a separate user

sudo useradd llama
sudo machinectl shell llama@

Check out this repository, navigate to its root directory and build the llama.cpp/llama swap container with

podman build -t localhost/lamaswap:latest -f Build.Containerfile

Place llama.socket in ~/.config/systemd/user, adjust ports and interfaces if needed. Place llama.container in ~/.config/containers/systemd. Adjust paths for models and config if desired. The files are in docs/socket_activation, next to this readme.

Put model files into the models directory (~/models). Create a llama swap config.yaml (by default in ~) according to the docs.

Start the socket:

systemctl --user daemon-reload
systemctl --user enable --now llama.socket

If you want to run the service also when the user is not logged in, enable lingering:

sudo loginctl enable-linger <user>

Check that you can access the llama swap control panel in browser. For troubleshooting, use, e. g., journalctl -xe.

1.3 KiB Raw Permalink Blame History

Rootless podman container with Systemd Socket activation

Idea

Set up

1.3 KiB

Raw Permalink Blame History