Commit Graph

5 Commits

Author SHA1 Message Date
Ikko Eltociear Ashimine
9a81c53664 chore: update process_test.go (#17)
nonexistant -> nonexistent
2024-11-26 10:20:16 -08:00
Benson Wong
73ad85ea69 Implement Multi-Process Handling (#7)
Refactor code to support starting of multiple back end llama.cpp servers. This functionality is exposed as `profiles` to create a simple configuration format. 

Changes: 

* refactor proxy tests to get ready for multi-process support
* update proxy/ProxyManager to support multiple processes (#7)
* Add support for Groups in configuration
* improve handling of Model alias configs
* implement multi-model swapping
* improve code clarity for swapModel
* improve docs, rename groups to profiles in config
2024-11-23 19:45:13 -08:00
Benson Wong
533162ce6a add support for automatically unloading a model (#10) (#14)
* Make starting upstream process on-demand (#10)
* Add automatic unload of model after TTL is reached
* add `ttl` configuration parameter to models in seconds, default is 0 (never unload)
2024-11-19 16:32:51 -08:00
Benson Wong
a33ac6f8fb update README 2024-11-18 15:37:50 -08:00
Benson Wong
e5c909ddf7 add tests for proxy.Process 2024-11-17 20:49:14 -08:00