Swapping models can take a long time and leave a lot of silence while the model is loading. Rather than silently load the model in the background, this PR allows llama-swap to send status updates in the reasoning_content of a streaming chat response.
Fixes: #366
* proxy/config: add model level macros
Add macros to model configuration. Model macros override macros that are
defined at the global configuration level. They follow the same naming
and value rules as the global macros.
* proxy/config: fix bug with macro reserved name checking
The PORT reserved name was not properly checked
* proxy/config: add tests around model.filters.stripParams
- add check that model.filters.stripParams has no invalid macros
- renamed strip_params to stripParams for camel case consistency
- add legacy code compatibility so model.filters.strip_params continues to work
* proxy/config: add duplicate removal to model.filters.stripParams
* clean up some doc nits