proxy: add support for user defined metadata in model configs (#333)
Changes: - add Metadata key to ModelConfig - include metadata in /v1/models under meta.llamaswap key - add recursive macro substitution into Metadata - change macros at global and model level to be any scalar type Note: This is the first mostly AI generated change to llama-swap. See #333 for notes about the workflow and approach to AI going forward.
This commit is contained in:
43
CLAUDE.md
Normal file
43
CLAUDE.md
Normal file
@@ -0,0 +1,43 @@
|
||||
# Project: llama-swap
|
||||
|
||||
## Project Description:
|
||||
|
||||
llama-swap is a light weight, transparent proxy server that provides automatic model swapping to llama.cpp's server.
|
||||
|
||||
## Tech stack
|
||||
|
||||
- golang
|
||||
- typescript, vite and react for UI (ui/)
|
||||
|
||||
## Testing
|
||||
|
||||
- `make test-dev` - Use this when making iterative changes. Runs `go test` and `staticcheck`. Fix any static checking errors.
|
||||
- `make test-all` - runs at the end before completing work. Includes long running concurrency tests.
|
||||
|
||||
## Workflow Tasks
|
||||
|
||||
### Plan Improvements
|
||||
|
||||
Work plans are located in ai-plans/. Plans written by the user may be incomplete, contain inconsistencies or errors.
|
||||
|
||||
When the user asks to improve a plan follow these guidelines for expanding and improving it.
|
||||
|
||||
- Identify any inconsistencies.
|
||||
- Expand plans out to be detailed specification of requirements and changes to be made.
|
||||
- Plans should have at least these sections:
|
||||
- Title - very short, describes changes
|
||||
- Overview: A more detailed summary of goal and outcomes desired
|
||||
- Design Requirements: Detailed descriptions of what needs to be done
|
||||
- Testing Plan: Tests to be implemented
|
||||
- Checklist: A detailed list of changes to be made
|
||||
|
||||
Look for "plan expansion" as explicit instructions to improve a plan.
|
||||
|
||||
### Implementation of plans
|
||||
|
||||
When the user says "paint it", respond with "commencing automated assembly". Then implement the changes as described by the plan. Update the checklist as you complete items.
|
||||
|
||||
## General Rules
|
||||
|
||||
- when summarizing changes only include details that require further action (action items)
|
||||
- when there are no action items, just say "Done."
|
||||
7
Makefile
7
Makefile
@@ -23,6 +23,11 @@ proxy/ui_dist/placeholder.txt:
|
||||
mkdir -p proxy/ui_dist
|
||||
touch $@
|
||||
|
||||
# use cached test results while developing
|
||||
test-dev: proxy/ui_dist/placeholder.txt
|
||||
go test -short ./proxy/...
|
||||
staticcheck ./proxy/... || true
|
||||
|
||||
test: proxy/ui_dist/placeholder.txt
|
||||
go test -short -count=1 ./proxy/...
|
||||
|
||||
@@ -82,4 +87,4 @@ release:
|
||||
git tag "$$new_tag";
|
||||
|
||||
# Phony targets
|
||||
.PHONY: all clean ui mac linux windows simple-responder test test-all
|
||||
.PHONY: all clean ui mac linux windows simple-responder test test-all test-dev
|
||||
|
||||
292
ai-plans/issue-264-add-metadata.md
Normal file
292
ai-plans/issue-264-add-metadata.md
Normal file
@@ -0,0 +1,292 @@
|
||||
# Add Model Metadata Support with Typed Macros
|
||||
|
||||
## Overview
|
||||
|
||||
Implement support for arbitrary metadata on model configurations that can be exposed through the `/v1/models` API endpoint. This feature extends the existing macro system to support scalar types (string, int, float, bool) instead of only strings, enabling type-safe metadata values.
|
||||
|
||||
The metadata will be schemaless, allowing users to define any key-value pairs they need. Macro substitution will work within metadata values, preserving types when macros are used directly and converting to strings when macros are interpolated within strings.
|
||||
|
||||
## Design Requirements
|
||||
|
||||
### 1. Enhanced Macro System
|
||||
|
||||
**Current State:**
|
||||
|
||||
- Macros are defined as `map[string]string` at both global and model levels
|
||||
- Only string substitution is supported
|
||||
- Macros are replaced in: `cmd`, `cmdStop`, `proxy`, `checkEndpoint`, `filters.stripParams`
|
||||
|
||||
**Required Changes:**
|
||||
|
||||
- Change `MacroList` type from `map[string]string` to `map[string]any`
|
||||
- Support scalar types: `string`, `int`, `float64`, `bool`
|
||||
- Implement type-preserving macro substitution:
|
||||
- Direct macro usage (`key: ${macro}`) preserves the macro's type
|
||||
- Interpolated usage (`key: "text ${macro}"`) converts to string
|
||||
- Add validation to ensure macro values are scalar types only
|
||||
- Update existing macro substitution logic in [proxy/config/config.go](proxy/config/config.go) to handle `any` types
|
||||
|
||||
**Implementation Details:**
|
||||
|
||||
- Create a generic helper function to perform macro substitution that:
|
||||
- Takes a value of type `any`
|
||||
- Recursively processes maps, slices, and scalar values
|
||||
- Replaces `${macro_name}` patterns with macro values
|
||||
- Preserves types for direct substitution
|
||||
- Converts to strings for interpolated substitution
|
||||
- Update `validateMacro()` function to accept `any` type and validate scalar types
|
||||
- Maintain backward compatibility with existing string-only macros
|
||||
|
||||
### 2. Metadata Field in ModelConfig
|
||||
|
||||
**Location:** [proxy/config/model_config.go](proxy/config/model_config.go)
|
||||
|
||||
**Required Changes:**
|
||||
|
||||
- Add `Metadata map[string]any` field to `ModelConfig` struct
|
||||
- Support YAML unmarshaling of arbitrary structures (maps, arrays, scalars)
|
||||
- Apply macro substitution to metadata values during config loading
|
||||
|
||||
**Schema Requirements:**
|
||||
|
||||
- Metadata is optional (default: empty/nil map)
|
||||
- Supports nested structures (objects within objects, arrays, etc.)
|
||||
- All string values within metadata undergo macro substitution
|
||||
- Type preservation rules apply as described above
|
||||
|
||||
### 3. Macro Substitution in Metadata
|
||||
|
||||
**Location:** [proxy/config/config.go](proxy/config/config.go) in `LoadConfigFromReader()`
|
||||
|
||||
**Process Flow:**
|
||||
|
||||
1. After loading YAML configuration
|
||||
2. After model-level and global macro merging
|
||||
3. Apply macro substitution to `ModelConfig.Metadata` field
|
||||
4. Use the same merged macros available to `cmd`, `proxy`, etc.
|
||||
5. Process recursively through all nested structures
|
||||
|
||||
**Substitution Rules:**
|
||||
|
||||
- `port: ${PORT}` → keeps integer type from PORT macro
|
||||
- `temperature: ${temp}` → keeps float type from temp macro
|
||||
- `note: "Running on ${PORT}"` → converts to string `"Running on 10001"`
|
||||
- Arrays and nested objects are processed recursively
|
||||
- Unknown macros should cause configuration load error (consistent with existing behavior)
|
||||
|
||||
### 4. API Response Updates
|
||||
|
||||
**Location:** [proxy/proxymanager.go:350](proxy/proxymanager.go#L350) `listModelsHandler()`
|
||||
|
||||
**Current Behavior:**
|
||||
|
||||
- Returns model records with: `id`, `object`, `created`, `owned_by`
|
||||
- Optionally includes: `name`, `description`
|
||||
|
||||
**Required Changes:**
|
||||
|
||||
- Add metadata to each model record under the key `llamaswap_meta`
|
||||
- Only include `llamaswap_meta` if metadata is non-empty
|
||||
- Preserve all types when marshaling to JSON
|
||||
- Maintain existing sorting by model ID
|
||||
|
||||
**Example Response:**
|
||||
|
||||
```json
|
||||
{
|
||||
"object": "list",
|
||||
"data": [
|
||||
{
|
||||
"id": "llama",
|
||||
"object": "model",
|
||||
"created": 1234567890,
|
||||
"owned_by": "llama-swap",
|
||||
"name": "llama 3.1 8B",
|
||||
"description": "A small but capable model",
|
||||
"llamaswap_meta": {
|
||||
"port": 10001,
|
||||
"temperature": 0.7,
|
||||
"note": "The llama is running on port 10001 temp=0.7, context=16384",
|
||||
"a_list": [1, 1.23, "macros are OK in list and dictionary types: llama"],
|
||||
"an_obj": {
|
||||
"a": "1",
|
||||
"b": 2,
|
||||
"c": [0.7, false, "model: llama"]
|
||||
}
|
||||
}
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
### 5. Validation and Error Handling
|
||||
|
||||
**Macro Validation:**
|
||||
|
||||
- Extend `validateMacro()` to accept values of type `any`
|
||||
- Verify macro values are scalar types: `string`, `int`, `float64`, `bool`
|
||||
- Reject complex types (maps, slices, structs) as macro values
|
||||
- Maintain existing validation for macro names and lengths
|
||||
|
||||
**Configuration Loading:**
|
||||
|
||||
- Fail fast if unknown macros are found in metadata
|
||||
- Provide clear error messages indicating which model and field contains errors
|
||||
- Ensure macros in metadata follow same rules as macros in cmd/proxy fields
|
||||
|
||||
## Testing Plan
|
||||
|
||||
### Test 1: Model-Level Macros with Different Types
|
||||
|
||||
**File:** [proxy/config/model_config_test.go](proxy/config/model_config_test.go)
|
||||
|
||||
**Test Cases:**
|
||||
|
||||
- Define model with macros of each scalar type
|
||||
- Verify metadata correctly substitutes and preserves types
|
||||
- Test direct substitution (`port: ${PORT}`)
|
||||
- Test string interpolation (`note: "Port is ${PORT}"`)
|
||||
- Verify nested objects and arrays work correctly
|
||||
|
||||
### Test 2: Global and Model Macro Precedence
|
||||
|
||||
**File:** [proxy/config/config_test.go](proxy/config/config_test.go)
|
||||
|
||||
**Test Cases:**
|
||||
|
||||
- Define same macro at global and model level with different types
|
||||
- Verify model-level macro takes precedence
|
||||
- Test metadata uses correct macro value
|
||||
- Verify type is preserved from the winning macro
|
||||
|
||||
### Test 3: Macro Validation
|
||||
|
||||
**File:** [proxy/config/config_test.go](proxy/config/config_test.go)
|
||||
|
||||
**Test Cases:**
|
||||
|
||||
- Test that complex types (maps, arrays) are rejected as macro values
|
||||
- Verify error message includes: macro name and type that was rejected
|
||||
- Test that scalar types (string, int, float, bool) are accepted
|
||||
- Each type should load without error
|
||||
- Test macro name validation still works with `any` types
|
||||
- Invalid characters, reserved names, length limits should still be enforced
|
||||
|
||||
### Test 4: Metadata in API Response
|
||||
|
||||
**File:** [proxy/proxymanager_test.go](proxy/proxymanager_test.go)
|
||||
|
||||
**Existing Test:** `TestProxyManager_ListModelsHandler`
|
||||
|
||||
**Test Cases:**
|
||||
|
||||
- Model with metadata → verify `llamaswap_meta` key appears
|
||||
- Model without metadata → verify `llamaswap_meta` key is absent
|
||||
- Verify all types are correctly marshaled to JSON
|
||||
- Verify nested structures are preserved
|
||||
- Verify macro substitution has occurred before serialization
|
||||
|
||||
### Test 5: Unknown Macros in Metadata
|
||||
|
||||
**File:** [proxy/config/config_test.go](proxy/config/config_test.go)
|
||||
|
||||
**Test Cases:**
|
||||
|
||||
- Use undefined macro in metadata
|
||||
- Verify configuration loading fails with clear error
|
||||
- Error should indicate model name and that macro is undefined
|
||||
|
||||
### Test 6: Recursive Substitution
|
||||
|
||||
**File:** [proxy/config/config_test.go](proxy/config/config_test.go)
|
||||
|
||||
**Test Cases:**
|
||||
|
||||
- Metadata with deeply nested structures
|
||||
- Arrays containing objects with macros
|
||||
- Objects containing arrays with macros
|
||||
- Mixed string interpolation and direct substitution at various nesting levels
|
||||
|
||||
## Checklist
|
||||
|
||||
### Configuration Schema Changes
|
||||
|
||||
- [x] Change `MacroList` type from `map[string]string` to `map[string]any` in [proxy/config/config.go:19](proxy/config/config.go#L19)
|
||||
- [x] Add `Metadata map[string]any` field to `ModelConfig` struct in [proxy/config/model_config.go:37](proxy/config/model_config.go#L37)
|
||||
- [x] Update `validateMacro()` function signature to accept `any` type for values
|
||||
- [x] Add validation logic to ensure macro values are scalar types only
|
||||
|
||||
### Macro Substitution Logic
|
||||
|
||||
- [x] Create generic recursive function `substituteMetadataMacros()` to handle `any` types
|
||||
- [x] Implement type-preserving direct substitution logic
|
||||
- [x] Implement string interpolation with type conversion
|
||||
- [x] Handle maps: recursively process all values
|
||||
- [x] Handle slices: recursively process all elements
|
||||
- [x] Handle scalar types: perform string-based macro substitution if value is string
|
||||
- [x] Integrate macro substitution into `LoadConfigFromReader()` after existing macro expansion
|
||||
- [x] Update existing macro substitution calls to use merged macros with correct types
|
||||
|
||||
### API Response Changes
|
||||
|
||||
- [x] Modify `listModelsHandler()` in [proxy/proxymanager.go:350](proxy/proxymanager.go#L350)
|
||||
- [x] Add `llamaswap_meta` field to model records when metadata exists
|
||||
- [x] Ensure empty metadata results in omitted `llamaswap_meta` key
|
||||
- [x] Verify JSON marshaling preserves all types correctly
|
||||
|
||||
### Testing - Config Package
|
||||
|
||||
- [x] Add test for string macros in metadata: [proxy/config/config_test.go](proxy/config/config_test.go)
|
||||
- [x] Add test for int macros in metadata: [proxy/config/config_test.go](proxy/config/config_test.go)
|
||||
- [x] Add test for float macros in metadata: [proxy/config/config_test.go](proxy/config/config_test.go)
|
||||
- [x] Add test for bool macros in metadata: [proxy/config/config_test.go](proxy/config/config_test.go)
|
||||
- [x] Add test for string interpolation in metadata: [proxy/config/config_test.go](proxy/config/config_test.go)
|
||||
- [x] Add test for model-level macro precedence: [proxy/config/config_test.go](proxy/config/config_test.go)
|
||||
- [x] Add test for nested structures in metadata: [proxy/config/config_test.go](proxy/config/config_test.go)
|
||||
- [x] Add test for unknown macro in metadata (should error): [proxy/config/config_test.go](proxy/config/config_test.go)
|
||||
- [x] Add test for invalid macro type validation: [proxy/config/config_test.go](proxy/config/config_test.go)
|
||||
|
||||
### Testing - Model Config Package
|
||||
|
||||
- [x] Add test cases to [proxy/config/model_config_test.go](proxy/config/model_config_test.go) for metadata unmarshaling
|
||||
- [x] Test metadata with various scalar types
|
||||
- [x] Test metadata with nested objects and arrays
|
||||
|
||||
### Testing - Proxy Manager
|
||||
|
||||
- [x] Update `TestProxyManager_ListModelsHandler` in [proxy/proxymanager_test.go](proxy/proxymanager_test.go)
|
||||
- [x] Add test case for model with metadata
|
||||
- [x] Add test case for model without metadata
|
||||
- [x] Verify `llamaswap_meta` key presence/absence
|
||||
- [x] Verify type preservation in JSON output
|
||||
- [x] Verify macro substitution has occurred
|
||||
|
||||
### Documentation
|
||||
|
||||
- [x] Verify [config.example.yaml](config.example.yaml) already has complete metadata examples (lines 149-171)
|
||||
- [x] No additional documentation needed per project instructions
|
||||
|
||||
## Known Issues and Considerations
|
||||
|
||||
### Inconsistencies
|
||||
|
||||
None identified. The plan references the correct existing example in [config.example.yaml:149-171](config.example.yaml#L149-L171).
|
||||
|
||||
### Design Decisions
|
||||
|
||||
1. **Why `llamaswap_meta` instead of merging into record?**
|
||||
|
||||
- Avoids potential collisions with OpenAI API standard fields
|
||||
- Makes it clear this is llama-swap specific metadata
|
||||
- Easier for clients to distinguish standard vs. custom fields
|
||||
|
||||
2. **Why support nested structures?**
|
||||
|
||||
- Provides maximum flexibility for users
|
||||
- Aligns with the schemaless design principle
|
||||
- Example config already demonstrates this capability
|
||||
|
||||
3. **Why validate macro types?**
|
||||
- Prevents confusing behavior (e.g., substituting a map)
|
||||
- Makes configuration errors explicit at load time
|
||||
- Simpler implementation and testing
|
||||
@@ -67,7 +67,8 @@ models:
|
||||
# - macros defined here override macros defined in the global macros section
|
||||
# - model level macros follow the same rules as global macros
|
||||
macros:
|
||||
"default_ctx": "16384"
|
||||
"default_ctx": 16384
|
||||
"temp": 0.7
|
||||
|
||||
# cmd: the command to run to start the inference server.
|
||||
# - required
|
||||
@@ -79,6 +80,7 @@ models:
|
||||
${latest-llama}
|
||||
--model path/to/llama-8B-Q4_K_M.gguf
|
||||
--ctx-size ${default_ctx}
|
||||
--temperature ${temp}
|
||||
|
||||
# name: a display name for the model
|
||||
# - optional, default: empty string
|
||||
@@ -144,6 +146,30 @@ models:
|
||||
# - recommended to stick to sampling parameters
|
||||
stripParams: "temperature, top_p, top_k"
|
||||
|
||||
# metadata: a dictionary of arbitrary values that are included in /v1/models
|
||||
# - optional, default: empty dictionary
|
||||
# - while metadata can contains complex types it is recommended to keep it simple
|
||||
# - metadata is only passed through in /v1/models responses
|
||||
metadata:
|
||||
# port will remain an integer
|
||||
port: ${PORT}
|
||||
|
||||
# the ${temp} macro will remain a float
|
||||
temperature: ${temp}
|
||||
note: "The ${MODEL_ID} is running on port ${PORT} temp=${temp}, context=${default_ctx}"
|
||||
|
||||
a_list:
|
||||
- 1
|
||||
- 1.23
|
||||
- "macros are OK in list and dictionary types: ${MODEL_ID}"
|
||||
|
||||
an_obj:
|
||||
a: "1"
|
||||
b: 2
|
||||
# objects can contain complex types with macro substitution
|
||||
# becomes: c: [0.7, false, "model: llama"]
|
||||
c: ["${temp}", false, "model: ${MODEL_ID}"]
|
||||
|
||||
# concurrencyLimit: overrides the allowed number of active parallel requests to a model
|
||||
# - optional, default: 0
|
||||
# - useful for limiting the number of active parallel requests a model can process
|
||||
|
||||
@@ -16,7 +16,7 @@ import (
|
||||
|
||||
const DEFAULT_GROUP_ID = "(default)"
|
||||
|
||||
type MacroList map[string]string
|
||||
type MacroList map[string]any
|
||||
|
||||
type GroupConfig struct {
|
||||
Swap bool `yaml:"swap"`
|
||||
@@ -25,6 +25,11 @@ type GroupConfig struct {
|
||||
Members []string `yaml:"members"`
|
||||
}
|
||||
|
||||
var (
|
||||
macroNameRegex = regexp.MustCompile(`^[a-zA-Z0-9_-]+$`)
|
||||
macroPatternRegex = regexp.MustCompile(`\$\{([a-zA-Z0-9_-]+)\}`)
|
||||
)
|
||||
|
||||
// set default values for GroupConfig
|
||||
func (c *GroupConfig) UnmarshalYAML(unmarshal func(interface{}) error) error {
|
||||
type rawGroupConfig GroupConfig
|
||||
@@ -182,14 +187,18 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
|
||||
mergedMacros[k] = v
|
||||
}
|
||||
|
||||
mergedMacros["MODEL_ID"] = modelId
|
||||
|
||||
// go through model config fields: cmd, cmdStop, proxy, checkEndPoint and replace macros with macro values
|
||||
for macroName, macroValue := range mergedMacros {
|
||||
macroSlug := fmt.Sprintf("${%s}", macroName)
|
||||
modelConfig.Cmd = strings.ReplaceAll(modelConfig.Cmd, macroSlug, macroValue)
|
||||
modelConfig.CmdStop = strings.ReplaceAll(modelConfig.CmdStop, macroSlug, macroValue)
|
||||
modelConfig.Proxy = strings.ReplaceAll(modelConfig.Proxy, macroSlug, macroValue)
|
||||
modelConfig.CheckEndpoint = strings.ReplaceAll(modelConfig.CheckEndpoint, macroSlug, macroValue)
|
||||
modelConfig.Filters.StripParams = strings.ReplaceAll(modelConfig.Filters.StripParams, macroSlug, macroValue)
|
||||
// Convert macro value to string for command/string field substitution
|
||||
macroStr := fmt.Sprintf("%v", macroValue)
|
||||
modelConfig.Cmd = strings.ReplaceAll(modelConfig.Cmd, macroSlug, macroStr)
|
||||
modelConfig.CmdStop = strings.ReplaceAll(modelConfig.CmdStop, macroSlug, macroStr)
|
||||
modelConfig.Proxy = strings.ReplaceAll(modelConfig.Proxy, macroSlug, macroStr)
|
||||
modelConfig.CheckEndpoint = strings.ReplaceAll(modelConfig.CheckEndpoint, macroSlug, macroStr)
|
||||
modelConfig.Filters.StripParams = strings.ReplaceAll(modelConfig.Filters.StripParams, macroSlug, macroStr)
|
||||
}
|
||||
|
||||
// enforce ${PORT} used in both cmd and proxy
|
||||
@@ -203,16 +212,14 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
|
||||
modelConfig.Cmd = strings.ReplaceAll(modelConfig.Cmd, "${PORT}", nextPortStr)
|
||||
modelConfig.CmdStop = strings.ReplaceAll(modelConfig.CmdStop, "${PORT}", nextPortStr)
|
||||
modelConfig.Proxy = strings.ReplaceAll(modelConfig.Proxy, "${PORT}", nextPortStr)
|
||||
|
||||
// add port to merged macros so it can be used in metadata
|
||||
mergedMacros["PORT"] = nextPort
|
||||
|
||||
nextPort++
|
||||
}
|
||||
|
||||
if strings.Contains(modelConfig.Cmd, "${MODEL_ID}") || strings.Contains(modelConfig.CmdStop, "${MODEL_ID}") {
|
||||
modelConfig.Cmd = strings.ReplaceAll(modelConfig.Cmd, "${MODEL_ID}", modelId)
|
||||
modelConfig.CmdStop = strings.ReplaceAll(modelConfig.CmdStop, "${MODEL_ID}", modelId)
|
||||
}
|
||||
|
||||
// make sure there are no unknown macros that have not been replaced
|
||||
macroPattern := regexp.MustCompile(`\$\{([a-zA-Z0-9_-]+)\}`)
|
||||
fieldMap := map[string]string{
|
||||
"cmd": modelConfig.Cmd,
|
||||
"cmdStop": modelConfig.CmdStop,
|
||||
@@ -222,7 +229,7 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
|
||||
}
|
||||
|
||||
for fieldName, fieldValue := range fieldMap {
|
||||
matches := macroPattern.FindAllStringSubmatch(fieldValue, -1)
|
||||
matches := macroPatternRegex.FindAllStringSubmatch(fieldValue, -1)
|
||||
for _, match := range matches {
|
||||
macroName := match[1]
|
||||
if macroName == "PID" && fieldName == "cmdStop" {
|
||||
@@ -234,6 +241,15 @@ func LoadConfigFromReader(r io.Reader) (Config, error) {
|
||||
}
|
||||
}
|
||||
|
||||
// Apply macro substitution to metadata
|
||||
if len(modelConfig.Metadata) > 0 {
|
||||
substitutedMetadata, err := substituteMetadataMacros(modelConfig.Metadata, mergedMacros)
|
||||
if err != nil {
|
||||
return Config{}, fmt.Errorf("model %s metadata: %s", modelId, err.Error())
|
||||
}
|
||||
modelConfig.Metadata = substitutedMetadata.(map[string]any)
|
||||
}
|
||||
|
||||
config.Models[modelId] = modelConfig
|
||||
}
|
||||
|
||||
@@ -296,7 +312,7 @@ func AddDefaultGroupToConfig(config Config) Config {
|
||||
}
|
||||
} else {
|
||||
// iterate over existing group members and add non-grouped models into the default group
|
||||
for modelName, _ := range config.Models {
|
||||
for modelName := range config.Models {
|
||||
foundModel := false
|
||||
found:
|
||||
// search for the model in existing groups
|
||||
@@ -369,21 +385,26 @@ func StripComments(cmdStr string) string {
|
||||
return strings.Join(cleanedLines, "\n")
|
||||
}
|
||||
|
||||
var (
|
||||
macroNameRegex = regexp.MustCompile(`^[a-zA-Z0-9_-]+$`)
|
||||
)
|
||||
|
||||
// validateMacro validates macro name and value constraints
|
||||
func validateMacro(name, value string) error {
|
||||
func validateMacro(name string, value any) error {
|
||||
if len(name) >= 64 {
|
||||
return fmt.Errorf("macro name '%s' exceeds maximum length of 63 characters", name)
|
||||
}
|
||||
if !macroNameRegex.MatchString(name) {
|
||||
return fmt.Errorf("macro name '%s' contains invalid characters, must match pattern ^[a-zA-Z0-9_-]+$", name)
|
||||
}
|
||||
if len(value) >= 1024 {
|
||||
|
||||
// Validate that value is a scalar type
|
||||
switch v := value.(type) {
|
||||
case string:
|
||||
if len(v) >= 1024 {
|
||||
return fmt.Errorf("macro value for '%s' exceeds maximum length of 1024 characters", name)
|
||||
}
|
||||
case int, int8, int16, int32, int64, uint, uint8, uint16, uint32, uint64, float32, float64, bool:
|
||||
// These types are allowed
|
||||
default:
|
||||
return fmt.Errorf("macro '%s' has invalid type %T, must be a scalar type (string, int, float, or bool)", name, value)
|
||||
}
|
||||
|
||||
switch name {
|
||||
case "PORT", "MODEL_ID":
|
||||
@@ -392,3 +413,63 @@ func validateMacro(name, value string) error {
|
||||
|
||||
return nil
|
||||
}
|
||||
|
||||
// substituteMetadataMacros recursively substitutes macros in metadata structures
|
||||
// Direct substitution (key: ${macro}) preserves the macro's type
|
||||
// Interpolated substitution (key: "text ${macro}") converts to string
|
||||
func substituteMetadataMacros(value any, macros MacroList) (any, error) {
|
||||
switch v := value.(type) {
|
||||
case string:
|
||||
// Check if this is a direct macro substitution
|
||||
if strings.HasPrefix(v, "${") && strings.HasSuffix(v, "}") && strings.Count(v, "${") == 1 {
|
||||
macroName := v[2 : len(v)-1]
|
||||
if macroValue, exists := macros[macroName]; exists {
|
||||
return macroValue, nil
|
||||
}
|
||||
return nil, fmt.Errorf("unknown macro '${%s}' in metadata", macroName)
|
||||
}
|
||||
|
||||
// Handle string interpolation
|
||||
matches := macroPatternRegex.FindAllStringSubmatch(v, -1)
|
||||
result := v
|
||||
for _, match := range matches {
|
||||
macroName := match[1]
|
||||
macroValue, exists := macros[macroName]
|
||||
if !exists {
|
||||
return nil, fmt.Errorf("unknown macro '${%s}' in metadata", macroName)
|
||||
}
|
||||
// Convert macro value to string for interpolation
|
||||
macroStr := fmt.Sprintf("%v", macroValue)
|
||||
result = strings.ReplaceAll(result, match[0], macroStr)
|
||||
}
|
||||
return result, nil
|
||||
|
||||
case map[string]any:
|
||||
// Recursively process map values
|
||||
newMap := make(map[string]any)
|
||||
for key, val := range v {
|
||||
newVal, err := substituteMetadataMacros(val, macros)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
newMap[key] = newVal
|
||||
}
|
||||
return newMap, nil
|
||||
|
||||
case []any:
|
||||
// Recursively process slice elements
|
||||
newSlice := make([]any, len(v))
|
||||
for i, val := range v {
|
||||
newVal, err := substituteMetadataMacros(val, macros)
|
||||
if err != nil {
|
||||
return nil, err
|
||||
}
|
||||
newSlice[i] = newVal
|
||||
}
|
||||
return newSlice, nil
|
||||
|
||||
default:
|
||||
// Return scalar types as-is
|
||||
return value, nil
|
||||
}
|
||||
}
|
||||
|
||||
@@ -163,7 +163,7 @@ groups:
|
||||
expected := Config{
|
||||
LogLevel: "info",
|
||||
StartPort: 5800,
|
||||
Macros: map[string]string{
|
||||
Macros: MacroList{
|
||||
"svr-path": "path/to/server",
|
||||
},
|
||||
Hooks: HooksConfig{
|
||||
|
||||
@@ -517,3 +517,243 @@ models:
|
||||
assert.NoError(t, err)
|
||||
assert.Equal(t, "/path/to/server -p 9000 -hf author/model:F16", strings.Join(sanitizedCmd3, " "))
|
||||
}
|
||||
|
||||
func TestConfig_TypedMacrosInMetadata(t *testing.T) {
|
||||
content := `
|
||||
startPort: 10000
|
||||
macros:
|
||||
PORT_NUM: 10001
|
||||
TEMP: 0.7
|
||||
ENABLED: true
|
||||
NAME: "llama model"
|
||||
CTX: 16384
|
||||
|
||||
models:
|
||||
test-model:
|
||||
cmd: /path/to/server -p ${PORT}
|
||||
metadata:
|
||||
port: ${PORT_NUM}
|
||||
temperature: ${TEMP}
|
||||
enabled: ${ENABLED}
|
||||
model_name: ${NAME}
|
||||
context: ${CTX}
|
||||
note: "Running on port ${PORT_NUM} with temp ${TEMP} and context ${CTX}"
|
||||
`
|
||||
|
||||
config, err := LoadConfigFromReader(strings.NewReader(content))
|
||||
assert.NoError(t, err)
|
||||
|
||||
meta := config.Models["test-model"].Metadata
|
||||
assert.NotNil(t, meta)
|
||||
|
||||
// Verify direct substitution preserves types
|
||||
assert.Equal(t, 10001, meta["port"])
|
||||
assert.Equal(t, 0.7, meta["temperature"])
|
||||
assert.Equal(t, true, meta["enabled"])
|
||||
assert.Equal(t, "llama model", meta["model_name"])
|
||||
assert.Equal(t, 16384, meta["context"])
|
||||
|
||||
// Verify string interpolation converts to string
|
||||
assert.Equal(t, "Running on port 10001 with temp 0.7 and context 16384", meta["note"])
|
||||
}
|
||||
|
||||
func TestConfig_NestedStructuresInMetadata(t *testing.T) {
|
||||
content := `
|
||||
startPort: 10000
|
||||
macros:
|
||||
TEMP: 0.7
|
||||
|
||||
models:
|
||||
test-model:
|
||||
cmd: /path/to/server -p ${PORT}
|
||||
metadata:
|
||||
config:
|
||||
port: ${PORT}
|
||||
temperature: ${TEMP}
|
||||
tags: ["model:${MODEL_ID}", "port:${PORT}"]
|
||||
nested:
|
||||
deep:
|
||||
value: ${TEMP}
|
||||
`
|
||||
|
||||
config, err := LoadConfigFromReader(strings.NewReader(content))
|
||||
assert.NoError(t, err)
|
||||
|
||||
meta := config.Models["test-model"].Metadata
|
||||
assert.NotNil(t, meta)
|
||||
|
||||
// Verify nested objects
|
||||
configMap := meta["config"].(map[string]any)
|
||||
assert.Equal(t, 10000, configMap["port"])
|
||||
assert.Equal(t, 0.7, configMap["temperature"])
|
||||
|
||||
// Verify arrays
|
||||
tags := meta["tags"].([]any)
|
||||
assert.Equal(t, "model:test-model", tags[0])
|
||||
assert.Equal(t, "port:10000", tags[1])
|
||||
|
||||
// Verify deeply nested structures
|
||||
nested := meta["nested"].(map[string]any)
|
||||
deep := nested["deep"].(map[string]any)
|
||||
assert.Equal(t, 0.7, deep["value"])
|
||||
}
|
||||
|
||||
func TestConfig_ModelLevelMacroPrecedenceInMetadata(t *testing.T) {
|
||||
content := `
|
||||
startPort: 10000
|
||||
macros:
|
||||
TEMP: 0.5
|
||||
GLOBAL_VAL: "global"
|
||||
|
||||
models:
|
||||
test-model:
|
||||
cmd: /path/to/server -p ${PORT}
|
||||
macros:
|
||||
TEMP: 0.9
|
||||
LOCAL_VAL: "local"
|
||||
metadata:
|
||||
temperature: ${TEMP}
|
||||
global: ${GLOBAL_VAL}
|
||||
local: ${LOCAL_VAL}
|
||||
`
|
||||
|
||||
config, err := LoadConfigFromReader(strings.NewReader(content))
|
||||
assert.NoError(t, err)
|
||||
|
||||
meta := config.Models["test-model"].Metadata
|
||||
assert.NotNil(t, meta)
|
||||
|
||||
// Model-level macro should override global
|
||||
assert.Equal(t, 0.9, meta["temperature"])
|
||||
// Global macro should be accessible
|
||||
assert.Equal(t, "global", meta["global"])
|
||||
// Model-level macro should be accessible
|
||||
assert.Equal(t, "local", meta["local"])
|
||||
}
|
||||
|
||||
func TestConfig_UnknownMacroInMetadata(t *testing.T) {
|
||||
content := `
|
||||
startPort: 10000
|
||||
models:
|
||||
test-model:
|
||||
cmd: /path/to/server -p ${PORT}
|
||||
metadata:
|
||||
value: ${UNKNOWN_MACRO}
|
||||
`
|
||||
|
||||
_, err := LoadConfigFromReader(strings.NewReader(content))
|
||||
assert.Error(t, err)
|
||||
assert.Contains(t, err.Error(), "test-model")
|
||||
assert.Contains(t, err.Error(), "UNKNOWN_MACRO")
|
||||
}
|
||||
|
||||
func TestConfig_InvalidMacroType(t *testing.T) {
|
||||
content := `
|
||||
startPort: 10000
|
||||
macros:
|
||||
INVALID:
|
||||
nested: value
|
||||
|
||||
models:
|
||||
test-model:
|
||||
cmd: /path/to/server -p ${PORT}
|
||||
`
|
||||
|
||||
_, err := LoadConfigFromReader(strings.NewReader(content))
|
||||
assert.Error(t, err)
|
||||
assert.Contains(t, err.Error(), "INVALID")
|
||||
assert.Contains(t, err.Error(), "must be a scalar type")
|
||||
}
|
||||
|
||||
func TestConfig_MacroTypeValidation(t *testing.T) {
|
||||
tests := []struct {
|
||||
name string
|
||||
yaml string
|
||||
shouldErr bool
|
||||
}{
|
||||
{
|
||||
name: "string macro",
|
||||
yaml: `
|
||||
startPort: 10000
|
||||
macros:
|
||||
STR: "test"
|
||||
models:
|
||||
test-model:
|
||||
cmd: /path/to/server -p ${PORT}
|
||||
`,
|
||||
shouldErr: false,
|
||||
},
|
||||
{
|
||||
name: "int macro",
|
||||
yaml: `
|
||||
startPort: 10000
|
||||
macros:
|
||||
NUM: 42
|
||||
models:
|
||||
test-model:
|
||||
cmd: /path/to/server -p ${PORT}
|
||||
`,
|
||||
shouldErr: false,
|
||||
},
|
||||
{
|
||||
name: "float macro",
|
||||
yaml: `
|
||||
startPort: 10000
|
||||
macros:
|
||||
FLOAT: 3.14
|
||||
models:
|
||||
test-model:
|
||||
cmd: /path/to/server -p ${PORT}
|
||||
`,
|
||||
shouldErr: false,
|
||||
},
|
||||
{
|
||||
name: "bool macro",
|
||||
yaml: `
|
||||
startPort: 10000
|
||||
macros:
|
||||
BOOL: true
|
||||
models:
|
||||
test-model:
|
||||
cmd: /path/to/server -p ${PORT}
|
||||
`,
|
||||
shouldErr: false,
|
||||
},
|
||||
{
|
||||
name: "array macro (invalid)",
|
||||
yaml: `
|
||||
startPort: 10000
|
||||
macros:
|
||||
ARR: [1, 2, 3]
|
||||
models:
|
||||
test-model:
|
||||
cmd: /path/to/server -p ${PORT}
|
||||
`,
|
||||
shouldErr: true,
|
||||
},
|
||||
{
|
||||
name: "map macro (invalid)",
|
||||
yaml: `
|
||||
startPort: 10000
|
||||
macros:
|
||||
MAP:
|
||||
key: value
|
||||
models:
|
||||
test-model:
|
||||
cmd: /path/to/server -p ${PORT}
|
||||
`,
|
||||
shouldErr: true,
|
||||
},
|
||||
}
|
||||
|
||||
for _, tt := range tests {
|
||||
t.Run(tt.name, func(t *testing.T) {
|
||||
_, err := LoadConfigFromReader(strings.NewReader(tt.yaml))
|
||||
if tt.shouldErr {
|
||||
assert.Error(t, err)
|
||||
} else {
|
||||
assert.NoError(t, err)
|
||||
}
|
||||
})
|
||||
}
|
||||
}
|
||||
|
||||
@@ -155,7 +155,7 @@ groups:
|
||||
expected := Config{
|
||||
LogLevel: "info",
|
||||
StartPort: 5800,
|
||||
Macros: map[string]string{
|
||||
Macros: MacroList{
|
||||
"svr-path": "path/to/server",
|
||||
},
|
||||
Models: map[string]ModelConfig{
|
||||
|
||||
@@ -31,6 +31,10 @@ type ModelConfig struct {
|
||||
// Macros: see #264
|
||||
// Model level macros take precedence over the global macros
|
||||
Macros MacroList `yaml:"macros"`
|
||||
|
||||
// Metadata: see #264
|
||||
// Arbitrary metadata that can be exposed through the API
|
||||
Metadata map[string]any `yaml:"metadata"`
|
||||
}
|
||||
|
||||
func (m *ModelConfig) UnmarshalYAML(unmarshal func(interface{}) error) error {
|
||||
|
||||
@@ -370,6 +370,13 @@ func (pm *ProxyManager) listModelsHandler(c *gin.Context) {
|
||||
record["description"] = desc
|
||||
}
|
||||
|
||||
// Add metadata if present
|
||||
if len(modelConfig.Metadata) > 0 {
|
||||
record["meta"] = gin.H{
|
||||
"llamaswap": modelConfig.Metadata,
|
||||
}
|
||||
}
|
||||
|
||||
data = append(data, record)
|
||||
}
|
||||
|
||||
|
||||
@@ -282,6 +282,90 @@ func TestProxyManager_ListModelsHandler(t *testing.T) {
|
||||
assert.Empty(t, expectedModels, "not all expected models were returned")
|
||||
}
|
||||
|
||||
func TestProxyManager_ListModelsHandler_WithMetadata(t *testing.T) {
|
||||
// Process config through LoadConfigFromReader to apply macro substitution
|
||||
configYaml := `
|
||||
healthCheckTimeout: 15
|
||||
logLevel: error
|
||||
startPort: 10000
|
||||
models:
|
||||
model1:
|
||||
cmd: /path/to/server -p ${PORT}
|
||||
macros:
|
||||
PORT_NUM: 10001
|
||||
TEMP: 0.7
|
||||
NAME: "llama"
|
||||
metadata:
|
||||
port: ${PORT_NUM}
|
||||
temperature: ${TEMP}
|
||||
enabled: true
|
||||
note: "Running on port ${PORT_NUM}"
|
||||
nested:
|
||||
value: ${TEMP}
|
||||
model2:
|
||||
cmd: /path/to/server -p ${PORT}
|
||||
`
|
||||
processedConfig, err := config.LoadConfigFromReader(strings.NewReader(configYaml))
|
||||
assert.NoError(t, err)
|
||||
|
||||
proxy := New(processedConfig)
|
||||
|
||||
req := httptest.NewRequest("GET", "/v1/models", nil)
|
||||
w := httptest.NewRecorder()
|
||||
proxy.ServeHTTP(w, req)
|
||||
|
||||
assert.Equal(t, http.StatusOK, w.Code)
|
||||
|
||||
var response struct {
|
||||
Data []map[string]any `json:"data"`
|
||||
}
|
||||
|
||||
err = json.Unmarshal(w.Body.Bytes(), &response)
|
||||
assert.NoError(t, err)
|
||||
assert.Len(t, response.Data, 2)
|
||||
|
||||
// Find model1 and model2 in response
|
||||
var model1Data, model2Data map[string]any
|
||||
for _, model := range response.Data {
|
||||
if model["id"] == "model1" {
|
||||
model1Data = model
|
||||
} else if model["id"] == "model2" {
|
||||
model2Data = model
|
||||
}
|
||||
}
|
||||
|
||||
// Verify model1 has llamaswap_meta
|
||||
assert.NotNil(t, model1Data)
|
||||
meta, exists := model1Data["meta"]
|
||||
if !assert.True(t, exists, "model1 should have meta key") {
|
||||
t.FailNow()
|
||||
}
|
||||
|
||||
metaMap := meta.(map[string]any)
|
||||
|
||||
lsmeta, exists := metaMap["llamaswap"]
|
||||
if !assert.True(t, exists, "model1 should have meta.llamaswap key") {
|
||||
t.FailNow()
|
||||
}
|
||||
|
||||
lsmetamap := lsmeta.(map[string]any)
|
||||
|
||||
// Verify type preservation
|
||||
assert.Equal(t, float64(10001), lsmetamap["port"]) // JSON numbers are float64
|
||||
assert.Equal(t, 0.7, lsmetamap["temperature"])
|
||||
assert.Equal(t, true, lsmetamap["enabled"])
|
||||
// Verify string interpolation
|
||||
assert.Equal(t, "Running on port 10001", lsmetamap["note"])
|
||||
// Verify nested structure
|
||||
nested := lsmetamap["nested"].(map[string]any)
|
||||
assert.Equal(t, 0.7, nested["value"])
|
||||
|
||||
// Verify model2 does NOT have llamaswap_meta
|
||||
assert.NotNil(t, model2Data)
|
||||
_, exists = model2Data["llamaswap_meta"]
|
||||
assert.False(t, exists, "model2 should not have llamaswap_meta")
|
||||
}
|
||||
|
||||
func TestProxyManager_ListModelsHandler_SortedByID(t *testing.T) {
|
||||
// Intentionally add models in non-sorted order and with an unlisted model
|
||||
config := config.Config{
|
||||
|
||||
Reference in New Issue
Block a user