diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..a271fd0 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,43 @@ +# Project: llama-swap + +## Project Description: + +llama-swap is a light weight, transparent proxy server that provides automatic model swapping to llama.cpp's server. + +## Tech stack + +- golang +- typescript, vite and react for UI (ui/) + +## Testing + +- `make test-dev` - Use this when making iterative changes. Runs `go test` and `staticcheck`. Fix any static checking errors. +- `make test-all` - runs at the end before completing work. Includes long running concurrency tests. + +## Workflow Tasks + +### Plan Improvements + +Work plans are located in ai-plans/. Plans written by the user may be incomplete, contain inconsistencies or errors. + +When the user asks to improve a plan follow these guidelines for expanding and improving it. + +- Identify any inconsistencies. +- Expand plans out to be detailed specification of requirements and changes to be made. +- Plans should have at least these sections: + - Title - very short, describes changes + - Overview: A more detailed summary of goal and outcomes desired + - Design Requirements: Detailed descriptions of what needs to be done + - Testing Plan: Tests to be implemented + - Checklist: A detailed list of changes to be made + +Look for "plan expansion" as explicit instructions to improve a plan. + +### Implementation of plans + +When the user says "paint it", respond with "commencing automated assembly". Then implement the changes as described by the plan. Update the checklist as you complete items. + +## General Rules + +- when summarizing changes only include details that require further action (action items) +- when there are no action items, just say "Done." diff --git a/Makefile b/Makefile index 330c434..2fc8e09 100644 --- a/Makefile +++ b/Makefile @@ -23,6 +23,11 @@ proxy/ui_dist/placeholder.txt: mkdir -p proxy/ui_dist touch $@ +# use cached test results while developing +test-dev: proxy/ui_dist/placeholder.txt + go test -short ./proxy/... + staticcheck ./proxy/... || true + test: proxy/ui_dist/placeholder.txt go test -short -count=1 ./proxy/... @@ -82,4 +87,4 @@ release: git tag "$$new_tag"; # Phony targets -.PHONY: all clean ui mac linux windows simple-responder test test-all +.PHONY: all clean ui mac linux windows simple-responder test test-all test-dev diff --git a/ai-plans/issue-264-add-metadata.md b/ai-plans/issue-264-add-metadata.md new file mode 100644 index 0000000..215c695 --- /dev/null +++ b/ai-plans/issue-264-add-metadata.md @@ -0,0 +1,292 @@ +# Add Model Metadata Support with Typed Macros + +## Overview + +Implement support for arbitrary metadata on model configurations that can be exposed through the `/v1/models` API endpoint. This feature extends the existing macro system to support scalar types (string, int, float, bool) instead of only strings, enabling type-safe metadata values. + +The metadata will be schemaless, allowing users to define any key-value pairs they need. Macro substitution will work within metadata values, preserving types when macros are used directly and converting to strings when macros are interpolated within strings. + +## Design Requirements + +### 1. Enhanced Macro System + +**Current State:** + +- Macros are defined as `map[string]string` at both global and model levels +- Only string substitution is supported +- Macros are replaced in: `cmd`, `cmdStop`, `proxy`, `checkEndpoint`, `filters.stripParams` + +**Required Changes:** + +- Change `MacroList` type from `map[string]string` to `map[string]any` +- Support scalar types: `string`, `int`, `float64`, `bool` +- Implement type-preserving macro substitution: + - Direct macro usage (`key: ${macro}`) preserves the macro's type + - Interpolated usage (`key: "text ${macro}"`) converts to string +- Add validation to ensure macro values are scalar types only +- Update existing macro substitution logic in [proxy/config/config.go](proxy/config/config.go) to handle `any` types + +**Implementation Details:** + +- Create a generic helper function to perform macro substitution that: + - Takes a value of type `any` + - Recursively processes maps, slices, and scalar values + - Replaces `${macro_name}` patterns with macro values + - Preserves types for direct substitution + - Converts to strings for interpolated substitution +- Update `validateMacro()` function to accept `any` type and validate scalar types +- Maintain backward compatibility with existing string-only macros + +### 2. Metadata Field in ModelConfig + +**Location:** [proxy/config/model_config.go](proxy/config/model_config.go) + +**Required Changes:** + +- Add `Metadata map[string]any` field to `ModelConfig` struct +- Support YAML unmarshaling of arbitrary structures (maps, arrays, scalars) +- Apply macro substitution to metadata values during config loading + +**Schema Requirements:** + +- Metadata is optional (default: empty/nil map) +- Supports nested structures (objects within objects, arrays, etc.) +- All string values within metadata undergo macro substitution +- Type preservation rules apply as described above + +### 3. Macro Substitution in Metadata + +**Location:** [proxy/config/config.go](proxy/config/config.go) in `LoadConfigFromReader()` + +**Process Flow:** + +1. After loading YAML configuration +2. After model-level and global macro merging +3. Apply macro substitution to `ModelConfig.Metadata` field +4. Use the same merged macros available to `cmd`, `proxy`, etc. +5. Process recursively through all nested structures + +**Substitution Rules:** + +- `port: ${PORT}` → keeps integer type from PORT macro +- `temperature: ${temp}` → keeps float type from temp macro +- `note: "Running on ${PORT}"` → converts to string `"Running on 10001"` +- Arrays and nested objects are processed recursively +- Unknown macros should cause configuration load error (consistent with existing behavior) + +### 4. API Response Updates + +**Location:** [proxy/proxymanager.go:350](proxy/proxymanager.go#L350) `listModelsHandler()` + +**Current Behavior:** + +- Returns model records with: `id`, `object`, `created`, `owned_by` +- Optionally includes: `name`, `description` + +**Required Changes:** + +- Add metadata to each model record under the key `llamaswap_meta` +- Only include `llamaswap_meta` if metadata is non-empty +- Preserve all types when marshaling to JSON +- Maintain existing sorting by model ID + +**Example Response:** + +```json +{ + "object": "list", + "data": [ + { + "id": "llama", + "object": "model", + "created": 1234567890, + "owned_by": "llama-swap", + "name": "llama 3.1 8B", + "description": "A small but capable model", + "llamaswap_meta": { + "port": 10001, + "temperature": 0.7, + "note": "The llama is running on port 10001 temp=0.7, context=16384", + "a_list": [1, 1.23, "macros are OK in list and dictionary types: llama"], + "an_obj": { + "a": "1", + "b": 2, + "c": [0.7, false, "model: llama"] + } + } + } + ] +} +``` + +### 5. Validation and Error Handling + +**Macro Validation:** + +- Extend `validateMacro()` to accept values of type `any` +- Verify macro values are scalar types: `string`, `int`, `float64`, `bool` +- Reject complex types (maps, slices, structs) as macro values +- Maintain existing validation for macro names and lengths + +**Configuration Loading:** + +- Fail fast if unknown macros are found in metadata +- Provide clear error messages indicating which model and field contains errors +- Ensure macros in metadata follow same rules as macros in cmd/proxy fields + +## Testing Plan + +### Test 1: Model-Level Macros with Different Types + +**File:** [proxy/config/model_config_test.go](proxy/config/model_config_test.go) + +**Test Cases:** + +- Define model with macros of each scalar type +- Verify metadata correctly substitutes and preserves types +- Test direct substitution (`port: ${PORT}`) +- Test string interpolation (`note: "Port is ${PORT}"`) +- Verify nested objects and arrays work correctly + +### Test 2: Global and Model Macro Precedence + +**File:** [proxy/config/config_test.go](proxy/config/config_test.go) + +**Test Cases:** + +- Define same macro at global and model level with different types +- Verify model-level macro takes precedence +- Test metadata uses correct macro value +- Verify type is preserved from the winning macro + +### Test 3: Macro Validation + +**File:** [proxy/config/config_test.go](proxy/config/config_test.go) + +**Test Cases:** + +- Test that complex types (maps, arrays) are rejected as macro values + - Verify error message includes: macro name and type that was rejected +- Test that scalar types (string, int, float, bool) are accepted + - Each type should load without error +- Test macro name validation still works with `any` types + - Invalid characters, reserved names, length limits should still be enforced + +### Test 4: Metadata in API Response + +**File:** [proxy/proxymanager_test.go](proxy/proxymanager_test.go) + +**Existing Test:** `TestProxyManager_ListModelsHandler` + +**Test Cases:** + +- Model with metadata → verify `llamaswap_meta` key appears +- Model without metadata → verify `llamaswap_meta` key is absent +- Verify all types are correctly marshaled to JSON +- Verify nested structures are preserved +- Verify macro substitution has occurred before serialization + +### Test 5: Unknown Macros in Metadata + +**File:** [proxy/config/config_test.go](proxy/config/config_test.go) + +**Test Cases:** + +- Use undefined macro in metadata +- Verify configuration loading fails with clear error +- Error should indicate model name and that macro is undefined + +### Test 6: Recursive Substitution + +**File:** [proxy/config/config_test.go](proxy/config/config_test.go) + +**Test Cases:** + +- Metadata with deeply nested structures +- Arrays containing objects with macros +- Objects containing arrays with macros +- Mixed string interpolation and direct substitution at various nesting levels + +## Checklist + +### Configuration Schema Changes + +- [x] Change `MacroList` type from `map[string]string` to `map[string]any` in [proxy/config/config.go:19](proxy/config/config.go#L19) +- [x] Add `Metadata map[string]any` field to `ModelConfig` struct in [proxy/config/model_config.go:37](proxy/config/model_config.go#L37) +- [x] Update `validateMacro()` function signature to accept `any` type for values +- [x] Add validation logic to ensure macro values are scalar types only + +### Macro Substitution Logic + +- [x] Create generic recursive function `substituteMetadataMacros()` to handle `any` types +- [x] Implement type-preserving direct substitution logic +- [x] Implement string interpolation with type conversion +- [x] Handle maps: recursively process all values +- [x] Handle slices: recursively process all elements +- [x] Handle scalar types: perform string-based macro substitution if value is string +- [x] Integrate macro substitution into `LoadConfigFromReader()` after existing macro expansion +- [x] Update existing macro substitution calls to use merged macros with correct types + +### API Response Changes + +- [x] Modify `listModelsHandler()` in [proxy/proxymanager.go:350](proxy/proxymanager.go#L350) +- [x] Add `llamaswap_meta` field to model records when metadata exists +- [x] Ensure empty metadata results in omitted `llamaswap_meta` key +- [x] Verify JSON marshaling preserves all types correctly + +### Testing - Config Package + +- [x] Add test for string macros in metadata: [proxy/config/config_test.go](proxy/config/config_test.go) +- [x] Add test for int macros in metadata: [proxy/config/config_test.go](proxy/config/config_test.go) +- [x] Add test for float macros in metadata: [proxy/config/config_test.go](proxy/config/config_test.go) +- [x] Add test for bool macros in metadata: [proxy/config/config_test.go](proxy/config/config_test.go) +- [x] Add test for string interpolation in metadata: [proxy/config/config_test.go](proxy/config/config_test.go) +- [x] Add test for model-level macro precedence: [proxy/config/config_test.go](proxy/config/config_test.go) +- [x] Add test for nested structures in metadata: [proxy/config/config_test.go](proxy/config/config_test.go) +- [x] Add test for unknown macro in metadata (should error): [proxy/config/config_test.go](proxy/config/config_test.go) +- [x] Add test for invalid macro type validation: [proxy/config/config_test.go](proxy/config/config_test.go) + +### Testing - Model Config Package + +- [x] Add test cases to [proxy/config/model_config_test.go](proxy/config/model_config_test.go) for metadata unmarshaling +- [x] Test metadata with various scalar types +- [x] Test metadata with nested objects and arrays + +### Testing - Proxy Manager + +- [x] Update `TestProxyManager_ListModelsHandler` in [proxy/proxymanager_test.go](proxy/proxymanager_test.go) +- [x] Add test case for model with metadata +- [x] Add test case for model without metadata +- [x] Verify `llamaswap_meta` key presence/absence +- [x] Verify type preservation in JSON output +- [x] Verify macro substitution has occurred + +### Documentation + +- [x] Verify [config.example.yaml](config.example.yaml) already has complete metadata examples (lines 149-171) +- [x] No additional documentation needed per project instructions + +## Known Issues and Considerations + +### Inconsistencies + +None identified. The plan references the correct existing example in [config.example.yaml:149-171](config.example.yaml#L149-L171). + +### Design Decisions + +1. **Why `llamaswap_meta` instead of merging into record?** + + - Avoids potential collisions with OpenAI API standard fields + - Makes it clear this is llama-swap specific metadata + - Easier for clients to distinguish standard vs. custom fields + +2. **Why support nested structures?** + + - Provides maximum flexibility for users + - Aligns with the schemaless design principle + - Example config already demonstrates this capability + +3. **Why validate macro types?** + - Prevents confusing behavior (e.g., substituting a map) + - Makes configuration errors explicit at load time + - Simpler implementation and testing diff --git a/config.example.yaml b/config.example.yaml index 8699a03..711e427 100644 --- a/config.example.yaml +++ b/config.example.yaml @@ -67,7 +67,8 @@ models: # - macros defined here override macros defined in the global macros section # - model level macros follow the same rules as global macros macros: - "default_ctx": "16384" + "default_ctx": 16384 + "temp": 0.7 # cmd: the command to run to start the inference server. # - required @@ -79,6 +80,7 @@ models: ${latest-llama} --model path/to/llama-8B-Q4_K_M.gguf --ctx-size ${default_ctx} + --temperature ${temp} # name: a display name for the model # - optional, default: empty string @@ -144,6 +146,30 @@ models: # - recommended to stick to sampling parameters stripParams: "temperature, top_p, top_k" + # metadata: a dictionary of arbitrary values that are included in /v1/models + # - optional, default: empty dictionary + # - while metadata can contains complex types it is recommended to keep it simple + # - metadata is only passed through in /v1/models responses + metadata: + # port will remain an integer + port: ${PORT} + + # the ${temp} macro will remain a float + temperature: ${temp} + note: "The ${MODEL_ID} is running on port ${PORT} temp=${temp}, context=${default_ctx}" + + a_list: + - 1 + - 1.23 + - "macros are OK in list and dictionary types: ${MODEL_ID}" + + an_obj: + a: "1" + b: 2 + # objects can contain complex types with macro substitution + # becomes: c: [0.7, false, "model: llama"] + c: ["${temp}", false, "model: ${MODEL_ID}"] + # concurrencyLimit: overrides the allowed number of active parallel requests to a model # - optional, default: 0 # - useful for limiting the number of active parallel requests a model can process diff --git a/proxy/config/config.go b/proxy/config/config.go index e043202..2bf4067 100644 --- a/proxy/config/config.go +++ b/proxy/config/config.go @@ -16,7 +16,7 @@ import ( const DEFAULT_GROUP_ID = "(default)" -type MacroList map[string]string +type MacroList map[string]any type GroupConfig struct { Swap bool `yaml:"swap"` @@ -25,6 +25,11 @@ type GroupConfig struct { Members []string `yaml:"members"` } +var ( + macroNameRegex = regexp.MustCompile(`^[a-zA-Z0-9_-]+$`) + macroPatternRegex = regexp.MustCompile(`\$\{([a-zA-Z0-9_-]+)\}`) +) + // set default values for GroupConfig func (c *GroupConfig) UnmarshalYAML(unmarshal func(interface{}) error) error { type rawGroupConfig GroupConfig @@ -182,14 +187,18 @@ func LoadConfigFromReader(r io.Reader) (Config, error) { mergedMacros[k] = v } + mergedMacros["MODEL_ID"] = modelId + // go through model config fields: cmd, cmdStop, proxy, checkEndPoint and replace macros with macro values for macroName, macroValue := range mergedMacros { macroSlug := fmt.Sprintf("${%s}", macroName) - modelConfig.Cmd = strings.ReplaceAll(modelConfig.Cmd, macroSlug, macroValue) - modelConfig.CmdStop = strings.ReplaceAll(modelConfig.CmdStop, macroSlug, macroValue) - modelConfig.Proxy = strings.ReplaceAll(modelConfig.Proxy, macroSlug, macroValue) - modelConfig.CheckEndpoint = strings.ReplaceAll(modelConfig.CheckEndpoint, macroSlug, macroValue) - modelConfig.Filters.StripParams = strings.ReplaceAll(modelConfig.Filters.StripParams, macroSlug, macroValue) + // Convert macro value to string for command/string field substitution + macroStr := fmt.Sprintf("%v", macroValue) + modelConfig.Cmd = strings.ReplaceAll(modelConfig.Cmd, macroSlug, macroStr) + modelConfig.CmdStop = strings.ReplaceAll(modelConfig.CmdStop, macroSlug, macroStr) + modelConfig.Proxy = strings.ReplaceAll(modelConfig.Proxy, macroSlug, macroStr) + modelConfig.CheckEndpoint = strings.ReplaceAll(modelConfig.CheckEndpoint, macroSlug, macroStr) + modelConfig.Filters.StripParams = strings.ReplaceAll(modelConfig.Filters.StripParams, macroSlug, macroStr) } // enforce ${PORT} used in both cmd and proxy @@ -203,16 +212,14 @@ func LoadConfigFromReader(r io.Reader) (Config, error) { modelConfig.Cmd = strings.ReplaceAll(modelConfig.Cmd, "${PORT}", nextPortStr) modelConfig.CmdStop = strings.ReplaceAll(modelConfig.CmdStop, "${PORT}", nextPortStr) modelConfig.Proxy = strings.ReplaceAll(modelConfig.Proxy, "${PORT}", nextPortStr) + + // add port to merged macros so it can be used in metadata + mergedMacros["PORT"] = nextPort + nextPort++ } - if strings.Contains(modelConfig.Cmd, "${MODEL_ID}") || strings.Contains(modelConfig.CmdStop, "${MODEL_ID}") { - modelConfig.Cmd = strings.ReplaceAll(modelConfig.Cmd, "${MODEL_ID}", modelId) - modelConfig.CmdStop = strings.ReplaceAll(modelConfig.CmdStop, "${MODEL_ID}", modelId) - } - // make sure there are no unknown macros that have not been replaced - macroPattern := regexp.MustCompile(`\$\{([a-zA-Z0-9_-]+)\}`) fieldMap := map[string]string{ "cmd": modelConfig.Cmd, "cmdStop": modelConfig.CmdStop, @@ -222,7 +229,7 @@ func LoadConfigFromReader(r io.Reader) (Config, error) { } for fieldName, fieldValue := range fieldMap { - matches := macroPattern.FindAllStringSubmatch(fieldValue, -1) + matches := macroPatternRegex.FindAllStringSubmatch(fieldValue, -1) for _, match := range matches { macroName := match[1] if macroName == "PID" && fieldName == "cmdStop" { @@ -234,6 +241,15 @@ func LoadConfigFromReader(r io.Reader) (Config, error) { } } + // Apply macro substitution to metadata + if len(modelConfig.Metadata) > 0 { + substitutedMetadata, err := substituteMetadataMacros(modelConfig.Metadata, mergedMacros) + if err != nil { + return Config{}, fmt.Errorf("model %s metadata: %s", modelId, err.Error()) + } + modelConfig.Metadata = substitutedMetadata.(map[string]any) + } + config.Models[modelId] = modelConfig } @@ -296,7 +312,7 @@ func AddDefaultGroupToConfig(config Config) Config { } } else { // iterate over existing group members and add non-grouped models into the default group - for modelName, _ := range config.Models { + for modelName := range config.Models { foundModel := false found: // search for the model in existing groups @@ -369,20 +385,25 @@ func StripComments(cmdStr string) string { return strings.Join(cleanedLines, "\n") } -var ( - macroNameRegex = regexp.MustCompile(`^[a-zA-Z0-9_-]+$`) -) - // validateMacro validates macro name and value constraints -func validateMacro(name, value string) error { +func validateMacro(name string, value any) error { if len(name) >= 64 { return fmt.Errorf("macro name '%s' exceeds maximum length of 63 characters", name) } if !macroNameRegex.MatchString(name) { return fmt.Errorf("macro name '%s' contains invalid characters, must match pattern ^[a-zA-Z0-9_-]+$", name) } - if len(value) >= 1024 { - return fmt.Errorf("macro value for '%s' exceeds maximum length of 1024 characters", name) + + // Validate that value is a scalar type + switch v := value.(type) { + case string: + if len(v) >= 1024 { + return fmt.Errorf("macro value for '%s' exceeds maximum length of 1024 characters", name) + } + case int, int8, int16, int32, int64, uint, uint8, uint16, uint32, uint64, float32, float64, bool: + // These types are allowed + default: + return fmt.Errorf("macro '%s' has invalid type %T, must be a scalar type (string, int, float, or bool)", name, value) } switch name { @@ -392,3 +413,63 @@ func validateMacro(name, value string) error { return nil } + +// substituteMetadataMacros recursively substitutes macros in metadata structures +// Direct substitution (key: ${macro}) preserves the macro's type +// Interpolated substitution (key: "text ${macro}") converts to string +func substituteMetadataMacros(value any, macros MacroList) (any, error) { + switch v := value.(type) { + case string: + // Check if this is a direct macro substitution + if strings.HasPrefix(v, "${") && strings.HasSuffix(v, "}") && strings.Count(v, "${") == 1 { + macroName := v[2 : len(v)-1] + if macroValue, exists := macros[macroName]; exists { + return macroValue, nil + } + return nil, fmt.Errorf("unknown macro '${%s}' in metadata", macroName) + } + + // Handle string interpolation + matches := macroPatternRegex.FindAllStringSubmatch(v, -1) + result := v + for _, match := range matches { + macroName := match[1] + macroValue, exists := macros[macroName] + if !exists { + return nil, fmt.Errorf("unknown macro '${%s}' in metadata", macroName) + } + // Convert macro value to string for interpolation + macroStr := fmt.Sprintf("%v", macroValue) + result = strings.ReplaceAll(result, match[0], macroStr) + } + return result, nil + + case map[string]any: + // Recursively process map values + newMap := make(map[string]any) + for key, val := range v { + newVal, err := substituteMetadataMacros(val, macros) + if err != nil { + return nil, err + } + newMap[key] = newVal + } + return newMap, nil + + case []any: + // Recursively process slice elements + newSlice := make([]any, len(v)) + for i, val := range v { + newVal, err := substituteMetadataMacros(val, macros) + if err != nil { + return nil, err + } + newSlice[i] = newVal + } + return newSlice, nil + + default: + // Return scalar types as-is + return value, nil + } +} diff --git a/proxy/config/config_posix_test.go b/proxy/config/config_posix_test.go index 9be46e1..036c92f 100644 --- a/proxy/config/config_posix_test.go +++ b/proxy/config/config_posix_test.go @@ -163,7 +163,7 @@ groups: expected := Config{ LogLevel: "info", StartPort: 5800, - Macros: map[string]string{ + Macros: MacroList{ "svr-path": "path/to/server", }, Hooks: HooksConfig{ diff --git a/proxy/config/config_test.go b/proxy/config/config_test.go index fdb87ee..ec2aa05 100644 --- a/proxy/config/config_test.go +++ b/proxy/config/config_test.go @@ -517,3 +517,243 @@ models: assert.NoError(t, err) assert.Equal(t, "/path/to/server -p 9000 -hf author/model:F16", strings.Join(sanitizedCmd3, " ")) } + +func TestConfig_TypedMacrosInMetadata(t *testing.T) { + content := ` +startPort: 10000 +macros: + PORT_NUM: 10001 + TEMP: 0.7 + ENABLED: true + NAME: "llama model" + CTX: 16384 + +models: + test-model: + cmd: /path/to/server -p ${PORT} + metadata: + port: ${PORT_NUM} + temperature: ${TEMP} + enabled: ${ENABLED} + model_name: ${NAME} + context: ${CTX} + note: "Running on port ${PORT_NUM} with temp ${TEMP} and context ${CTX}" +` + + config, err := LoadConfigFromReader(strings.NewReader(content)) + assert.NoError(t, err) + + meta := config.Models["test-model"].Metadata + assert.NotNil(t, meta) + + // Verify direct substitution preserves types + assert.Equal(t, 10001, meta["port"]) + assert.Equal(t, 0.7, meta["temperature"]) + assert.Equal(t, true, meta["enabled"]) + assert.Equal(t, "llama model", meta["model_name"]) + assert.Equal(t, 16384, meta["context"]) + + // Verify string interpolation converts to string + assert.Equal(t, "Running on port 10001 with temp 0.7 and context 16384", meta["note"]) +} + +func TestConfig_NestedStructuresInMetadata(t *testing.T) { + content := ` +startPort: 10000 +macros: + TEMP: 0.7 + +models: + test-model: + cmd: /path/to/server -p ${PORT} + metadata: + config: + port: ${PORT} + temperature: ${TEMP} + tags: ["model:${MODEL_ID}", "port:${PORT}"] + nested: + deep: + value: ${TEMP} +` + + config, err := LoadConfigFromReader(strings.NewReader(content)) + assert.NoError(t, err) + + meta := config.Models["test-model"].Metadata + assert.NotNil(t, meta) + + // Verify nested objects + configMap := meta["config"].(map[string]any) + assert.Equal(t, 10000, configMap["port"]) + assert.Equal(t, 0.7, configMap["temperature"]) + + // Verify arrays + tags := meta["tags"].([]any) + assert.Equal(t, "model:test-model", tags[0]) + assert.Equal(t, "port:10000", tags[1]) + + // Verify deeply nested structures + nested := meta["nested"].(map[string]any) + deep := nested["deep"].(map[string]any) + assert.Equal(t, 0.7, deep["value"]) +} + +func TestConfig_ModelLevelMacroPrecedenceInMetadata(t *testing.T) { + content := ` +startPort: 10000 +macros: + TEMP: 0.5 + GLOBAL_VAL: "global" + +models: + test-model: + cmd: /path/to/server -p ${PORT} + macros: + TEMP: 0.9 + LOCAL_VAL: "local" + metadata: + temperature: ${TEMP} + global: ${GLOBAL_VAL} + local: ${LOCAL_VAL} +` + + config, err := LoadConfigFromReader(strings.NewReader(content)) + assert.NoError(t, err) + + meta := config.Models["test-model"].Metadata + assert.NotNil(t, meta) + + // Model-level macro should override global + assert.Equal(t, 0.9, meta["temperature"]) + // Global macro should be accessible + assert.Equal(t, "global", meta["global"]) + // Model-level macro should be accessible + assert.Equal(t, "local", meta["local"]) +} + +func TestConfig_UnknownMacroInMetadata(t *testing.T) { + content := ` +startPort: 10000 +models: + test-model: + cmd: /path/to/server -p ${PORT} + metadata: + value: ${UNKNOWN_MACRO} +` + + _, err := LoadConfigFromReader(strings.NewReader(content)) + assert.Error(t, err) + assert.Contains(t, err.Error(), "test-model") + assert.Contains(t, err.Error(), "UNKNOWN_MACRO") +} + +func TestConfig_InvalidMacroType(t *testing.T) { + content := ` +startPort: 10000 +macros: + INVALID: + nested: value + +models: + test-model: + cmd: /path/to/server -p ${PORT} +` + + _, err := LoadConfigFromReader(strings.NewReader(content)) + assert.Error(t, err) + assert.Contains(t, err.Error(), "INVALID") + assert.Contains(t, err.Error(), "must be a scalar type") +} + +func TestConfig_MacroTypeValidation(t *testing.T) { + tests := []struct { + name string + yaml string + shouldErr bool + }{ + { + name: "string macro", + yaml: ` +startPort: 10000 +macros: + STR: "test" +models: + test-model: + cmd: /path/to/server -p ${PORT} +`, + shouldErr: false, + }, + { + name: "int macro", + yaml: ` +startPort: 10000 +macros: + NUM: 42 +models: + test-model: + cmd: /path/to/server -p ${PORT} +`, + shouldErr: false, + }, + { + name: "float macro", + yaml: ` +startPort: 10000 +macros: + FLOAT: 3.14 +models: + test-model: + cmd: /path/to/server -p ${PORT} +`, + shouldErr: false, + }, + { + name: "bool macro", + yaml: ` +startPort: 10000 +macros: + BOOL: true +models: + test-model: + cmd: /path/to/server -p ${PORT} +`, + shouldErr: false, + }, + { + name: "array macro (invalid)", + yaml: ` +startPort: 10000 +macros: + ARR: [1, 2, 3] +models: + test-model: + cmd: /path/to/server -p ${PORT} +`, + shouldErr: true, + }, + { + name: "map macro (invalid)", + yaml: ` +startPort: 10000 +macros: + MAP: + key: value +models: + test-model: + cmd: /path/to/server -p ${PORT} +`, + shouldErr: true, + }, + } + + for _, tt := range tests { + t.Run(tt.name, func(t *testing.T) { + _, err := LoadConfigFromReader(strings.NewReader(tt.yaml)) + if tt.shouldErr { + assert.Error(t, err) + } else { + assert.NoError(t, err) + } + }) + } +} diff --git a/proxy/config/config_windows_test.go b/proxy/config/config_windows_test.go index e89bfcf..293f41a 100644 --- a/proxy/config/config_windows_test.go +++ b/proxy/config/config_windows_test.go @@ -155,7 +155,7 @@ groups: expected := Config{ LogLevel: "info", StartPort: 5800, - Macros: map[string]string{ + Macros: MacroList{ "svr-path": "path/to/server", }, Models: map[string]ModelConfig{ diff --git a/proxy/config/model_config.go b/proxy/config/model_config.go index 40386f6..49e78e9 100644 --- a/proxy/config/model_config.go +++ b/proxy/config/model_config.go @@ -31,6 +31,10 @@ type ModelConfig struct { // Macros: see #264 // Model level macros take precedence over the global macros Macros MacroList `yaml:"macros"` + + // Metadata: see #264 + // Arbitrary metadata that can be exposed through the API + Metadata map[string]any `yaml:"metadata"` } func (m *ModelConfig) UnmarshalYAML(unmarshal func(interface{}) error) error { diff --git a/proxy/proxymanager.go b/proxy/proxymanager.go index 50b677d..9ecef92 100644 --- a/proxy/proxymanager.go +++ b/proxy/proxymanager.go @@ -370,6 +370,13 @@ func (pm *ProxyManager) listModelsHandler(c *gin.Context) { record["description"] = desc } + // Add metadata if present + if len(modelConfig.Metadata) > 0 { + record["meta"] = gin.H{ + "llamaswap": modelConfig.Metadata, + } + } + data = append(data, record) } diff --git a/proxy/proxymanager_test.go b/proxy/proxymanager_test.go index 0a99f9d..3c8a158 100644 --- a/proxy/proxymanager_test.go +++ b/proxy/proxymanager_test.go @@ -282,6 +282,90 @@ func TestProxyManager_ListModelsHandler(t *testing.T) { assert.Empty(t, expectedModels, "not all expected models were returned") } +func TestProxyManager_ListModelsHandler_WithMetadata(t *testing.T) { + // Process config through LoadConfigFromReader to apply macro substitution + configYaml := ` +healthCheckTimeout: 15 +logLevel: error +startPort: 10000 +models: + model1: + cmd: /path/to/server -p ${PORT} + macros: + PORT_NUM: 10001 + TEMP: 0.7 + NAME: "llama" + metadata: + port: ${PORT_NUM} + temperature: ${TEMP} + enabled: true + note: "Running on port ${PORT_NUM}" + nested: + value: ${TEMP} + model2: + cmd: /path/to/server -p ${PORT} +` + processedConfig, err := config.LoadConfigFromReader(strings.NewReader(configYaml)) + assert.NoError(t, err) + + proxy := New(processedConfig) + + req := httptest.NewRequest("GET", "/v1/models", nil) + w := httptest.NewRecorder() + proxy.ServeHTTP(w, req) + + assert.Equal(t, http.StatusOK, w.Code) + + var response struct { + Data []map[string]any `json:"data"` + } + + err = json.Unmarshal(w.Body.Bytes(), &response) + assert.NoError(t, err) + assert.Len(t, response.Data, 2) + + // Find model1 and model2 in response + var model1Data, model2Data map[string]any + for _, model := range response.Data { + if model["id"] == "model1" { + model1Data = model + } else if model["id"] == "model2" { + model2Data = model + } + } + + // Verify model1 has llamaswap_meta + assert.NotNil(t, model1Data) + meta, exists := model1Data["meta"] + if !assert.True(t, exists, "model1 should have meta key") { + t.FailNow() + } + + metaMap := meta.(map[string]any) + + lsmeta, exists := metaMap["llamaswap"] + if !assert.True(t, exists, "model1 should have meta.llamaswap key") { + t.FailNow() + } + + lsmetamap := lsmeta.(map[string]any) + + // Verify type preservation + assert.Equal(t, float64(10001), lsmetamap["port"]) // JSON numbers are float64 + assert.Equal(t, 0.7, lsmetamap["temperature"]) + assert.Equal(t, true, lsmetamap["enabled"]) + // Verify string interpolation + assert.Equal(t, "Running on port 10001", lsmetamap["note"]) + // Verify nested structure + nested := lsmetamap["nested"].(map[string]any) + assert.Equal(t, 0.7, nested["value"]) + + // Verify model2 does NOT have llamaswap_meta + assert.NotNil(t, model2Data) + _, exists = model2Data["llamaswap_meta"] + assert.False(t, exists, "model2 should not have llamaswap_meta") +} + func TestProxyManager_ListModelsHandler_SortedByID(t *testing.T) { // Intentionally add models in non-sorted order and with an unlisted model config := config.Config{