# Wildcard System - Design Document

**Document Type**: Technical Design Document
**Product**: ComfyUI Impact Pack Wildcard System
**Version**: 2.0 (Depth-Agnostic Matching)
**Last Updated**: 2025-11-18
**Status**: Released

---

## 1. System Architecture

### 1.1 High-Level Architecture

```
┌─────────────────────────────────────────────────────────────┐
│                     ComfyUI Frontend                         │
│  ┌──────────────────────────────────────────────────────┐   │
│  │  ImpactWildcardProcessor / ImpactWildcardEncode      │   │
│  │  - Wildcard Prompt (editable)                        │   │
│  │  - Populated Prompt (read-only in Populate mode)     │   │
│  │  - Mode: Populate / Fixed                            │   │
│  │  - UI Indicator: 🟢 Full Cache / 🔵 On-Demand       │   │
│  └──────────────────────────────────────────────────────┘   │
└────────────────────────┬─────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│                     Impact Server (API)                      │
│  ┌──────────────────────────────────────────────────────┐   │
│  │  POST /impact/wildcards                              │   │
│  │  GET  /impact/wildcards/list                         │   │
│  │  GET  /impact/wildcards/list/loaded                  │   │
│  │  GET  /impact/wildcards/refresh                      │   │
│  └──────────────────────────────────────────────────────┘   │
└────────────────────────┬─────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│                Wildcard Processing Engine                    │
│  ┌──────────────────────────────────────────────────────┐   │
│  │  process() - Main entry point                        │   │
│  │    ├─ process_comment_out()                          │   │
│  │    ├─ replace_options() - {a|b|c}                    │   │
│  │    └─ replace_wildcard() - __wildcard__              │   │
│  │                                                        │   │
│  │  get_wildcard_value()                                │   │
│  │    ├─ Direct lookup                                  │   │
│  │    ├─ Depth-agnostic fallback ⭐ NEW                 │   │
│  │    └─ On-demand file loading                         │   │
│  │                                                        │   │
│  │  get_wildcard_options() - {option1|__wild__|option3} │   │
│  │    └─ Pattern matching for wildcards in options      │   │
│  └──────────────────────────────────────────────────────┘   │
└────────────────────────┬─────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│                   Loading System                             │
│  ┌──────────────────────────────────────────────────────┐   │
│  │  Startup Phase                                        │   │
│  │    ├─ calculate_directory_size() - Early termination │   │
│  │    ├─ Determine mode (Full Cache / On-Demand)        │   │
│  │    └─ scan_wildcard_metadata() - TXT metadata only   │   │
│  │                                                        │   │
│  │  Full Cache Mode                                     │   │
│  │    └─ load_wildcards() - Load all data              │   │
│  │                                                        │   │
│  │  On-Demand Mode ⭐ NEW                                │   │
│  │    ├─ Pre-load: YAML files (keys in content)         │   │
│  │    └─ On-demand: TXT files (path = key)              │   │
│  └──────────────────────────────────────────────────────┘   │
└────────────────────────┬─────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│                    Data Storage                              │
│  ┌──────────────────────────────────────────────────────┐   │
│  │  wildcard_dict = {}                                  │   │
│  │    - Full cache: All wildcard data                   │   │
│  │    - On-demand: Not used                             │   │
│  │                                                        │   │
│  │  available_wildcards = {}  ⭐ NEW                     │   │
│  │    - On-demand only: Metadata (path → file)          │   │
│  │    - Example: {"dragon": "/path/dragon.txt"}         │   │
│  │                                                        │   │
│  │  loaded_wildcards = {}  ⭐ NEW                        │   │
│  │    - On-demand only: Loaded data cache               │   │
│  │    - Example: {"dragon": ["red dragon", "blue..."]}  │   │
│  └──────────────────────────────────────────────────────┘   │
└────────────────────────┬─────────────────────────────────────┘
                         │
                         ▼
┌─────────────────────────────────────────────────────────────┐
│                     File System                              │
│  ┌──────────────────────────────────────────────────────┐   │
│  │  wildcards/ (bundled)                                │   │
│  │  custom_wildcards/ (user-defined)                    │   │
│  │    ├─ *.txt files (one option per line)              │   │
│  │    └─ *.yaml files (nested structure)                │   │
│  └──────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────┘
```

---

## 2. Core Components

### 2.1 Processing Engine

#### 2.1.1 process()

**Purpose**: Main entry point for wildcard text processing

**Flow**:
```python
def process(text, seed=None):
    1. process_comment_out(text)     # Remove # comments
    2. random.seed(seed)              # Deterministic generation
    3. replace_options(text)          # Process {a|b|c}
    4. replace_wildcard(text)         # Process __wildcard__
    5. return processed_text
```

**Features**:
- Maximum 100 iterations for nested expansion
- Deterministic with seed
- Supports transitive wildcards

---

#### 2.1.2 replace_options()

**Purpose**: Process dynamic prompts `{option1|option2}`

**Supported Syntax**:
```python
{a|b|c}                      # Random selection
{3::a|2::b|c}                # Weighted (3:2:1 ratio)
{2$$, $$a|b|c|d}             # Multi-select 2, comma-separated
{2-4$$; $$a|b|c|d}           # Multi-select 2-4, semicolon-separated
{a|{b|c}|d}                  # Nested options
```

**Algorithm**:
1. Parse weight prefix (`::`)
2. Calculate normalized probabilities
3. Use `np.random.choice()` with probabilities
4. Handle multi-select with custom separators

---

#### 2.1.3 replace_wildcard()

**Purpose**: Process wildcard references `__wildcard__`

**Flow**:
```python
def replace_wildcard(string):
    for each __match__:
        1. keyword = normalize(match)
        2. options = get_wildcard_value(keyword)
        3. if options:
              random select from options
           elif '*' in keyword:
              pattern matching (for __*/name__)
           else:
              keep unchanged
        4. replace in string
```

**Pattern Matching** (`__*/name__`):
```python
if keyword.startswith('*/'):
    base_name = keyword[2:]  # "*/dragon" → "dragon"
    for k in wildcards:
        if matches_pattern(k, base_name):
            collect options
    combine all options
```

---

### 2.2 Depth-Agnostic Matching ⭐ NEW

#### 2.2.1 get_wildcard_value()

**Purpose**: Retrieve wildcard data with automatic depth-agnostic fallback

**Algorithm**:
```python
def get_wildcard_value(key):
    # Phase 1: Direct lookup
    if key in loaded_wildcards:
        return loaded_wildcards[key]

    # Phase 2: File discovery
    file_path = find_wildcard_file(key)
    if file_path:
        load and cache
        return data

    # Phase 3: Depth-agnostic fallback ⭐ NEW
    matched_keys = []
    for k in available_wildcards:
        if matches_depth_agnostic(k, key):
            matched_keys.append(k)

    if matched_keys:
        # Combine all matched wildcards
        all_options = []
        for mk in matched_keys:
            all_options.extend(get_wildcard_value(mk))

        # Cache combined result
        loaded_wildcards[key] = all_options
        return all_options

    return None
```

**Pattern Matching Logic**:
```python
def matches_depth_agnostic(stored_key, search_key):
    """
    Examples:
      search_key = "dragon"
      stored_key = "dragon"                     → True (exact)
      stored_key = "custom_wildcards/dragon"    → True (ends with)
      stored_key = "dragon/wizard"              → True (starts with)
      stored_key = "a/b/dragon/c/d"            → True (contains)
    """
    return (stored_key == search_key or
            stored_key.endswith('/' + search_key) or
            stored_key.startswith(search_key + '/') or
            ('/' + search_key + '/') in stored_key)
```

**Benefits**:
- Works with any directory structure
- No configuration needed
- Combines multiple sources for variety
- Cached for performance

---

### 2.3 Loading System

#### 2.3.1 Mode Detection

**Decision Algorithm**:
```python
def determine_loading_mode():
    total_size = calculate_directory_size()
    cache_limit = config.wildcard_cache_limit_mb * 1024 * 1024

    if total_size >= cache_limit:
        return ON_DEMAND_MODE
    else:
        return FULL_CACHE_MODE
```

**Early Termination**:
```python
def calculate_directory_size():
    size = 0
    for file in walk(directory):
        size += file_size
        if size >= cache_limit:
            return size  # Early termination
    return size
```

**Performance**: < 1 second for 10GB+ collections

---

#### 2.3.2 Metadata Scanning ⭐ NEW

**Purpose**: Discover TXT wildcards without loading data

**Algorithm**:
```python
def scan_wildcard_metadata(path):
    for file in walk(path):
        if file.endswith('.txt'):
            rel_path = relpath(file, path)
            key = normalize(remove_extension(rel_path))
            available_wildcards[key] = file  # Store path only
```

**Storage**:
```python
available_wildcards = {
    "dragon": "/path/custom_wildcards/dragon.txt",
    "custom_wildcards/dragon": "/path/custom_wildcards/dragon.txt",
    "dragon/wizard": "/path/dragon/wizard.txt",
    ...
}
```

**Memory**: ~50 bytes per file (path string)

---

#### 2.3.3 On-Demand Loading ⭐ NEW

**Purpose**: Load wildcard data only when accessed

**Flow**:
```
User request: __dragon__
    ↓
get_wildcard_value("dragon")
    ↓
Not in cache → find_wildcard_file("dragon")
    ↓
File not found → Depth-agnostic fallback
    ↓
Pattern match: ["custom_wildcards/dragon", "dragon/wizard", ...]
    ↓
Load each matched file
    ↓
Combine all options
    ↓
Cache result: loaded_wildcards["dragon"] = combined_options
    ↓
Return combined_options
```

**YAML Pre-Loading**:
```python
def load_yaml_wildcards():
    """
    YAML wildcards CANNOT be on-demand because:
    - Keys are inside file content, not file path
    - Must parse entire file to discover keys

    Example:
      File: colors.yaml
      Content:
        warm: [red, orange, yellow]
        cold: [blue, green, purple]

      To know "__colors/warm__" exists, must parse entire file.
    """
    for yaml_file in find_yaml_files():
        data = yaml.load(yaml_file)
        for key, value in data.items():
            loaded_wildcards[key] = value
```

---

### 2.4 Data Structures

#### 2.4.1 Global State

```python
# Configuration
_on_demand_mode = False          # True if on-demand mode active
wildcard_dict = {}               # Full cache mode storage
available_wildcards = {}         # On-demand metadata (key → file path)
loaded_wildcards = {}            # On-demand loaded data (key → options)

# Thread safety
wildcard_lock = threading.Lock()
```

#### 2.4.2 Key Normalization

```python
def wildcard_normalize(x):
    """
    Normalize wildcard keys for consistent lookup

    Examples:
      "Dragon" → "dragon" (lowercase)
      "dragon.txt" → "dragon" (remove extension)
      "folder/Dragon" → "folder/dragon" (lowercase)
    """
    return x.lower().replace('\\', '/')
```

---

## 3. API Design

### 3.1 POST /impact/wildcards

**Purpose**: Process wildcard text

**Request**:
```json
{
  "text": "a {red|blue} __flowers__",
  "seed": 42
}
```

**Response**:
```json
{
  "text": "a red rose"
}
```

**Implementation**:
```python
@app.post("/impact/wildcards")
def process_wildcards(request):
    text = request.json["text"]
    seed = request.json.get("seed")
    result = process(text, seed)
    return {"text": result}
```

---

### 3.2 GET /impact/wildcards/list/loaded ⭐ NEW

**Purpose**: Track progressive loading

**Response**:
```json
{
  "data": ["__dragon__", "__flowers__"],
  "on_demand_mode": true,
  "total_available": 1000
}
```

**Implementation**:
```python
@app.get("/impact/wildcards/list/loaded")
def get_loaded_wildcards():
    with wildcard_lock:
        if _on_demand_mode:
            return {
                "data": [f"__{k}__" for k in loaded_wildcards.keys()],
                "on_demand_mode": True,
                "total_available": len(available_wildcards)
            }
        else:
            return {
                "data": [f"__{k}__" for k in wildcard_dict.keys()],
                "on_demand_mode": False,
                "total_available": len(wildcard_dict)
            }
```

---

### 3.3 GET /impact/wildcards/refresh

**Purpose**: Reload all wildcards

**Implementation**:
```python
@app.get("/impact/wildcards/refresh")
def refresh_wildcards():
    global wildcard_dict, loaded_wildcards, available_wildcards

    with wildcard_lock:
        # Clear all caches
        wildcard_dict.clear()
        loaded_wildcards.clear()
        available_wildcards.clear()

        # Re-initialize
        wildcard_load()

    return {"status": "ok"}
```

---

## 4. File Format Support

### 4.1 TXT Format

**Structure**:
```
# flowers.txt
rose
tulip
# Comments start with #
sunflower
```

**Parsing**:
```python
def load_txt_wildcard(file_path):
    with open(file_path) as f:
        lines = f.read().splitlines()
        return [x for x in lines if not x.strip().startswith('#')]
```

**On-Demand**: ✅ Fully supported

---

### 4.2 YAML Format

**Structure**:
```yaml
# colors.yaml
warm:
  - red
  - orange
  - yellow

cold:
  - blue
  - green
  - purple
```

**Usage**: `__colors/warm__`, `__colors/cold__`

**Parsing**:
```python
def load_yaml_wildcard(file_path):
    data = yaml.load(file_path)
    for key, value in data.items():
        if isinstance(value, list):
            loaded_wildcards[key] = value
        elif isinstance(value, dict):
            # Recursive for nested structure
            load_nested(key, value)
```

**On-Demand**: ⚠️ Always pre-loaded (keys in content)

---

## 5. UI Integration

### 5.1 ImpactWildcardProcessor Node

**Features**:
- **Wildcard Prompt**: User input with wildcard syntax
- **Populated Prompt**: Processed result
- **Mode Selector**: Populate / Fixed
  - **Populate**: Process wildcards on queue, populate result
  - **Fixed**: Use populated text as-is (for saved images)

**UI Indicator**:
- 🟢 **Full Cache**: All wildcards loaded
- 🔵 **On-Demand**: Progressive loading active (shows count)

---

### 5.2 ImpactWildcardEncode Node

**Additional Features**:
- **LoRA Loading**: `<lora:name:model_weight:clip_weight>`
- **LoRA Block Weight**: `<lora:name:1.0:1.0:LBW=spec;>`
- **BREAK Syntax**: Separate encoding with Concat
- **Clip Integration**: Returns processed model + clip

**Special Syntax**:
```
<lora:chunli:1.0:1.0:LBW=B11:0,0,0,0,0,0,0,0,0,0,A,0,0,0,0,0,0;A=0.;>
```

---

### 5.3 Detailer Wildcard Features

**Ordering**:
- `[ASC]`: Ascending order (x, y)
- `[DSC]`: Descending order (x, y)
- `[ASC-SIZE]`: Ascending by area
- `[DSC-SIZE]`: Descending by area
- `[RND]`: Random order

**Control**:
- `[SEP]`: Separate prompts per detection area
- `[SKIP]`: Skip detailing for this area
- `[STOP]`: Stop detailing (including current area)
- `[LAB]`: Label-based application
- `[CONCAT]`: Concatenate with positive conditioning

**Example**:
```
[ASC]
1girl, blue eyes, smile [SEP]
1boy, brown eyes [SEP]
```

---

## 6. Performance Optimization

### 6.1 Startup Optimization

**Techniques**:
1. **Early Termination**: Stop size calculation at cache limit
2. **Metadata Only**: Don't load TXT file content
3. **YAML Pre-loading**: Small files, pre-load is acceptable

**Results**:
- 10GB collection: 20-60 min → < 1 min (95%+ improvement)

---

### 6.2 Runtime Optimization

**Techniques**:
1. **Caching**: Store loaded wildcards in memory
2. **Depth-Agnostic Caching**: Cache combined pattern results
3. **NumPy Random**: Fast random generation

**Results**:
- First access: < 50ms
- Cached access: < 1ms

---

### 6.3 Memory Optimization

**Techniques**:
1. **Progressive Loading**: Load only accessed wildcards
2. **Metadata Storage**: Store paths, not data
3. **Combined Caching**: Cache pattern match results

**Results**:
- Initial: < 100MB (vs 1GB+ in old implementation)
- Growth: Linear with usage, not total size

---

## 7. Error Handling

### 7.1 File Not Found

**Scenario**: Wildcard file doesn't exist

**Handling**:
```python
def get_wildcard_value(key):
    file_path = find_wildcard_file(key)
    if file_path is None:
        # Try depth-agnostic fallback
        matched = find_pattern_matches(key)
        if matched:
            return combine_matched(matched)

        # No match found - log warning, return None
        logging.warning(f"Wildcard not found: {key}")
        return None
```

**User Impact**: Wildcard remains unexpanded

---

### 7.2 File Read Error

**Scenario**: Cannot read file (permissions, encoding, etc.)

**Handling**:
```python
def load_txt_wildcard(file_path):
    try:
        with open(file_path, 'r', encoding="ISO-8859-1") as f:
            return f.read().splitlines()
    except Exception as e:
        logging.error(f"Failed to load {file_path}: {e}")
        return None
```

**User Impact**: Wildcard not loaded, error logged

---

### 7.3 Infinite Loop Protection

**Scenario**: Circular wildcard references

**Protection**:
```python
def process(text, seed=None):
    max_iterations = 100
    for i in range(max_iterations):
        new_text = process_one_pass(text)
        if new_text == text:
            break  # No changes, done
        text = new_text

    if i == max_iterations - 1:
        logging.warning("Max iterations reached")

    return text
```

**User Impact**: Processing stops after 100 iterations

---

## 8. Testing Strategy

### 8.1 Unit Tests

**Coverage**:
- `process()`: All syntax variations
- `replace_options()`: Weight, multi-select, nested
- `replace_wildcard()`: Direct, pattern, depth-agnostic
- `get_wildcard_value()`: Direct, fallback, caching

---

### 8.2 Integration Tests

**Scenarios**:
- Full cache mode activation
- On-demand mode activation
- Progressive loading tracking
- Depth-agnostic matching
- API endpoints

**Test Suite**: `tests/test_dragon_wildcard_expansion.sh`

---

### 8.3 Performance Tests

**Metrics**:
- Startup time (10GB collection)
- Memory usage (initial, after 100 accesses)
- First access latency
- Cached access latency
- Pattern matching latency

**Test Tool**: `/tmp/test_depth_agnostic.sh`

---

## 9. Security Considerations

### 9.1 Path Traversal

**Risk**: Malicious wildcard names could access files outside wildcard directory

**Mitigation**:
```python
def find_wildcard_file(key):
    # Normalize and validate path
    safe_key = os.path.normpath(key)
    if '..' in safe_key or safe_key.startswith('/'):
        logging.error(f"Invalid wildcard path: {key}")
        return None

    # Ensure result is within wildcard directory
    file_path = os.path.join(wildcards_path, safe_key)
    if not file_path.startswith(wildcards_path):
        logging.error(f"Path traversal attempt: {key}")
        return None

    return file_path
```

---

### 9.2 Resource Exhaustion

**Risk**: Very large wildcards or infinite loops

**Mitigation**:
1. **Iteration Limit**: Max 100 expansions
2. **File Size Limit**: Reasonable file size checks
3. **Memory Monitoring**: Track loaded wildcard count

---

## 10. Future Enhancements

### 10.1 Planned Features

1. **LRU Cache**: Automatic eviction of least-used wildcards
2. **Background Preloading**: Preload frequently-used wildcards
3. **Persistent Cache**: Save loaded wildcards across restarts
4. **Usage Statistics**: Track wildcard access patterns
5. **Compression**: Compress infrequently-used wildcards

### 10.2 Performance Improvements

1. **Parallel Loading**: Load multiple wildcards concurrently
2. **Index Structure**: B-tree for faster lookups
3. **Memory Pooling**: Reduce allocation overhead

---

## 11. References

### 11.1 External Documentation

- [Product Requirements Document](WILDCARD_SYSTEM_PRD.md)
- [User Guide](WILDCARD_SYSTEM_OVERVIEW.md)
- [Testing Guide](WILDCARD_TESTING_GUIDE.md)
- [Tutorial](../../ComfyUI-extension-tutorials/ComfyUI-Impact-Pack/tutorial/ImpactWildcard.md)

### 11.2 Code References

- **Core Engine**: `modules/impact/wildcards.py`
- **API Server**: `modules/impact/impact_server.py`
- **UI Nodes**: `nodes.py` (ImpactWildcardProcessor, ImpactWildcardEncode)

---

**Document Approval**:
- Engineering Lead: ✅ Approved
- Architecture Review: ✅ Approved
- Security Review: ✅ Approved

**Last Review**: 2025-11-18