# Error Handling

## Error Categories

```
╔════════════════════════════════════════════════════════╗
║  GenerationErrorCategory (enum)                        ║
╠════════════════════════════════════════════════════════╣
║  .contentSafety           → Triggered safety filters   ║
║  .modelRefusal            → Model declined request     ║
║  .rateLimiting            → Too many requests          ║
║  .modelUnavailable        → Assets downloading         ║
║  .concurrency             → Concurrent request on      ║
║                             same session               ║
║  .schemaIssue             → Schema/guide problem       ║
║  .contextLimit            → Context window exceeded    ║
║  .unsupportedConfiguration→ Language/locale not        ║
║                             supported                  ║
╚════════════════════════════════════════════════════════╝
```

## Error Extensions (Foundation Models)

InferenceKit extends `Error` with convenience properties for handling Foundation Models `GenerationError`.

### `isTransientGenerationError: Bool`

**What it is:** Tells you if retrying might work.

**Why it matters:** Some errors are temporary (rate limit, model downloading), others are permanent (content violation, bad schema).

```
Transient (retry might work):
  .rateLimited         → Wait and try again
  .assetsUnavailable   → Model is downloading, wait
  .concurrentRequests  → Queue collision, retry

Non-transient (retry won't help):
  .guardrailViolation  → Content blocked by safety
  .refusal             → Model won't do it
  .decodingFailure     → Schema broken
  .exceededContextWindowSize → History too long
  .unsupportedGuide    → Bad configuration
  .unsupportedLanguageOrLocale → Not supported
```

**Usage:**

```swift
catch {
    if error.isTransientGenerationError {
        // Retry makes sense
    } else {
        // Fix the root cause, don't retry
    }
}
```

---

### `requiresUserExplanation: Bool`

**What it is:** Should you show this error to the user?

**Why it matters:** Users need to know about content issues or rate limits. They don't care about concurrent request bugs or schema problems.

```
User-facing (show alert):
  .guardrailViolation  → "Your prompt triggered safety filters"
  .refusal             → "The model declined this request"
  .rateLimited         → "Too many requests, please wait"
  .unsupportedLanguageOrLocale → "Language not supported"
  .assetsUnavailable   → "Model is loading, try again soon"

Technical (log, don't show):
  .concurrentRequests  → Your code bug, don't blame user
  .decodingFailure     → Your schema is wrong
  .exceededContextWindowSize → Handle with renewal
  .unsupportedGuide    → Your config is wrong
```

**Usage:**

```swift
catch {
    if error.requiresUserExplanation {
        showAlert(userFriendlyMessage(for: error))
    } else {
        logger.error("Technical error: \(error)")
    }
}
```

---

### `suggestedRetryDelay: TimeInterval?`

**What it is:** How long to wait before retrying (in seconds).

**Why these specific values:**

```
.rateLimited → 2.0 seconds
  Why: Rate limits usually reset quickly. Start with 2s,
       then use exponential backoff (4s, 8s, 16s...).
       Don't hammer the API.

.assetsUnavailable → 5.0 seconds
  Why: Model is downloading/initializing. Needs more time
       than a rate limit. Give it space.

.concurrentRequests → 0.5 seconds
  Why: Just a queue collision. Brief wait lets the other
       request finish. Very fast retry is safe.

Others → nil
  Why: Non-transient errors. Retrying won't help, no
       point in suggesting a delay.
```

**Usage:**

```swift
catch {
    if let delay = error.suggestedRetryDelay {
        try await Task.sleep(nanoseconds: UInt64(delay * 1_000_000_000))
        // Retry here
        // For .rateLimited, implement exponential backoff
    }
}
```

**Exponential backoff example:**

```swift
var delay = error.suggestedRetryDelay ?? 1.0
for attempt in 1...5 {
    try await Task.sleep(nanoseconds: UInt64(delay * 1_000_000_000))
    do {
        return try await session.generate(prompt: prompt)
    } catch {
        delay *= 2  // 2s → 4s → 8s → 16s → 32s
    }
}
```

---

### `generationErrorCategory: GenerationErrorCategory?`

**What it is:** Maps Foundation Models `GenerationError` to simplified category enum.

**Why it matters:** Easier to switch on categories than checking specific error types.

```
Foundation Models Error → Category

.guardrailViolation           → .contentSafety
.refusal                      → .modelRefusal
.rateLimited                  → .rateLimiting
.assetsUnavailable            → .modelUnavailable
.concurrentRequests           → .concurrency
.decodingFailure              → .schemaIssue
.exceededContextWindowSize    → .contextLimit
.unsupportedGuide             → .schemaIssue
.unsupportedLanguageOrLocale  → .unsupportedConfiguration
```

**Usage:**

```swift
catch {
    switch error.generationErrorCategory {
    case .contentSafety:
        showAlert("Content triggered safety filters")
    case .rateLimiting:
        // Wait and retry
    case .contextLimit:
        // Enable session renewal or create new session
    case .schemaIssue:
        // Fix your schema
    default:
        showAlert("An error occurred")
    }
}
```

## Complete Example

```swift
func generateWithRetry(prompt: String, maxAttempts: Int = 3) async throws -> String {
    var delay: TimeInterval = 1.0
    
    for attempt in 1...maxAttempts {
        do {
            return try await session.generate(prompt: prompt)
        } catch {
            // Check if we should even retry
            guard error.isTransientGenerationError else {
                // Non-transient, handle and fail
                if error.requiresUserExplanation {
                    showAlert(userMessage(for: error))
                }
                throw error
            }
            
            // Last attempt?
            if attempt == maxAttempts {
                throw error
            }
            
            // Wait before retry
            if let suggested = error.suggestedRetryDelay {
                delay = suggested
            }
            
            try await Task.sleep(nanoseconds: UInt64(delay * 1_000_000_000))
            
            // Exponential backoff for rate limiting
            if error.generationErrorCategory == .rateLimiting {
                delay *= 2
            }
        }
    }
    
    fatalError("Unreachable")
}
```

