Skip to main content

Overview

Every deep investigation automatically produces two outputs:
  1. Raw markdown — human-readable report saved to ~/.siclaw/reports/
  2. Structured record — root cause category, causal chain, affected entities → SQLite
This dual-output system means Siclaw accumulates diagnostic experience across sessions.

How It Works

Writing Memory

After Phase 4 (Conclusion), the investigation engine extracts structured data:
{
  "root_cause_category": "resource_exhaustion",
  "affected_entities": ["pod/payment-service", "node/worker-03"],
  "environment_tags": ["cluster-prod", "namespace-payments"],
  "causal_chain": [
    "v2.3 added caching layer",
    "Memory usage increased to 310Mi",
    "Exceeded 256Mi limit",
    "OOMKilled → CrashLoopBackOff"
  ],
  "confidence": 92
}

Reading Memory

When a new investigation starts, Phase 2 (Hypothesis Generation) retrieves relevant past investigations using hybrid search:
  1. Structured query — matches on root cause category, affected entities, environment tags
  2. Semantic search — vector similarity over investigation chunks (cosine similarity with BM25 boost)
finalScore = 0.70 × cosineSimilarity + 0.30 × bm25Score
Results above 0.35 threshold are injected into the hypothesis generation prompt.

Root Cause Categories

Structured records use a fixed category taxonomy:
CategoryExamples
resource_exhaustionOOMKilled, CPU throttling, disk full
config_errorMissing configmap, wrong env var, bad YAML
network_partitionDNS failure, CNI issues, firewall rules
hardware_failureDisk failure, GPU error, NIC flap
pcie_errorPCIe bus error, GPU fallen off bus
driver_issueNVIDIA driver mismatch, kernel module missing
software_bugApplication crash, deadlock, race condition
permission_deniedRBAC, PodSecurityPolicy, filesystem permissions
scheduling_failureInsufficient resources, affinity conflicts
unknownNo clear root cause identified

Storage

ComponentEngineLocation
Structured recordsnode:sqlite (native)~/.siclaw/memory/.memory.db
Embeddingsnode:sqlite + FTS5Same database
Raw reportsMarkdown files~/.siclaw/reports/
Memory search requires an embedding provider to be configured. Without it, structured field matching still works but semantic similarity search is disabled.

Chunking

Memory files are split on heading boundaries for optimal retrieval:
  • Max chunk size: ~400 tokens (~1600 bytes)
  • Overlap: ~80 tokens between adjacent chunks
  • Each chunk tracks: file path, heading breadcrumb, start/end line
  • CJK queries use OR for bigrams; Latin queries use AND