Overview
Every deep investigation automatically produces two outputs:- Raw markdown — human-readable report saved to
~/.siclaw/reports/ - Structured record — root cause category, causal chain, affected entities → SQLite
How It Works
Writing Memory
After Phase 4 (Conclusion), the investigation engine extracts structured data:Reading Memory
When a new investigation starts, Phase 2 (Hypothesis Generation) retrieves relevant past investigations using hybrid search:- Structured query — matches on root cause category, affected entities, environment tags
- Semantic search — vector similarity over investigation chunks (cosine similarity with BM25 boost)
0.35 threshold are injected into the hypothesis generation prompt.
Root Cause Categories
Structured records use a fixed category taxonomy:| Category | Examples |
|---|---|
resource_exhaustion | OOMKilled, CPU throttling, disk full |
config_error | Missing configmap, wrong env var, bad YAML |
network_partition | DNS failure, CNI issues, firewall rules |
hardware_failure | Disk failure, GPU error, NIC flap |
pcie_error | PCIe bus error, GPU fallen off bus |
driver_issue | NVIDIA driver mismatch, kernel module missing |
software_bug | Application crash, deadlock, race condition |
permission_denied | RBAC, PodSecurityPolicy, filesystem permissions |
scheduling_failure | Insufficient resources, affinity conflicts |
unknown | No clear root cause identified |
Storage
| Component | Engine | Location |
|---|---|---|
| Structured records | node:sqlite (native) | ~/.siclaw/memory/.memory.db |
| Embeddings | node:sqlite + FTS5 | Same database |
| Raw reports | Markdown files | ~/.siclaw/reports/ |
Chunking
Memory files are split on heading boundaries for optimal retrieval:- Max chunk size: ~400 tokens (~1600 bytes)
- Overlap: ~80 tokens between adjacent chunks
- Each chunk tracks: file path, heading breadcrumb, start/end line
- CJK queries use OR for bigrams; Latin queries use AND