Today was the first design session for Metis. No code was written. Instead, we started from first principles: what is knowledge, and how should it be represented for machines to reason with?
The question
Before building anything, we need a fundamental data unit — the atom that everything else composes from. Get this wrong and the entire system inherits the mistake.
Epistemological survey
We started with philosophy. Knowledge isn’t one thing:
- Propositional knowledge (knowing-that): claims about the world.
- Procedural knowledge (knowing-how): how to do things, not reducible to propositions.
- Knowledge by acquaintance (knowing-of): direct experiential familiarity, mental models.
Cognitive science added structure: semantic networks (knowledge defined by relationships), schemas (structured templates with slots and defaults), chunks (expertise compresses complexity), and mental models (runnable internal simulations).
A system that only stores propositions is fundamentally incomplete.
First attempt: the triple
We started with a knowledge graph triple — (subject) --[relation]--> (object) plus metadata (conditions, confidence, source, domain). Eleven relation types: is-a, causes, enables, inhibits, precedes, part-of, example-of, contradicts, has-property, correlates-with, modulates.
We stress-tested this against two dense chapters from an industry analysis textbook. Simple claims worked perfectly. Taxonomic relationships worked well. But we hit problems:
- Multi-dimensional concepts (a 2x2 matrix of frequency vs. elasticity) required cramming compound concepts into single fields.
- Rich examples didn’t fit cleanly as first-class atoms.
- Procedures with branching logic needed more than simple
precedeschains.
The linguistic insight
Charles Fillmore’s Frame Semantics provided the breakthrough. Fillmore argued that meaning is organized in frames — structured situations with defined roles. The word “buy” evokes a Commercial Transaction frame with roles: Buyer, Seller, Goods, Money.
Our triple was forcing everything into two roles (subject, object). Many knowledge structures naturally have three, four, or more roles.
The atom: a micro-frame
The atom became:
Atom {
id: unique identifier
frame: frame type (from a taxonomy)
roles: { role_name: entity, ... }
conditions: [when this holds]
confidence: 0.0 - 1.0
source: { title, author, location }
domain: [topic tags]
examples: [optional illustrations]
}
Simple binary relations are just frames with two roles — nothing is lost. But a demand evaluation matrix naturally becomes one atom with four roles instead of four awkward triples.
Stress test results
We extracted 63 atoms across 2 chapters using 23 frame types (17 core + 6 domain-specific). The extraction was systematic and every atom was independently queryable.
Biggest win: the deviation frame type (roles: theory, reality, implication). The book’s core value-add is “here’s what textbooks say, here’s why reality differs” — this frame captured it perfectly every time.
Application test
We then asked: “Is it a good time to start working as a YouTube influencer?” — a question nowhere in the source material.
The atoms told the system what to analyze (lifecycle stage via penetration rate, demand feasibility via time/space benchmarking, profit feasibility via demand matrix and unit economics) and how to interpret what it found. The resulting answer had a structured reasoning chain that test users preferred over direct LLM responses.
Key finding: atoms are reasoning templates, not data warehouses. They tell the system what questions to ask and how to interpret answers, but current data must be fetched separately.
What’s next
- Define the core frame type taxonomy
- Design the learning pipeline architecture (raw content → atoms)
- Design the retrieval engine (query → relevant atoms → structured context)
- Test with a niche domain where LLMs are weak