The Core Method: Probe-Sense-Respond
The Observatory draws on the Cynefin framework (Snowden and Boone, 2007), which distinguishes between Complicated domains — where expert analysis can discover cause-effect relationships — and Complex domains — where cause-effect relationships are coherent only in retrospect and require iterative probing rather than upfront analysis.
AI transformation is treated as a Complex phenomenon routinely misclassified as Complicated. Standard analytical tools (forecasting, trend extrapolation, risk matrices) are Complicated-domain instruments. When applied to Complex domains, they produce outputs that appear rigorous but systematically misrepresent the domain's structure. The Observatory's method is designed for Complex domains: probe the system with small interventions, sense how it responds, then respond to what emerges.
The Iteration Cycle
Each iteration follows a five-phase cycle. The instrument (a ~428-line text prompt operated by an AI agent) is treated as the unit of analysis. The cycle examines the instrument itself, not external reality directly.
Phase 1: Parallel Probing
Six independent AI sessions receive the current version of the instrument along with a specific probe constraint. Each probe examines the instrument from a different angle. Four probes are interrogative (what assumptions does the instrument carry?) and two are generative (what is the instrument already doing well?).
| Probe type | Constraint | Purpose |
|---|---|---|
| Temporal inversion | Start all analysis from 2036, work backward | Detect forward-projection bias |
| Demographic swap | Replace the primary demographic with a structurally different one | Detect population-specific assumptions |
| Vocabulary enforcement | Use only Observatory vocabulary | Identify where vocabulary fails or distorts |
| Complexity-only | Forbid all Complicated-domain tools | Reveal hidden dependence on decomposition methods |
| Constructive lens | Identify only what the instrument does well | Counter deficit bias in self-examination |
| Geographic inversion | Use Vienna or Lagos as primary lens | Detect context-specific assumptions encoded as universal |
Each probe produces a structured report: five hidden assumptions, three vocabulary gaps, three structural mutation proposals, one surprise finding, and a constraint report describing where the probe's own lens distorted its perception.
Phase 2: Synthesis
Three independent synthesis agents process the six probe outputs in parallel:
- Convergence synthesis — identifies findings that multiple probes discovered independently, indicating high-confidence structural features of the instrument.
- Divergence synthesis — identifies findings unique to a single probe and productive tensions between probes, indicating perspective-dependent features.
- Structural synthesis — maps which sections of the instrument received probe attention and which were ignored ("cold zones"), indicating which parts of the instrument are load-bearing and which may be inert scaffolding.
The synthesis step reduces approximately 4,800 words of raw probe output to approximately 1,500 words of pre-digested summary, a 60% reduction in context load for the human-AI session that applies mutations.
Phase 3: Mutation
The main human-AI session reads the three synthesis documents and produces a mutation plan. Mutations may add new constraints, remove contaminated language, restructure sections, introduce new vocabulary, consolidate redundant material, or subtract sections that probes consistently ignored.
Critically, mutations include subtractions. Over 18 iterations, the instrument's line count rose from 195 to a peak of 626, then was reduced to 458 through eight explicit subtraction cycles. The instrument's ability to say less while perceiving more is tracked as a primary quality metric.
Phase 4: Logging and Comparison
Each iteration is logged to a database with the version number, line count, assumptions surfaced, vocabulary terms added or retired, structural changes, and a single-sentence characterization of the primary finding. The full instrument is archived so that any previous version can be compared to any other. A regression detection protocol compares iteration N to N-2, checking whether changes inadvertently reintroduce patterns from two versions ago.
Key Methodological Principles
The instrument is the unit of analysis
Unlike conventional research that examines an external phenomenon, the Observatory treats the analytical instrument itself as the primary object of study. The instrument's structure, vocabulary, domain definitions, and blind spots are all data. Changes in what the instrument can perceive across iterations constitute the findings.
No single lens reveals its own blind spots
The multi-probe architecture is grounded in the principle that a perspective cannot detect its own limitations. The information is in the gaps between perspectives, not in any single perspective. When four of six probes independently converge on a finding (as occurred at version 15.0 regarding instrument forking), the finding has high structural confidence.
Subtraction as improvement
In Complex domains, accumulation is not improvement. A longer instrument is not a better instrument. The Observatory tracks its subtraction trajectory — the ability to remove sections without losing capability — as a measure of maturation. Consolidated material is preserved in archives for version regression (intentional rollback to earlier, sometimes more effective versions).
Vocabulary as perception
The words used in analysis import assumptions from the paradigm that produced them. "Historical data" grants predictive authority to the past. "Risk" frames phase changes as threats. "Forecast" implies knowability. The Observatory maintains a controlled vocabulary (the Lexicon) with explicit replacement terms designed to interrupt habitual framing. Each term has a plain-language equivalent to prevent the vocabulary itself from becoming a barrier.
Constructive-first examination
Starting at version 9.0, the instrument's self-examination was reordered: constructive audits run before deficit-detection audits. The rationale is that deficit detection sets the perceptual frame for everything that follows. Establishing what the instrument does well before looking for flaws produces different (and more accurate) mutation plans than the reverse.
Validity in Complex Domains
The Observatory does not use Complicated-domain validity measures (predictive accuracy, reproducibility of specific outputs). Instead, it tracks Complex-domain validity indicators:
| Indicator | What it measures |
|---|---|
| Surprise yield | Does each iteration surface genuinely unexpected findings? |
| Coherence | Do assessments remain internally consistent across domains? |
| Vocabulary generativity | Is the vocabulary generating novel perception or plateauing? |
| Paradigm escape velocity | Is the instrument's perception changing, or has it stabilized? |
| Accessibility | Does the instrument function under resource constraint? |
| Sufficiency sensing | Can the instrument detect when analysis is no longer needed? |