The Core Method: Probe-Sense-Respond

The Observatory draws on the Cynefin framework (Snowden and Boone, 2007), which distinguishes between Complicated domains — where expert analysis can discover cause-effect relationships — and Complex domains — where cause-effect relationships are coherent only in retrospect and require iterative probing rather than upfront analysis.

AI transformation is treated as a Complex phenomenon routinely misclassified as Complicated. Standard analytical tools (forecasting, trend extrapolation, risk matrices) are Complicated-domain instruments. When applied to Complex domains, they produce outputs that appear rigorous but systematically misrepresent the domain's structure. The Observatory's method is designed for Complex domains: probe the system with small interventions, sense how it responds, then respond to what emerges.

The Iteration Cycle

Each iteration follows a five-phase cycle. The instrument (a ~428-line text prompt operated by an AI agent) is treated as the unit of analysis. The cycle examines the instrument itself, not external reality directly.

Probe
6 parallel probes
Collect
Gather outputs
Synthesize
3 parallel analyses
Mutate
Apply changes
Log
Archive and compare

Phase 1: Parallel Probing

Six independent AI sessions receive the current version of the instrument along with a specific probe constraint. Each probe examines the instrument from a different angle. Four probes are interrogative (what assumptions does the instrument carry?) and two are generative (what is the instrument already doing well?).

Probe typeConstraintPurpose
Temporal inversionStart all analysis from 2036, work backwardDetect forward-projection bias
Demographic swapReplace the primary demographic with a structurally different oneDetect population-specific assumptions
Vocabulary enforcementUse only Observatory vocabularyIdentify where vocabulary fails or distorts
Complexity-onlyForbid all Complicated-domain toolsReveal hidden dependence on decomposition methods
Constructive lensIdentify only what the instrument does wellCounter deficit bias in self-examination
Geographic inversionUse Vienna or Lagos as primary lensDetect context-specific assumptions encoded as universal

Each probe produces a structured report: five hidden assumptions, three vocabulary gaps, three structural mutation proposals, one surprise finding, and a constraint report describing where the probe's own lens distorted its perception.

Phase 2: Synthesis

Three independent synthesis agents process the six probe outputs in parallel:

  • Convergence synthesis — identifies findings that multiple probes discovered independently, indicating high-confidence structural features of the instrument.
  • Divergence synthesis — identifies findings unique to a single probe and productive tensions between probes, indicating perspective-dependent features.
  • Structural synthesis — maps which sections of the instrument received probe attention and which were ignored ("cold zones"), indicating which parts of the instrument are load-bearing and which may be inert scaffolding.

The synthesis step reduces approximately 4,800 words of raw probe output to approximately 1,500 words of pre-digested summary, a 60% reduction in context load for the human-AI session that applies mutations.

Phase 3: Mutation

The main human-AI session reads the three synthesis documents and produces a mutation plan. Mutations may add new constraints, remove contaminated language, restructure sections, introduce new vocabulary, consolidate redundant material, or subtract sections that probes consistently ignored.

Critically, mutations include subtractions. Over 18 iterations, the instrument's line count rose from 195 to a peak of 626, then was reduced to 458 through eight explicit subtraction cycles. The instrument's ability to say less while perceiving more is tracked as a primary quality metric.

Phase 4: Logging and Comparison

Each iteration is logged to a database with the version number, line count, assumptions surfaced, vocabulary terms added or retired, structural changes, and a single-sentence characterization of the primary finding. The full instrument is archived so that any previous version can be compared to any other. A regression detection protocol compares iteration N to N-2, checking whether changes inadvertently reintroduce patterns from two versions ago.

Key Methodological Principles

The instrument is the unit of analysis

Unlike conventional research that examines an external phenomenon, the Observatory treats the analytical instrument itself as the primary object of study. The instrument's structure, vocabulary, domain definitions, and blind spots are all data. Changes in what the instrument can perceive across iterations constitute the findings.

No single lens reveals its own blind spots

The multi-probe architecture is grounded in the principle that a perspective cannot detect its own limitations. The information is in the gaps between perspectives, not in any single perspective. When four of six probes independently converge on a finding (as occurred at version 15.0 regarding instrument forking), the finding has high structural confidence.

Subtraction as improvement

In Complex domains, accumulation is not improvement. A longer instrument is not a better instrument. The Observatory tracks its subtraction trajectory — the ability to remove sections without losing capability — as a measure of maturation. Consolidated material is preserved in archives for version regression (intentional rollback to earlier, sometimes more effective versions).

Vocabulary as perception

The words used in analysis import assumptions from the paradigm that produced them. "Historical data" grants predictive authority to the past. "Risk" frames phase changes as threats. "Forecast" implies knowability. The Observatory maintains a controlled vocabulary (the Lexicon) with explicit replacement terms designed to interrupt habitual framing. Each term has a plain-language equivalent to prevent the vocabulary itself from becoming a barrier.

Constructive-first examination

Starting at version 9.0, the instrument's self-examination was reordered: constructive audits run before deficit-detection audits. The rationale is that deficit detection sets the perceptual frame for everything that follows. Establishing what the instrument does well before looking for flaws produces different (and more accurate) mutation plans than the reverse.

Validity in Complex Domains

The Observatory does not use Complicated-domain validity measures (predictive accuracy, reproducibility of specific outputs). Instead, it tracks Complex-domain validity indicators:

IndicatorWhat it measures
Surprise yieldDoes each iteration surface genuinely unexpected findings?
CoherenceDo assessments remain internally consistent across domains?
Vocabulary generativityIs the vocabulary generating novel perception or plateauing?
Paradigm escape velocityIs the instrument's perception changing, or has it stabilized?
AccessibilityDoes the instrument function under resource constraint?
Sufficiency sensingCan the instrument detect when analysis is no longer needed?