Methodology — The Observatory Project

The Core Method: Probe-Sense-Respond

The Observatory draws on the Cynefin framework (Snowden and Boone, 2007), which distinguishes between Complicated domains — where expert analysis can discover cause-effect relationships — and Complex domains — where cause-effect relationships are coherent only in retrospect and require iterative probing rather than upfront analysis.

AI transformation is treated as a Complex phenomenon routinely misclassified as Complicated. Standard analytical tools (forecasting, trend extrapolation, risk matrices) are Complicated-domain instruments. When applied to Complex domains, they produce outputs that appear rigorous but systematically misrepresent the domain's structure. The Observatory's method is designed for Complex domains: probe the system with small interventions, sense how it responds, then respond to what emerges.

The Iteration Cycle

Each iteration follows a five-phase cycle. The instrument (a ~428-line text prompt operated by an AI agent) is treated as the unit of analysis. The cycle examines the instrument itself, not external reality directly.

Probe

6 parallel probes

→

Collect

Gather outputs

→

Synthesize

3 parallel analyses

→

Mutate

Apply changes

→

Log

Archive and compare

Phase 1: Parallel Probing

Six independent AI sessions receive the current version of the instrument along with a specific probe constraint. Each probe examines the instrument from a different angle. Four probes are interrogative (what assumptions does the instrument carry?) and two are generative (what is the instrument already doing well?).

Probe type	Constraint	Purpose
Temporal inversion	Start all analysis from 2036, work backward	Detect forward-projection bias
Demographic swap	Replace the primary demographic with a structurally different one	Detect population-specific assumptions
Vocabulary enforcement	Use only Observatory vocabulary	Identify where vocabulary fails or distorts
Complexity-only	Forbid all Complicated-domain tools	Reveal hidden dependence on decomposition methods
Constructive lens	Identify only what the instrument does well	Counter deficit bias in self-examination
Geographic inversion	Use Vienna or Lagos as primary lens	Detect context-specific assumptions encoded as universal

Each probe produces a structured report: five hidden assumptions, three vocabulary gaps, three structural mutation proposals, one surprise finding, and a constraint report describing where the probe's own lens distorted its perception.

Phase 2: Synthesis

Three independent synthesis agents process the six probe outputs in parallel:

Convergence synthesis — identifies findings that multiple probes discovered independently, indicating high-confidence structural features of the instrument.
Divergence synthesis — identifies findings unique to a single probe and productive tensions between probes, indicating perspective-dependent features.
Structural synthesis — maps which sections of the instrument received probe attention and which were ignored ("cold zones"), indicating which parts of the instrument are load-bearing and which may be inert scaffolding.

The synthesis step reduces approximately 4,800 words of raw probe output to approximately 1,500 words of pre-digested summary, a 60% reduction in context load for the human-AI session that applies mutations.

Phase 3: Mutation

The main human-AI session reads the three synthesis documents and produces a mutation plan. Mutations may add new constraints, remove contaminated language, restructure sections, introduce new vocabulary, consolidate redundant material, or subtract sections that probes consistently ignored.

Critically, mutations include subtractions. Over 18 iterations, the instrument's line count rose from 195 to a peak of 626, then was reduced to 458 through eight explicit subtraction cycles. The instrument's ability to say less while perceiving more is tracked as a primary quality metric.

Phase 4: Logging and Comparison

Each iteration is logged to a database with the version number, line count, assumptions surfaced, vocabulary terms added or retired, structural changes, and a single-sentence characterization of the primary finding. The full instrument is archived so that any previous version can be compared to any other. A regression detection protocol compares iteration N to N-2, checking whether changes inadvertently reintroduce patterns from two versions ago.

Key Methodological Principles

The instrument is the unit of analysis

Unlike conventional research that examines an external phenomenon, the Observatory treats the analytical instrument itself as the primary object of study. The instrument's structure, vocabulary, domain definitions, and blind spots are all data. Changes in what the instrument can perceive across iterations constitute the findings.

No single lens reveals its own blind spots

The multi-probe architecture is grounded in the principle that a perspective cannot detect its own limitations. The information is in the gaps between perspectives, not in any single perspective. When four of six probes independently converge on a finding (as occurred at version 15.0 regarding instrument forking), the finding has high structural confidence.

Subtraction as improvement

In Complex domains, accumulation is not improvement. A longer instrument is not a better instrument. The Observatory tracks its subtraction trajectory — the ability to remove sections without losing capability — as a measure of maturation. Consolidated material is preserved in archives for version regression (intentional rollback to earlier, sometimes more effective versions).

Vocabulary as perception

The words used in analysis import assumptions from the paradigm that produced them. "Historical data" grants predictive authority to the past. "Risk" frames phase changes as threats. "Forecast" implies knowability. The Observatory maintains a controlled vocabulary (the Lexicon) with explicit replacement terms designed to interrupt habitual framing. Each term has a plain-language equivalent to prevent the vocabulary itself from becoming a barrier.

Constructive-first examination

Starting at version 9.0, the instrument's self-examination was reordered: constructive audits run before deficit-detection audits. The rationale is that deficit detection sets the perceptual frame for everything that follows. Establishing what the instrument does well before looking for flaws produces different (and more accurate) mutation plans than the reverse.

Validity in Complex Domains

The Observatory does not use Complicated-domain validity measures (predictive accuracy, reproducibility of specific outputs). Instead, it tracks Complex-domain validity indicators:

Indicator	What it measures
Surprise yield	Does each iteration surface genuinely unexpected findings?
Coherence	Do assessments remain internally consistent across domains?
Vocabulary generativity	Is the vocabulary generating novel perception or plateauing?
Paradigm escape velocity	Is the instrument's perception changing, or has it stabilized?
Accessibility	Does the instrument function under resource constraint?
Sufficiency sensing	Can the instrument detect when analysis is no longer needed?