• 6D Amplifying Analysis
Amplifying · Software Engineering · Methodology / Governance

The Upstream Migration: Fifty Years of Moving the Definition of Correct

The history of software is not a list of fads. It is one unbroken move: each time the cost of a wrong assumption arrived earlier and compounded harder, the discipline that survived was the one that *defined “correct” earlier*. The 1968 software crisis forced structure onto code. Test-Driven Development (2002–03) pinned correctness at implementation; Domain-Driven Design (2003) at the domain model; Behaviour-Driven Development (2006, built on DDD) at user behaviour.[1][2][3][4] Each felt like overhead until the teams that skipped it hit the wall. AI agents collapse the gap between “designed” and “in production” to near zero — so the next move upstream is governance, before the agent runs. That is GDD: not a new idea, but the latest instance of the oldest pattern in the field — and one already running in production.[5][6]

1968
Software crisis named (NATO Garmisch)
2002–06
TDD · DDD · BDD pin correctness earlier
≈ 0
Design-to-production gap, in the AI era
2026
GDD pins correctness at governance
4 / 0
Clean cases since the GDD gate was expli
6 of 6
Dimensions in the cascade

6D Foraging Methodology™

01

The Insight

Read as a list of methodologies, TDD, DDD, BDD and GDD look like a stack of acronyms a generation of engineers argued about. Read structurally, they are one move repeated: each pushes the definition of “correct” one layer earlier in the lifecycle. The pattern is older than any of them. In 1968, at the NATO conference that named the “software crisis,” the field admitted software had outpaced the methods to build it; Dijkstra's structured programming was the first response — discipline the control flow before it becomes unmanageable.[1][7]

The early-2000s cluster moved correctness upstream again, three ways at once. Kent Beck's Test-Driven Development (book, 2002–03) said: prove behaviour at the unit before you write the code.[2] Eric Evans' Domain-Driven Design (2003) said: model the business and agree the language before you write a line.[3] Dan North's Behaviour-Driven Development (2006) — built directly on DDD's ubiquitous language — said: express what the system should do for the user in a shared, executable vocabulary.[4] Note the real order: DDD preceded and shaped BDD. These were not a relay but a cluster of responses to the same complexity pressure.

Each shift felt like overhead until the teams who skipped it hit the wall. A green unit test cannot catch a misunderstood requirement; a well-modelled domain cannot catch a misunderstood intent handed to a machine that executes before anyone reviews it. Every discipline solved the failure of the layer below it — and, in doing so, exposed a new class of assumption one layer up. The constant is the shape of the move, not a strict order: each pinned correctness earlier than the gap that had just burned someone.

There is a new layer now. AI agents execute on inferred intent — fast, autonomous, and wrong in ways that compound before the sprint closes. The gap between “designed” and “in production” has collapsed to near zero. The question none of the prior disciplines had to answer is now the expensive one: do we know what correct means here — before any agent runs — and can we prove it? That is Governance-Driven Design. Not a replacement for TDD, DDD or BDD, and not their equal in standing — their upstream completion, for the era where the machine no longer waits. The honest claim is modest and structural: GDD is the next instance of a 50-year pattern, earned by a working implementation rather than asserted.[5]

Fifty years
Of moving the definition of 'correct' one layer upstream

1968 software crisis → TDD → DDD → BDD → GDD. The same move, repeated, every time the cost of being wrong arrived sooner.[1][2][3] One direction for half a century.

02

The Timeline

From the software crisis to governance before the agent runs — the same move, six times.

1968

The software crisis is named

At the NATO Software Engineering Conference in Garmisch, the field admits software has outpaced the methods to build it. Dijkstra's “Go To Statement Considered Harmful” launches structured programming — the first move to discipline code before it becomes unmanageable.[1][7]

The Crisis
1999–2001

Extreme Programming and the Agile Manifesto

Kent Beck's “Extreme Programming Explained” (1999) builds a test-first culture on the SUnit groundwork (1994); the Agile Manifesto (2001) formalizes working software and short feedback loops over heavy up-front documentation.[2][8]

Agile
2002–03

TDD — correctness at implementation

Beck's “Test-Driven Development: By Example” makes test-first mainstream: prove the behaviour of a unit before you write the code that satisfies it. Correctness is pinned at the implementation layer.[2]

TDD
2003

DDD — correctness at the domain model

Eric Evans' “Domain-Driven Design” (Addison-Wesley, August 2003): model the business and agree a ubiquitous language before writing a line. Correctness is pinned at the domain layer — and predates BDD.[3]

DDD
2006

BDD — correctness at behaviour

Dan North's “Introducing BDD” (Better Software), built on DDD's ubiquitous language: express what the system should do for the user in a shared, executable vocabulary (Given/When/Then). Correctness is pinned at the behaviour layer.[4]

BDD
2026

GDD — correctness at governance

Governance-Driven Design moves the definition of correct before any artifact exists, before any agent runs — the upstream completion of the family for AI-first delivery. Not a replacement; the next instance of the pattern.[5]

GDD
2026

Running in production

The migration's newest layer is live: the CAL publishing pipeline is governed by GDD. After UC-236 shipped with errors that every prior check missed (“cited” ≠ “correct”), an explicit verification gate was built; the next four cases shipped with zero post-publish corrections.[6]

In Production

The discipline that survives is the one that defines 'correct' earlier — because that is exactly where the cost of being wrong keeps moving.

DimensionEvidence
Quality (D5) Origin · 90 The migrating variable is the definition of “correct” itself — what counts, and when it is decided. The whole lineage is the relocation of that definition, earlier and earlier: implementation, domain, behaviour, governance.[1][2] D5 is the origin because every discipline in the family changes the same thing — not the code, but where and when correctness gets pinned down. It is the one dimension the entire fifty-year story is about.The Definition of Correct
Operational (D6) L1 · 84 Test-Driven Development (Beck, 2002–03; roots in SUnit 1994 and XP 1999) pinned correctness at the implementation layer: prove the behaviour of a unit before writing the code that satisfies it.[2] D6 is the first modern instance of the upstream move — correctness defined operationally, at the point of construction, rather than discovered after the fact in QA.TDD — Implementation
Customer (D1) L1 · 82 Behaviour-Driven Development (North, 2006), built on DDD's ubiquitous language, pinned correctness at the user-behaviour layer: express what the system should do for the user in a shared, executable vocabulary (Given/When/Then).[4] D1 is the move that connected correctness to the customer's intent — does it do the right thing, not merely run without error.BDD — Behaviour
Employee (D3) L2 · 84 Domain-Driven Design (Evans, 2003) pinned correctness at the domain model and the ubiquitous language the team internalizes: agree what the business means before writing a line.[3] D3 sits with the people and the shared understanding — the discipline a team adopts so that code and conversation describe the same reality. It predated and shaped BDD.DDD — Domain & Shared Language
Revenue (D2) L2 · 80 The engine of the whole migration is the cost of a wrong assumption — arriving earlier and compounding harder each decade. Every shift happened because the discipline that caught the error sooner was cheaper than the one that caught it later; the teams that skipped a shift paid at the wall.[1] D2 is the economic pressure that makes the upstream move inevitable rather than optional, and AI agents — fast and autonomous — sharpen it to a point.
Regulatory (D4) 86 Governance-Driven Design pins correctness before any artifact exists, before any agent runs — the terminus of the fifty-year migration.[5] D4 is where the cascade arrives because, when a machine executes inferred intent, no check downstream of governance is early enough to catch a wrong assumption. The layer is not theoretical: the CAL publishing pipeline runs it in production, and the UC-236 incident (“cited” ≠ “correct”) is the documented case that built the gate.[6]GDD — Governance (the terminus)
03

6D Cascade Analysis

The cascade originates in D5 — Quality — because the migrating variable is the definition of “correct” itself: what counts, and when it is decided. From D5 the definition is pinned, in turn, at D6 (implementation — TDD) and D1 (user behaviour — BDD), then at D3 (the domain model and shared language the team internalizes — DDD), driven the whole way by D2 (the cost of a wrong assumption, arriving earlier and compounding harder each decade). The terminus is D4 — Governance — where GDD pins correctness before any artifact exists. The structural payoff is the route: across fifty years the definition of correct migrates from D6 (implementation) all the way to D4 (governance), and D4 is precisely where the AI era forces it, because a machine that executes inferred intent cannot be caught by any check downstream of governance. The cross-references are deliberate: [UC-235] traced a design philosophy propagating across decades — the same mould; [UC-241] and [UC-240] showed AI commoditizing the surface signal while the upstream layer survives — which is exactly GDD's bet; [UC-247] is the adjacent fight over who owns a foundational software layer.

FETCH Score Breakdown

Chirp: 84.3
|DRIFT|: 42
Confidence: 0.88
FETCH = 84.3 × 42 × 0.88 = 3,116  →  EXECUTE — HIGH PRIORITY (threshold: 1,000)
Calibration: FETCH 3,116 calibrates near UC-235 (The Hopper Cascade, 3,155) — the same lineage-propagation mould — and reflects a foundational, primary-sourced case (Beck, Evans, North, Dijkstra; the 1968 NATO conference). DRIFT 42: the pattern's methodology is proven across fifty years (high), but the newest instance, GDD, is nascent — v0.1.0 with a single documented live implementation. The case earns its claim through that implementation, not assertion. Confidence 0.88.
6 of 6
Dimensions Hit
The definition of co
Multiplier
3,116
FETCH Score
Origin D5 Quality
L1 D6 Operational+ D1 Customer
L2 D3 Employee+ D2 Revenue
L3 D4 Regulatory
CAL Source upstream-migration · amplifying · D5 origin · correctness, defined earlier every time upstream-migration.cal
-- UC-249: The Upstream Migration: 6D Amplifying Cascade
-- Correctness, defined earlier every time (connects UC-235/241/240/247)
FORAGE upstream_migration
WHERE definition_of_correct = migrating_upstream
  AND cost_of_being_wrong = arriving_earlier
  AND each_discipline_completes_the_last = true
ACROSS D5, D6, D1, D3, D2, D4
DEPTH 3
SURFACE upstream_migration

DIVE INTO correctness_locus
WHEN agent_executes_inferred_intent = true
  AND surface_check_cannot_catch_it = true
TRACE crisis_to_governance_cascade
EMIT upstream_migration_signal

DRIFT upstream_migration
METHODOLOGY 90
PERFORMANCE 48

FETCH upstream_migration
THRESHOLD 1000
ON EXECUTE CHIRP high 'for 50 years software discipline has moved the definition of correct one layer earlier - TDD, DDD, BDD - and now GDD, governance before any AI agent runs; the latest instance of the oldest pattern in the field'

SURFACE analysis AS json
SENSE FORAGE: 1968 software crisis (NATO Garmisch) + Dijkstra structured programming; TDD (Beck, 2002–03), DDD (Evans, 2003), BDD (North, 2006, built on DDD); GDD (2026). Each discipline moves the definition of 'correct' one layer earlier. Signal: a 50-year, one-directional migration — implementation → domain/behaviour → governance — driven by the cost of a wrong assumption arriving sooner. AI agents collapse the design-to-production gap to near zero, forcing the next move to governance, before execution.
ANALYZE DRIFT 42 — the methodology is proven across five decades (high), but the newest instance, GDD, is nascent (v0.1.0, one documented live implementation). D5 origin (the definition of correct) pins at D6 (implementation) + D1 (behaviour), then D3 (domain/shared language), driven by D2 (cost of being wrong), terminating at D4 (governance). Honest framing: GDD is a candidate completion of the pattern, earned by implementation, not a coronation beside TDD/DDD.
DECIDE FETCH 3,116 exceeds threshold 1,000. EXECUTE — HIGH PRIORITY. The lineage is primary-sourced (Beck, Evans, North, Dijkstra, the 1968 conference); the corrections (DDD precedes BDD; not a strict relay) are honoured. The proof is the production: the CAL publishing pipeline that produced this case is itself governed by GDD — after the UC-236 incident, an explicit verification gate was built, and the next four cases shipped with zero post-publish corrections. WATCH: whether GDD adoption spreads beyond the author's ecosystem.
04

Key Insights

One move, repeated for fifty years

Every shift pushed the definition of 'correct' one layer earlier, each time the cost of being wrong arrived sooner. TDD → DDD → BDD → GDD is not four ideas — it is one pattern, four instances. The history of software is a single migration upstream.[1][2]

It was never a clean relay

DDD (2003) predates and shaped BDD (2006); these were contemporaneous responses to one complexity pressure, not a succession. The honest lineage is a family, not a line — and saying so is what makes the pattern credible rather than tidy.[3][4]

Cited is not correct

The failure mode that defines the AI era: an agent can cite a source accurately and still be wrong; the audit passes; the artifact compiles. The surface check survives; the truth does not. Governance is the only layer early enough to catch what the surface check cannot.[6]

The proof is the production

GDD's claim is earned, not asserted: the pipeline that produced this very case is governed by it. After the UC-236 incident, an explicit verification gate was built; the next four cases shipped with zero post-publish corrections. A discipline describing itself, while running.[6]

Sources

Eight sources spanning the canonical software-engineering record — the 1968 NATO conference, Beck, Evans, North, Dijkstra, the Agile Manifesto — plus the GDD framework and its documented live implementation.

Tier 1 — Official & Structural Data
[5]
Governance-Driven Design (GDD) — Shatny, 2026. DOI 10.5281/zenodo.20938777; gdd.semanticintent.dev. A pre-execution discipline asking one question before any agent runs — “do we know what correct means, and can we prove it?” — operationalized through the ICR cycle (FORMALIZE → STRESS → CHECK → SURFACE → GATE → CONVERGE) producing a Governed Constraint Set. Positioned as the upstream completion of TDD/BDD/DDD, not a replacement.gdd.semanticintent.dev · 2026
[6]
GDD Example 4 — AI-Driven Publishing Pipeline (gdd.semanticintent.dev/examples). The CAL case-study pipeline as a live GDD implementation: the UC-236 incident (three factual errors that passed every check — “cited” ≠ “correct”) produced the cal_verify_case gate (Step 4.5) and the permanent human gate (“does this thesis earn a DOI?”). Documented dividend: subsequent cases shipped end-to-end with zero post-publish corrections.gdd.semanticintent.dev · Examp
Tier 1 — Primary Source
[3]
Eric Evans — “Domain-Driven Design: Tackling Complexity in the Heart of Software” (Addison-Wesley, August 2003). Introduced ubiquitous language and bounded contexts: model the domain and agree the language before writing code. Predates BDD and directly influenced it.martinfowler.com · DDD
[4]
Dan North — “Introducing BDD” (Better Software magazine, March 2006). JBehave was begun in late 2003 as a replacement for JUnit that removed the word “test.” Influenced by DDD's ubiquitous language, the Given/When/Then template captures acceptance criteria in executable form.dannorth.net · 2006
[8]
The Agile Manifesto (2001) — formalized values of working software and short feedback loops over comprehensive up-front documentation, the cultural context in which TDD, DDD and BDD took hold.agilemanifesto.org · 2001
Tier 2 — Scientific Reference
[1]
Software crisis / NATO Software Engineering Conference (Garmisch, 1968). The term “software crisis” was coined at the conference attended by Dijkstra, Hoare and Wirth, confronting projects that were over budget, overdue and unreliable as systems outpaced the era's methods. Popularized further by Dijkstra's 1972 Turing Award lecture.wikipedia.org · software crisi
[2]
Kent Beck — Test-Driven Development. SUnit (1994) laid the test-first groundwork; “Extreme Programming Explained” (1999) built the culture; “Test-Driven Development: By Example” (Addison-Wesley, 2002–03) made TDD mainstream. Beck described his role as “rediscovering” a technique with ancient roots.wikipedia.org · TDD
[7]
Edsger Dijkstra — “Go To Statement Considered Harmful” (Communications of the ACM, 1968) and the case for structured programming as the disciplined response to unmanageable control flow. The first modern instance of moving correctness upstream.wikipedia.org · structured pro

Every discipline that survived moved one question earlier: do we know what correct means?

AI agents don't wait for the answer. So you decide it before they run.