·Spec-Driven Development·3 min read·Mid-level developers

Why I Stopped Writing Prompts and Started Writing Specs (with OpenSpec)

A year of writing increasingly clever prompts ended with the same problem on every feature. I could not reliably reproduce what the AI built yesterday. Specs solved it. Prompts didn't.

I had a folder called prompts/ with 47 files in it. A couple of them were named "v3-final-actually-final," which is exactly as embarrassing as it sounds. None of them survived a model upgrade. The day GPT-4o-mini changed its default behaviour was the day I admitted that prompts are not source code. They're vibes I keep talking myself into trusting.

What changed in my head

A prompt describes how to ask. A spec describes what the answer must look like. Those are different artifacts. Prompts decay every time the model changes. Specs don't. They're contracts on the output, model-independent.

Once I started writing specs first, the prompts I still write got shorter. The spec carries the contract. The prompt just orients the model. Most of mine are now three sentences. Almost all of my specs are 40 to 100 lines of structured YAML.

What the workflow actually looks like

# specs/features/classify_ticket.openspec.yaml
feature: classify_ticket
input:
  subject: string
  body: string
output:
  category: enum[billing, technical, account, spam, other]
  confidence: float  # 0..1
  reasoning: string  # 1-2 sentences, internal-only
invariants:
  - confidence in [0, 1]
  - "spam" only when body contains promotional or unsolicited content
  - reasoning never exposes internal user IDs
acceptance_examples:
  - input:
      subject: "Where's my refund?"
      body: "I was charged twice last Tuesday..."
    expect:
      - output.category == "billing"
      - output.confidence > 0.8
// The prompt is now this small:
const prompt = `
Classify the support ticket below according to the contract in ${specPath}.
Subject: ${subject}
Body: ${body}
`;

Why this works

The spec encodes intent in a form the model, the linter, and the human reviewer all agree on. The prompt drops to a pointer. "Follow the contract at this path." Model upgrades stop requiring prompt rewrites because the model isn't leaning on cleverness. It's leaning on the contract. The contract lives in version control. The model doesn't get to vote on it.

When to switch

Anything you've prompted more than twice and gotten meaningfully different results. Anything where downstream code depends on the structure of the output. Anything you have to explain to a colleague the third time. The pattern is "I keep tweaking this prompt" → "write a spec for it."

When prompts are still right

Throwaway tasks where writing a spec costs more than just trying. Brainstorming and exploration where there is no contract yet. You are finding the contract there, not enforcing one. Creative work where there is no objective shape to the output. Prompts haven't died. Their job has just gotten smaller.

How the artifact shifts

flowchart LR
    subgraph Before
        P1[prompt-v3-final.txt] --> M1[Model]
        M1 --> O1[Output]
    end
    subgraph After
        S[(OpenSpec)] --> P2["Tiny prompt:<br/>'follow the spec'"]
        P2 --> M2[Model]
        S --> V[Validator]
        M2 --> O2[Output]
        O2 --> V
        V -- conforms --> Done[Use output]
        V -- fails --> Retry[Re-run with feedback]
    end

Conclusion

Open the prompt file you've tweaked the most this month. Spend 30 minutes writing a spec for what it's supposed to produce. You won't go back. And the prompt will shrink to three lines on its own, which is the part that finally got me to commit.