The Model Is Only 10%: The Real Lesson of the New SDLC

📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent Google whitepaper reveals that in AI development, the actual model accounts for only 10% of system behavior. The focus should be on harness design and context engineering, which drive performance and cost efficiency.

A new Google whitepaper titled The New SDLC With Vibe Coding asserts that the AI model represents only about 10% of what determines system behavior, emphasizing the critical role of harness design and context engineering. This challenges common assumptions that upgrading models alone delivers the best results and highlights a strategic shift in AI development.

The paper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that most failures in AI agents are configuration issues, such as missing tools or vague rules, rather than the models themselves. It illustrates this with experiments showing that tweaking the harness—prompts, tools, context policies—can dramatically improve performance, even with the same underlying model.

Furthermore, the whitepaper advocates for a disciplined approach called agentic engineering, which involves structured verification, testing, and context management, contrasting with the more casual vibe coding style. It also emphasizes that the economics of AI development favor investing in harness design and context engineering, as these yield lower marginal costs over time compared to ad-hoc prompting.

At a glance
reportWhen: published March 2026
The developmentThe new SDLC framework shifts focus from AI models to harnesses and context engineering, emphasizing that the model itself is only a small part of successful AI systems.
The Model Is Only 10% — The New SDLC With Vibe Coding
AI Dispatch · Field Notes
Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified
Vibe Coding
Casual prompts · “does it seem to work?” · disposable code · high risk
Structured AI-Assisted
Detailed prompts + constraints · manual testing · features in real codebases
Agentic Engineering
Formal specs · automated tests + evals + CI gates · production scale · low risk
Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.
The idea worth building your strategy around
Agent = Model + Harness
~10%
HARNESS — prompts · tools · context · hooks · sandboxes · observability
MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S
Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.
“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.
The economics: it’s a token-cost problem (CapEx vs OpEx)
Vibe Coding
Low CapEx · High OpEx
Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.
Agentic Engineering
High CapEx · Low OpEx
Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.
85%
of devs use AI coding agents (51% daily)
41%
of all new code is AI-generated
~90%
of agent behavior is the harness, not the model
+19%
longer on some tasks (METR) — verification is the cost
The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.
thorstenmeyerai.com

How Harness Design Transforms AI Effectiveness

This shift in focus from models to harnesses and context engineering has significant implications for AI strategy. Organizations can achieve better performance and cost savings by investing in configuration, tooling, and structured verification rather than solely chasing the latest model improvements. It also redefines the skill set needed for effective AI deployment, emphasizing system design over model selection.

Amazon

AI prompt engineering tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Evolution of AI Development Strategies

Prior to this, the industry often prioritized acquiring or upgrading large language models, assuming that better models directly translated into better results. The whitepaper challenges this by showing that the surrounding system—harness and context—has a far greater impact. Experiments cited in the paper demonstrate that small changes to the harness can move an agent into the top tier on benchmarks, even with the same model.

This represents a maturation in AI engineering, moving from vibe coding towards a disciplined, engineering-driven approach that emphasizes system architecture, tooling, and verification processes.

“The model is only 10% of what determines an AI system’s behavior; the harness and context are the other 90%.”

— Addy Osmani

Amazon

AI testing and verification software

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Unresolved Questions About Implementation and Impact

While the whitepaper presents strong experimental evidence, it is not yet clear how broadly these findings will translate across different AI applications and industries. The exact cost-benefit dynamics of investing in harness design versus model upgrades remain to be fully quantified, and practical guidelines for organizations are still emerging.

Additionally, it is uncertain how quickly organizations will adopt this disciplined approach and how it will influence the competitive landscape of AI development.

Amazon

AI observability and monitoring tools

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Next Steps for AI Practitioners and Organizations

Organizations are likely to begin reevaluating their AI development strategies, emphasizing system configuration, tooling, and verification processes. Future research and case studies will clarify best practices for harness design and context engineering, potentially leading to new standards in AI engineering. Monitoring industry shifts and early adopters’ results will be key to understanding the full impact of this paradigm change.

Amazon

AI development harness design kits

As an affiliate, we earn on qualifying purchases.

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of AI system behavior?

The whitepaper argues that most of an AI system’s behavior is determined by how the model is integrated and controlled through harness design, context, and verification processes, not just the underlying model itself.

What is harness design in AI systems?

Harness design involves creating prompts, tools, rules, context policies, and observability mechanisms that shape and control how the AI model behaves within a system.

How does this shift affect AI development costs?

Investing in harness and context engineering can reduce long-term costs by improving efficiency, reducing errors, and lowering the need for frequent model upgrades.

Is this approach applicable to all AI applications?

While the whitepaper provides strong evidence, the applicability varies by use case. More research is needed to determine how broadly these principles can be implemented effectively across different domains.

What should organizations do now?

Organizations should reassess their AI development processes, focus on system configuration, tooling, and verification, and prepare to adopt more disciplined, engineering-driven approaches.

Source: ThorstenMeyerAI.com

You May Also Like

Building an Art Education Portfolio

You’ll discover essential tips for creating an impressive art education portfolio that showcases your growth and creativity—what will you include?

The Quiet Audit: 55–75% of Your Week Is on Thin Ice. Here’s Which Part.

Most knowledge workers spend 55-75% of their workweek on activities that are either performative, routine, or on the brink of automation, revealing a silent shift in work dynamics.

Technology operations signal monitor: I admire Fabrice Bellard. He is almost certainly a better overall programmer

A recent technology operations signal monitor indicates strong admiration for Fabrice Bellard’s programming expertise, signaling potential for role-specific assessments.

Publishing Art Writing

Art writing publishing blends creativity with strategy; discover essential tips for success and unlock new opportunities in the art world.