📊 Full opportunity report: The Model Is Only 10%: The Real Lesson of the New SDLC on ThorstenMeyerAI.com — validation score, market gap, and execution plan.

TL;DR

A recent Google whitepaper reveals that in AI development, the actual model accounts for only 10% of system behavior. The focus should be on harness design and context engineering, which drive performance and cost efficiency.

A new Google whitepaper titled The New SDLC With Vibe Coding asserts that the AI model represents only about 10% of what determines system behavior, emphasizing the critical role of harness design and context engineering. This challenges common assumptions that upgrading models alone delivers the best results and highlights a strategic shift in AI development.

The paper, authored by Addy Osmani, Shubham Saboo, and Sokratis Kartakis, states that most failures in AI agents are configuration issues, such as missing tools or vague rules, rather than the models themselves. It illustrates this with experiments showing that tweaking the harness—prompts, tools, context policies—can dramatically improve performance, even with the same underlying model.

Furthermore, the whitepaper advocates for a disciplined approach called agentic engineering, which involves structured verification, testing, and context management, contrasting with the more casual vibe coding style. It also emphasizes that the economics of AI development favor investing in harness design and context engineering, as these yield lower marginal costs over time compared to ad-hoc prompting.

At a glance

reportWhen: published March 2026

The developmentThe new SDLC framework shifts focus from AI models to harnesses and context engineering, emphasizing that the model itself is only a small part of successful AI systems.

The Model Is Only 10% — The New SDLC With Vibe Coding

AI Dispatch · Field Notes

Google · Osmani, Saboo & Kartakis · May 2026

The model is only 10%

A Google whitepaper argues software’s biggest shift is from writing code to expressing intent. Its sharpest claim: the model you obsess over is the smallest part of the system — the scaffolding around it does the real work.

A spectrum, not a binary — the differentiator is how outputs get verified

Vibe Coding

Casual prompts · “does it seem to work?” · disposable code · high risk

Structured AI-Assisted

Detailed prompts + constraints · manual testing · features in real codebases

Agentic Engineering

Formal specs · automated tests + evals + CI gates · production scale · low risk

Tests verify the deterministic; evals verify the rest. Without both, it’s vibe coding — however clever the prompt.

The idea worth building your strategy around

Agent = Model + Harness

~10%

HARNESS — prompts · tools · context · hooks · sandboxes · observability

MODEL~90% IS YOUR SURFACE AREA, NOT THE PROVIDER’S

Outside Top 30 → Top 5 on Terminal Bench 2.0 by changing only the harness — same model.

“Most agent failures, examined honestly, are configuration failures” — a missing tool, a vague rule, a noisy context.

The economics: it’s a token-cost problem (CapEx vs OpEx)

Vibe Coding

Low CapEx · High OpEx

Looks free, hides debt: token burn (fix-it loops), maintenance tax (AI spaghetti), security remediation. Crosses over to 3–10× more per feature.

Agentic Engineering

High CapEx · Low OpEx

Pay upfront (specs, evals, context), then ship cheaply. Levers: context engineering for first-pass success + intelligent model routing — cheap models for the easy work.

85%

of devs use AI coding agents (51% daily)

41%

of all new code is AI-generated

~90%

of agent behavior is the harness, not the model

+19%

longer on some tasks (METR) — verification is the cost

The read

The clearest map yet of how serious AI development works — and mostly tool-agnostic. But it’s a Google funnel: the concepts are neutral, the on-ramps point to Gemini, Jules & the ADK. If the harness is 90% and it’s yours, your moat and your costs both live there — so own your scaffolding, route across models, and remember: AI amplifies whatever engineering culture it lands in.

Source: Osmani, Saboo & Kartakis, “The New SDLC With Vibe Coding,” Google (May 2026). Figures are the paper’s own, incl. METR & LangChain. Analysis is the author’s.

thorstenmeyerai.com

How Harness Design Transforms AI Effectiveness

This shift in focus from models to harnesses and context engineering has significant implications for AI strategy. Organizations can achieve better performance and cost savings by investing in configuration, tooling, and structured verification rather than solely chasing the latest model improvements. It also redefines the skill set needed for effective AI deployment, emphasizing system design over model selection.

AI Engineering: Building Applications with Foundation Models

As an affiliate, we earn on qualifying purchases.

Evolution of AI Development Strategies

Prior to this, the industry often prioritized acquiring or upgrading large language models, assuming that better models directly translated into better results. The whitepaper challenges this by showing that the surrounding system—harness and context—has a far greater impact. Experiments cited in the paper demonstrate that small changes to the harness can move an agent into the top tier on benchmarks, even with the same model.

This represents a maturation in AI engineering, moving from vibe coding towards a disciplined, engineering-driven approach that emphasizes system architecture, tooling, and verification processes.

“The model is only 10% of what determines an AI system’s behavior; the harness and context are the other 90%.”
— Addy Osmani

Supply Chain Software Security: AI, IoT, and Application Security

As an affiliate, we earn on qualifying purchases.

Unresolved Questions About Implementation and Impact

While the whitepaper presents strong experimental evidence, it is not yet clear how broadly these findings will translate across different AI applications and industries. The exact cost-benefit dynamics of investing in harness design versus model upgrades remain to be fully quantified, and practical guidelines for organizations are still emerging.

Additionally, it is uncertain how quickly organizations will adopt this disciplined approach and how it will influence the competitive landscape of AI development.

AI-Powered Observability: From Noise to Insight: Transforming How We Monitor, Detect, and Respond

As an affiliate, we earn on qualifying purchases.

Next Steps for AI Practitioners and Organizations

Organizations are likely to begin reevaluating their AI development strategies, emphasizing system configuration, tooling, and verification processes. Future research and case studies will clarify best practices for harness design and context engineering, potentially leading to new standards in AI engineering. Monitoring industry shifts and early adopters’ results will be key to understanding the full impact of this paradigm change.

Amazon

AI development harness design kits

As an affiliate, we earn on qualifying purchases.

Key Questions

Why is the model only 10% of AI system behavior?

The whitepaper argues that most of an AI system’s behavior is determined by how the model is integrated and controlled through harness design, context, and verification processes, not just the underlying model itself.

What is harness design in AI systems?

Harness design involves creating prompts, tools, rules, context policies, and observability mechanisms that shape and control how the AI model behaves within a system.

How does this shift affect AI development costs?

Investing in harness and context engineering can reduce long-term costs by improving efficiency, reducing errors, and lowering the need for frequent model upgrades.

Is this approach applicable to all AI applications?

While the whitepaper provides strong evidence, the applicability varies by use case. More research is needed to determine how broadly these principles can be implemented effectively across different domains.

What should organizations do now?

Organizations should reassess their AI development processes, focus on system configuration, tooling, and verification, and prepare to adopt more disciplined, engineering-driven approaches.

Source: ThorstenMeyerAI.com

The Model Is Only 10%: The Real Lesson of the New SDLC

Up next

Cutrova: Edit the Words, Not the Timeline

Author

Cornford and Cross Team

The model is only 10%

How Harness Design Transforms AI Effectiveness

AI Engineering: Building Applications with Foundation Models

Evolution of AI Development Strategies

Supply Chain Software Security: AI, IoT, and Application Security

Unresolved Questions About Implementation and Impact

AI-Powered Observability: From Noise to Insight: Transforming How We Monitor, Detect, and Respond