Why 95 % of AI Projects Fail — and What Data Quality Has to Do with It
MIT 2025: 95 % of GenAI pilots deliver no measurable P&L impact. The most common reason isn't the technology — it's the data behind it.

The Phone Call
Last week a CFO from a mid-sized German engineering firm called me. Mechanical engineering, around 700 employees, healthy margin, steady growth. His company had bought AI licenses for roughly half the workforce seven months earlier, with a clean rollout plan. Workshop, champions program, monthly newsletter, the works.
His question on the call: "Can you tell us where the return went?"
I knew the honest answer before he finished his story. In about nineteen out of twenty comparable projects it comes down to the same point, and this isn't gut feeling: MIT documented it in numbers last summer.
The Number That Can Ruin Your Day
In summer 2025, MIT (specifically the NANDA Initiative) published a study titled State of AI in Business. It analyzed 300 documented AI projects, 52 board-level interviews, and 153 leadership surveys. This isn't a glossy marketing white paper. It's a fairly sober inventory.
The central finding: across 95 % of generative AI pilots, no measurable contribution to the P&L was detectable. Ninety-five. Not the one the IT lead estimated. Not the impression from one-on-ones with employees. Nothing that shows up in the controlling report at quarter-end.
Before anyone says "so AI is overhyped": it isn't. The remaining 5 % show that it works. The honest question is why so few companies get there.
Picture an Intern
Picture an intern who has read every document in your company. Every contract. Every complaint. Every project report from the last five years. You could ask them:
"Which customer segment complained the most last quarter, and is it connected to our supplier switch in February?"
Answer in 30 seconds, with source citations. That's essentially what we mean when we say "Corporate LLM."
The difference to ChatGPT: ChatGPT knows everything about the open internet, but nothing about you. A Corporate LLM knows your company. With access rights that respect your organization, so finance sees finance and sales sees sales. For many companies it's the next step after a broad Copilot rollout, which speeds up meetings and writing but doesn't unlock the actual internal knowledge.
Why the Intern Still Fails
An intern with memory can only be as smart as the files they get. And in most companies, the files look like this:
Your contract documentation exists in five versions, spread across four SharePoint folders. There are ten comments in the margin of a Word document from 2022. The latest Excel sits on the laptop of a unit head who's currently on vacation. And in one of those folders is a PDF version nobody maintains anymore, but everyone pulls it up when the contract is briefly needed.
In a setup like that, even the best intern gives contradictory answers. That's the most common reason behind the 95 % failure rate in the MIT study. Not the model. What it gets to see.
In practice, three things are usually missing.
First, a clean data state. The AI needs to know which version is current and which is long obsolete. Sounds trivial. Isn't.
Second, permissions. An AI that bypasses your organization's access rights rather than respecting them creates compliance risk that your data protection officer and the EU AI Act don't appreciate. More on that in our EU AI Act roadmap.
Third, what we call "structured ingestion." Concretely: a scanned invoice as a JPG, a contract with handwritten notes, a project report in an old Word file. All of that has to be properly prepared before AI can work with it productively. In nearly every project, the effort for this gets underestimated. Sometimes by a factor of three.
Where Cross-Department Access Suddenly Becomes Worth Money
Three examples from actual client engagements, lightly anonymized.
Sales meets service. A major customer in healthcare asks for a proposal. Instead of three days of research, the proposal team pulls together in two minutes what would otherwise be scattered across the department: every past order, open support tickets, contract terms, current margin. The sales lead signs the offer with a clearer picture and a more realistic floor price.
Legal meets procurement. Supplier negotiation. The AI knows instantly which clauses we've already negotiated in comparable contracts, which became problematic later, what management rejected the last time around. Negotiations that used to take two weeks then take two days. Realistic, not from the sales deck.
Leadership meets data. Instead of a PDF report from controlling, the board asks the question directly: "Which division contributed most to EBIT improvement last half-year, and why?" The answer arrives with sources, in 20 seconds, mid-meeting. We've noticed a side effect that surprised us: boards start asking more concrete questions. Because they know concrete answers will come back.
Watch out: deploying cross-department AI without clean access rights creates exactly the compliance risk from Shadow AI it was supposed to eliminate.
Build or Buy?
The MIT study is more direct here than most expect. In-house builds fail twice as often as purchased solutions: 67 % success rate with specialized partnerships, 33 % with purely internal builds.
This isn't ideology and it isn't a sales pitch from service providers (myself included). It's a question of specialization. Data integration, model selection, security, ongoing maintenance. None of that is a side project for the existing IT team, and in the first quarter it's routinely talked down. We hear the same sentence in every other initial conversation: "We wanted to build it ourselves. Now we're looking for a partner."
It's a pattern.
What C-Suite Actually Does Differently
The companies that end up in the 5 % we see doing three things differently.
First: starting with a clear business problem, not a technology. "We want to accelerate proposals by 40 %" is a good starting point. "We want a Corporate LLM" isn't a starting point. It's a wish.
Second: checking the data foundation before picking the model. Which sources need to be connected, how current are they, who's allowed to see what. This is the most boring phase in the project. It's also the most important.
Third, if you're setting something like this up for the first time: find a partner who has walked this path ten times before. The first six months are expensive in learning curves, not in license fees.
Get Started
The return on a Corporate LLM rarely sits with the model. It sits with the question of whether your internal data is AI-ready. That question can be answered cleanly in a two-day workshop before the first project begins. If that's on your plate right now, get in touch. We do this often.
Sources
Frequently Asked Questions
What's the difference between a Corporate LLM and ChatGPT Enterprise?
It's not a different AI model. It's a controlled connection between an AI model and your organization's knowledge. ChatGPT Enterprise knows a great deal about the open internet but nothing about your supplier contracts, complaint history, or project status. A Corporate LLM makes exactly that internal knowledge queryable, in a way that finance sees only finance data and sales sees only sales data.
What data do I need to prepare before starting?
Less than most people think. But cleaner than most have. In pilot mode, two or three clearly scoped knowledge sources are usually enough, for example the CRM, the proposal library, and two years of project documentation. What matters more than volume is currency, clean permissions, and a sharp separation between current and outdated versions.
Should we build or buy?
The MIT 2025 study is unambiguous: in-house builds fail twice as often as purchased solutions. 67 % success rate with specialized partnerships, 33 % with purely internal builds. Data integration, security, and ongoing maintenance are heavier work than they appear in a pilot. Without a dedicated AI team in-house, a partner gets you to ROI faster.



