Welcome back to our second edition of Rxn. Benchling dropped their 2026 Biotech AI Report just ahead of JPM week. 100 biotech and pharma organizations, independent research firm, real adoption data. I highly recommend you read the full report; here’s what stood out to me.

The LinkedIn echo chamber is doing its thing. Sajith (CEO) posted. Ashu (Co-founder) posted. Christopher Li from BioBox called it a must-read. The same talking points are everywhere: "Killer apps are here." "Data is the bottleneck." "Real progress, real friction."

The headlines from the report:

  • 76% adoption for literature review

  • 71% for protein structure prediction

  • 66% for scientific reporting

  • 55% of pilots fail due to data quality

  • 50% already report faster time-to-target

  • 80% plan to increase AI budgets

They're not wrong. The methodology is solid. Independent research firm, 100 organizations already using AI, split between large pharma and smaller biotech. This isn't a sentiment survey. It's a snapshot of the companies ahead of the curve.

But scroll through the commentary and you'll notice something. Everyone is watching the same scoreboard. Waiting for the same headline: "First AI-designed drug enters clinic."

That's the wrong metric. And the Benchling report, read carefully, tells you why.

The question changed.

In 2021, the industry asked: "Can AI design a drug?"
In 2026, with multiple AI-generated molecules in Phase 2 and 3 trials, the question has shifted: "Can AI reduce how many drugs fail?"

That's a different scoreboard. And it changes what you should pay attention to.

The "first AI-designed drug" framing treats AI success as a single breakthrough event. A molecule enters the clinic. Headlines are written. The technology is validated. But drug development doesn't work like that.

"Those waiting for the 'first AI-designed drug' as proof that AI works are focused on the wrong signal."

— Sajith Wickramasekara, CEO, Benchling

A drug takes 10 to 12 years to reach market. Along the way, thousands of decisions get made. Which targets to pursue. Which compounds to synthesize. Which experiments to run. Which candidates to advance. Each decision narrows your options. Pick a bad target and you've wasted years. Miss a toxicity signal early and it kills you in Phase 2.

The Benchling data shows AI is already improving these decisions:

  • 50% report faster time-to-target

  • 42% see better hit rates and prediction accuracy

  • 56% expect meaningful cost reductions within two years

These aren't future promises. They're current operational gains. But they don't make headlines because "20% improvement in target prioritization" isn't a story anyone shares.

Here's the math that matters. If AI improves target selection quality by 20%, and you make dozens of target decisions across a pipeline's lifetime, that compounds dramatically. The value is accruing now, invisibly, in the operational layer of R&D.

The report confirms this. The "killer apps" that broke out of pilot are literature review, structure prediction, and scientific reporting. Not glamorous. Cognitive infrastructure.

Meanwhile, the sexy stuff is stuck. Generative drug design sits at 42% adoption with a big "piloting" population. De novo molecule creation, same story. Not because the models are bad. Because the data environment can't support them.

The real metric isn't "who files the first AI-designed IND." It's "who has the data-to-decision loop that gets better every quarter."

The proof is in the trials.

Two clinical stories from 2025. One win. One loss. Both instructive.

Insilico Medicine announced positive Phase 2a results for ISM001-055, a drug for idiopathic pulmonary fibrosis. That's a lung scarring disease with limited treatment options.

What makes this notable: AI did both jobs. Their PandaOmics platform identified TNIK as a novel anti-fibrotic target. Their Chemistry42 platform designed the molecule. End-to-end generative AI, from target to drug.

The data: Patients on the 60mg dose improved lung function by +98.4 mL. Patients on placebo declined by -20.3 mL. A clinically meaningful reversal of disease progression.

The timeline: Target discovery to preclinical candidate in roughly 18 months. Industry average is 4 to 5 years. That's 50 to 70% compression of early-stage development.

This is the first real clinical validation that generative AI can produce a molecule that engages a target in humans. Not a cell model. Not a mouse. Humans.

Then there's Recursion Pharmaceuticals.

Recursion built its platform on phenomics, using AI to analyze millions of cellular images to find drug candidates. Their lead asset, REC-994, targeted Cerebral Cavernous Malformation, a rare brain disease.

The AI worked. It identified a molecule that rescued the diseased phenotype in cellular models. The cells looked healthier.

The trial failed. In the Phase 2 SYCAMORE study, the drug was safe but showed no improvement in brain lesions or functional outcomes compared to natural disease progression. Program discontinued May 2025.

The lesson: AI solved the cell model. It didn't solve the disease. The cellular phenotype wasn't a reliable proxy for human pathology.

Broader data confirms this pattern. AI-designed drugs show higher Phase 1 success rates, around 80 to 90% versus the historical 60%. That's because ADMET optimization (absorption, toxicity, metabolism) is a chemistry problem AI handles well.

But Phase 2 efficacy still fails at roughly 40%. That's the industry average. Phase 2 is where you find out if you picked the right target. That's a biology problem. AI hasn't solved it yet.

Phase 2 failure rate for AI-designed drugs

~40%

Same as the industry average.

The scoreboard everyone watches obscures what's actually happening. AI is compressing timelines and improving chemistry. It's not yet cracking the target validation problem that causes most drugs to fail.

The real moat isn't the model.

Everyone has access to the same algorithms now. AlphaFold is open source. GPT-4 and Claude are API calls. Diffusion models for molecule generation are published. The algorithm isn't the differentiator.

The Benchling report shows what actually separates leaders from laggards: data infrastructure.

The numbers are stark:

  • High AI adopters are 2x more likely to have integrated data infrastructure

  • 55% cite data quality as the top reason AI pilots fail

  • 50% cite IP, security, and compliance friction

Companies with fully integrated data infrastructure

6%

Everyone else is playing catch-up.

This is the "Clean Data Ceiling." Models trained on public datasets perform beautifully. Protein databases, published chemical libraries, all the curated stuff. Fine-tune them on internal company data and performance collapses.

Why? The internal data is fragmented across dozens of systems. Critical metadata is missing. Assay conditions weren't recorded in machine-readable formats: temperature, buffer, protein construct, incubation time. If that context isn't there, the data is useless to an AI model.

You can't fix this retroactively. Historical experiments without proper context are essentially lost. Companies are facing a choice: expensive "data archaeology" or discarding years of experiments and starting fresh.

The companies capturing clean experimental data today are building an appreciating asset. Every time foundation models improve, that proprietary data becomes more valuable. The gap between data-rich and data-poor organizations widens with each model generation.

The investment community has noticed. The "platform premium" that drove TechBio valuations in 2020 and 2021 is dead. After high-profile clinical failures from BenevolentAI (Phase 2a miss in eczema, 30% layoffs) and Recursion, investors now demand "product-proven platforms."

Marketing claims about generative capabilities get discounted. The only currency is clinical data. Molecules that work in humans. But the leading indicator of who will generate that clinical data isn't model sophistication. It's data infrastructure.

Xaira Therapeutics raised over $1 billion in 2024. The thesis wasn't "better algorithms." It was the combination of world-class data engineering with clinical development capability. That's the new bar.

What to watch instead.

If you're evaluating a biotech's AI claims, whether as an investor, a partner, or a competitor, the Benchling report suggests a different set of questions.

Stop asking: "Who will produce the first AI-designed drug?"

Start asking:

  • How is their experimental data captured and connected?

  • Can their computational team access wet lab results in real time?

  • What percentage of their historical data is machine-readable?

  • Do they have the wet-dry lab integration that creates tight feedback loops?

The talent signal matters too. 67% of biotech is growing AI capability through internal upskilling, not hiring ML engineers from tech companies. The scarce resource isn't algorithm expertise. It's scientific translators who understand both the biology and the computation.

"The race isn't to the best model; it's to the best data-to-decision loop."

— Yves Fomekong Nanfack, Takeda

The regulatory excuse is gone. FDA's January 2025 guidance established a clear framework for AI in drug development. High-risk applications require rigorous validation. Low-risk operational uses get lighter scrutiny. The rules exist. The bottleneck is infrastructure, not uncertainty.

The divergence is already happening. High AI adopters have better data integration, tighter feedback loops, faster iteration cycles.

The first AI-designed drug will make headlines when it happens. But by then, the companies with clean data and tight feedback loops will already be years ahead. That's the real story in this report.

— Chris

If this was useful, forward it to someone building in biotech. More stories like this every week at Thinking Folds.

Keep Reading

No posts found