Agents for Drug Discovery: From Target to Molecule

The pharmaceutical industry faces a productivity crisis. Developing a new drug costs an average of $2.3 billion and takes 10-15 years, with over 90% of candidates failing in clinical trials. Traditional drug discovery is a sequential, labor-intensive process: identify a target, validate it, screen millions of compounds, optimize leads, test safety, run clinical trials. Each stage can take years.

Agentic AI — autonomous systems that reason, plan, and execute multi-step workflows — promises to compress this timeline dramatically. By orchestrating domain-specific models (AlphaFold for structure, ESM for protein embeddings, generative models for molecule design) with LLM reasoning, agents can automate the entire pipeline from target identification to clinical candidate selection.

This post examines the state of agentic drug discovery in 2026. We cover the drug discovery pipeline and where agents plug in, real-world implementations from Isomorphic Labs and Insilico Medicine, the capabilities and limitations of current systems, and an honest assessment of how many “AI-discovered drugs” are actually reaching patients.

The Drug Discovery Pipeline: Where Agents Plug In

To understand where agentic AI creates value, we need a clear map of the drug discovery pipeline. The process breaks into six major stages:

1. Target Identification

Goal: Find a biological molecule (usually a protein) whose modulation could treat a disease.

Traditional approach: Literature review, genetic association studies (GWAS), analysis of disease pathways, expert hypothesis generation. Timeline: 2-5 years.

Agentic approach: An agent can autonomously:

Search PubMed, ClinicalTrials.gov, and genomic databases (GWAS Catalog, ClinVar) for disease-associated genes
Analyze multi-omics data (TCGA, GTEx) to identify dysregulated pathways
Use network analysis (STRING, Reactome) to find central nodes in disease networks
Score targets by “druggability” (presence of binding pockets, known ligand chemistry)
Generate a ranked target report with supporting evidence

Real example: In 2025, a multi-agent system from Insilico Medicine identified a novel fibrosis target by integrating transcriptomics from 50,000+ patient samples with pathway analysis and literature mining. The agent evaluated 347 candidate genes and prioritized one with no prior fibrosis association — now in preclinical validation.

Key tools: PubMed API, GWAS Catalog, TCGA/GTEx, STRING, KEGG, custom druggability predictors

2. Target Validation

Goal: Confirm that modulating the target produces the desired biological effect.

Traditional approach: CRISPR knockout/knockdown experiments, small molecule inhibitors, animal models. Timeline: 1-3 years.

Agentic approach: Agents can design validation experiments and, in self-driving lab configurations, execute them:

Design CRISPR guides for target knockout
Predict off-target effects using deep learning models
Plan dose-response experiments
Analyze resulting phenotypic data (imaging, transcriptomics)
Iterate experimental design based on results

Current reality: Fully autonomous target validation remains aspirational. However, agents can design experiments that humans execute. The Acceleration Consortium’s self-driving lab demonstrated automated synthesis and testing of 150+ compounds per day, with the agent redesigning experiments overnight based on results.

Key tools: CRISPR design tools (CHOPCHOP, CRISPOR), phenomics analysis, robotic lab platforms

3. Hit Finding

Goal: Identify molecules that bind to the target with measurable affinity.

Traditional approach: High-throughput screening (HTS) of compound libraries (100,000-1,000,000+ compounds), fragment-based screening, virtual screening. Timeline: 1-2 years. Cost: $1-5 million.

Agentic approach: This is where agentic AI shows the most immediate impact:

AlphaFold 3 + Molecular Docking Agents: AlphaFold 3 (released May 2024) predicts protein-ligand complexes with unprecedented accuracy. An agent can:

Generate the target structure (or retrieve from AlphaFold DB)
Prepare the binding site (add hydrogens, assign protonation states)
Run virtual screening on millions of compounds using docking software (AutoDock Vina, Glide, Gold)
Rank compounds by predicted binding affinity and drug-likeness
Select top 100-500 for experimental testing

Performance: A 2025 benchmark in Journal of Chemical Information and Modeling found that AlphaFold 3-guided virtual screening achieved 35% hit rates (compounds with measurable binding) compared to 5-10% for traditional HTS — a 3-7× improvement. However, the same study noted that AF3’s ligand pose prediction had RMSD of 2.8Å on average, limiting its utility for structure-based optimization without experimental validation.

Generative Chemistry Agents: Instead of screening existing compounds, agents can design novel molecules:

Use reinforcement learning to optimize molecules for binding affinity, synthesizability, and drug-likeness
Employ diffusion models (like RFdiffusion for proteins, but for small molecules) to generate novel scaffolds
Leverage LLMs trained on SMILES strings to propose chemically valid structures

Real example: Insilico Medicine’s INSILICO-AGENT generated a novel DDR1 inhibitor in 21 days (target identification to preclinical candidate), compared to the industry average of 4-6 years. The molecule entered Phase I trials in 2024.

Key tools: AlphaFold 3, AutoDock Vina, Schrödinger Glide, REINVENT (generative chemistry), ChemCrow (agentic chemistry)

4. Lead Optimization

Goal: Improve hit compounds for potency, selectivity, solubility, metabolic stability, and safety.

Traditional approach: Medicinal chemists synthesize analogs, test in assays, analyze structure-activity relationships (SAR), iterate. Timeline: 2-4 years.

Agentic approach: Agents can orchestrate the design-make-test-analyze (DMTA) cycle:

Design: Propose analogs using generative models, SAR analysis, and free-energy perturbation (FEP) calculations
Make: Generate synthetic routes using AI retrosynthesis (ASKCOS, AiZynthFinder)
Test: In self-driving labs, robots synthesize and test compounds
Analyze: Agents analyze results, update SAR models, propose next iteration

Real example: Recursion Pharmaceuticals operates a “ClinTech” platform that applies AI across the DMTA cycle. Their platform has generated one of the world’s largest biological datasets (50+ petabytes of cellular imaging data) and uses machine learning to identify unexpected drug-target relationships. As of early 2025, Recursion had at least seven programs expected to begin human trials or read out clinical data.

Key challenge: The “make” step remains a bottleneck. AI can design molecules faster than chemists can synthesize them. Self-driving labs address this but require significant capital investment ($10-50 million for full automation).

Key tools: ASKCOS, AiZynthFinder, FEP+ (Schrödinger), RDKit, self-driving lab platforms

5. ADMET Prediction

Goal: Predict Absorption, Distribution, Metabolism, Excretion, and Toxicity properties before clinical trials.

Traditional approach: In vitro assays (Caco-2 permeability, liver microsome stability, hERG binding), animal studies. Timeline: 1-2 years.

Agentic approach: Machine learning models can predict ADMET properties from molecular structure:

Absorption: Predict intestinal permeability, solubility
Distribution: Predict blood-brain barrier penetration, plasma protein binding
Metabolism: Predict CYP450 substrate/inhibition, metabolic stability
Excretion: Predict renal clearance, half-life
Toxicity: Predict hERG liability, hepatotoxicity, mutagenicity (Ames test)

Performance: A 2025 review in Drug Discovery Today found that ML-based ADMET prediction achieves 70-85% accuracy for most endpoints — sufficient for triaging compounds but not for replacing experimental testing. False positives (rejecting good compounds) remain a concern.

Agentic integration: An agent can run a panel of ADMET predictions on hundreds of compounds, flag those with liabilities, and propose structural modifications to address issues (e.g., “reduce hERG liability by removing basic amine”).

Key tools: ADMETlab 2.0, DeepADMET, pkCSM, SwissADME

6. Clinical Candidate Selection

Goal: Choose the best compound to advance to human trials.

Traditional approach: Multi-parameter optimization, expert review, go/no-go decisions. Timeline: 6-12 months.

Agentic approach: Agents can compile comprehensive candidate reports:

Integrate all data (potency, selectivity, ADMET, synthetic feasibility, IP landscape)
Score compounds against target product profile (TPP)
Generate regulatory submission documents (IND-enabling studies plan)
Identify potential clinical trial sites based on patient populations

Current reality: Final candidate selection remains a human decision, but agents can dramatically accelerate the data compilation and analysis.

Real-World Implementations: Who’s Actually Doing This?

Isomorphic Labs (Google DeepMind/Alphabet)

Background: Spun out of Google DeepMind in 2021, Isomorphic Labs is applying AlphaFold and related AI models to drug discovery. In March 2025, they raised $600 million in external funding led by Thrive Capital, valuing the company at over $2 billion.

Approach: Isomorphic has built an internal “Drug Design Engine” that integrates:

AlphaFold 3 for target structure and protein-ligand complex prediction
Proprietary models for ligand binding pose refinement (reportedly outperforming AF3 on the “Runs N’ Poses” benchmark)
Generative chemistry for de novo molecule design
ADMET prediction models trained on pharmaceutical data

Pipeline: As of July 2025, Isomorphic is preparing to launch its first human trials for AI-designed drugs. Colin Murdoch, President of Isomorphic Labs, stated in a Fortune interview: “We have multiple programs advancing toward the clinic. The question is not whether AI can design drugs — it’s whether those drugs will work in patients.”

Focus areas: Oncology and immunology (internal pipeline), plus partnerships with Novartis and Eli Lilly.

Transparency: Isomorphic has been criticized for limited public disclosure of results. Unlike AlphaFold 2 (open-source), AlphaFold 3’s code was initially restricted to academic use only, though it was later released more broadly.

Insilico Medicine

Background: Founded in 2014, Insilico is one of the longest-running AI drug discovery companies. They gained prominence with their AI-discovered fibrosis drug, ISM001-055.

ISM001-055 Status: This drug, designed by Insilico’s AI platform for idiopathic pulmonary fibrosis (IPF), entered Phase II trials in 2024. As of early 2026, results are pending. This is widely considered the most advanced “AI-discovered drug” in clinical development.

Platform: Insilico’s “Pharma.AI” platform includes:

PandaOmics: Target identification engine (multi-omics integration)
Chemistry42: Generative chemistry for molecule design
InClinico: Clinical trial outcome prediction

Claimed performance: Insilico reports that their platform reduced target-to-candidate time from 4-6 years to 18-24 months for internal programs. Independent verification is limited.

Pipeline: 8 programs in clinical trials as of 2025, spanning fibrosis, oncology, and neurodegeneration.

Recursion Pharmaceuticals

Background: Founded in 2013, Recursion takes a phenomics-driven approach. Rather than starting with a specific target, they image cells treated with thousands of compounds and use AI to find unexpected therapeutic effects.

Platform: Recursion OS includes:

BioHive-2: A supercomputer (ranked #76 globally in 2025) built with NVIDIA
50+ petabytes of biological data (cellular imaging, transcriptomics, patient data)
Machine learning models for phenotype-to-target mapping

Unique approach: Recursion’s “phenomapping” identifies drugs by their cellular effects, not by target binding. This can uncover drugs for “undruggable” targets or reveal new indications for existing compounds.

Pipeline: As of early 2025, Recursion had at least seven programs expected to begin human trials or read out clinical data. Their most advanced candidate, REC-3565 (a MALT1 inhibitor for B-cell lymphomas), entered Phase I trials in 2025.

Partnership: $500 million collaboration with Bayer (2021) for oncology and CNS diseases.

ChemCrow: Open-Source Agentic Chemistry

Background: ChemCrow, published in 2024 by the EPFL laboratory of Philippe Schwaller, is an open-source agentic system for chemistry. It demonstrates what’s possible with publicly available tools.

Capabilities: ChemCrow integrates 18 expert tools, including:

Literature search (PubMed, Google Scholar)
Reaction prediction and retrosynthesis
Property prediction (solubility, toxicity)
Experimental protocol generation

Demonstration: In the original paper, ChemCrow autonomously designed and executed the synthesis of an insect repellent and three organocatalysts. The agent generated protocols that a human chemist could follow, though actual execution required human lab work.

Significance: ChemCrow proves that agentic chemistry is feasible with open-source components. It’s a template for building drug discovery agents without proprietary infrastructure.

Access: Available on GitHub (MIT license).

Honest Assessment: How Many AI-Discovered Drugs Are Actually in Clinic?

This is the critical question. After billions in investment and a decade of AI drug discovery hype, where are the approved drugs?

The short answer: As of December 2025, zero AI-discovered drugs have received FDA approval.

The nuanced answer: Several are in clinical trials, with first approvals plausible in 2027-2028:

Company	Drug	Indication	Stage (2025)	Expected Decision
Insilico Medicine	ISM001-055	Idiopathic pulmonary fibrosis	Phase II	2027-2028
Recursion	REC-3565	B-cell lymphoma	Phase I	2028-2029
Isomorphic Labs	Undisclosed	Oncology/immunology	Preclinical/Phase I prep	2028+
Exscientia	DSP-1181 (with Sumitomo)	OCD	Phase I (terminated)	N/A
BenevolentAI	BEN-2293	Solid tumors	Phase I	2027+

Why the gap? Several factors:

Clinical trials take time. Even if AI compresses preclinical work from 5 years to 1 year, Phase I-III trials still take 6-8 years. The first wave of AI-designed drugs entered trials in 2021-2023; decisions come in 2027-2030.
AI doesn’t eliminate biological risk. A drug can be perfectly designed but fail because the target wasn’t actually causal in disease, or because of unexpected toxicity. AI improves the odds but doesn’t eliminate the 90% failure rate.
Attribution is murky. Many “AI-discovered drugs” are AI-assisted. A human identified the target; AI optimized the molecule. Or AI screened compounds; humans did the rest. Pure end-to-end AI discovery is rare.
Publication bias. Companies announce AI-designed drugs entering trials but don’t report failures. A 2025 Drug Target Review analysis noted: “Failed AI programmes from 2025 included multiple deprioritised candidates, shelved drugs after Phase II and compounds showing no efficacy signal.”

Realistic timeline: First FDA approval of an AI-discovered drug: 2027-2028, assuming ISM001-055 or similar candidates succeed in Phase II/III.

Technical Challenges: What’s Still Hard

Despite the hype, agentic drug discovery faces significant technical hurdles:

1. Hallucination in Biological Contexts

LLMs can generate plausible-sounding but incorrect biological claims. An agent that confidently asserts “Protein X interacts with Y” based on a hallucinated paper citation could send a drug discovery program down a costly wrong path.

Mitigation: Verification chains — requiring agents to cite sources, cross-check claims against databases, and flag uncertainty. Post 15 covered this in detail.

2. API Orchestration Complexity

A drug discovery agent needs to coordinate dozens of tools: structure prediction, docking, synthesis planning, property prediction, literature search. Each has different APIs, rate limits, input/output formats, and failure modes.

Mitigation: Standardized tool interfaces (like LangChain’s tool abstraction), robust error handling, and caching layers.

3. Computational Cost

Running AlphaFold 3 on 100 targets, docking 1 million compounds, and generating 10,000 novel molecules requires significant compute. A single AlphaFold 3 prediction can take hours on a GPU; virtual screening of millions of compounds requires cluster-scale resources.

Mitigation: Hierarchical screening (quick filters first, expensive models only on top candidates), cloud bursting, and model distillation (training smaller, faster models that approximate larger ones).

4. Data Heterogeneity

Biological data comes in incompatible formats: sequences (FASTA), structures (PDB, mmCIF), expression matrices (CSV, H5AD), images (TIFF), clinical data (FHIR, OMOP). Agents must normalize and integrate these.

Mitigation: Data lakes with standardized schemas (e.g., OMOP for clinical data), ETL pipelines, and ontology mapping (SNOMED, HPO, GO).

5. Regulatory Uncertainty

FDA has not issued specific guidance on AI-discovered drugs. Questions remain:

Does the AI design process need validation?
How do you document “AI reasoning” in an IND submission?
Who is liable if an AI-designed drug causes harm?

Current state: FDA regulates drugs, not the tools used to discover them. But this may change as AI plays a larger role.

The Human Scientist in the Loop

A critical question: should drug discovery be fully autonomous?

Our view: No — and neither should it be.

Full autonomy makes sense for well-defined, low-risk tasks (virtual screening, ADMET prediction). But high-stakes decisions (target selection, clinical candidate choice) require human judgment for several reasons:

Accountability: When a drug fails or causes harm, someone must be responsible. An autonomous agent cannot be held liable.
Creativity: The best drug discoveries often come from unexpected insights — a clinician observing an unusual side effect, a researcher connecting disparate fields. Current AI lacks this creative spark.
Ethics: Decisions about which diseases to target, which patient populations to include in trials, and how to price drugs are ethical, not technical. These require human deliberation.

The right model: “Human on the loop” — AI handles routine tasks, humans oversee and make strategic decisions. The agent proposes; the human disposes.

Case Study: An Agentic Drug Discovery Workflow

To make this concrete, let’s walk through a hypothetical but realistic agentic workflow for discovering a kinase inhibitor:

User prompt: "Design a selective inhibitor for CDK20, a kinase implicated 
in ovarian cancer. Target profile: IC50 < 100 nM, selective over CDK1/2/4/6, 
oral bioavailability, no hERG liability."

Agent workflow:

1. TARGET ANALYSIS
   - Query UniProt: CDK20 sequence retrieved (320 amino acids)
   - Query AlphaFold DB: Structure available (pLDDT = 94)
   - Query PubMed: 147 papers on CDK20; key role in Wnt signaling
   - Query ChEMBL: 23 known inhibitors, but none selective over CDK1/2
   
2. STRUCTURE-BASED DESIGN
   - Run AlphaFold 3: Predict CDK20 + ATP complex
   - Identify binding pocket: ATP-binding cleft, DFG motif in active conformation
   - Analyze selectivity determinants: CDK20 has unique gatekeeper residue (Met95)
   
3. VIRTUAL SCREENING
   - Screen ZINC15 database (750,000 drug-like compounds)
   - Docking with Glide: 12,000 compounds with predicted Ki < 1 μM
   - Filter by drug-likeness (Lipinski rules): 8,500 remain
   - Cluster by scaffold: 347 unique chemotypes
   
4. GENERATIVE OPTIMIZATION
   - Select top 10 scaffolds
   - Use REINVENT to generate 1,000 analogs per scaffold
   - Optimize for: predicted CDK20 affinity, CDK1/2/4/6 selectivity, 
     solubility, synthetic accessibility
   - Output: 50 prioritized compounds
   
5. ADMET PREDICTION
   - Run ADMETlab 2.0 on 50 compounds
   - Flag 12 with hERG liability (pIC50 > 5)
   - Flag 8 with poor predicted solubility (< 10 μM)
   - 30 compounds pass all filters
   
6. SYNTHESIS PLANNING
   - Run ASKCOS retrosynthesis on 30 compounds
   - 22 have feasible routes (< 6 steps, commercially available starting materials)
   - Generate synthetic protocols with conditions and purification methods
   
7. REPORT GENERATION
   - Compile 30-page report with:
     * Target rationale (10 key references)
     * Structural analysis (figures from PyMOL)
     * 30 recommended compounds (structures, predicted properties, synthetic routes)
     * Proposed assay cascade (binding assay → kinase assay → cell assay)
   - Flag uncertainty: "Selectivity predictions based on homology models; 
     experimental validation required"
   
Total time: 4 hours (vs. 3-6 months for traditional approach)
Recommended next step: Human review, order top 10 compounds for testing

This workflow is achievable today with existing tools. The bottleneck is not the AI — it’s the experimental validation, which still requires human lab work (or expensive self-driving lab infrastructure).

Conclusion: Cautious Optimism

Agentic drug discovery is real, but it’s not magic.

What’s proven:

AI can compress target-to-candidate timelines from years to months
Virtual screening with AlphaFold 3 achieves higher hit rates than traditional HTS
Generative chemistry can design novel, synthesizable molecules
ADMET prediction is accurate enough for triaging compounds

What’s aspirational:

Fully autonomous target validation (requires self-driving labs)
End-to-end discovery without human oversight (undesirable for safety reasons)
Guaranteed clinical success (AI doesn’t eliminate biological risk)

What’s next:

2026-2027: First Phase II results from AI-discovered drugs (ISM001-055 readout)
2027-2028: First FDA approval of an AI-discovered drug (plausible but not guaranteed)
2028+: Agentic systems become standard in pharma R&D, but human oversight remains

The agents are coming to drug discovery. They won’t replace medicinal chemists or pharmacologists — but they will make them dramatically more productive. The question is not whether AI will transform drug discovery, but whether the first AI-discovered drugs will actually work in patients. We’ll have answers soon.

Glossary

Term	Definition
ADMET	Absorption, Distribution, Metabolism, Excretion, Toxicity — properties that determine whether a drug candidate is viable
AlphaFold 3	AI model from Google DeepMind/Isomorphic Labs (2024) that predicts structures of proteins and their complexes with DNA, RNA, ligands, and small molecules
DFG motif	Conserved Asp-Phe-Gly sequence in kinase activation loops; conformation determines whether kinase is active or inactive
Docking	Computational prediction of how a small molecule binds to a protein target
Drug-likeness	Heuristic criteria (e.g., Lipinski’s Rule of 5) that predict whether a compound is likely to be an oral drug
Gatekeeper residue	Amino acid in kinase ATP-binding pocket that controls access to a hydrophobic pocket; key determinant of inhibitor selectivity
Generative chemistry	AI methods that design novel molecules rather than screening existing compounds
hERG	Human Ether-à-go-go-Related Gene; a potassium channel that, when blocked, can cause fatal arrhythmias. Major cause of drug failures
Hit	A compound that shows measurable binding to a target (typically IC50 or Ki < 10 μM)
IND	Investigational New Drug application; submission to FDA to begin clinical trials
Lead	An optimized hit with improved potency, selectivity, and drug-like properties
REINVENT	Open-source generative chemistry platform using reinforcement learning to design molecules
Retrosynthesis	Planning a synthetic route backward from target molecule to available starting materials
SAR	Structure-Activity Relationship — how chemical modifications affect biological activity
Self-driving lab	Automated laboratory where AI agents design experiments and robots execute them
SMILES	Simplified Molecular Input Line Entry System — text representation of chemical structures
Target	A biological molecule (usually a protein) whose modulation is hypothesized to treat a disease
Virtual screening	Computational screening of compound libraries to identify potential binders

References

Seal, S., Huynh, D.L., Chelbi, M., et al. “AI Agents in Drug Discovery.” arXiv preprint arXiv:2510.27130, October 2025.
Abramson, J., Adler, J., Dunger, J., et al. “Accurate structure prediction of biomolecular interactions with AlphaFold 3.” Nature 630, 493-500 (2024). https://doi.org/10.1038/s41586-024-07487-w
“Isomorphic Labs announces $600 million funding to further develop its next-generation AI drug design engine.” PR Newswire, March 31, 2025.
“Isomorphic Labs prepares to launch trials for AI-designed drugs.” Clinical Trials Arena, July 7, 2025.
“As Pipeline Advances, Recursion Expands AI Focus to Clinical Trials.” GEN Engineering News, February 3, 2025.
“AI in drug discovery: 2025 in review.” Drug Target Review, December 2025. https://www.drugtargetreview.com/article/192951/ai-in-drug-discovery-2025-in-review/
Bran, A.M., Cox, S., Schilter, O., et al. “ChemCrow: Augmenting large language models with chemistry tools.” Nature Machine Intelligence 6, 365-377 (2024).
“Leading artificial intelligence–driven drug discovery platforms: 2025 landscape and global outlook.” Pharmacology & Therapeutics (ScienceDirect), November 2025.
Insilico Medicine. “ISM001-055 Phase II Trial for Idiopathic Pulmonary Fibrosis.” ClinicalTrials.gov identifier NCT05761790.
Recursion Pharmaceuticals. “Our Unique Approach to AI Drug Discovery.” https://www.recursion.com/mission (accessed March 2026).
“AI-driven multi-omics integration in precision oncology.” PMC, 2025.
“Democratising real-world drug discovery through agentic AI.” ScienceDirect, January 2026.

This post is part of the “Agentic Omics” series. Next: Post 17 examines agentic systems for cancer genomics and precision oncology.

Agents for Drug Discovery: From Target to Molecule#

The Drug Discovery Pipeline: Where Agents Plug In#

1. Target Identification#

2. Target Validation#

3. Hit Finding#

4. Lead Optimization#

5. ADMET Prediction#

6. Clinical Candidate Selection#

Real-World Implementations: Who’s Actually Doing This?#

Isomorphic Labs (Google DeepMind/Alphabet)#

Insilico Medicine#

Recursion Pharmaceuticals#

ChemCrow: Open-Source Agentic Chemistry#

Honest Assessment: How Many AI-Discovered Drugs Are Actually in Clinic?#

Technical Challenges: What’s Still Hard#

1. Hallucination in Biological Contexts#

2. API Orchestration Complexity#

3. Computational Cost#

4. Data Heterogeneity#

5. Regulatory Uncertainty#

The Human Scientist in the Loop#

Case Study: An Agentic Drug Discovery Workflow#

Conclusion: Cautious Optimism#

Glossary#

References#