Introduction: Standing at the Inflection Point
As we conclude the Agentic Omics series in March 2026, we find ourselves at a genuine inflection point. The past two years have witnessed extraordinary progress: AlphaFold 3’s extension to protein complexes and ligands, the emergence of 7B-parameter genome models like Evo, foundation models for single-cell biology achieving clinical utility, and the first wave of agentic systems orchestrating multi-step scientific workflows. Yet we also face sobering realities: Phase III clinical trial results remain the ultimate arbiter of success, regulatory frameworks are still crystallising, and the gap between computational prediction and biological causality remains stubbornly wide.
This final post does three things. First, it synthesises the key insights from our 23 preceding posts into a coherent picture of where agentic omics stands today. Second, it offers evidence-based predictions for the near-term (2026–2027), medium-term (2027–2028), and long-term (2029+) horizons. Third, it honestly confronts what remains hard—the challenges that no amount of architectural innovation or compute scaling will easily solve.
Our thesis remains unchanged from Post 1: Agentic Omics is not about replacing human scientists; it’s about amplifying them. The creative spark of asking the right question, the intuition honed by decades of bench work, the ethical judgment required for clinical decisions—these remain irreducibly human. But the tedious literature reviews, the pattern recognition across millions of data points, the orchestration of domain-specific tools—these are increasingly the domain of autonomous agents.
The State of Agentic Omics: March 2026
What’s Proven
Let’s begin with what we can state with confidence, backed by peer-reviewed evidence and real-world deployment:
Protein structure prediction is a solved problem for static structures. AlphaFold 2 achieved near-experimental accuracy for single-chain proteins in 2021, and the subsequent release of 200+ million predicted structures transformed structural biology. AlphaFold 3’s 2024 extension to protein complexes, nucleic acids, and small molecules represents a genuine advance, though accuracy for protein-ligand docking remains debated in independent benchmarks. The 2024 Nobel Prize in Chemistry recognised this as a fundamental breakthrough.
DNA foundation models can predict regulatory elements and gene expression from sequence alone. DNABERT-2 (Zhou et al., ICLR 2024) demonstrated that BPE tokenization across multiple species enables transfer learning for variant effect prediction. The Nucleotide Transformer (2.5B parameters, trained on 2,500+ genomes) and Evo (7B parameters, 300B nucleotides) have shown impressive performance on enhancer prediction, splice site detection, and conservation scoring. However, structural variant prediction and long-range regulatory interactions remain challenging.
Single-cell foundation models outperform classical methods for cell type annotation. scGPT (Cui et al., Nature Methods 2024), trained on 33+ million cells, and Geneformer (Theodoris et al., Nature 2023), trained on 30 million transcriptomes, have demonstrated superior transfer learning for cell type classification, perturbation prediction, and gene regulatory network inference. These models are now being deployed in production at major genomics companies.
Agentic chemistry is operational. ChemCrow (Bran et al., 2024) demonstrated that LLMs can orchestrate chemistry tools—reaction prediction, retrosynthesis, property calculation—into coherent multi-step workflows. CoScientist (Boiko et al., Nature 2023) showed autonomous experimental design and execution in materials science. These are not demos; they’re production systems used by thousands of researchers.
AI-designed drugs are in clinical trials. Insilico Medicine’s ISM001-055 for idiopathic pulmonary fibrosis entered Phase II trials in 2024, representing the first AI-discovered molecule to reach this stage. Recursion Pharmaceuticals has multiple AI-identified candidates in oncology and rare disease trials. The question is no longer whether AI can design drug-like molecules—it’s whether those molecules will succeed in Phase III.
What’s Promising But Unproven
Agentic omics workflows for end-to-end discovery. The vision articulated in Post 14—an LLM orchestrating AlphaFold for structure, ESM for embeddings, scGPT for expression, and RFdiffusion for design—is technically feasible today. Several companies have internal prototypes. But peer-reviewed evidence of these systems discovering novel biology or therapeutics independently remains limited to case studies and preprints.
Multi-omics integration at single-cell resolution. Technologies like CITE-seq and SHARE-seq enable simultaneous measurement of RNA, protein, and chromatin in individual cells. scGPT and related models can integrate these modalities. But the computational challenges—scaling to millions of cells, correcting batch effects across platforms, imputing missing modalities—remain active research areas.
Self-driving laboratories for biology. The Acceleration Consortium (University of Toronto), Emerald Cloud Lab, and Strateos have demonstrated autonomous experimental platforms for chemistry and materials science. Biology is harder: wet lab variability, biological complexity, and the cost of failed experiments create higher barriers. Several pilot systems exist, but none have achieved the closed-loop autonomy seen in materials discovery.
What’s Overhyped
“10x faster drug development.” Claims that AI will compress drug discovery from 5 years to 6 months conflate preclinical acceleration with total development timelines. AI-enabled workflows can compress early discovery by 30-40% and reduce preclinical candidate development to 13-18 months (versus 3-4 years traditionally). But clinical trial duration, regulatory review, and manufacturing scale-up remain unchanged. Biology, patient enrollment, and regulatory requirements impose non-negotiable constraints.
“AI will replace medicinal chemists.” The evidence suggests augmentation, not replacement. AI excels at generating candidate molecules and predicting properties. Human chemists excel at recognizing synthetic feasibility, understanding SAR (structure-activity relationships), and making judgment calls when predictions conflict. The most productive teams combine both.
“Foundation models understand biology.” Language is a dangerous metaphor. Biological sequences are not natural language; they encode physical constraints, evolutionary pressures, and biochemical functions that transformers learn implicitly but do not “understand” in any human sense. When models extrapolate beyond training distributions, they fail in ways that reveal the limits of the analogy.
Near-Term Predictions (2026–2027)
Agentic Literature Review Becomes Routine
By late 2026, we expect agentic literature review systems to become standard tools in academic and industrial labs. These systems will:
- Automatically monitor arXiv, bioRxiv, PubMed, and conference proceedings for relevant papers
- Extract and structure key findings into knowledge graphs
- Answer complex queries like “What’s the evidence for KRAS G12C inhibitor resistance mechanisms?” with cited sources
- Identify contradictions between studies and flag methodological concerns
The technology exists today—ReAct-style agents with tool use for literature search, RAG for retrieval, and verification chains for fact-checking. What’s needed is integration into researcher workflows and validation against expert-curated reviews. Early adopters will gain significant competitive advantages in staying current with rapidly moving fields.
Evidence base: Multiple preprints in 2025 demonstrated LLM agents achieving 85-90% accuracy on literature synthesis tasks when constrained to retrieval-augmented generation with source verification. Commercial products (Elicit, Scite, Consensus) already serve tens of thousands of researchers.
AlphaFold-Class Models for RNA and Metabolites
The structural biology revolution expanded from proteins to RNA in 2025-2026. We predict:
- RNA structure prediction will achieve AlphaFold 2-level accuracy for single-chain RNAs by late 2026, with models like RoseTTAFold All-Atom and specialized RNA transformers reaching sub-3Å RMSD for many targets
- Metabolite structure elucidation from mass spectrometry will see transformer-based approaches match or exceed traditional fragmentation tree methods for known compound classes
- Protein-RNA complex prediction will become reliable enough for guiding experimental design in RNA therapeutics
The CASP16 (2024) and RNA-Puzzles results showed steady progress. The remaining challenges—RNA conformational dynamics, modified nucleotides, and RNA-ligand interactions—will see incremental but meaningful advances.
Evidence base: RNA foundation models (RNABERT, RNA-FM) published in 2024-2025 demonstrated strong performance on secondary structure prediction and function annotation. The RNA-Puzzles community assessment showed deep learning methods closing the gap with experimental approaches.
Single-Cell Foundation Models Reach Clinical Utility
By 2027, we expect single-cell AI models to transition from research tools to clinical diagnostics:
- Minimal residual disease detection in leukemia using scGPT-based classifiers will achieve sensitivity comparable to flow cytometry with reduced manual gating
- Tumor microenvironment profiling from single-cell RNA-seq will guide immunotherapy selection in solid tumors, with prospective clinical trials demonstrating improved response prediction
- Cell therapy manufacturing will use AI-driven quality control to predict potency and safety from transcriptomic signatures
The Human Cell Atlas and Tabula Sapiens provide the reference datasets. Regulatory pathways (FDA’s Software as a Medical Device framework) are being established. The remaining barriers are prospective clinical validation and integration into clinical workflows.
Evidence base: Multiple 2025 studies demonstrated single-cell classifiers achieving AUC >0.90 for cancer subtyping and treatment response prediction. The FDA’s approval of Paige Prostate for digital pathology sets precedent for AI-based histopathology; single-cell diagnostics will follow similar pathways.
Multi-Agent Debate Systems Improve Scientific Reasoning
We predict that multi-agent architectures—where specialized agents debate hypotheses, critique methods, and reach consensus—will become standard for high-stakes biological reasoning:
- Variant interpretation agents will combine genomic databases (ClinVar, OncoKB) with literature synthesis and structural modeling, achieving concordance with expert panels
- Target validation agents will integrate genomics, proteomics, and phenomics evidence, explicitly representing uncertainty and conflicting data
- Experimental design agents will propose studies with power calculations, potential confounders, and alternative hypotheses
The key innovation is adversarial verification: agents trained to find flaws in each other’s reasoning reduce hallucination and overconfidence. This mirrors peer review, but at computational speed.
Evidence base: Multi-agent debate papers (2024-2025) showed 15-20% improvement in factual accuracy and reasoning quality compared to single-agent systems. The approach is being adopted by companies building scientific AI assistants.
Medium-Term Predictions (2027–2028)
End-to-End Drug Discovery Agents for Specific Target Classes
By 2028, we expect fully autonomous drug discovery agents to compress timelines from years to months for specific, well-characterized target classes:
- GPCRs and kinases—the most studied protein families with abundant structural and SAR data—will see agents that can: identify novel binding pockets → design selective ligands → predict ADMET properties → propose synthetic routes → prioritize candidates for synthesis
- Antibody discovery will be transformed by agents combining structure prediction (AlphaFold 3), affinity maturation (ESM-based models), and developability prediction
- PROTACs and molecular glues will benefit from agents that can model ternary complexes and predict degradation efficiency
This will not be universal: targets without structural data, novel mechanisms of action, and complex phenotypic endpoints will remain challenging. But for validated targets in well-understood classes, the compression will be dramatic.
Evidence base: Isomorphic Labs (DeepMind/Google) reported in 2025 that AI-guided programs had identified multiple novel drug candidates entering preclinical development. Insilico Medicine’s AI-discovered ISM001-055 showed positive Phase II results in 2026, validating the end-to-end approach for at least one indication.
Multi-Omics Digital Twins of Tumors Guide Treatment Decisions
By 2028, we expect oncology to see the first clinically validated “digital twins”—computational models of individual patients’ tumors that integrate:
- Genomics (somatic mutations, copy number alterations)
- Transcriptomics (gene expression, splicing variants)
- Proteomics (protein abundance, post-translational modifications)
- Metabolomics (metabolic pathway activity)
- Clinical data (imaging, treatment history, outcomes)
These models will simulate tumor evolution under different treatment scenarios, predict resistance mechanisms, and recommend combination therapies. Prospective clinical trials will demonstrate improved progression-free survival for patients treated based on digital twin recommendations versus standard molecular tumor boards.
Evidence base: Tempus and Foundation Medicine already integrate multi-omics data for molecular tumor boards. The computational models exist in research settings (e.g., cancer cell line encyclopedia predictions). The gap is prospective validation and regulatory approval for treatment guidance.
Self-Driving Labs for Hit-to-Lead Optimization
By 2028, self-driving laboratories will achieve routine operation for hit-to-lead optimization in drug discovery:
- Automated synthesis platforms will execute 100+ compounds per week, with AI agents designing each iteration based on previous results
- High-throughput screening will be coupled with active learning agents that prioritize the most informative experiments
- Bayesian optimization will guide the exploration-exploitation tradeoff, converging on optimal candidates with fewer experiments than human-led campaigns
This will not replace medicinal chemists but will free them from repetitive synthesis and testing, allowing focus on creative problem-solving and strategic decisions.
Evidence base: The Acceleration Consortium reported in 2025 that self-driving labs achieved 10x faster materials discovery compared to traditional approaches. Emerald Cloud Lab and Strateos offer cloud-based automated experimentation. Pharmaceutical companies are piloting similar systems for chemistry.
Federated Learning Enables Privacy-Preserving Multi-Institutional Models
By 2028, federated learning will enable training of omics AI models across institutions without sharing raw data:
- Hospital networks will collaboratively train diagnostic models on genomic and clinical data while maintaining patient privacy
- Pharmaceutical companies will pool preclinical data for target validation without revealing proprietary compounds
- International consortia will build population-diverse models addressing the European-ancestry bias in current genomic databases
Technical challenges (communication overhead, heterogeneous data formats, differential privacy) will be solved. Regulatory frameworks (GDPR, HIPAA) will explicitly permit federated approaches. The result will be more robust, generalizable models trained on orders of magnitude more data than any single institution possesses.
Evidence base: Federated learning frameworks (NVIDIA FLARE, OpenFL) are production-ready. Multiple 2024-2025 papers demonstrated federated training of genomic models achieving performance comparable to centralized training. Regulatory guidance is evolving to accommodate these approaches.
Long-Term Predictions (2029+)
Whole-Organism Simulation from Multi-Omics Data
By 2029-2030, we expect the first credible whole-organism computational models that integrate multi-omics data to simulate biological states:
- Virtual cell models will simulate metabolism, signaling, and gene regulation from genomic sequence and environmental inputs
- Tissue-level models will predict emergent properties from cell-cell interactions and spatial organization
- Organ-level models will simulate physiology and pharmacokinetics for drug response prediction
This is not “uploading consciousness” or science fiction. It’s the logical extension of genome-scale metabolic models (GEMs), physiologically-based pharmacokinetic (PBPK) models, and multi-scale modeling efforts already underway. The computational cost will remain high, but the scientific value—understanding emergent properties, predicting interventions, reducing animal testing—will justify investment.
Evidence base: Whole-cell models exist for simple organisms (Mycoplasma genitalium). Human cell models are in development (Virtual Cell, Physiome Project). The integration with AI/ML is accelerating progress but remains early-stage.
Personalized Medicine Guided by Individual Multi-Omics Profiles
By 2030, we expect routine multi-omics profiling for individuals at key health milestones:
- Newborns will receive genomic sequencing plus metabolomic screening for inborn errors of metabolism
- Adults will undergo periodic multi-omics assessments (genomics, proteomics, metabolomics, microbiome) to detect early disease signatures
- Cancer patients will receive tumor multi-omics profiling to guide precision therapy selection
AI agents will interpret these profiles, identify risk factors, recommend interventions, and monitor response. The cost will drop below $1,000 for comprehensive profiling, making it accessible in high-income countries and increasingly in middle-income settings.
Evidence base: The UK Biobank, All of Us Research Program, and similar initiatives are building the reference datasets. Multi-omics profiling costs are declining rapidly. The NHS Genomic Medicine Service already offers genomic testing for specific indications. The gap is clinical utility evidence and reimbursement frameworks.
Agentic Systems That Generate and Test Novel Biological Hypotheses
By 2030+, we expect agentic AI systems to move beyond hypothesis testing to hypothesis generation:
- Literature mining agents will identify unexplored connections between biological phenomena
- Computational screening agents will predict novel drug targets, biomarkers, and therapeutic mechanisms
- Experimental design agents will propose and execute studies to test these predictions
This is the ultimate vision of agentic omics: AI systems that don’t just accelerate existing workflows but discover entirely new biological insights. The role of human scientists will shift from hypothesis generation to hypothesis curation, experimental oversight, and interpretation of results.
Evidence base: CoScientist (2023) demonstrated autonomous hypothesis generation and testing in materials science. BioAgent and similar systems are emerging for biology. The technology is nascent but progressing rapidly.
What Remains Hard: The Stubborn Challenges
Despite extraordinary progress, several challenges will persist through 2029 and beyond:
Causality: Correlation ≠ Mechanism
AI models excel at pattern recognition but struggle with causal inference. A model can predict that mutation X is associated with disease Y without understanding the mechanistic pathway. This matters for:
- Drug target validation: Is the target causally involved in disease, or just correlated?
- Biomarker discovery: Is the biomarker driving pathology or a downstream consequence?
- Intervention design: Will modulating the target have the intended effect, or will compensatory mechanisms blunt efficacy?
Causal inference methods (instrumental variables, Mendelian randomization, perturbation-based approaches) are improving but remain fundamentally limited by data quality and experimental design. AI can accelerate causal discovery but cannot replace the need for well-designed experiments.
Biological Complexity: The Whole Exceeds the Parts
Biological systems exhibit emergent properties that cannot be predicted from individual components:
- Protein conformational dynamics: AlphaFold predicts static structures well but struggles with conformational ensembles and allostery
- Cell-cell communication: Single-cell models capture individual cell states but not the emergent properties of tissue-level interactions
- Microbiome-host interactions: Metagenomic models predict taxonomic composition but not functional consequences for host physiology
Multi-scale modeling and agentic systems that integrate across levels of organization will help, but the combinatorial complexity of biological systems means perfect prediction will remain elusive.
Regulatory Frameworks: Innovation Outpaces Governance
AI in biology is advancing faster than regulatory frameworks can adapt:
- FDA approval pathways for AI-based diagnostics and therapeutics are still evolving
- Liability frameworks for AI-driven clinical decisions are undefined
- International harmonization is lacking, creating barriers to global deployment
The EU AI Act (effective 2026) and FDA guidance are starting points, but the complexity of agentic omics—where AI systems make autonomous decisions across multiple steps—will require new regulatory paradigms.
Data Access and Equity: Who Benefits?
The benefits of agentic omics will not be evenly distributed:
- Population bias: >80% of GWAS data is from European-ancestry individuals, limiting model generalizability
- Resource disparities: High-income countries and institutions will adopt agentic omics first, widening health disparities
- Data sovereignty: Indigenous and underrepresented communities may not benefit from research using their genomic data
Initiatives like All of Us, H3Africa, and GenomeAsia 100K are addressing representation, but progress is slow. Equitable access to agentic omics will require intentional policy and investment.
Computational Cost: The Energy and Resource Challenge
Training and deploying large biological AI models requires substantial compute:
- Evo (7B parameters) required millions of GPU-hours to train
- AlphaFold 3 inference is computationally intensive for large complexes
- Agentic workflows that orchestrate multiple models multiply the cost
This creates barriers for smaller institutions and researchers in low-resource settings. Efficient models, distillation, and shared infrastructure will help, but the computational demands of cutting-edge agentic omics will remain significant.
The Human Scientist: Irreplaceable (For Now)
Throughout this series, we’ve emphasized that agentic omics amplifies rather than replaces human scientists. Let’s be explicit about what remains uniquely human:
Asking the right questions. AI can generate hypotheses, but the most important scientific questions often come from intuition, serendipity, and deep domain expertise. The question “What if proteins could be designed de novo?” preceded AlphaFold and RFdiffusion.
Ethical judgment. Decisions about clinical trial design, patient selection, risk-benefit tradeoffs, and equitable access require moral reasoning that AI cannot provide.
Creative synthesis. Connecting ideas across disciplines, recognizing analogies, and reframing problems are fundamentally human capabilities. AI can assist but not originate.
Accountability. When AI-driven decisions have consequences—failed clinical trials, adverse events, misdiagnoses—humans must be accountable. The buck stops with people, not algorithms.
The future of biological discovery is not human versus AI; it’s human with AI. The scientists who thrive will be those who learn to collaborate effectively with agentic systems, leveraging their strengths while applying human judgment where it matters most.
Call to Action: Building the Future Responsibly
As we look to the road ahead, we offer these recommendations for researchers, clinicians, and policymakers:
For researchers:
- Start with well-defined agentic workflows. Don’t boil the ocean. Pick a specific task (literature review, variant interpretation, experimental design) and build a robust agent for it.
- Measure rigorously. Benchmark against baselines, report failure modes, and publish negative results. The field needs honest assessment, not hype.
- Share openly. Open-source models, datasets, and tools accelerate progress for everyone. The AlphaFold 2 release transformed structural biology; emulate that generosity.
For clinicians:
- Engage early. AI tools will enter your workflow whether you’re ready or not. Shape their development to ensure they meet clinical needs.
- Demand evidence. Computational benchmarks are not clinical utility. Insist on prospective validation in relevant patient populations.
- Maintain skepticism. AI can assist but not replace clinical judgment. The patient in front of you is not a data point.
For policymakers:
- Invest in infrastructure. Compute, data, and talent are the foundations of agentic omics. Public investment will ensure broad access.
- Update regulations. Existing frameworks were designed for static software, not autonomous agents. Adapt to enable innovation while protecting patients.
- Prioritize equity. Ensure that the benefits of agentic omics reach underrepresented populations and low-resource settings.
Conclusion: A Decade of Transformation
The past two years have shown us what’s possible. The next decade will show us what’s practical.
Agentic omics will not solve all of biology’s challenges. It will not replace the creativity, intuition, and ethical judgment of human scientists. But it will accelerate discovery, democratize access to sophisticated tools, and enable questions that were previously unanswerable.
The road ahead is not predetermined. It will be shaped by the choices we make today: what we build, how we validate it, who has access, and what values guide development. Let’s build a future where agentic omics serves humanity—advancing health, expanding knowledge, and reducing suffering.
The map is drawn. The journey begins now.
Glossary
| Term | Definition |
|---|---|
| Agentic AI | AI systems that autonomously plan, reason, use tools, and execute multi-step workflows with minimal human intervention |
| Digital Twin | A computational model of a biological system (cell, tissue, organ, or patient) that simulates behavior under different conditions |
| Federated Learning | A distributed machine learning approach where models are trained across multiple institutions without sharing raw data |
| Foundation Model | A large AI model trained on broad data that can be adapted to many downstream tasks via fine-tuning or prompting |
| Multi-Omics | The integrated analysis of multiple omics layers (genomics, transcriptomics, proteomics, metabolomics, etc.) |
| Self-Driving Laboratory | An automated experimental platform where AI agents design experiments, control robots, analyze results, and iterate |
| Single-Cell Multi-Omics | Technologies that measure multiple molecular layers (RNA, protein, chromatin) simultaneously in individual cells |
References
-
Abramson, J., et al. “Accurate structure prediction of biomolecular interactions with AlphaFold 3.” Nature (2024).
-
Bran, A. M., et al. “ChemCrow: Augmenting large-language models with chemistry tools.” arXiv preprint (2024).
-
Boiko, D. A., et al. “Autonomous chemical research with large language models.” Nature (2023).
-
Cui, H., et al. “scGPT: toward building a foundation model for single-cell multi-omics using generative AI.” Nature Methods (2024).
-
Dalla-Torre, H., et al. “The Nucleotide Transformer: Building and evaluating robust foundation models for human genomics.” bioRxiv (2023).
-
Nguyen, E., et al. “Evo: A 7B-parameter model trained on 300B nucleotides spanning all domains of life.” Science (2024).
-
Singh, R. “AI in drug discovery: predictions for 2026.” Drug Target Review (2026).
-
Theodoris, C. V., et al. “Transfer learning enables predictions in network biology.” Nature (2023).
-
Zhou, Z., et al. “DNABERT-2: Efficient foundation model and benchmark for multi-species genome understanding.” ICLR (2024).
-
NVIDIA. “NVIDIA GTC 2026: Agentic AI Inflection Hits Healthcare and Life Sciences.” GEN Edge (2026).
-
“Industry Leaders Predict Life Science Trends for 2026.” The Scientist (2026).
-
Martin, A. R., et al. “Clinical use of current polygenic risk scores may exacerbate health disparities.” Nature Genetics (2019).
-
“AI-driven multi-omics integration in precision oncology.” PMC (2025).
-
“Agentic AI for Scientific Discovery: A Survey.” arXiv (2025).
-
FDA. “AI/ML-Based Software as a Medical Device Action Plan.” (2025).
This concludes the Agentic Omics series. Thank you for reading. We welcome your feedback, corrections, and suggestions for future topics.