Introduction: The Closed Loop of Discovery

For centuries, the scientific method has followed a familiar rhythm: a human scientist observes a phenomenon, formulates a hypothesis, designs an experiment, executes it manually or with basic automation, analyses the results, and iterates. This cycle — hypothesis, experiment, analysis, refinement — is the engine of scientific progress. But it’s also a bottleneck. Each iteration takes days, weeks, or months. Human bandwidth limits the search space we can explore. And crucially, the loop is open: the scientist must close it manually, bringing their intuition and experience to bear at every step.

Self-driving laboratories (SDLs) promise to close this loop. By integrating AI agents with robotic laboratory automation, SDLs can autonomously design experiments, execute them on robotic platforms, analyse the results, and use those results to inform the next round of experimentation — all with minimal human intervention. The human provides the goal (“find a catalyst for reaction X,” “optimize this enzyme for thermostability,” “identify compounds that inhibit this target”), and the SDL handles the rest.

This is not science fiction. Self-driving labs are operational today in chemistry, materials science, and increasingly in biology. The Acceleration Consortium at the University of Toronto has deployed multiple SDLs that have discovered novel materials and optimized chemical processes. Emerald Cloud Lab and Strateos offer cloud-accessible robotic laboratories that can be programmed remotely. In 2025, a comprehensive review in Royal Society Open Science documented that “today’s most capable SDLs automate nearly the entire scientific method, from hypothesis generation, experimental design, experiment execution and data analysis, to drawing conclusions and updating hypotheses for subsequent rounds of optimization or discovery” (Royal Society Open Science, 2025).

But biology is harder than chemistry. Biological systems are noisier, more complex, and less predictable. The wet lab variability that biochemists accept as routine — pipetting errors, cell culture contamination, reagent batch effects — poses significant challenges for autonomous systems. This post examines where self-driving labs stand today, why biological SDLs lag behind their chemistry counterparts, and what it will take to bring agentic omics into the physical laboratory.


What Is a Self-Driving Laboratory?

A self-driving laboratory is an integrated system combining three core components:

  1. AI Planning and Decision-Making: An intelligent agent (often using Bayesian optimization, active learning, or reinforcement learning) that decides which experiments to run next based on current knowledge and uncertainty.

  2. Laboratory Automation: Robotic hardware — liquid handlers, plate readers, reactors, sequencers, microscopes — that can execute experiments without human intervention.

  3. Data Integration and Analysis: Pipelines that automatically process experimental results, extract relevant features, and feed them back to the AI planner to close the loop.

The architecture forms a closed loop: the AI proposes an experiment → the robot executes it → sensors measure the outcome → data is processed → the AI updates its model and proposes the next experiment. This cycle can run 24/7, exploring the experimental space far more rapidly than a human-led lab.

The term “self-driving” is borrowed from autonomous vehicles, and the analogy is instructive. Just as a self-driving car perceives its environment, plans a route, and executes driving actions, an SDL perceives experimental state, plans the next measurement, and executes laboratory actions. Both require robust sensing, reliable actuation, and intelligent decision-making under uncertainty. Both also raise questions about safety, oversight, and the appropriate level of human involvement.

In the literature, these systems are also called “robot scientists,” “AI scientists,” or “autonomous laboratories.” The Acceleration Consortium — a leading global network devoted to autonomous science — uses “self-driving labs” and “SDLs” throughout its communications, and this terminology has become dominant (Royal Society Open Science, 2025; Acceleration Consortium, 2025).


Self-Driving Labs Today: Chemistry and Materials Science Lead

The most mature self-driving laboratories are in chemistry and materials science, where experimental conditions are more controlled and outcomes are more predictable.

The Acceleration Consortium

The Acceleration Consortium at the University of Toronto operates multiple self-driving labs focused on materials discovery and chemical synthesis. In January 2025, Nature highlighted SDLs as one of the “technologies to watch in 2025,” specifically featuring the Acceleration Consortium’s work (Nature, 2025). Their systems combine:

  • Bayesian optimization engines that efficiently navigate high-dimensional experimental spaces
  • Robotic platforms for chemical synthesis, purification, and characterization
  • Automated analytics including HPLC, mass spectrometry, and spectroscopy

In 2025, the Consortium announced new funding from the Natural Sciences and Engineering Research Council of Canada to transform chemical purification processes using self-driving labs. Two SDLs are specifically targeting liquid-liquid extraction in chemical synthesis — a ubiquitous but notoriously difficult optimization problem (Acceleration Consortium, 2025).

A key innovation from the Acceleration Consortium is BayBE (Bayesian Back End), developed in collaboration with Merck KGaA. BayBE is an AI-driven Bayesian optimization engine that assists in identifying the most promising experiments to conduct next, dramatically reducing the number of experiments needed to reach an optimal solution (Scispot, 2025).

Success Stories

Self-driving labs have achieved notable successes:

  • Materials Discovery: SDLs have discovered novel photocatalysts, battery materials, and metal-organic frameworks (MOFs) with properties superior to those found through traditional screening.

  • Chemical Synthesis Optimization: Automated systems have optimized reaction conditions (temperature, pressure, catalyst loading, solvent ratios) for complex organic syntheses, achieving higher yields and selectivity than manual optimization.

  • Process Scale-Up: Some SDLs can transition from milligram-scale discovery to gram-scale synthesis automatically, bridging the gap between laboratory discovery and practical application.

A 2025 review in Nature Communications emphasized that “collaborative self-driving research is crucial to research acceleration amidst ever more complex problems,” identifying key challenges in scaling SDLs from single-lab curiosities to networked research infrastructure (Nature Communications, 2025).


The Biology Gap: Why Self-Driving Bio Labs Are Harder

If self-driving labs are succeeding in chemistry, why aren’t they everywhere in biology? The answer lies in the fundamental differences between chemical and biological systems.

Biological Complexity

Chemical reactions are governed by well-understood physical laws. Given reactants, conditions, and a mechanism, a chemist can often predict the outcome with reasonable accuracy. Biological systems are vastly more complex:

  • Non-linearity: Small changes in conditions can have disproportionate effects. A 1°C temperature shift might have no effect on a chemical reaction but could kill a cell culture.

  • Emergent Properties: Cellular behavior emerges from networks of interactions that are not fully understood. Predicting how a genetic perturbation will affect a cell remains challenging even with sophisticated models.

  • Context Dependence: The same molecule can have different effects in different cell types, different organisms, or even the same organism under different conditions.

Wet Lab Variability

Biological experiments are inherently noisy:

  • Pipetting Errors: Manual liquid handling introduces variability that compounds across multi-step protocols.

  • Cell Culture Contamination: Mycoplasma, bacteria, or fungi can ruin weeks of work. Automated systems must detect and respond to contamination in real-time.

  • Reagent Batch Effects: Different lots of serum, enzymes, or antibodies can behave differently. Biological reagents are not as standardized as chemical reagents.

  • Biological Variability: Even genetically identical cells show heterogeneity in gene expression, metabolism, and behavior. This “biological noise” is a feature, not a bug — but it complicates automation.

A 2025 review in Royal Society Open Science noted these challenges explicitly: “The biology gap: why self-driving bio labs are harder than self-driving chemistry labs” remains a central challenge, citing “biological complexity, reproducibility, and wet lab variability” as key factors (Royal Society Open Science, 2025).

Measurement Challenges

In chemistry, analytical instruments like HPLC and mass spectrometry provide quantitative, reproducible measurements. In biology:

  • Assay Complexity: Many biological assays are multi-step, requiring incubation, washing, and detection steps that are difficult to automate reliably.

  • Endpoint vs. Kinetic: Some biological processes must be measured at specific timepoints; others require continuous monitoring. Scheduling these measurements autonomously is non-trivial.

  • Destructive Measurements: Many assays consume the sample, preventing re-measurement. This places a premium on getting the measurement right the first time.


Emerging Biological Self-Driving Labs

Despite these challenges, biological SDLs are emerging. The most advanced applications are in areas where the experimental space is well-defined and assays are robust.

Protein Engineering and Directed Evolution

Protein engineering is a natural fit for self-driving labs. The experimental workflow is relatively standardized: design variants → express proteins → assay function → iterate. Several groups have demonstrated closed-loop systems:

Active Learning-Assisted Directed Evolution: A 2025 study in Nature Communications demonstrated an active learning workflow using machine learning to efficiently explore protein fitness landscapes (Nature Communications, 2025). The system:

  1. Starts with a parent protein sequence
  2. Uses a machine learning model to predict which mutations are most likely to improve the target property
  3. Synthesizes and assays those variants
  4. Updates the model with the results
  5. Repeats until the target is achieved

The key insight is that active learning allows the system to identify functional mutations from minimal data, dramatically reducing the number of variants that must be experimentally tested. As one 2025 review noted: “The integration of predictive modeling, active learning, and automated experimentation yields three interlocking effects. First, the exploration of protein fitness landscapes is dramatically accelerated: active learning allows the efficient identification of functional mutations from minimal data” (Frontiers in Microbiology, 2025).

AI-Native Biofoundries: A February 2026 preprint described “An AI-Native Biofoundry for Autonomous Enzyme Engineering: Integrating Active Learning with Automated Experimentation” (bioRxiv, 2026). This system integrates:

  • Computational design tools for generating variant libraries
  • Automated cloning and protein expression
  • High-throughput screening assays
  • Machine learning models that guide the next round of design

The authors report successful optimization of enzymes for industrial applications, with the autonomous system identifying superior variants in fewer iterations than traditional directed evolution.

Metabolic Engineering

Metabolic engineering — redesigning cellular metabolism to produce desired compounds — is another area where SDLs are gaining traction:

Automated Strain Construction: Systems that automatically assemble genetic constructs, transform cells, and screen colonies are becoming more common. Companies like Transcriptic (now part of Strateos) and Ginkgo Bioworks have built platforms that automate much of the strain engineering workflow.

Closed-Loop Optimization: A 2025 PubMed-indexed review noted that “synthetic biology is rapidly evolving through the integration of artificial intelligence (AI) and automated biofoundries. This convergence accelerates the design-build-test-learn cycle, shifting protein engineering and metabolic engineering” toward autonomous operation (PubMed, 2025).

High-Throughput Screening

Pharmaceutical and biotech companies have long used automated systems for high-throughput screening (HTS) of compound libraries. The next generation adds AI-driven decision-making:

Adaptive Screening: Instead of screening all compounds in a library, AI models can prioritize compounds likely to be active, then adaptively refine the search based on initial results. This “smart screening” approach can reduce costs and time while maintaining or improving hit rates.

Phenotypic Screening with AI Analysis: High-content imaging systems generate massive datasets of cellular phenotypes. AI models can automatically classify phenotypes, identify subtle effects, and cluster compounds by mechanism of action — all without human intervention.

Cloud Laboratories

Emerald Cloud Lab and Strateos offer cloud-accessible robotic laboratories that can be programmed remotely. Researchers design experiments using web interfaces or APIs, and the robotic systems execute them. While not fully autonomous (humans still design the experiments), these platforms represent a step toward SDLs by:

  • Standardizing experimental protocols
  • Providing reliable, reproducible execution
  • Generating structured data suitable for AI analysis
  • Enabling remote operation 24/7

A 2025 article noted that “in both AMP2 and M2PC, the level of modular and R&D-focused automation and integration with AI will allow the DoE and the scientific community to perform experiments to learn more in ways, and on a scale, that we simply cannot do using today’s methods” (Society of Chemical Industry, 2025).


The Agentic Layer: AI Orchestration for Self-Driving Labs

What transforms an automated laboratory into a self-driving laboratory is the AI agent that orchestrates the workflow. This agent must:

Experimental Design

The agent must decide which experiments to run next. Common approaches include:

Bayesian Optimization: This probabilistic approach models the relationship between experimental parameters and outcomes, then uses acquisition functions to balance exploration (trying uncertain conditions) and exploitation (refining known good conditions). BayBE, mentioned earlier, is an example of this approach.

Active Learning: The agent identifies experiments that will most reduce uncertainty in its model. This is particularly effective when experiments are expensive and the search space is large.

Reinforcement Learning: The agent learns a policy for selecting experiments through trial and error, maximizing a reward signal (e.g., improvement in the target property).

Multi-Objective Optimization: Many biological problems involve trade-offs (e.g., enzyme activity vs. stability, drug potency vs. toxicity). Multi-objective optimization algorithms can identify Pareto-optimal solutions that balance competing objectives.

Error Handling and Recovery

Biological experiments fail. A self-driving lab must detect failures and respond appropriately:

  • Anomaly Detection: Statistical methods and machine learning can identify experiments that deviate from expected patterns, flagging potential errors.

  • Automatic Retries: For transient failures (e.g., a pipetting error), the system can automatically repeat the experiment.

  • Protocol Adaptation: If a particular step consistently fails, the agent might modify the protocol (e.g., change incubation times, adjust reagent concentrations).

  • Human Escalation: For failures the system cannot resolve, it should alert human operators with sufficient context to diagnose the problem.

Knowledge Representation

The agent must maintain a representation of what it has learned:

  • Experimental Provenance: Every result must be traceable to specific conditions, reagents, and instruments. This is critical for reproducibility and for learning from failures.

  • Literature Integration: Agents can incorporate knowledge from the scientific literature, using it to inform experimental design or interpret results.

  • Cross-Experiment Learning: Insights from one project can inform others. A model of protein expression learned in one context might transfer to related proteins.

Human-in-the-Loop

Full autonomy is not always desirable. Self-driving labs should support various levels of human involvement:

  • Human-on-the-Loop: Humans monitor the system and can intervene if needed, but the system operates autonomously by default.

  • Human-in-the-Loop: The system proposes experiments, but humans approve them before execution. This is appropriate for high-stakes or novel applications.

  • Human-Guided: Humans set goals and constraints, and the system operates autonomously within those boundaries.

The appropriate level of human involvement depends on the application, the maturity of the technology, and regulatory requirements.


Cost and Accessibility: Who Can Afford Self-Driving Labs?

Self-driving laboratories are expensive. The robotic hardware alone can cost hundreds of thousands to millions of dollars. Adding AI software, data infrastructure, and maintenance pushes the total cost higher. This raises questions about equity and access.

Current Landscape

Large Institutions and Companies: Most SDLs are in well-funded academic institutions (like the Acceleration Consortium) or large pharmaceutical and biotech companies. These organizations can afford the capital investment and have the technical expertise to operate and maintain the systems.

Cloud Laboratories: Services like Emerald Cloud Lab and Strateos offer a different model: instead of buying hardware, researchers pay per experiment. This reduces the barrier to entry but still requires significant budgets for sustained use.

Open-Source Efforts: Some groups are developing open-source hardware and software for laboratory automation. The OpenTrons platform, for example, offers relatively affordable liquid handling robots that can be integrated into SDLs. The Acceleration Consortium maintains an “awesome-self-driving-labs” repository on GitHub, curating community resources (GitHub, 2025).

The Accessibility Challenge

A 2025 review in Royal Society Open Science raised concerns about accessibility: “Cost and accessibility: who can afford self-driving labs?” remains an open question. If SDLs dramatically accelerate discovery, but only wealthy institutions can afford them, the technology could exacerbate existing inequalities in scientific research.

Potential solutions include:

  • Shared Facilities: Centralized SDL facilities that multiple institutions can access, similar to synchrotrons or supercomputing centers.

  • Cloud-Based Access: Expanding the cloud laboratory model to make SDL capabilities available on a pay-per-use basis.

  • Modular, Affordable Systems: Developing lower-cost SDL components that smaller labs can afford and integrate incrementally.

  • Open-Source Software: Making the AI orchestration software freely available, even if the hardware remains expensive.


Case Study: A Self-Driving Omics Workflow

To make the vision concrete, consider a self-driving laboratory for single-cell multi-omics optimization:

Goal: Optimize a protocol for simultaneous measurement of RNA, protein, and chromatin accessibility in the same single cells (a challenging multi-omics assay).

Initial State: The lab has a starting protocol based on published methods, but it yields low cell viability and poor signal-to-noise for some modalities.

The SDL Workflow:

  1. Parameter Space Definition: The agent defines the experimental parameters to optimize: cell lysis buffer composition, incubation times, antibody concentrations, transposase loading, etc. — perhaps 20 parameters total.

  2. Initial Experiments: The agent runs a designed experiment (e.g., a fractional factorial design) to explore the parameter space and build an initial model.

  3. Bayesian Optimization Loop:

    • The agent uses the results to build a probabilistic model of how parameters affect outcomes (cell viability, RNA counts, protein signal, chromatin peaks).
    • It identifies the next set of conditions most likely to improve the overall protocol quality.
    • The robotic system prepares samples, runs the assay, and performs single-cell sequencing.
    • Automated pipelines process the sequencing data, extracting quality metrics.
    • The agent updates its model and repeats.
  4. Convergence: After 50-100 iterations (which might take 2-3 weeks of continuous operation), the agent converges on an optimized protocol that outperforms the starting point.

  5. Validation: The final protocol is validated on held-out cell types to ensure generalizability.

Human Involvement: Throughout this process, humans:

  • Set the initial goal and constraints
  • Monitor system health and intervene if errors occur
  • Review the final protocol and decide whether to adopt it
  • Publish the results

This workflow exemplifies the “agentic omics” vision: AI agents orchestrating domain-specific tools (in this case, laboratory robots and sequencing platforms) to achieve a biological objective with minimal human intervention.


Challenges and Limitations

Despite the promise, significant challenges remain:

Reproducibility

Paradoxically, automation can both improve and undermine reproducibility:

  • Improved: Robots don’t get tired, distracted, or inconsistent. Automated protocols execute the same way every time.

  • Undermined: If there’s a systematic error in the automation (e.g., a miscalibrated pipette), it will be replicated across all experiments. And if the AI agent makes decisions based on flawed assumptions, it can lead the entire project astray.

Rigorous validation and quality control are essential.

Interpretability

AI agents that design experiments can be opaque. When an SDL discovers an optimal condition, scientists want to understand why it works. Black-box optimization can find solutions, but it doesn’t necessarily provide insight. Integrating interpretable models and explainable AI techniques is an active area of research.

Regulatory Compliance

For applications in drug discovery or clinical diagnostics, regulatory compliance is critical:

  • Data Integrity: FDA and other regulators require rigorous data integrity controls (ALCOA+ principles: Attributable, Legible, Contemporaneous, Original, Accurate, plus Complete, Consistent, Enduring, Available).

  • Validation: Automated systems must be validated to demonstrate they perform as intended.

  • Audit Trails: Every action must be logged and traceable.

SDLs designed for regulated applications must be built with these requirements in mind from the start.

The Creativity Question

Can an AI agent formulate truly novel hypotheses, or does it only optimize within the space humans define? This is a philosophical question with practical implications. For now, SDLs excel at optimization and exploration within defined spaces. Creative hypothesis generation remains a human strength — though this may change as AI capabilities advance.


The Road Ahead

Where are self-driving laboratories heading?

Near-Term (2026-2027)

  • More Biological SDLs: Expect to see more self-driving labs in protein engineering, metabolic engineering, and cell therapy development.

  • Standardization: Efforts to standardize laboratory automation interfaces (like the SiLA and OPC UA standards) will make it easier to integrate diverse hardware.

  • AI Improvements: Better models for experimental design, error detection, and cross-domain transfer learning.

Medium-Term (2027-2029)

  • Networked SDLs: Multiple self-driving labs connected in a network, sharing data and coordinating experiments across sites.

  • Multi-Omics Integration: SDLs that can run genomics, transcriptomics, proteomics, and metabolomics assays in an integrated workflow.

  • Clinical Applications: SDLs for personalized medicine applications, such as optimizing cell therapies for individual patients.

Long-Term (2029+)

  • Fully Autonomous Discovery: SDLs that can identify novel biological phenomena, not just optimize known processes.

  • Integration with Computational Models: Tight coupling between SDLs and whole-cell or whole-organism simulation models, with experiments designed to test and refine the models.

  • Democratization: Lower-cost, modular SDL systems that make the technology accessible to smaller institutions and developing countries.


Conclusion: Amplifying Human Scientists

Self-driving laboratories represent a profound shift in how biological research is conducted. By closing the loop between hypothesis, experiment, and analysis, SDLs can explore the experimental space far more rapidly than human-led labs. They can work 24/7, they don’t get tired, and they can navigate high-dimensional parameter spaces that would overwhelm human intuition.

But SDLs are not a replacement for human scientists. They are a tool — a powerful tool — that amplifies human creativity and insight. The human scientist asks the important questions, interprets the results in the context of broader knowledge, and makes the creative leaps that drive science forward. The SDL handles the repetitive, time-consuming work of experimental iteration.

The vision of agentic omics — AI agents orchestrating domain-specific models and tools to accelerate biological discovery — finds its most concrete expression in the self-driving laboratory. Here, agents meet robots, and the closed loop of discovery becomes a reality.

The challenges are real: biological complexity, wet lab variability, cost, and accessibility. But the progress to date is encouraging. As the technology matures, self-driving laboratories will become an increasingly important part of the biological research landscape — not replacing human scientists, but empowering them to achieve more than ever before.


Glossary

Term Definition
Self-Driving Laboratory (SDL) An integrated system combining AI decision-making, robotic automation, and data analysis to autonomously conduct scientific experiments in a closed loop.
Bayesian Optimization A probabilistic approach to optimization that models the relationship between inputs and outputs, balancing exploration and exploitation to efficiently find optimal conditions.
Active Learning A machine learning paradigm where the algorithm selectively queries the most informative data points to label, reducing the amount of labeled data needed.
Design-Build-Test-Learn (DBTL) Cycle The iterative workflow in synthetic biology and metabolic engineering: design genetic constructs, build them, test their function, and learn from results to inform the next design.
Biofoundry A facility equipped with high-throughput automation for the design, construction, and testing of biological systems, often integrated with computational tools.
Liquid Handler A robotic instrument that automatically transfers liquids between containers, essential for automating biological assays.
High-Content Screening Automated microscopy combined with image analysis to extract quantitative phenotypic information from cells or organisms.
Pareto-Optimal A solution where no objective can be improved without worsening another; used in multi-objective optimization to identify trade-off solutions.
ALCOA+ Data integrity principles required by regulators: Attributable, Legible, Contemporaneous, Original, Accurate, plus Complete, Consistent, Enduring, Available.
Human-in-the-Loop A system design where humans participate in decision-making, either by approving actions (in-the-loop) or monitoring and intervening as needed (on-the-loop).

References

  1. Royal Society Open Science (2025). “Autonomous ‘self-driving’ laboratories: a review of technology and policy implications.” Royal Society Open Science, 12(7), 250646. https://doi.org/10.1098/rsos.250646

  2. Nature (2025). “Self-driving laboratories, advanced immunotherapies and five more technologies to watch in 2025.” Nature, January 20, 2025. https://www.nature.com/articles/d41586-025-00075-6

  3. Nature Communications (2025). “Science acceleration and accessibility with self-driving labs.” Nature Communications, April 24, 2025. https://www.nature.com/articles/s41467-025-59231-1

  4. Acceleration Consortium (2025). “Self-driving labs transform liquid-liquid extraction in chemical synthesis with new funding.” University of Toronto. https://acceleration.utoronto.ca/news/this-chemical-purification-process-just-got-smarter----self-driving-labs-transform-liquid-liquid-extraction-in-chemical-synthesis-with-new-funding

  5. Scispot (2025). “AI-Powered ‘Self-Driving’ Labs: Accelerating Life Science R&D.” April 14, 2025. https://www.scispot.com/blog/ai-powered-self-driving-labs-accelerating-life-science-r-d

  6. Nature Communications (2025). “Active learning-assisted directed evolution.” Nature Communications, January 16, 2025. https://www.nature.com/articles/s41467-025-55987-8

  7. Frontiers in Microbiology (2025). “Without safeguards, AI-Biology integration risks accelerating future pandemics.” November 26, 2025. https://www.frontiersin.org/journals/microbiology/articles/10.3389/fmicb.2025.1734561/full

  8. bioRxiv (2026). “An AI-Native Biofoundry for Autonomous Enzyme Engineering: Integrating Active Learning with Automated Experimentation.” February 1, 2026. https://www.biorxiv.org/content/10.64898/2026.02.01.703093v1

  9. PubMed (2025). “Artificial intelligence-powered biofoundries for protein engineering and metabolic engineering.” PMID: 41192167. https://pubmed.ncbi.nlm.nih.gov/41192167/

  10. Society of Chemical Industry (2025). “Autonomous labs and AI to boost biotech research.” December 2025. https://www.soci.org/news/2025/12/autonomous-labs-and-ai-boost-biotech-research

  11. GitHub (2025). “awesome-self-driving-labs.” Acceleration Consortium. https://github.com/AccelerationConsortium/awesome-self-driving-labs

  12. PMC (2025). “Autonomous ‘self-driving’ laboratories: a review of technology and policy implications.” PMC12368842. https://pmc.ncbi.nlm.nih.gov/articles/PMC12368842/