Science acceleration and accessibility with self-driving labs

Canty, Richard B.; Bennett, Jeffrey A.; Brown, Keith A.; Buonassisi, Tonio; Kalinin, Sergei V.; Kitchin, John R.; Maruyama, Benji; Moore, Robert G.; Schrier, Joshua; Seifrid, Martin; Sun, Shijing; Vegge, Tejs; Abolhasani, Milad

doi:10.1038/s41467-025-59231-1

Download PDF

Perspective
Open access
Published: 24 April 2025

Science acceleration and accessibility with self-driving labs

Nature Communications volumeÂ 16, ArticleÂ number:Â 3856 (2025) Cite this article

4722 Accesses
34 Altmetric
Metrics details

Subjects

Abstract

In the evolving landscape of scientific research, the complexity of global challenges demands innovative approaches to experimental planning and execution. Self-Driving Laboratories (SDLs) automate experimental tasks in chemical and materials sciences and the design and selection of experiments to optimize research processes and reduce material usage. This perspective explores improving access to SDLs via centralized facilities and distributed networks. We discuss the technical and collaborative challenges in realizing SDLsâ€™ potential to enhance humanâ€“machine and humanâ€“human collaboration, ultimately fostering a more inclusive research community and facilitating previously untenable research projects.

Performance metrics to unleash the power of self-driving labs in chemistry and materials science

Article Open access 14 February 2024

The rise of self-driving labs in chemical and materials sciences

Article 30 January 2023

AiiDA 1.0, a scalable computational infrastructure for automated reproducible workflows and data provenance

Article Open access 08 September 2020

Introduction

The execution and planning of experiments have become increasingly more rigorous and automated in response to the growing complexity of world problems. Experimental planning has gradually evolved from random to statistically driven design of experiments; meanwhile experimental tools have advanced from simple facilitators of manual actions to highly automated platforms. As the complexity, intersectionality, and scope of challenges in energy¹, medicine^2,3, ecological harm reduction, and nonrenewable-resource management increase, laboratory research must again leap forward: from individualized research to massively collaborative efforts to incorporate diverse expertise and techniques. In order to bring experimentation into the hands of a broad and diverse community of scientists, however, a certain level of automation, throughput, and accessible design must be achieved.

Self-driving (also known as autonomous) laboratories (SDLs) are the result of technological efforts to automate the execution of experimental tasks to meet the demands of industry and academia, the design and selection of experiments to minimize the material and temporal costs of research, the refinement and generation of hypotheses to discover new relationships and knowledge, and the collaboration of multiple research groups to accelerate research^4,5. An SDL typically comprises a suite of digital tools to make predictions, propose experiments, and update beliefs between experimental campaigns and a suite of automated hardware to carry out experiments in the physical world (Fig.Â 1); these two components then work jointly toward a human-defined objective (e.g., process or material property optimization, compound or property-set discovery, self-improvement, and combinations thereof). The primary differences between established high-throughput/cloud laboratories and SDLs lie in the judicious selection of experiments^6,7, the adaptation of experimental methods⁸, and the development of workflows that can integrate the operation of multiple tools. This automation of experimental design provides the leverage for expert and literature knowledge to efficiently tackle the increasingly incomprehensible, multivariate design spaces required by modern problems⁹. Adaptability allows for SDLs to develop new techniques to handle new applications, expand the feasible experimental design space, and modify workflows on-the-fly to address and preclude safety and sustainability concerns¹⁰. Furthermore, the integration of tools, learning modules, and data enables SDLs to accumulate knowledge and continually improve.

**Fig. 1: A schematic overview of how a self-driving laboratory (SDL) operates.**

SDLs, by acting as highly capable collaborators in the research process, can serve as nexuses for collaboration and inclusion in the sciencesâ€”helping coordinate and optimize grand and intersectional research efforts and reducing the physical and technical obstacles of performing research manually. When combined with collaborators who bring broad domain expertise, this potential new paradigm for research (SDL-assisted research) could allow for the scientific community to adequately address previously intractable Grand Challenges¹¹ (such as developing economically viable solar power technologies and industrial processes, making breakthroughs in personalized health and safety, and the creation of new analytical devices and methods). Already, SDLs have shown promise in accelerating molecular discovery^3,12, the discovery of new synthesis routes for nanoparticles¹³, crystallographic phase mapping¹⁴, microscopy^15,16, and HPLC method development¹⁷, among other feats reviewed more thoroughly in the literature^4,18,19.

The widespread accessibility of and to SDLs is necessary to fully realize their promise^20,21. A team science²² research paradigm involves more actors to conduct research and incorporates more diverse ideas into the formulation and execution of research problems. In discussion of the democratization of research, we have chosen to focus on the questions of how many researchers can participate in the scientific method and how to make the generation and analysis of hypotheses more accessible to said researchers. Toward the democratization of research through SDLs, there is an open question as to how SDL technologies will be balanced between open-access, centralized facilities (Centralized approach)â€”cf. the European Organization for Nuclear Research (CERN) research facilities and the BioPacific MIP²³â€”and networks of distributed facilities (Distributed approach)â€”cf. the Galaxy Zoo²⁴ and Foldit²⁵ projects and the Harvard Clean Energy Project²⁶. Both the access to scientific research that SDLs provide and the communities that SDLs necessitate and foster position this research paradigm particularly well for enhancing humanâ€“machine collaboration, multi-disciplinary and data-driven research, public outreach, and science education²⁷. Furthermore, by accelerating research, SDLs involve industry partnerships, whose economic interest in efficient research and development (R&D) can provide the support to build and improve SDL technologies. These external actors will require the personnel to build and maintain SDLsâ€”which in turn involves more people to engage with the technology and can bring in more industry scientists.

Recent efforts on prototype SDL platforms and their associated technologies demonstrate the first steps of democratizing SDL-assisted research. Numerous systems have been released with open-source tools such as Chemspyd²⁸, PyLabRobot²⁹, PerQueue³⁰, and Jubilee³¹ among others^{32,33,34,35,36,37,38,39}. Access to such tools facilitates others in developing their own research platforms. Others have demonstrated collaborations between research groups^40,41, across academic levels^42,43, and with industry partners⁴⁴ as well as tools which increase accessibility to non-computer scientists^45,46,47. Moreover, additional studies have begun characterizing and benchmarking the performance of current SDLs^{13,42,48,49,50,51} to facilitate communal comparison and improvement. While these efforts and other SDL technology demonstrations⁵ inspire hope for the future of SDLs as a democratizing agent in scientific research, there are critical challenges which need to be addressed.

In this perspective, we first discuss two paradigms (centralized and distributed) by which SDLs can be made accessible to the research community and why we believe both avenues should be explored. We then discuss the current roadblocks toward achieving such a democratized future of SDL-assisted research before addressing how we might overcome these obstacles.

Balancing a centralized and distributed future

While creating automated experimental apparatus may be feasible for a general laboratory, the effort required to develop and maintain an SDL is unarguably large. Centralized facilities that allow (virtual) access by applicants concentrate efforts and personnel^{52,53,54,55,56,57,58} (Fig.Â 2, left); alternatively, open-source^59,60 networks encourage peer-to-peer collaborations that can leverage specialization and modularization (Fig.Â 2, right).

**Fig. 2: Schematic illustrating both centralized (top left) and distributed (top right) self-driving laboratories (SDLs).**

The categorization of SDL deployment as centralized or distributed is useful but relative. The delineation between these can vary: an SDL for every individual researcher, research group, university, etc.â€”extending hyperbolically to a single centralized SDL for the entire world. We have chosen to divide the distributed and centralized paradigms between research groups and shared university facilities as there is a noticeable change in the degree of customization and flexibility in addressing research challenges between these two levels.

Hybrid approaches are also feasible, and potentially preferable at this stage. Individual laboratories could utilize simplified, low-cost automation systems⁶¹ for workflow development, testing, and troubleshooting before submitting the finalized workflow to an external facility. Moreover, to tailor centralized facilitiesâ€™ capabilities to the needs of specific research groups, individual laboratories could develop an instrument in accordance with facility guidelines such that the unit can â€œplug inâ€ to a centralized facility^62,63,64. This requires financial and logistic support for transporting and integrating bespoke equipment, but addresses the throughput concerns of an individual laboratory and the specialization concerns of a centralized facility. Finally, collaborations with national laboratories provide a unique opportunity to explore large-scale coordination and to test how various centralized and distributed technologies can be implemented. National laboratoriesâ€™ intermediate scales may be optimal for the development of self-driving systems to manage both academically and industrially relevant data provenance and metadataâ€”a challenge that transcends fields and informs how disparate industrial sectors must adapt to embrace self-driving workflows.

Centralized, distributed, and hybrid approaches seek to keep SDLs open to researchers regardless of background or financial means and ensure that an enclave of privileged facilities do not have sole access to SDLsâ€”a configuration that would worsen disparity between laboratories for funding and publication and squander the potential of an SDL-assisted research paradigm. Despite these varied approaches, the challenge remains of how to ideally balance priorities between advanced, communal automation technologies and networks of specialized platforms. The optimal strategy for SDL deployment must consider how the initial investments (the barriers to entry) are overcome, how logistics and legal concerns are managed, and how the workforce is prepared to engage with SDLs in industry and academia.

In terms of financing and staffing, centralized facilities may be more attractive to industry and national investors, as funding a single meta-project helps guard against splintering and wasted/redundant effort, helps maintain long-term collaboration, and provides more stability (less risk) than an individual research group⁶⁵. As the costs of commercial automation units and software decline, distributed SDLs become more feasible. Smaller, designer SDLs are in turn likely to attract local businesses and universities. The proximity and flexibility of distributed SDLs can facilitate rapid collaboration on novel and cutting-edge research for a given scientific or industrial niche. Conversely, a centralized facility may have too much inertia to rapidly address changing needs or may struggle to justify providing highly specialized equipment only a handful of users ever use.

The maintenance and sustainability of the SDL ecosystem is also dependent on good management. Team science management, regardless of approach, requires the coordination of data and experiments in a manner that is both robust, equitable, and accountable^66,67. The distributed approach, with its more liquid boundaries, would require considerably more coordination, making maintaining the digital aspect (digital twins⁶⁸ and datasets) more difficult⁶⁹. The centralized approach raises ethical questions of how projects are selected, time and resources allocated, and experiments managed. For credit in both approaches, the SDLs could be given identifiers (cf. ORCID) for data provenance and attribution in manuscriptsâ€”though the task of acknowledging the personnel behind each SDL at the time of publication remains a logistical challenge. Unlike high-performance and distributed computing (a close analogue to future SDLs), SDLs consume and produce material; the rights to which and whose purchasing may be challenged by funding agreements.

In democratized science, data must be generated safely, ethically, and legally and the quality of the data must be made trustworthy⁷⁰. With respect to local and national regulations for hazardous materials, dangerous processes, and sensitive data, a centralized facility may have an easier time being regulation certified, but also must acquire more/higher-grade certifications; whereas a smaller SDL in a distributed network can only apply for the certifications it needs but may struggle to acquire all the engineering controls required to meet the certification. Concerning data quality, any SDL would require routine testing and quality control in order to maintain public trust in addition to study-specific benchmarking and control experiments to ensure the validity of results for new or novel materials and processes. In the centralized approach, a consortium of key facilities could develop these protocols and use them as the standard; whereas in the distributed approach, more effort would be required to create robust future- and site-proof standards for interoperability, shareability, and reproducibility with periodic checkups to remain a trusted member of the network of collaborators. While overall maintenance and standardization may be easier for centralized facilities, it is worth noting that the research groups using these facilities may likely find the established protocols and standards limiting in what research can be conducted.

SDLs must engage with and support their collaborators: people⁴⁰. Any proposed education strategy implemented should reinforce itself to ensure the sustainability of the SDL ecosystem. In the centralized paradigm, key facilities create centers of learning and can act as educational institutions to provide intense and fulfilling educations for participants. In the distributed paradigm, having more diffuse facilities can more effectively provide access to a geographically diverse cohort of future SDL researchers and may have a larger overall capacityâ€”increasing the number and diversity of future-researchers benefited. Furthermore, low-cost and do-it-yourself SDL technologies can serve as educational tools for burgeoning researchers (e.g., â€œfrugal twinâ€ platforms⁶¹, Educational ARES^71,72, and Legolas⁷³). Ultimately, SDLs are in service of people, and how users are trained in and engage with these powerful research tools must be thoughtfully considered.

In both paradigms, the goals to increase throughput and reduce cost must be balanced against the quality of the data. Experimental fidelity can greatly impact the number of experiments required to arrive at a solution^50,74, and further work is required to determine the optimal tradeoffs between these design goals. Any low-cost system would need rigorous reproducibility analysis to be of value to SDL-assisted research. The relationship between setup and operating costs and data fidelity will evolve with the advancement of automated laboratory technologies and in turn modify the optimal balance between centralized and distributed SDL research.

Table 1 Summary comparison of centralized and distributed self-driving laboratory (SDL) paradigms

Full size table

In summary, centralized approaches pool resources to achieve more technologically advanced SDLs but face challenges in generalizability while distributed approaches provide flexibility but require greater coordination (TableÂ 1). As SDL-related technologies evolve, however, many of the capital and operating expenses (for the creators, managers, and users of SDLs) will change.

Roadblocks on the path to future SDLs

SDLs have the potential to expand (or restrict, if improperly managed) who is afforded the opportunity to do research. The same positive feedback loop that stands to accelerate SDL proliferation also forestalls their widespread use (Fig.Â 3). Individual SDLs are large projects, and current demonstrations of SDLs are limited by hardware and software capabilities or are deployed conservatively to reduce risk. As a result, there are few exemplar systems that are industrially relevant enough to attract widespread funding, and funding is required for high-impact demonstrations. Support from outside this cycle is needed to kickstart SDL proliferation, and efforts made to improve the power, generalizability, and accessibility of both the physical and digital aspects of SDLs are necessary.

**Fig. 3: The cycle of challenges forestalling the advancement of self-driving laboratory (SDL) technologies.**

In this section, we review the major roadblocks to SDL proliferation and to the subsequent democratization of automated research. The discussion will focus initially on development of SDLs and their workforce, then into laboratory- and community-level challenges and opportunities for advancement.

In the workforce

The transition from conventional to self-driving research in the materials and chemical space will require developing a specialized, yet multi-faceted, workforce. The status quo of research favors collaborations between others in a closely related field or between members of the same research campus. Unfortunately, the applications where SDLs are most promising also require the greatest diversity and mass of knowledge.

The needed workforce is not, however, a monolithic body: Developers combine hardware, software, processing, and materials innovations to realize new self-driving systems; technicians maintain and tune such systems; and users interact with self-driving research systems through the digital world by selecting hypotheses, guiding learning, and analyzing data. While these may be distinct roles, individuals may move between these roles throughout the course of their careers and as their research needs evolve. Moreover, these roles will require differing levels of expertise within and between fields as well as collaboration skills. As complex, intersectional engineered systems, SDLs will need a healthy distribution of actors to be successful.

Developers of self-driving experimentation systems must combine the expertises of their experimental domain with automation. They must know the subtleties and pitfalls of various laboratory and research techniques in their discipline and be skilled in automation to reify these techniques with programmatic control. In contrast, traditional academic departments are fairly siloed with robotics being separate from chemistry or materials science; and learning to be a developer generally requires practice and dedicated training or self-teaching to fill the educational gaps. The need for practice systems can be ameliorated with low-cost systems that have been specifically designed as pedagogical tools^73,75,76,77. The development and dissemination of such systems is an important part of efficiently training SDL developers.

The challenges associated with maintaining SDLs mirror the challenges in developing them. Technicians may be required to monitor processing signals to ensure smooth operation, enhance performance through collaboration⁷⁸, intervene to repair or recalibrate the system when needed, and restock supplies. Technicians require less depth of expertise in the underlying science and robotics optimization than developers. As such, individuals with some experience with physical experiments (whether vocational or through undergraduate training) and instrument-specific training will be able to perform this role. Here, micro-certifications and online training are appropriate for delivering the specific knowledge needed to operate the relevant systems. Both the academy and industry should be involved in the development of these resources.

A userâ€™s two main responsibilities are to (1) formulate good scientific hypotheses for the SDLs to explore and (2) (virtually) oversee the learning process to make adjustments as needed⁷⁸. While the former can be thought of as a goal of doctoral education (implying most first-generation users will be domain experts), in contrast to a developer, this domain expertise need not include automation. For proficiency in the latter task, users will have to practice overseeing an SDLâ€”requiring facile tools that allow for this experience and pedagogical resources that define the types of situations that can be experienced (and how to resolve them)⁷⁹.

While the collaboration between industry and academia is necessary for the advancement of SDL technologies, there is a conflicting argument between what each group seeks to gain from the advancement (cf. Fig.Â 3, where each circle corresponds roughly to a different role). A user seeking to apply the SDL to solve a scientific problem may desire a more vertically integrated SDL technology, which packages hardware control, data-management, and experimental planning together (such as Atinary or a lab-as-a-service provider^52,80,81). A developer seeking to advance laboratory capabilities may desire the ability to combine, adapt, and create new modulesâ€”favoring horizontal integration. And, presently, there is a large overlap between technicians and both the developer and user rolesâ€”introducing a bias in the user experience toward having expert knowledge and permitting band-aid solutions in design. A party more interested in workforce development, education, or mitigating job displacement^82,83 may desire in-house development and implementation as a form of training. We encourage the community to thoughtfully consider these cases and make which is their focus explicit in their publications. This way there can be more equitable advancements of the technology (theoretical and demonstrated) and the workforce.

Successful efforts in a collaborative SDL will require team science to communicate with diverse researchers and people who may not have shared research experience, vocabulary, or knowledge base. Large groups will need people skilled in team science who can facilitate the development and success of the teamâ€”a role currently not prepared by academia; though there are some programs such as the â€œGrowing convergence researchâ€ program at the National Science Foundation (NSF)⁸⁴ that are pushing this area. A consequence of this lack of a teaching culture is that teamwork is most often learned â€œon the jobâ€. While collaboration is a lifelong learning objective; it has yet to be determined how team science should best be integrated into educational programs. Playing to the strengths of a diverse community, we need a multipronged strategy to address the training of academic and industrial researchers. Universities should engage with industry partners to integrate team and automated sciences into the curriculumâ€”ideally these programs should overlap with the micro-credential programs for upskilling existing workers.

In the lab

The integration and automation of advanced data-science strategies for the analysis of results and the proposition of new experiments is a key feature which separates SDLs from prior work in high-throughput experimentation. Despite efforts to democratize research automation technologies, their transferability in practice can be limited by the tension between generality (being applicable to any hardware system) and specificity (taking full advantage of the nuances and capabilities of a given instrument)⁸⁵. Similarly, machine learning (ML) for materials and molecular discovery struggles to be universal and is currently used primarily to simulate or plan experiments and analyze data.

Laboratory automation is a powerful tool that bridges the digital and physical worlds of SDLs, enabling active learning. Over the past few years, proof-of-concept automated setups have been demonstrated for many upstream^86,87 and downstream^{42,88,89,90,91} steps in the traditional materials science research workflow, as well as more holistic SDLs^3,52,55. The integration of disparate components into a single, usable platform, even in a modular fashion, requires establishing engineering controls, networking systems, defining data-collection and management structures, and providing human access points for troubleshooting and collaboration.

Developing automation tools and integrating them into SDLs relies heavily on access to application programming interfaces (APIs) and the documentation of both these APIs and the toolsâ€™ capabilities⁹². Few APIs are readily provided or supported by manufacturers and many are poorly documented or come with restrictive licensing agreementsâ€”resulting in individual groups producing redundant, rarely universal, solutions. In addition, many vendor-provided APIs may transmit data in opaque data formatsâ€”curtailing active learning. For commercial equipment, preferential purchasing of solutions that provide native programmatic control with high-quality APIs and documentation (as well as vendor technical support and interoperable data) is a good start. Due to the size of the research market, however, vendors may not foresee sufficient economic returns in recapitalizing their product lines, especially when contrasted with closed ecosystem approaches that are potentially more profitable. As fields emerging into automation, chemical and materials research impose heavy demands on system robustness (e.g., temperature, pressure, and chemical compatibility) and specialization (e.g., flexibility and cutting-edge applications); meanwhile existing automation suppliers, often begotten from a biological domain, are often focused on general-purpose use, high-throughputs, and industrial scale safety. It is imperative, then, that researchers seek to engage with industry R&D departments to co-develop prototype automation toolsâ€”even if these collaborative efforts cannot be promptly releasedâ€”as by having a seat at the table, gradual improvement to SDL integrations can be achieved.

The technologies the SDL community itself creates must consider accessibility to the broader research community as open-source solutions are less likely to come from industry. Open-source systems that are interchangeable between vendors hinder competitiveness. Academic²⁹, commercial⁶⁴, and governmental^93,94 research efforts into providing open-source and high-quality APIs and SDL development tools show promise in making SDLs more accessible and interoperable. Groups currently developing SDLs should make an effort to reuse (or consult) as much code and data from prior studies as possible, even if it must be modified, and should publicly release their code during publication. While it is often beyond the scope of a single laboratory to develop universal code or experimental techniques, taking the time to analyze code and data reporting decisions can act as an additional form of dialogue between SDL developers. This encourages community involvement, reduces redundant effort, brings people into dialogue, and can help work out the bugs in our standardization efforts. The long-term maintenance⁹⁵ and cybersecurity of these code libraries as data formats and software evolve, best practices and standards change, and programming language preferences shift is, unlike the popular ML development libraries which are maintained by the technology industry, incompatible with the current funding paradigm⁹⁶ (see section â€œIn the communityâ€).

Given the effort required to â€œglueâ€ each hardware and software module of an SDL together, there have been efforts to automate or assist developers as they construct new SDLs. While powerful middleware, protocol, and orchestration tools that aim to address these interoperability issues have been developed (such as ROS, SiLA 2, and BlueSky⁶⁴), their adoption is limited to groups developing the most complex SDLs, pointing to a broader issue of fragmented technology ecosystems between laboratories. Universal (highly abstract) middleware suites have learning curves that can challenge less experienced groups and can be difficult to quickly deploy for a specific application. Conversely, for the larger projects of more experienced groups, version management and overhead can become a concern when using these frameworks⁹⁷ and may motivate a group to create its own middleware suite. Consequently, progress by less experienced groups is limited by their lack of awareness or expertise in the technology and progress by more experienced groups is not easily transferable to other groups. In an attempt to address the inaccessibility of designing or using such middleware suits, several groups have turned toward using pre-trained LLMs to generate this â€œglueâ€ code automatically from API documentation⁹⁸. The unpredictability in existing LLMs requires special attention be put into engineering operational safeguards⁹⁹. For the near term, achieving genuine planning with LLMs requires considerable supplementation with more traditional, logic-based verification workflows¹⁰⁰.

Machine learning is crucial to realize efficient experimental design and drive active learning in SDLs. While ML for SDLs shares in the challenges of multi-objective and multi-property optimization^101,102,103 that exist throughout data science, their coupling to physical platforms and the nontrivial inputâ€“output spaces of chemical and materials science presents challenging opportunities to advance artificial scientific intelligence.

In practice, experiments are subject to constraints, be they inherent to the physical system or imposed by humans for safety. Constraints are crucial for knowledge transfer between collaborators and evaluating the transferability of a model between physical systems. Constraints can be encoded by using prior knowledge¹⁰⁴ (which may introduce bias into the system) or can be learned during experimentation¹⁰⁵ (which can necessitate more data: increasing cost). The relative nature of constraints, however, can hinder generalizability and predictive powerâ€”as is notoriously the case with predicting synthesizability^{106,107,108,109}. Similarly, the objectives by which experiments are evaluated need to be quantifiable and measurable within the active learning cycle. For example, future-seeking objectives such as â€œoptimize for sustainabilityâ€ need to be translated into short-term measurable, each of which comes with its own set of questions (e.g., metric/assay choice, comparison techniques, economic contexts, etc.). Presently, neither human- nor ML-generated decompositions of these objectives seem to suffice, and new ways of combining and translating objectives are required to address complex or unclear connections between objectives (low- and high-level, immediate and far-off) so that collaborators can make targeted progress toward their diverse goals^110,111,112. Recent efforts to address the measurement component of this problem¹¹³ include the use of proxies^92,114 (i.e., the estimation of inaccessible properties of interest by using correlations to more accessible measurablesâ€”including the estimation of device-level properties via inexpensive replicas of real-world devices). While proxy measurements increase the scope of research problems an SDL can investigate (and can often reduce costs), they can struggle in extrapolative campaigns (e.g., material discovery) and so require continually maintaining the correlations with new data¹¹⁵.

A final, foundational ML challenge for SDLs is understanding and quantifying uncertainties. Uncertainty provides a crucial touchstone for human collaborators as a rough estimate of how confident^116,117 the ML algorithms are and is used to determine which experiments are proposed. Unlike many data science applications where the uncertainties of observations must be estimated, SDLs present an opportunity to directly measure and transform uncertainties between platforms. Current SDL hardware fails to capture most experimental meta-data; and despite manufacturer testing, experimental uncertainties must be measured for the particular chemical or material system being investigated. An SDL could use such information to learn about laboratory praxis and improve its own workflows for materials discovery and process optimization. Less speculatively, uncertainty, calibration, and benchmark studies are required for human researchers and collaborating SDLs to determine their trust in SDL-generated results⁵⁰. Despite how crucial such studies are, the current suite of tests and the integration of these controls into SDL workflows is lacking.

In the community

Industrial adoption and implementation of SDL technologies are currently limited to a few specific areas, such as biotech, biopharma, and specialty chemicals/materials where companies have applied SDLs in their discovery pipeline^{118,119,120,121,122,123,124,125}. While traditional funding bodies are typically limited in terms of budget and scope of SDL projects research initiatives (making it difficult to fund ambitious, large-scale projects needed for breakthrough demonstrations), industry participation, can be instrumental in the development of large-scale SDL initiatives. While examples have recently emerged in related fields, further de-risking of the technology is needed for a broad market penetration.

Thorough cost-benefit and other techno-economic analyses are typically required to illustrate the potential savings in time and resources that SDLs offer over traditional R&D processes and will help collaborators make the best choice of tool for the challenge at hand. Mature approaches are often viewed as safer investments for addressing new, complex problems; and even as SDL technologies advance, the question of whether to (or the temptation to) use brute force, high-throughput experimentation will remain as advances in one often apply to the other. While transparent analyses of SDLs may help partners overcome their own barriers (viability, security, IP, and competitiveness), external factors such as legislation over whether compounds or processes discovered by SDLs are patentable hover over industry support.

Data and records of prior work are essential to science, and it is no different for SDL-assisted research. Data-management and modeling must be flexible, interoperable, and provide representations of experiments and results¹²⁶. Whereas the motivations and challenges of FAIR scientific data have been discussed elsewhere¹²⁷, attempts have been made to organize experimental information in general purpose relational schema¹²⁸ and in semantic knowledge graphs^38,129 as well as efforts to provide provenance tracking¹³⁰â€”features which, once mature, will be indispensable for combating dubious data and for general data-management in a distributed paradigm¹³¹. While epistemological and ontological frameworks are in development^38,132 and can facilitate collaboration across linguistic, domanial, and cultural barriers, they represent yet another complex system which must be integrated into an SDL; as such, the coming generation of these technologies must seek to be easy to deploy and integrateâ€”potentially requiring a standardized interface.

As a consequence of the volume of data produced and the (mostly ad hoc) automation of experimental planning, execution, and analysis, there have been concerns about whether the data generated by an SDL can be trusted^133,134. While ad hoc solutions are indispensable during the early stages of a technologyâ€™s development, they often result in redundant efforts^29,135,136, increased setup times, and suboptimal outcomes and introduce the potential for unreliable experiments. For distributed SDLs, the completeness and thoroughness of platform and results characterization (meta-characterization)^50,74,137 must be studied such that collaborators can properly assess literature data and train future users and technicians. Data must encompass not only experimental results (inputs and outputs) but state and environmental information as well (e.g., temperature, relative humidity, and any metadata required to reproduce the results of the ML modelsâ€”measurables often overlooked by commercial units)^70,138,139. With complete details of an experiment, it may become possible to learn the mapping between different hardware architectures (e.g., batch vs. flow) and further the transferability and interoperability of SDL technologies. Until either a critical mass of SDLs is online or the self-reporting of platform metrics is sufficient to assuage investment risks, the lack of trust to justify SDL investment will continue to stymie the growth of SDL technologies.

Many extant SDLs, however, are beholden to their externally funded projectâ€™s goals (particularly in academia) and either do not have the bandwidth for external validation studies or must conceal crucial aspects of their workflows to protect their sponsorsâ€™ IP (curtailing the cross-validation of their results by other SDLs). Currently, most laboratories can only report calibration, benchmarking, and self-validation data to garner supportâ€”with only a fraction of self-driving technologies actually incorporating repeatability or reproducibility metrics. Moreover, there are no standards for performing or enforcing such studies, and incentives are non-existent. Different fields and applications have different ideas of what is considered standard for experimental verification; and for multidisciplinary and multi-application SDLs, these discrepancies make defining a singular standard difficult.

While there is pressure to present research in the most promising format while still being fair and honest, supplemental information needs to better characterize the theory and reality of platforms in operations⁵⁰ (e.g., its performance in well-defined tasks, devices, capabilities, and interfaces) in order to avoid overpromising and backlash. The definition of (partial) success must be considered and discussed when reporting any metrics about SDL performance with respect to its objectivesâ€”e.g., for a â€œdiscoveryâ€ campaign, is there success in gaining any insights or only when a new compound is created? how do the properties of interest affect the degree of success? and when does the failure of a chemistry count as a failure of the automation? SDL reports should include calibrations, standards, and comparative or benchmarking studies performed so that others can better build off of the reported results. In addition, metrics such as overall operational performance, the degree of human involvement with the workflow, resources used and wastes generated, as well as an open discussion of areas where the platform could be reasonably improved should be reported (cf. ref. ¹⁴⁰, ref. ¹⁴¹, and supplemental materials of ref. ³).

These demands represent additional work in the present; however, it is important to establish rich descriptions of SDLs as the precedent for future work. This rigor will help to build and reinforce trust within and, perhaps more crucially, beyond the SDL community. Similarly, by having code and data available (when legal to do so), the communal development of SDLs can be accelerated and consensus can be achieved for best practices and a shared understanding of SDL language developed. Additionally, rigorous reporting on SDLs, especially their shortcomings, establishes norms that not every published SDL needs to be flawless (an admission that helps the groups behind less â€œspectacularâ€ SDLs to enter the conversation) and facilitates the identification of opportunities for improvement (inviting collaboration for perpetual improvement).

The current structure for funding, publication, and maintenance are insufficient for SDLs to act as democratizing agents for scientific research⁹⁶. Research funds are mostly allocated toward scientific results, rather than the infrastructure and development needed for SDLsâ€”resulting in bootstrapping and leveraging of funds to build SDLs. The singular focus on the scientific results of interest and publishability can result in overlapping technologies tailored to (even pigeonholed in) a specific application. Instead, proposals should be tailored to drive the diversification of collaborators. Diverse expertise supplies SDLs with the necessary breadth of knowledge required to be built and helps to foster and maintain collaborative relationships within the scientific and industrial communities so that future SDLs can thrive in a democratic and collaborative ecosystem. These cross-disciplinary collaborations will also facilitate the training of individuals in scientific communication and team science. Even when projects are simple, the inclusion of non-experts in SDL-related discussions helps prevent the gradual increase of the SDL skill-floor which often occurs when only experts are allowed to participate in discussionsâ€”outside (even naive) opinions help to challenge assumptions and bring in new ideas. Therefore, it will be important that investments be made in SDL infrastructure (both human and machine), with the understanding and expectation that these new tools will lead to new and impactful scientific advances.

Conclusions

In our collective advancement toward a more democratized future of SDL-assisted research, we must focus our efforts. If the role of SDLs should be to enable more scientists to participate in research and to facilitate the act of research to include more diverse and collaborative ideas, then the efficiency, accessibility, and interoperability of autonomous technologies must be improved. This will only be possible by getting a head-start on team-science and incorporating academic, industrial, and national inputâ€”building standards and protocols, and opening access to our tools and data. Both Centralized and Distributed approaches will require a consortium to outline living standards for automation, software, and data interfaces; Centralized-leaning technologies will require the initial investment; and Distributed-leaning technologies will require a confluence of low-cost assays and modules with interfaces tailored toward the layperson. Advancement of these thrusts are mutually beneficial towards making both SDLs and research more accessible; and the breadth of their coverage will enable SDLs to assist research in addressing problems of industrial and societal impact. The ultimate goal of this democratization for the increase in participation and ideas to germinate into better and more creative solutions, which could not be envisioned by traditional, insular approaches to research.

While advancements to improve individual components must be made, the identified bottlenecks of SDL-aided research in this perspective article hints that additional work is needed in the prediction and management of scopes and throughputs when an SDL is being designed^114,142. Such analysis can also help identify which bottlenecks are the result of capital expenditure and which are the result of fundamental limitation of the technology or technique. The latter invites innovation and can better illuminate the cross-SDL benefit of addressing these limitationsâ€”catalyzing the development of new SDLs. In this spirit, some effort should be made into automating the act of automation by creating SDL â€œinstallation wizardsâ€ which can help in selecting equipment and developing variations of traditional workflows to make the most of the resources available.

An institute for automated laboratory infrastructure could better focus the SDL communityâ€™s efforts and provide partners with a meta-project with which to engage and interface. A consortium for SDLs would more readily sustain long-term funding to develop and maintain SDL software infrastructure (as opposed to specific hypothesis-driven research projects) as well as provide pre-competitive, non-proprietary support for academic and industrial researchers. Such a consortium could, as an external entity, attract long-term staffing who could cultivate a set of best practices and engage in developing educational materials (online tutorials, in-person workshops) to train users. By focusing on core SDL issues, the tools developed in part of the consortium would serve both centralized and distributed SDL paradigms.

SDLs provide a means by which to further democratize research. While there are outstanding issues with the technology, they are surmountable; and while the approaches to address these issues vary depending on how centralized or distributed SDLs are implemented, the future of SDLs will encompass both. Addressing these challenges of automation, modeling, data management, collaboration, and training from both angles will help to bring a future of inclusive and accessible research that is flexible and robust against changing paradigms and better fit to address the ever complexifying problems of the world.

References

Chu, S. & Majumdar, A. Opportunities and challenges for a sustainable energy future. Nature 488, 294â€“303 (2012).
ArticleÂ ADSÂ CASÂ PubMedÂ Google ScholarÂ
Campos, K. R. et al. The importance of synthetic chemistry in the pharmaceutical industry. Science 363, eaat0805 (2019).
ArticleÂ CASÂ PubMedÂ Google ScholarÂ
Koscher, B. A. et al. Autonomous, multiproperty-driven molecular discovery: From predictions to measurements and back. Science 382, eadi1407 (2023).
ArticleÂ CASÂ PubMedÂ Google ScholarÂ
Tom, G. et al. Self-driving laboratories for chemistry and materials science. Chem. Rev. 124, 9633â€“9732 (2024). A nearly comprehensive review of self-driving research literature through 2023 showcasing the applications and impact of self-driving laboratory technologies.
ArticleÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Bayley, O., Savino, E., Slattery, A. & NoÃ«l, T. Autonomous chemistry: navigating self-driving labs in chemical and material sciences. Matter 7, 2382â€“2398 (2024). A breakdown of automated laboratory paradigms (flow, batch, mobile), automated experimental design, and applications with a discussion on decentralization towards increasing the accessibility of self-driving laboratory technologies.
Boswell-Koller, C. et al. Accelerated Materials Experimentation Enabled by the Autonomous Materials Innovation Infrastructure (AMII) A Workshop Report. https://www.mgi.gov/autonomous-experimentation-materials-rd (2024).
Noack, M. M. et al. Autonomous materials discovery driven by Gaussian process regression with inhomogeneous measurement noise and anisotropic kernels. Sci. Rep. 10, 17663 (2020).
ArticleÂ ADSÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Canty, R. B., Koscher, B. A., McDonald, M. A. & Jensen, K. F. Integrating autonomy into automated research platforms. Digit. Discov. 2, 1259â€“1268 (2023).
ArticleÂ Google ScholarÂ
Montoya, J. H. et al. Toward autonomous materials research: Recent progress and future challenges. Appl. Phys. Rev. 9, 011405 (2022).
ArticleÂ ADSÂ CASÂ Google ScholarÂ
Sadeghi, S. et al. Engineering a sustainable future: harnessing automation, robotics, and artificial intelligence with self-driving laboratories. ACS Sustain. Chem. Eng. https://doi.org/10.1021/acssuschemeng.4c02177 (2024)
Grand Challenges - 14 Grand Challenges for Engineering. https://www.engineeringchallenges.org/challenges.aspx.
Wu, T. C. et al. A materials acceleration platform for organic laser discovery. Adv. Mater. 35, 2207070 (2023).
ArticleÂ CASÂ Google ScholarÂ
Volk, A. A. et al. AlphaFlow: autonomous discovery and optimization of multi-step chemistry using a self-driven fluidic lab guided by reinforcement learning. Nat. Commun. 14, 1403 (2023).
ArticleÂ ADSÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
He, D. et al. Algorithm-driven robotic discovery of polyoxometalate-scaffolding metalâ€“organic frameworks. J. Am. Chem. Soc. 146, 28952â€“28960 (2024).
ArticleÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Pratiush, U. et al. Building Workflows for Interactive Human in the Loop Automated Experiment (hAE) in STEM-EELS. Preprint at https://doi.org/10.48550/arXiv.2404.07381 (2024).
Liu, Y., Ziatdinov, M. A., Vasudevan, R. K. & Kalinin, S. V. Explainability and human intervention in autonomous scanning probe microscopy. Patterns 4, 100858 (2023).
ArticleÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Dixon, T. M. et al. Operator-free HPLC automated method development guided by Bayesian optimization. Digit. Discov. 3, 1591â€“1601 (2024).
ArticleÂ CASÂ Google ScholarÂ
Smith, S. C., Horbaczewskyj, C. S., Tanner, T. F. N., Walder, J. J. & Fairlamb, I. J. S. Automated approaches, reaction parameterisation, and data science in organometallic chemistry and catalysis: towards improving synthetic chemistry and accelerating mechanistic understanding. Digit. Discov. 3, 1467â€“1495 (2024).
ArticleÂ CASÂ Google ScholarÂ
Abolhasani, M. & Kumacheva, E. The rise of self-driving labs in chemical and materials sciences. Nat. Synth. 1â€“10, https://doi.org/10.1038/s44160-022-00231-0 (2023).
Stach, E. et al. Autonomous experimentation systems for materials development: a community perspective. Matter 4, 2702â€“2726 (2021).
ArticleÂ Google ScholarÂ
Baird, S. G. & Sparks, T. D. What is a minimal working example for a self-driving laboratory? Matter 5, 4170â€“4178 (2022).
ArticleÂ Google ScholarÂ
The Science of Team Science | National Academies. https://www.nationalacademies.org/our-work/the-science-of-team-science.
NSF BioPACIFIC MIP (DMR-1933487). https://biopacificmip.org/.
Masters, K. L. & Galaxy Zoo Team. Twelve years of Galaxy Zoo. Proc. Int. Astron. Union 14, 205â€“212 (2019).
ArticleÂ CASÂ Google ScholarÂ
Khatib, F. et al. Crystal structure of a monomeric retroviral protease solved by protein folding game players. Nat. Struct. Mol. Biol. 18, 1175â€“1177 (2011).
ArticleÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Hachmann, J. et al. The Harvard Clean Energy Project: large-scale computational screening and design of organic photovoltaics on the world community grid. J. Phys. Chem. Lett. 2, 2241â€“2251 (2011).
ArticleÂ CASÂ Google ScholarÂ
NSFâ€™s 10 Big Ideas - Special Report | NSF - National Science Foundation. https://www.nsf.gov/news/special_reports/big_ideas/.
Seifrid, M. et al. Chemspyd: an open-source python interface for Chemspeed robotic chemistry and materials platforms. Digit. Discov. 3, 1319â€“1326 (2024).
ArticleÂ CASÂ Google ScholarÂ
Wierenga, R. P., Golas, S. M., Ho, W., Coley, C. W. & Esvelt, K. M. PyLabRobot: an open-source, hardware-agnostic interface for liquid-handling robots and accessories. Device 1, 100111 (2023).
Heckscher SjÃ¸lin, B. et al. PerQueue: managing complex and dynamic workflows. Digit. Discov. 3, 1832â€“1841 (2024).
ArticleÂ Google ScholarÂ
Politi, M. et al. A high-throughput workflow for the synthesis of CdSe nanocrystals using a sonochemical materials acceleration platform. Digit. Discov. 2, 1042â€“1057 (2023).
ArticleÂ CASÂ Google ScholarÂ
Ziatdinov, M. A. et al. Hypothesis learning in automated experiment: application to combinatorial materials libraries. Adv. Mater. 34, 2201345 (2022).
ArticleÂ CASÂ Google ScholarÂ
Raghavan, A. et al. Evolution of Ferroelectric Properties in SmxBi1â€“xFeO3 via Automated Piezoresponse Force Microscopy across combinatorial spread libraries. ACS Nano 18, 25591â€“25600 (2024).
Liu, Y. et al. Autonomous scanning probe microscopy with hypothesis learning: exploring the physics of domain switching in ferroelectric materials. Patterns 4, 100704 (2023).
ArticleÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Liu, Y. et al. Experimental discovery of structureâ€“property relationships in ferroelectric materials via active learning. Nat. Mach. Intell. 4, 341â€“350 (2022).
ArticleÂ Google ScholarÂ
Roccapriore, K. M., Kalinin, S. V. & Ziatdinov, M. Physics discovery in nanoplasmonic systems via autonomous experiments in scanning transmission electron microscopy. Adv. Sci. 9, 2203422 (2022).
ArticleÂ Google ScholarÂ
Pratiush, U., Funakubo, H., Vasudevan, R., Kalinin, S. V. & Liu, Y. Scientific Exploration with Expert Knowledge (SEEK) in autonomous scanning probe microscopy with active learning. Digital Discovery 4, 252â€“263 (2025).
Bai, J. et al. A dynamic knowledge graph approach to distributed self-driving laboratories. Nat. Commun. 15, 462 (2024).
ArticleÂ ADSÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Liu, Y. et al. Exploring the relationship of microstructure and conductivity in metal halide perovskites via active learning-driven automated scanning probe microscopy. J. Phys. Chem. Lett. 14, 3352â€“3359 (2023).
ArticleÂ PubMedÂ Google ScholarÂ
Leins, A. D., Haase, S. B., Eslami, M., Schrier, J. & Freeman, J. T. Collaborative methods to enhance reproducibility and accelerate discovery. Digit. Discov. 2, 12â€“27 (2023).
Vogler, M. et al. Autonomous Battery Optimization by Deploying Distributed Experiments and Simulations. Adv. Energy Mater. 14, 2403263 (2024).
Gongora, A. E. et al. A Bayesian experimental autonomous researcher for mechanical design. Sci. Adv. 6, eaaz1708 (2020).
ArticleÂ ADSÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Quinn, H. et al. PANDA: a self-driving lab for studying electrodeposited polymer films. Mater. Horiz. https://doi.org/10.1039/D4MH00797B (2024)
Liu, Y. et al. Machine learning-based reward-driven tuning of scanning probe microscopy: towards fully automated microscopy. Preprint at https://arxiv.org/abs/2408.04055v1 (2024).
Mehr, S. H. M., Craven, M., Leonov, A. I., Keenan, G. & Cronin, L. A universal system for digitization and automatic execution of the chemical synthesis literature. Science 370, 101â€“108 (2020).
ArticleÂ ADSÂ CASÂ PubMedÂ Google ScholarÂ
Darvish, K. et al. ORGANA: A robotic assistant for automated chemistry experimentation and characterization. Matter 8, 101897 (2025).
Ren, Z., Zhang, Z., Tian, Y. & Li, J. CRESt â€“ copilot for real-world experimental scientist. Preprint at https://doi.org/10.26434/chemrxiv-2023-tnz1x-v4 (2023).
Gongora, A. E. et al. Using simulation to accelerate autonomous experimentation: a case study using mechanics. iScience 24, 102262 (2021).
ArticleÂ ADSÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Snapp, K. L. et al. Superlative mechanical energy absorbing efficiency discovered through self-driving lab-human partnership. Nat. Commun. 15, 4290 (2024).
ArticleÂ ADSÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Volk, A. A. & Abolhasani, M. Performance metrics to unleash the power of self-driving labs in chemistry and materials science. Nat. Commun. 15, 1378 (2024).
ArticleÂ ADSÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Suvarna, M. et al. Active learning streamlines development of high performance catalysts for higher alcohol synthesis. Nat. Commun. 15, 5844 (2024).
ArticleÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Emerald Cloud Lab: Remote Controlled Life Sciences Lab. https://www.emeraldcloudlab.com/.
German-Canadian Materials Acceleration Centre. https://gcmac.ca/.
CMU Cloud Lab | A Future of Science Initiative. https://cloudlab.cmu.edu/.
Arias, D. S. & Taylor, R. E. Scientific discovery at the press of a button: navigating emerging cloud laboratory technology. Adv. Mater. Technol. 9, 2400084 (2024).
ArticleÂ CASÂ Google ScholarÂ
Office of Science User Facilities. Energy.gov. https://www.energy.gov/science/office-science-user-facilities.
Automating Labs through Green Button Go | Biosero. https://biosero.com/ (2021).
Burke, M. D. et al. Molecule Maker Lab Institute: accelerating, advancing, and democratizing molecular innovation. AI Mag. 45, 117â€“123 (2024).
Google ScholarÂ
Li, J. et al. Autonomous discovery of optically active chiral inorganic perovskite nanocrystals through an intelligent cloud lab. Nat. Commun. 11, 2046 (2020).
ArticleÂ ADSÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Kalinin, S. V. et al. Probe microscopy is all you need. Mach. Learn. Sci. Technol. 4, 023001 (2023).
ArticleÂ ADSÂ Google ScholarÂ
Lo, S. et al. Review of low-cost self-driving laboratories in chemistry and materials science: the â€œfrugal twinâ€ concept. Digit. Discov. 3, 842â€“868 (2024). Introduction to the concept of â€œFrugal twinsâ€, wherein complex physical systems are modeled with low-cost alternatives for the purpose of instruction, integration, troubleshooting, and method development.
ArticleÂ Google ScholarÂ
Autonomous Discovery. Argonne National Laboratory. https://www.anl.gov/autonomous-discovery.
Beaucage, P. A. & Martin, T. B. The Autonomous Formulation Laboratory: an open liquid handling platform for formulation discovery using X-ray and neutron scattering. Chem. Mater. 35, 846â€“852 (2023).
ArticleÂ CASÂ Google ScholarÂ
Allan, D., Caswell, T., Campbell, S. & Rakitin, M. Blueskyâ€™s ahead: a multi-facility collaboration for an a la carte software project for data acquisition and management. Synchrotron Radiat. N. 32, 19â€“22 (2019).
ArticleÂ ADSÂ Google ScholarÂ
Bach, U. & Leisten, I. How to Structure and Foster Innovative Research. In Automation, Communication and Cybernetics in Science and Engineering 2009/2010 (eds Jeschke, S., Isenhardt, I. & Henning, K.) 3â€“13. https://doi.org/10.1007/978-3-642-16208-4_1 (Springer, Berlin, Heidelberg, 2011). A breakdown of funding priorities and project management for meta-projects in scientific research (as applied to occupational health and safety).
Stokols, D., Misra, S., Moser, R. P., Hall, K. L. & Taylor, B. K. The ecology of team science: understanding contextual influences on transdisciplinary collaboration. Am. J. Prev. Med. 35, S96â€“S115 (2008).
ArticleÂ PubMedÂ Google ScholarÂ
Bennett, L. M. Team Science: An Exercise in Difference and Diversity. In An Astronomical Inclusion Revolution: Advancing Diversity, Equity, and Inclusion in Professional Astronomy and Astrophysics. https://doi.org/10.1088/2514-3433/ad2174ch8 (IOP Astronomy, 2024).
Slautin, B. N. et al. Bayesian co-navigation: dynamic designing of the materials digital twins via active learning. ACS Nano 18, 24898â€“24908 (2024).
ArticleÂ CASÂ PubMedÂ Google ScholarÂ
National Science Data Fabric. https://nationalsciencedatafabric.org/.
Pelkie, B. G. & Pozzo, L. D. The laboratory of Babel: highlighting community needs for integrated materials data management. Digit. Discov. 2, 544â€“556 (2023). An overview of and guideline for the collaborative management of data in group research projects, notably highlighting the importance of meta-characterization and the logging of the environment under which experiments are conducted so as to evaluate collective trust.
ArticleÂ Google ScholarÂ
AI research robots key to â€˜democratizing and revolutionizing scienceâ€™, world-class AFRL re. One AFRL â€“ One Fight. https://www.afrl.af.mil/News/Article-Display/Article/3559877/ai-research-robots-key-to-democratizing-and-revolutionizing-science-world-class (2023).
ARES Learning | Project-Based Research-Driven Space-Focused School & Enrichment. ARES Learning. https://www.areslearning.com.
Saar, L. et al. The LEGOLAS Kit: a low-cost robot science kit for education with symbolic regression for hypothesis discovery and validation. MRS Bull. 47, 881â€“885 (2022).
ArticleÂ ADSÂ Google ScholarÂ
Epps, R. W. & Abolhasani, M. Modern nanoscience: convergence of AI, robotics, and colloidal synthesis. Appl. Phys. Rev. 8, 041316 (2021).
ArticleÂ ADSÂ CASÂ Google ScholarÂ
Ganitano, G. S., Wallace, S. G., Maruyama, B. & Peterson, G. L. A hybrid metaheuristic and computer vision approach to closed-loop calibration of fused deposition modeling 3D printers. Prog. Addit. Manuf. 9, 767â€“777 (2024).
ArticleÂ Google ScholarÂ
Baird, S. G. & Sparks, T. D. Building a â€œHello Worldâ€ for self-driving labs: The Closed-loop Spectroscopy Lab Light-mixing demo. STAR Protoc. 4, 102329 (2023).
ArticleÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Norquist, A. J., Jones-Thomson, G., He, K., Egg, T. & Schrier, J. A modern twist on an old measurement: using laboratory automation and data science to determine the solubility product of lead iodide. J. Chem. Educ. 100, 3445â€“3453 (2023).
ArticleÂ CASÂ Google ScholarÂ
Hung, L. et al. Autonomous laboratories for accelerated materials discovery: a community survey and practical insights. Digit. Discov. 3, 1273â€“1279 (2024). A survey analysis providing insight into the desires and fears of automation and autonomization that also highlights key distinctions (such as flexibility vs robustness) as well as the utility of keeping humans in-the-loop.
ArticleÂ Google ScholarÂ
Snapp, K. L. & Brown, K. A. Driving school for self-driving labs. Digit. Discov. 2, 1620â€“1629 (2023).
ArticleÂ CASÂ Google ScholarÂ
Chemistry Testing Laboratory. Certified Laboratories. https://certified-laboratories.com/chemistry/.
Services & Solutions â€¢ Frontage Laboratories. Frontage Laboratories. https://www.frontagelab.com/services/.
Vermeulen, B., Kesselhut, J., Pyka, A. & Saviotti, P. P. The Impact of Automation on Employment: Just the Usual Structural Change? Sustainability 10, 1661 (2018). A labor economic theory analysis of automation describing the observed (rather than theoretical) effects of automation in the workforce and how automation has historically resulted in a restructuring of labor rather than its elimination.
ArticleÂ Google ScholarÂ
Howard, J. Artificial intelligence: Implications for the future of work. Am. J. Ind. Med. 62, 917â€“926 (2019).
ArticleÂ PubMedÂ Google ScholarÂ
GROWING CONVERGENCE RESEARCH (GCR)|NSF - National Science Foundation. https://new.nsf.gov/funding/opportunities/growing-convergence-research-gcr (2024).
Canty, R. B. & Abolhasani, M. Reproducibility in automated chemistry laboratories using computer science abstractions. Nat. Synth. 1â€“13, https://doi.org/10.1038/s44160-024-00649-8 (2024).
Moliner, M. et al. Application of artificial neural networks to high-throughput synthesis of zeolites. Microporous Mesoporous Mater. 78, 73â€“81 (2005).
ArticleÂ CASÂ Google ScholarÂ
Kirman, J. et al. Machine-learning-accelerated perovskite crystallization. Matter 2, 938â€“947 (2020).
ArticleÂ Google ScholarÂ
Ludwig, A. Discovery of new materials using combinatorial synthesis and high-throughput characterization of thin-film materials libraries combined with computational methods. Npj Comput. Mater. 5, 1â€“7 (2019).
ArticleÂ Google ScholarÂ
Stein, H. S. & Gregoire, J. M. Progress and prospects for accelerating materials science with automated and autonomous workflows. Chem. Sci. 10, 9640â€“9649 (2019).
ArticleÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Desai, B. et al. Rapid discovery of a novel series of Abl kinase inhibitors by application of an integrated microfluidic synthesis and screening platform. J. Med. Chem. 56, 3033â€“3047 (2013).
ArticleÂ CASÂ PubMedÂ Google ScholarÂ
Adamo, A. et al. On-demand continuous-flow production of pharmaceuticals in a compact, reconfigurable system. Science 352, 61â€“67 (2016).
ArticleÂ ADSÂ CASÂ PubMedÂ Google ScholarÂ
Seifrid, M. et al. Autonomous chemical experiments: challenges and perspectives on establishing a self-driving lab. Acc. Chem. Res. 55, 2454â€“2466 (2022). An introduction to proxy measurements and the internal challenges of self-driving laboratory technology (specifically robotics and cognitive models).
ArticleÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
MolSSI â€“ The Molecular Sciences Software Institute. https://molssi.org/.
Vescovi, R. et al. Towards a modular architecture for science factories. Digit. Discov. 2, 1980â€“1998 (2023).
ArticleÂ Google ScholarÂ
Avoiding â€˜Bit Rotâ€™: Long-Term Preservation of Digital Information [Point of View] | IEEE Journals & Magazine | IEEE Xplore. https://ieeexplore.ieee.org/document/5768098. Definition of the term â€œbit rotâ€ and the systematic challenges of producing and maintaining high-quality software.
Monteith, J. Y., McGregor, J. D. & Ingram, J. E. Scientific Research Software Ecosystems. In Proc 2014 European Conference on Software Architecture Workshops 1â€“6, https://doi.org/10.1145/2642803.2642812 (Association for Computing Machinery, New York, NY, USA, 2014).
Rahmanian, F. et al. Enabling modular autonomous feedback-loops in materials science through hierarchical experimental laboratory automation and orchestration. Adv. Mater. Interfaces 9, 2101987 (2022).
ArticleÂ Google ScholarÂ
Liang, J. et al. Code as policies: language model programs for embodied control. In 2023 IEEE International Conference on Robotics and Automation (ICRA) 9493â€“9500. https://doi.org/10.1109/ICRA48891.2023.10160591 (2023).
Tang, X. et al. Prioritizing safeguarding over autonomy: risks of LLM agents for science. Preprint at https://doi.org/10.48550/arXiv.2402.04247 (2024).
Kambhampati, S. et al. Position: LLMs canâ€™t plan, but can help planning in LLM-modulo frameworks. in Proceedings of the 41st International Conference on Machine Learning vol. 235 22895â€“22907 (JMLR.org, Vienna, Austria, 2024).
Skalse, J., Howe, N., Krasheninnikov, D. & Krueger, D. Defining and Characterizing Reward Gaming. Adv. Neural Inf. Process. Syst. 35, 9460â€“9471 (2022).
Google ScholarÂ
Fromer, C. J., Graff, E. D. & Coley, C. W. Pareto optimization to accelerate multi-objective virtual screening. Digit. Discov. 3, 467â€“481 (2024).
ArticleÂ Google ScholarÂ
Dietz, T. et al. Introducing multiobjective complex systems. Eur. J. Oper. Res. 280, 581â€“596 (2020).
ArticleÂ MathSciNetÂ Google ScholarÂ
Kapusuzoglu, B. & Mahadevan, S. Information fusion and machine learning for sensitivity analysis using physics knowledge and experimental data. Reliab. Eng. Syst. Saf. 214, 107712 (2021).
ArticleÂ Google ScholarÂ
Epps, W. R., Volk, A. A., Reyes, G. K. & Abolhasani, M. Accelerated AI development for autonomous materials synthesis in flow. Chem. Sci. 12, 6025â€“6036 (2021).
ArticleÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Gao, W. & Coley, C. W. The synthesizability of molecules proposed by generative models. J. Chem. Inf. Model. 60, 5714â€“5723 (2020).
ArticleÂ CASÂ PubMedÂ Google ScholarÂ
Fromer, J. C. & Coley, C. W. An algorithmic framework for synthetic cost-aware decision making in molecular design. Nat. Comput. Sci. 4, 440â€“450 (2024).
ArticleÂ PubMedÂ Google ScholarÂ
Wang, J. et al. ChemistGA: a chemical synthesizable accessible molecular generation algorithm for real-world drug discovery. J. Med. Chem. 65, 12482â€“12496 (2022).
ArticleÂ CASÂ PubMedÂ Google ScholarÂ
Swanson, K. et al. Generative AI for designing and validating easily synthesizable and structurally novel antibiotics. Nat. Mach. Intell. 6, 338â€“353 (2024).
ArticleÂ Google ScholarÂ
Fromer, J. C. & Coley, C. W. Computer-aided multi-objective optimization in small molecule discovery. Patterns 4, 100678 (2023).
ArticleÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Slautin, B. N. et al. Co-orchestration of multiple instruments to uncover structureâ€“property relationships in combinatorial libraries. Digit. Discov. 3, 1602â€“1611 (2024).
ArticleÂ Google ScholarÂ
Batatia, I. et al. A foundation model for atomistic materials chemistry. Preprint at https://doi.org/10.48550/arXiv.2401.00096 (2024).
Strieth-Kalthoff, F. et al. Delocalized, asynchronous, closed-loop discovery of organic laser emitters. Science 384, eadk9227 (2024).
ArticleÂ CASÂ PubMedÂ Google ScholarÂ
Scheurer, C. & Reuter, K. Role of the human-in-the-loop in emerging self-driving laboratories for heterogeneous catalysis. Nat. Catal. 8, 13â€“19 (2025).
ArticleÂ CASÂ Google ScholarÂ
Fare, C., Fenner, P., Benatan, M., Varsi, A. & Pyzer-Knapp, E. O. A multi-fidelity machine learning approach to high throughput materials screening. Npj Comput. Mater. 8, 1â€“9 (2022).
ArticleÂ Google ScholarÂ
Kahle, L. & Zipoli, F. Quality of uncertainty estimates from neural network potential ensembles. Phys. Rev. E 105, 015311 (2022).
ArticleÂ ADSÂ CASÂ PubMedÂ Google ScholarÂ
Zhan, N. & Kitchin, J. R. Model-specific to model-general uncertainty for physical properties. Ind. Eng. Chem. Res. 61, 8368â€“8377 (2022).
ArticleÂ CASÂ Google ScholarÂ
Bhowmik, A. et al. Implications of the BATTERY 2030+AI-assisted toolkit on future low-TRL Battery Discoveries and Chemistries. Adv. Energy Mater. 12, 2102698 (2022).
ArticleÂ CASÂ Google ScholarÂ
Acceleration Consortium. https://acceleration.utoronto.ca/.
CAPeX. https://capex.dtu.dk/.
Savage, N. Tapping into the drug discovery potential of AI. Biopharma Deal. https://doi.org/10.1038/d43747-021-00045-7 (2021).
Hede, K. PNNL Kicks Off Multi-Year Energy Storage, Scientific Discovery Collaboration with Microsoft. https://www.pnnl.gov/news-media/pnnl-kicks-multi-year-energy-storage-scientific-discovery-collaboration-microsoft (2024).
Umicore enters AI platform agreement with Microsoft. https://www.umicore.com/en/newsroom/umicore-enters-ai-platform-agreement-with-microsoft-to-accelerate-and-scale-its-battery-materials-technologies-development/ (2024).
Northvoltâ€™s embrace of machine learning. https://northvolt.com/articles/northvolt-machine-learning/ (2023).
MIF Welcome Video. https://liverpool.cloud.panopto.eu/Panopto/Pages/Viewer.aspx?id=969aa046-7742-4f18-bfeb-b0730069a06c.
Allec, S. I. et al. A case study of multimodal, multi-institutional data management for the combinatorial materials science community. Integrating Mater. Manuf. Innov. 13, 406â€“419 (2024).
ArticleÂ Google ScholarÂ
Scheffler, M. et al. FAIR data enabling new horizons for materials research. Nature 604, 635â€“642 (2022).
ArticleÂ ADSÂ CASÂ PubMedÂ Google ScholarÂ
Pendleton, I. M. et al. Experiment Specification, Capture and Laboratory Automation Technology (ESCALATE): a software pipeline for automated chemical experimentation and data management. MRS Commun. 9, 846â€“859 (2019).
ArticleÂ ADSÂ CASÂ Google ScholarÂ
Bai, J. et al. From platform to knowledge graph: evolution of laboratory automation. JACS Au 2, 292â€“309 (2022).
ArticleÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Pruyne, J., Wozniak, J. M. & Foster, I. Tracking dubious data: protecting scientific workflows from invalidated experiments. In 2022 IEEE 18th International Conference on e-Science (e-Science) 456â€“461, https://doi.org/10.1109/eScience55777.2022.00082 (2022).
Delgado-Licona, F. & Abolhasani, M. Research acceleration in self-driving labs: technological roadmap toward accelerated materials and molecular discovery. Adv. Intell. Syst. 5, 2200331 (2023).
ArticleÂ Google ScholarÂ
Clark, S. et al. Toward a unified description of battery data. Adv. Energy Mater. 12, 2102702 (2022).
ArticleÂ CASÂ Google ScholarÂ
Szymanski, N. J. et al. An autonomous laboratory for the accelerated synthesis of novel materials. Nature 624, 86â€“91 (2023).
ArticleÂ ADSÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Leeman, J. et al. Challenges in high-throughput inorganic materials prediction and autonomous synthesis. PRX Energy 3, 011002 (2024).
ArticleÂ Google ScholarÂ
Yoo, H. J. et al. Bespoke metal nanoparticle synthesis at room temperature and discovery of chemical knowledge on nanoparticle growth via autonomous experimentations. Adv. Funct. Mater. 34, 2312561 (2024).
ArticleÂ CASÂ Google ScholarÂ
RodrÃguez, O., Pence, M. A. & RodrÃguez-LÃ³pez, J. Hard potato: a python library to control commercial potentiostats and to automate electrochemical experiments. Anal. Chem. 95, 4840â€“4845 (2023).
ArticleÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Hein, J. E. & Schrier, J. Guidelines for hardware-focused articles. Digit. Discov. 3, 447â€“448 (2024).
ArticleÂ Google ScholarÂ
Willoughby, C. & Frey, J. G. Data management matters. Digit. Discov. 1, 183â€“194 (2022).
ArticleÂ Google ScholarÂ
Maffettone, P. M. et al. What is missing in autonomous discovery: open challenges for the community. Digit. Discov. 2, 1644â€“1659 (2023).
ArticleÂ Google ScholarÂ
Fakhruldeen, H., Pizzuto, G., Glowacki, J. & Cooper, A. I. ARChemist: autonomous robotic chemistry system architecture. In 2022 International Conference on Robotics and Automation (ICRA) 6013â€“6019. https://doi.org/10.1109/ICRA46639.2022.9811996 (2022).
MacLeod, B. P. et al. Self-driving laboratory for accelerated discovery of thin-film materials. Sci. Adv. 6, eaaz8867 (2020).
ArticleÂ ADSÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ
Christensen, M. et al. Automation isnâ€™t automatic. Chem. Sci. 12, 15473â€“15490 (2021). An instructive guide for the establishment of a self-driving laboratory and the practical concerns, especially managing the relative throughputs of each component and the cost-benefit analysis of various approaches.
ArticleÂ CASÂ PubMedÂ PubMed CentralÂ Google ScholarÂ

Download references

Acknowledgements

M.A. graciously acknowledges the financial support of the Technology, Innovation, and Partnerships (TIP) Directorate of the National Science Foundation (award #2332452) and the records made by Pragyan Jha, Fernando Delgado-Licona, Sina Sadeghi, and Nikolai Mukhin during the FUTURE Labs Workshop (NSF #2332452) at North Carolina State University. S.V.K. was supported by the Center for Advanced Materials and Manufacturing (CAMM) and the NSF Materials Research Science and Engineering Centers (MRSEC). R.G.M. acknowledges support from the INTERSECT Initiative as part of the Laboratory Directed Research and Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the US Department of Energy under contract DE-AC05-00OR22725. JS acknowledges support from the National Science Foundation (PHY-2226511, OAC-2320718).Â T.V. acknowledges support from the Pioneer Center for Accelerating P2X Materials Discovery (CAPeX), DNRF grant number P3.

Author information

Authors and Affiliations

Department of Chemical and Biomolecular Engineering, North Carolina State University, Raleigh, NC, USA
Richard B. Canty,Â Jeffrey A. BennettÂ &Â Milad Abolhasani
Department of Mechanical Engineering, Boston University, Boston, MA, USA
Keith A. Brown
Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge, MA, USA
Tonio Buonassisi
Materials Science and Engineering, The University of Tennessee, Knoxville, TN, USA
Sergei V. Kalinin
Department of Chemical Engineering, Carnegie Mellon University, Pittsburgh, PA, USA
John R. Kitchin
Air Force Research Laboratory, Materials and Manufacturing Directorate, Wright-Patterson AFB, OH, USA
Benji Maruyama
Materials Science and Technology Division, Oak Ridge National Laboratory, Oak Ridge, TN, USA
Robert G. Moore
Department of Chemistry and Biochemistry, Fordham University, New York, NY, USA
Joshua Schrier
Department of Materials Science and Engineering, North Carolina State University, Raleigh, NC, USA
Martin Seifrid
Department of Mechanical Engineering, University of Washington, Seattle, WA, USA
Shijing Sun
Department of Energy Conversion and Storage, Technical University of Denmark, Lyngby, Denmark
Tejs Vegge

Authors

Richard B. Canty
View author publications
You can also search for this author inPubMedÂ Google Scholar
Jeffrey A. Bennett
View author publications
You can also search for this author inPubMedÂ Google Scholar
Keith A. Brown
View author publications
You can also search for this author inPubMedÂ Google Scholar
Tonio Buonassisi
View author publications
You can also search for this author inPubMedÂ Google Scholar
Sergei V. Kalinin
View author publications
You can also search for this author inPubMedÂ Google Scholar
John R. Kitchin
View author publications
You can also search for this author inPubMedÂ Google Scholar
Benji Maruyama
View author publications
You can also search for this author inPubMedÂ Google Scholar
Robert G. Moore
View author publications
You can also search for this author inPubMedÂ Google Scholar
Joshua Schrier
View author publications
You can also search for this author inPubMedÂ Google Scholar
Martin Seifrid
View author publications
You can also search for this author inPubMedÂ Google Scholar
Shijing Sun
View author publications
You can also search for this author inPubMedÂ Google Scholar
Tejs Vegge
View author publications
You can also search for this author inPubMedÂ Google Scholar
Milad Abolhasani
View author publications
You can also search for this author inPubMedÂ Google Scholar

Contributions

R.B.C., J.A.B., K.A.B., T.B., S.V.K., J.R.K., B.M., R.G.M., J.S., M.S., S.S., T.V., and M.A. all contributed to the writing of this persperctive article. R.B.C., J.A.B., and M.A. prepared the figures and edited the manuscript. M.A. acquired the funding for the workshop that resulted in this perspective article.

Corresponding author

Correspondence to Milad Abolhasani.

Ethics declarations

Competing interests

J.S. is on the scientific advisory board of Atinary, mentioned in the article. The authors declare no additional competing interests.

Peer review

Peer review information

Nature Communications thanks the anonymous reviewer(s) for their contribution to the peer review of this work.

Additional information

Publisherâ€™s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License, which permits any non-commercial use, sharing, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if you modified the licensed material. You do not have permission under this licence to share adapted material derived from this article or parts of it. The images or other third party material in this article are included in the articleâ€™s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the articleâ€™s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by-nc-nd/4.0/.

Reprints and permissions

About this article

Cite this article

Canty, R.B., Bennett, J.A., Brown, K.A. et al. Science acceleration and accessibility with self-driving labs. Nat Commun 16, 3856 (2025). https://doi.org/10.1038/s41467-025-59231-1

Download citation

Received: 21 September 2024
Accepted: 08 April 2025
Published: 24 April 2025
DOI: https://doi.org/10.1038/s41467-025-59231-1