Creating an open-source dataset for characterising microbes
Contributing to the ALIGN datasets for genotype to phenotype modelling
In May 2024 our CTO, Satnam Surae, joined an inspiring and ambitious gathering of researchers, engineers, and innovators at the Align to Innovate Microbial Phenotyping Workshop (https://alignbio.org/datasets-microbes). The event centred around designing an extensible experimental platform for characterising microbes and linking their genotypes to phenotypic outcomes, essential for predictive modelling in biology.
🌱 A community-driven initiative
Align to Innovate is pioneering open science, harnessing automation and AI to generate comprehensive datasets for machine learning models in microbiology. The workshop brought together experts from academia, industry, and biotech startups - like us at twig - to collectively envision and design experiments capable of transforming microbial research.
📚 The vision: genotype to phenotype
Our primary objective was clear: to create a robust, scalable dataset that captures microbial growth across diverse environmental conditions, connecting genotypic information directly to phenotypic traits. Such datasets are critical to building accurate, predictive models that could vastly accelerate innovation in fields such as climate science, agriculture, and biomanufacturing.
Specifically, the platform aims to:
Automate microbial phenotyping, generating standardised growth curves through precise laboratory automation.
Measure phenotypes across diverse microbes, capturing data from 1,000 microbial strains under 1,000 different cultivation conditions - resulting in a total of one million unique experiments.
Utilise AI-driven experimental designs (via tools like BacterAI) to select the most informative cultivation conditions efficiently, ensuring the best use of resources and maximising data quality.
⚙️ twig’s contribution
As a contributor from twig, Satnam’s role focused on:
Sharing our experience in automating biological workflows.
Discussing how AI-driven approaches are currently utilised and fall short.
Ensuring the developed protocols and datasets were not only scientifically rigorous but also highly applicable to real-world biomanufacturing challenges we face daily.
🔬 Why it matters
The gap between genotype and phenotype is a major bottleneck in biological sciences. By rigorously mapping this relationship, we open doors for:
Predicting microbial behaviour under various environmental conditions.
Optimising growth media formulations rapidly through machine learning.
Engineering microbes precisely tailored for specific biotechnological applications - ranging from sustainable agriculture to bioplastics and biofuels production.
🌍 Impact and applications
The outcomes of this initiative extend far beyond academia:
Biomanufacturing: Identifying novel host strains with unique capabilities; Enabling rapid optimisation of production strains and conditions, dramatically cutting R&D timelines and improving yields.
Environmental Biotechnology: Supporting the design of microbes capable of resource-efficient processes like biomining or wastewater biovalorisation.
Fundamental Research: Providing comprehensive datasets that enable researchers worldwide to tackle previously unanswerable questions about microbial growth and metabolism.
📈 Looking ahead
The Align to Innovate platform is a living resource, intended to expand over time by continuously adding new organisms, conditions, and data types. As the platform evolves, the potential to leverage it for transformative innovations in microbial biotechnology will only grow.
Participating in the workshop reinforced my belief in the power of open collaboration and standardised, AI-driven experimental methodologies. We’re excited to see this ambitious vision turn into reality and to continue collaborating within this ground-breaking community.
See more details here (https://zenodo.org/records/15397940), and we’re excited to see this project progress!
Learn more about twig at: https://www.twig.bio
Contact us via info@twig.bio