In a groundbreaking achievement, scientists have developed an AI model that simulates 500 million years of evolution, resulting in the creation of a novel fluorescent protein. This innovative approach, led by researchers at EvolutionaryScale and the Arc Institute, opens new avenues for understanding protein functions and applications in various scientific fields.
Key Takeaways
The AI model, named ESM3, successfully synthesised a new fluorescent protein called esmGFP.
ESM3 was trained on an extensive dataset, simulating half a billion years of evolutionary data.
The new protein has potential applications in medicine, environmental research, and biotechnology.
The EvolutionaryScale Model 3 (ESM3)
The ESM3 model represents a significant advancement in the field of artificial intelligence and evolutionary biology. By leveraging large datasets, the model was trained on 771 billion tokens derived from 3.15 billion protein sequences, 236 million protein structures, and 539 million protein annotations. This extensive training allowed ESM3 to mimic the evolutionary processes that have shaped life on Earth over millions of years.
The researchers aimed to create a new green fluorescent protein (GFP), which is commonly used as a marker in biological research. The resulting protein, esmGFP, exhibits a genetic sequence that is 58% similar to its closest natural counterpart, suggesting that it could have taken approximately 500 million years of evolution to develop naturally.

Implications for Scientific Research
The ability to generate novel proteins through AI has profound implications for various scientific disciplines:
Biotechnology: The creation of unique proteins can lead to advancements in drug development and therapeutic applications.
Environmental Science: Researchers hope to engineer proteins capable of addressing pressing environmental issues, such as plastic waste degradation.
Evolutionary Biology: ESM3 provides a new tool for studying evolutionary processes, potentially reshaping our understanding of how life evolves.
Future Prospects
The success of ESM3 marks just the beginning of AI's role in biological research. As the model continues to evolve, it is expected to become increasingly powerful, enabling scientists to explore new frontiers in protein engineering and evolutionary studies. The open-access nature of ESM3 allows researchers worldwide to utilise this technology, fostering collaboration and innovation in the scientific community.
In conclusion, the simulation of 500 million years of evolution by AI not only showcases the potential of artificial intelligence in scientific research but also paves the way for groundbreaking discoveries in protein synthesis and applications across various fields. The future of biology may very well be intertwined with the advancements in AI technology, leading to solutions for some of the world's most pressing challenges.
Sources
AI model simulates 500 million years of evolution to generate a new fluorescent protein, MSN.
Scientists Just Used AI to Simulate 500 Million Years of Evolution, MSN.
Scientists Just Used AI to Simulate 500 Million Years of Evolution, Popular Mechanics.