How AI Is Unearthing Hidden Scientific Knowledge | Sara Beery | TED
Scientists reckon that a massive 80 percent of life on Earth is still a mystery to us. With habitats shrinking and resources dwindling, we're losing species faster than we can even find them. But what if the knowledge we need to understand and protect the natural world is already out there, hidden in millions of images and recordings? AI naturalist Sara Beery explains how we can learn to read this data before it's too late.
Key Takeaways
- We've only observed about 2 million out of an estimated 10 million species on Earth.
- Traditional data collection methods are too slow to keep up with the current extinction crisis.
- Vast ecological databases like iNaturalist contain hidden knowledge within their pixels.
- AI can help us quickly sift through millions of data points to find crucial information.
- New AI systems allow scientists to ask direct questions of databases, speeding up discovery.
- This AI-driven approach can significantly reduce the time and cost of conservation research.
The Scale of What We Don't Know
Imagine being a doctor trying to save a patient, but you can only see a fifth of their body. It would be incredibly difficult to figure out the right treatment, right? Well, that's pretty much the situation we're in with nature. We need to protect ecosystems that are in trouble, but we don't know enough about the life on our planet.
As an AI researcher and ecologist, I work on developing ways to learn more about the natural world. I believe AI can dramatically increase our knowledge of species and ecosystems. However, we need to change how we use AI in ecology. We need methods that are flexible, interactive, and that scientists can actually use to find the knowledge buried in our data.
Why is this so important? Scientists estimate there are about 10 million species on Earth, but we've only ever seen about 2 million. That means a staggering 80 percent of life is still unknown to us. And just knowing a species exists isn't enough to protect it. We need to know where it lives, what it eats, if it migrates and how far. This deeper information takes more than a single observation, but it's vital for understanding what puts species at risk.
For instance, if insect populations suddenly crash across North America – which is happening now – what does that mean for birds that rely on insects? Which birds will be most affected, and which can switch to other food sources? And what about the predators that eat those birds? Everything is connected, and a threat to one species can cause a whole ecosystem to collapse.
The Growing Crisis and Slow Discovery
Unfortunately, species are facing threats from all sides. Habitats are shrinking, temperatures are rising, and food and water sources are disappearing. Natural disasters and invasive species are also taking a heavy toll. As a result, extinction rates are now much higher than they used to be. Scientists and policymakers are trying to figure out what's causing this and what we can do to stop it. But it often feels like we're discovering species just in time to document their extinction. The Tapanuli orangutan, for example, was discovered in 2017 and was already critically endangered.
Traditional ways of collecting data are simply too slow for the crisis we're facing. But here's some good news: we have huge databases of ecological information that we've barely started to explore.
The Hidden Treasure in Data
Consider a platform like iNaturalist, where volunteers have uploaded 300 million images. In each image, a species has been identified. This kind of data has already been incredibly useful for science. But there's so much more knowledge hidden within the pixels themselves.
Take an image labeled as a Grant's zebra. It tells us a zebra was seen at a certain place and time. But look closer. If there are three zebras, we can identify them individually by their unique stripe patterns. This allows us to track how species move, study their social interactions, monitor their health, and even estimate population sizes. We can also see other animals in the image, like wildebeest and oxpeckers, which tells us about coexistence and the spread of disease. We can even look at the vegetation in the background to understand local carbon storage and the food chain.
Now, imagine taking all this information from just one image and multiplying it by 300 million images on iNaturalist. Then add in other databases like millions of bioacoustic recordings, tens of millions of camera-trap images, and thousands of hours of deep-sea footage. We're sitting on an ecological goldmine, but the challenge is accessing this knowledge efficiently.
If it took you just one second to look at each image, you'd need to work full-time for 40 years to go through all the images on iNaturalist alone. This is where AI becomes a game-changer. It can help us look through all this data incredibly quickly.
AI as a Discovery Tool
Currently, if an ecologist wants to find examples of birds eating insects in a database, they might train an AI model. This involves collecting hundreds or thousands of examples to teach the model what to look for. Once trained, the model can quickly find new examples. But this process of gathering so many examples each time is still too slow.
Scientific discovery starts with curiosity – asking questions about the world. Wouldn't it be amazing if we could just ask our databases questions directly and get answers? My team at MIT has been working on a system called Inquire, which helps ecologists find answers in data without needing to collect examples or write code.
Inquire uses AI models that can understand the similarities between images and scientific language. Here's how it works: An ecologist breaks down a scientific question into search terms. For example, "bird eating insect." Inquire then compares this search directly to millions of images in seconds. It's designed to be fast and efficient, making the system interactive and using less computational power than some other AI approaches.
Once the images are sorted by relevance, scientists can easily focus on the most likely results and verify them. This process generates human-verified data that can be exported and analyzed. One collaborator used Inquire to find thousands of examples of birds eating various food sources. They then analyzed differences in diets between summer and winter, a study that took them about three hours. A similar study done manually took 1,560 hours. The results from Inquire closely matched the manual study.
This is incredibly exciting because it means we can start accessing all this hidden knowledge much faster. Scientists are already using Inquire creatively to explore many different questions, such as how forests regenerate after fires, differences in species' mortality between urban and rural areas, or how flowering times are changing with the climate. The possibilities are truly endless, and any scientist can ask the questions they are interested in.
The Future of Conservation
This is just the beginning. We've shown this works for images, but similar systems could be designed for audio recordings, aerial video, satellite data, GPS tracking data, and any other type of ecological data. These different data types are all related and capture different perspectives of life on Earth. Imagine future systems that help scientists find hidden connections between all these data sources.
Of course, this alone won't solve our global nature crisis. But it helps us get the most out of the data we've already collected. This allows us to identify knowledge gaps and strategically collect new data to fill them. Ultimately, this reduces the time and cost of gathering information that supports conservation efforts, like ensuring species have the food and habitat they need when they're migrating, breeding, or recovering from disasters.
We are at a critical point. We face a huge biodiversity crisis, but we also have amazing tools to address it. Millions of people want to help with conservation and scientific discovery, and AI tools can help scientists find patterns in data at a scale impossible for humans alone. The future of conservation isn't just in remote places; it's also hidden in our ecological databases, both current and future. And that's where everyone can play a part. By collecting data and uploading it to platforms like iNaturalist, every photo, sound, and observation shared contributes to building a complete picture of life on Earth. With scientific AI tools, we can help save nature under threat.