AlphaEarth Foundations helps map our planet in unprecedented detail
Key Points
- 1AlphaEarth Foundations is a new AI model that integrates vast petabytes of Earth observation data into a unified digital "embedding" to comprehensively map and monitor the planet's terrestrial land and coastal waters.
- 2This system addresses data overload and inconsistency by combining diverse geospatial inputs into compact 10x10 meter summaries, which are 16 times more storage-efficient than other methods, enabling unprecedented planetary-scale analysis.
- 3The annual embeddings are released as the Satellite Embedding dataset, empowering organizations globally to create accurate custom maps for critical applications like classifying unmapped ecosystems, monitoring agricultural changes, and informing conservation strategies.
AlphaEarth Foundations is an artificial intelligence (AI) model designed to generate a unified, high-resolution digital representation, or "embedding," of the Earth's terrestrial land and coastal waters from petabytes of diverse Earth observation data. Addressing challenges of data complexity, multimodality, high refresh rates, and inconsistency, the model functions like a "virtual satellite," integrating information from dozens of public sources, including optical satellite images, radar, 3D laser mapping, and climate simulations.
The core methodology involves processing this vast, disparate dataset to analyze the planet in 10x10 meter squares. For each square, the model generates a highly compact summary, known as an embedding. These embeddings are 64-dimensional vectors, where each component maps to a coordinate on a 64-dimensional sphere, allowing for a rich, continuous representation of geospatial features and changes over time. A key innovation is the extreme data compression achieved by these embeddings, requiring 16 times less storage space compared to other tested AI systems, which significantly reduces the computational cost of planetary-scale analysis. The model also handles non-uniformly sampled temporal data, allowing it to create a continuous view of a location's evolution and explain numerous measurements over time.
The Satellite Embedding dataset, powered by AlphaEarth Foundations and released through Google Earth Engine, comprises annual embeddings with over 1.4 trillion embedding footprints per year. Performance evaluations demonstrated that AlphaEarth Foundations consistently achieved higher accuracy than traditional methods and other AI mapping systems, exhibiting a 24% lower error rate on average and superior learning efficiency, particularly in scenarios with scarce label data. The model excels at tasks such as identifying land use, estimating surface properties, classifying unmapped ecosystems (e.g., distinguishing coastal shrublands from hyper-arid deserts), and monitoring agricultural and environmental changes, even penetrating persistent cloud cover or handling irregular satellite imaging in challenging areas like Antarctica. Future developments include combining these temporal embeddings with general reasoning Large Language Model (LLM) agents like Gemini for enhanced analytical capabilities.