Spanner: Google's Globally-Distributed Database
📜 Abstract
Spanner is Google's scalable, multi-version, globally-distributed, and synchronously-replicated database. It is the first system to distribute data at global scale and support externally-consistent distributed transactions. This paper describes how Spanner is structured, its feature set, the rationale underlying various design decisions, and a novel time API that exposes clock uncertainty. Spanner's data model, a late-binding schematized semi-relational table, and its query language, also support general-purpose transactions. In addition, Spanner has been deployed in Google's production environment for several years, evolving from a Bigtable-based system. It is available as a service within Google, where Spanner is used for managing structured data at Google-scale.
✨ Summary
The paper, “Spanner: Google’s Globally-Distributed Database,” introduces Spanner, Google’s scalable and globally-distributed database known for its synchronous replication and ability to support distributed transactions with external consistency. A novel aspect of Spanner is its reliance on a new time API to handle clock uncertainty, which supports the consistency model.
Spanner organizes data in schematized semi-relational tables and provides a SQL-like language for transactions. Initially derived from Bigtable, Spanner has been deployed for several years within Google’s infrastructure, proving its robustness and efficiency at Google-scale operations.
Spanner’s influential architecture and technology have inspired various subsequent database systems and studies on distributed databases, setting a benchmark for innovation in distributed systems. It has been referenced in many research works such as “CockroachDB: The Resilient Geo-Distributed SQL Database” at usenix.org, highlighting its impact on creating resilient, synchronously-replicated databases capable of global distribution.
Moreover, the system strongly influenced the design of many NewSQL databases and enterprise database systems aimed at achieving high availability and consistency. For instance, Google Cloud’s Spanner service remains one of the most renowned examples of this system’s real-world applicability. Numerous academic endeavors have cited the paper to explore enhancements in distributed transaction management and revisiting consistency and scalability in large-scale systems.