PostgreSQL compatibility is a no-brainer for any modern database offering a unique feature set. Every modern database vendor is now offering some level of compatibility to reach developers and increase adoption. Recently, Google Spanner announced compatibility, saying, “PostgreSQL has emerged as the ‘API’ for operational databases.” But, even if a database claims compatibility, it can be difficult to decipher what that means in relation to another database, as not all “compatibility” is created equal.
Read on to learn about a framework you can use to analyze PostgreSQL compatibility across modern database vendors. The resulting metrics enable you to choose a database that delivers the specific capabilities you want and need.
PostgreSQL is the gold standard for operational databases. This relational database has made major strides in user adoption since 2014 and has been named “Database of the Year” several times by DB-Engines. It was voted the “Most Wanted” database in Stack Overflow's 2021 Developer Survey.
But, what makes PostgreSQL such a powerful API for modern databases?
For starters, it has an expansive, proven feature set that most technologists know and love. It supports many “post-relational” database concepts, such as in-database functions. These characteristics make it highly functional for a wide range of use cases and applications. Even better, it’s robust, fully extensible, and 100% open source. It is also supported by a vibrant community of developers, and a rich ecosystem of tools and frameworks.
Defining Key Compatibility Metrics
Modern database vendors often neglect to mention that there are four key metrics—or “levels”—of PostgreSQL compatibility. These metrics follow a natural adoption curve, from exploring a new database through to making a switch. Going further, the degree of compatibility isn’t all or nothing—databases can be higher or lower for any given metric. But, a database can only support an application to the degree it can support other forms of compatibility.
A PostgreSQL database reads some set of commands and queries off the network. These commands and queries are written in a sequence of bytes, and serialized. A database is considered wire-protocol compatible if it can understand the same sequence and serialization of data.
Wire-protocol compatibility allows PostgreSQL client drivers to communicate with a database. This means developers can start building an application with ease. They’re able to write an application in a variety of programming languages, such as Java, JDBC, Python, and Go. Finally, developers can use command-line tools—such as psql—to connect to the database and explore any number of schemas or queries.
Wire compatibility deals with how a database sequences and serializes data. With syntax compatibility, a database can parse the same PostgreSQL syntax. In other words, when a valid PostgreSQL snippet is sent to the database, it can create a table, execute a SQL statement, and return a valid result. The way it executes could be different, though. For example, in some cases, a specific set of parameters may be ignored.
If proper syntax is created, developers can use some of PostgreSQL’s tools and frameworks. They simply send the necessary commands and the database understands what to do with it.
The feature compatibility metric means a database should be able to support the advanced features of PostgreSQL, beyond just ANSI SQL. Advanced features include triggers, partial indexes, and stored procedures.
An equivalent feature doesn’t need to have the same PostgreSQL syntax to work.
However, in this instance, developers who rely on a certain PostgreSQL feature would need to re-learn equivalent feature sets and modify their applications.
Runtime compatibility ensures a database matches PostgreSQL execution semantics at runtime. Runtime-compatible databases should support queries to the system catalog, error messages, and error codes.
This compatibility metric automatically implies the three other forms of compatibility—wire, syntax, and feature are in place. As a result, developers who know PostgreSQL will feel at home from start to finish.
Compatibility Metrics in Practice
With these metrics defined, let’s take Google Spanner as an example and examine its level of PostgreSQL compatibility.
Evaluating Google Spanner
For those unaware, Spanner is a globally-distributed transactional database service. It’s a horizontally-scalable database originally built for internal Google use cases.
The database started with a proprietary API, followed by an ANSI SQL-compliant API. However, Google found adoption slow, so in November 2021 they introduced a PostgreSQL API. This new inclusion reveals a flavor of SQL was needed with a built-in user base and best-in-class functionality.
From a design perspective, Spanner is highly available and horizontally scalable. This means the database provides applications with high data integrity and on-demand scalability. In addition to this, Spanner provides distributed query processing and a distributed storage layer. These three design traits are unlike PostgreSQL.
Spanner added PostgreSQL compatibility by building in a layer that converts PostgreSQL statements at runtime to equivalent Spanner statements. Think of this process as a translating compiler or transpiler. The compiler runs as a sidecar next to an application and converts and issues Spanner statements to the database.
However, this current implementation leads to low PostgreSQL compatibility. Let’s explore why in greater detail.
Reviewing Spanner’s documentation, the following protocols are not supported:
- COPY protocol
- Prepared statement DESCRIBE
- PSQL meta-commands not included in this list
Some of these features are important, such as COPY. However, they may not always be blockers, which is why this database’s level of wire compatibility is moderate.
Spanner currently does not support several PostgreSQL data types, including:
- TIMESTAMP WITHOUT TIME ZONE
Specifically, SERIAL, TIMESTAMP, and CHAR are common in practice. As a result, Spanner’s syntax compatibility is low for applications containing these data types.
Finally, Spanner does not support several PostgreSQL features, such as:
- Ecosystem clients
- Stored procedures
- Fine-grained concurrency control
- Partial indexes
The lack of these features (and others) keeps feature compatibility low, and overall very difficult to achieve. The only available fix would be to build these features into the application layer.
As mentioned earlier, runtime compatibility implies all other forms of compatibility. Consequently, Spanner has low compatibility here. This is due to low syntax and feature compatibility, as well as moderate wire-protocol compatibility.
The Future is PostgreSQL-Compatible
For developers moving to Spanner, any PostgreSQL applications they already have in place will not run as expected. So, if you’re a developer who uses PostgreSQL, you’ll need to understand what’s supported, solve for any runtime requirements, and find workarounds for missing features.
These metrics can be applied to any modern database claiming PostgreSQL compatibility. But remember: No modern database is 100% compatible. And they shouldn’t try to be—design goals and value propositions are unique for each database.
With that said, Distributed SQL is a popular category of modern databases offering high PostgreSQL compatibility. Spanner introduced this category, but—as illustrated—ranks low in overall compatibility. However, there are open source alternatives that combine high compatibility with the horizontal scalability and resilience of cloud native architectures.
Whichever database you choose, higher compatibility with PostgreSQL provides a lower barrier to entry for developers, along with less mental friction. Therefore, applying compatibility metrics is critical to speed up (and future-proof) team adoption and productivity.