Google is hosting a version of its Cloud Next conference in Tokyo this week, and it’s putting the focus squarely on tweaking its databases for AI workloads (because at this point in 2024, AI is the only thing these major tech companies want to talk about). These include updates to its Spanner SQL database, which now features graph and vector search support, as well as extended full-text search capabilities.
This wouldn’t be a Google announcement without some Gemini -powered features. These include Gemini in BigQuery and Looker to help users with data engineering and analysis, as well as governance and security tasks.
Google argues that while the vast majority of enterprises think that generative AI will be critical to the success of their business, they also know that much of their data remains unmanaged, leaving it outside of the scope of their analytics and AI initiatives.
“They have to really get out of all of their existing data silos and data islands, and get to a consolidated multimodal data platform, spanning structured and unstructured data — [because] GenAI is terrific at analyzing unstructured data — and combining data at rest with their data movement, so real-time data and data at rest processing,” explained Gerrit Kazmeier, Google’s VP and GM for database, Data Analytics and Looker. Activating this enterprise data flow, he argued, is what a lot of these new features are all about.
Spanner gets graph and vector capabilities
Spanner powers most of Google’s own products like Search, Gmail and YouTube and its customer list includes the likes of Home Depot, Uber, Walmart and others. And while Spanner can handle a massive volume of data, vector and graph databases are a necessity to bring enterprise data into GenAI applications and enrich existing foundation models.
“What we’re thinking about is what would it really take for us to take Spanner’s availability, scale, relational model, and really expand that to be the best data platform for operational GenAI apps,” said Andi Gutmans, Google’s VP and GM for databases. Like so many database vendors, the first step here for Google is adding graph capabilities to Spanner, using the emerging GraphQL standard. Enterprises can then use this graph to augment their GenAI applications — and the foundation models that power them — using Retrieval Augmented Generation (RAG), which is currently the de facto standard for this use case.
Also new in Spanner are full-text search and vector search, with the vector search capabilities backed by Google’s ScaNN algorithm. “With Spanner Graph, full-text search and vector search, we have evolved Spanner from not only being the most available, globally consistent and scalable database, to a multi-model database with intelligent capabilities that seamlessly interoperate to enable you to deliver a new class of AI-enabled applications,” Google says.
In addition to these AI-centric updates, Spanner is getting a new, optional pricing structure. Dubbed “Spanner editions,” the idea here is to offer a tier-based pricing model that offers them more flexibility. Currently, Google Cloud customers had to choose between a single-region offering and a multi-region version, which also offered a bundle of additional features like replication.
Bigtable goes SQL
Google also on Thursday announced a major update to Bigtable, Google’s NoSQL database for unstructured data and latency-sensitive workloads. Bigtable now features SQL support (or more precisely, support for GoogleSQL, the company’s own SQL dialect), making it significantly easier for virtually any developer to use the service.
Previously, developers had to use the Bigtable API to query their databases. Currently, Bigtable supports roughly 100 SQL functions.
Oracle on Google Cloud
For the Oracle database fans out there, Google will now allow them to host their Oracle Exadata and Autonomous database services right in the Google Cloud data centers — and they can link their applications between Google Cloud and the Oracle Cloud. For Google, that means more workloads in its cloud and for Oracle, at least, it means these users are still paying their licensing fees, even if they aren’t using the Oracle cloud.
Also new in Google Cloud is support for open-source Apache Spark and Kafka for data streaming and processing, as well as real-time streaming from Analytics Hub (Google’s service for securely sharing data between organizations).