TigerData logo
TigerData logo
  • Product

    Tiger Cloud

    Robust elastic cloud platform for startups and enterprises

    Agentic Postgres

    Postgres for Agents

    TimescaleDB

    Postgres for time-series, real-time analytics and events

  • Docs
  • Pricing

    Pricing

    Enterprise Tier

  • Developer Hub

    Changelog

    Benchmarks

    Blog

    Community

    Customer Stories

    Events

    Support

    Integrations

    Launch Hub

  • Company

    Contact us

    About

    Timescale

    Partners

    Security

    Careers

Log InTry for free
Home
AWS Time-Series Database: Understanding Your OptionsStationary Time-Series AnalysisThe Best Time-Series Databases ComparedTime-Series Analysis and Forecasting With Python Alternatives to TimescaleWhat Are Open-Source Time-Series Databases—Understanding Your OptionsWhy Consider Using PostgreSQL for Time-Series Data?Time-Series Analysis in RWhat Is Temporal Data?What Is a Time Series and How Is It Used?Is Your Data Time Series? Data Types Supported by PostgreSQL and TimescaleUnderstanding Database Workloads: Variable, Bursty, and Uniform PatternsHow to Work With Time Series in Python?Tools for Working With Time-Series Analysis in PythonGuide to Time-Series Analysis in PythonUnderstanding Autoregressive Time-Series ModelingCreating a Fast Time-Series Graph With Postgres Materialized Views
Understanding PostgreSQLOptimizing Your Database: A Deep Dive into PostgreSQL Data TypesUnderstanding FROM in PostgreSQL (With Examples)How to Address ‘Error: Could Not Resize Shared Memory Segment’ How to Install PostgreSQL on MacOSUnderstanding FILTER in PostgreSQL (With Examples)Understanding GROUP BY in PostgreSQL (With Examples)PostgreSQL Join Type TheoryA Guide to PostgreSQL ViewsStructured vs. Semi-Structured vs. Unstructured Data in PostgreSQLUnderstanding Foreign Keys in PostgreSQLUnderstanding PostgreSQL User-Defined FunctionsUnderstanding PostgreSQL's COALESCE FunctionUnderstanding SQL Aggregate FunctionsUsing PostgreSQL UPDATE With JOINHow to Install PostgreSQL on Linux5 Common Connection Errors in PostgreSQL and How to Solve ThemUnderstanding HAVING in PostgreSQL (With Examples)How to Fix No Partition of Relation Found for Row in Postgres DatabasesHow to Fix Transaction ID Wraparound ExhaustionUnderstanding LIMIT in PostgreSQL (With Examples)Understanding PostgreSQL FunctionsUnderstanding ORDER BY in PostgreSQL (With Examples)Understanding WINDOW in PostgreSQL (With Examples)Understanding PostgreSQL WITHIN GROUPPostgreSQL Mathematical Functions: Enhancing Coding EfficiencyUnderstanding DISTINCT in PostgreSQL (With Examples)Using PostgreSQL String Functions for Improved Data AnalysisData Processing With PostgreSQL Window FunctionsPostgreSQL Joins : A SummaryUnderstanding OFFSET in PostgreSQL (With Examples)Understanding PostgreSQL Date and Time FunctionsWhat Is Data Compression and How Does It Work?What Is Data Transformation, and Why Is It Important?Understanding the Postgres string_agg FunctionWhat Is a PostgreSQL Left Join? And a Right Join?Understanding PostgreSQL SELECTSelf-Hosted or Cloud Database? A Countryside Reflection on Infrastructure ChoicesUnderstanding ACID Compliance Understanding percentile_cont() and percentile_disc() in PostgreSQLUnderstanding PostgreSQL Conditional FunctionsUnderstanding PostgreSQL Array FunctionsWhat Characters Are Allowed in PostgreSQL Strings?Understanding WHERE in PostgreSQL (With Examples)What Is a PostgreSQL Full Outer Join?What Is a PostgreSQL Cross Join?What Is a PostgreSQL Inner Join?Data Partitioning: What It Is and Why It MattersStrategies for Improving Postgres JOIN PerformanceUnderstanding the Postgres extract() FunctionUnderstanding the rank() and dense_rank() Functions in PostgreSQL
Guide to PostgreSQL PerformanceHow to Reduce Bloat in Large PostgreSQL TablesDesigning Your Database Schema: Wide vs. Narrow Postgres TablesBest Practices for Time-Series Data Modeling: Single or Multiple Partitioned Table(s) a.k.a. Hypertables Best Practices for (Time-)Series Metadata Tables A Guide to Data Analysis on PostgreSQLA Guide to Scaling PostgreSQLGuide to PostgreSQL SecurityHandling Large Objects in PostgresHow to Query JSON Metadata in PostgreSQLHow to Query JSONB in PostgreSQLHow to Use PostgreSQL for Data TransformationOptimizing Array Queries With GIN Indexes in PostgreSQLPg_partman vs. Hypertables for Postgres PartitioningPostgreSQL Performance Tuning: Designing and Implementing Your Database SchemaPostgreSQL Performance Tuning: Key ParametersPostgreSQL Performance Tuning: Optimizing Database IndexesDetermining the Optimal Postgres Partition SizeNavigating Growing PostgreSQL Tables With Partitioning (and More)Top PostgreSQL Drivers for PythonWhen to Consider Postgres PartitioningGuide to PostgreSQL Database OperationsUnderstanding PostgreSQL TablespacesWhat Is Audit Logging and How to Enable It in PostgreSQLGuide to Postgres Data ManagementHow to Index JSONB Columns in PostgreSQLHow to Monitor and Optimize PostgreSQL Index PerformanceSQL/JSON Data Model and JSON in SQL: A PostgreSQL PerspectiveA Guide to pg_restore (and pg_restore Example)PostgreSQL Performance Tuning: How to Size Your DatabaseAn Intro to Data Modeling on PostgreSQLExplaining PostgreSQL EXPLAINWhat Is a PostgreSQL Temporary View?A PostgreSQL Database Replication GuideHow to Compute Standard Deviation With PostgreSQLHow PostgreSQL Data Aggregation WorksBuilding a Scalable DatabaseRecursive Query in SQL: What It Is, and How to Write OneGuide to PostgreSQL Database DesignHow to Use Psycopg2: The PostgreSQL Adapter for Python
Best Practices for Scaling PostgreSQLHow to Design Your PostgreSQL Database: Two Schema ExamplesHow to Handle High-Cardinality Data in PostgreSQLHow to Store Video in PostgreSQL Using BYTEABest Practices for PostgreSQL Database OperationsHow to Manage Your Data With Data Retention PoliciesBest Practices for PostgreSQL AggregationBest Practices for Postgres Database ReplicationHow to Use a Common Table Expression (CTE) in SQLBest Practices for Postgres Data ManagementBest Practices for Postgres PerformanceBest Practices for Postgres SecurityBest Practices for PostgreSQL Data AnalysisTesting Postgres Ingest: INSERT vs. Batch INSERT vs. COPYHow to Use PostgreSQL for Data Normalization
PostgreSQL Extensions: amcheckPostgreSQL Extensions: Unlocking Multidimensional Points With Cube PostgreSQL Extensions: hstorePostgreSQL Extensions: ltreePostgreSQL Extensions: Secure Your Time-Series Data With pgcryptoPostgreSQL Extensions: pg_prewarmPostgreSQL Extensions: pgRoutingPostgreSQL Extensions: pg_stat_statementsPostgreSQL Extensions: Install pg_trgm for Data MatchingPostgreSQL Extensions: Turning PostgreSQL Into a Vector Database With pgvectorPostgreSQL Extensions: Database Testing With pgTAPPostgreSQL Extensions: PL/pgSQLPostgreSQL Extensions: Using PostGIS and Timescale for Advanced Geospatial InsightsPostgreSQL Extensions: Intro to uuid-ossp
Columnar Databases vs. Row-Oriented Databases: Which to Choose?Data Analytics vs. Real-Time Analytics: How to Pick Your Database (and Why It Should Be PostgreSQL)How to Choose a Real-Time Analytics DatabaseUnderstanding OLTPOLAP Workloads on PostgreSQL: A GuideHow to Choose an OLAP DatabasePostgreSQL as a Real-Time Analytics DatabaseWhat Is the Best Database for Real-Time AnalyticsHow to Build an IoT Pipeline for Real-Time Analytics in PostgreSQL
When Should You Use Full-Text Search vs. Vector Search?HNSW vs. DiskANNA Brief History of AI: How Did We Get Here, and What's Next?A Beginner’s Guide to Vector EmbeddingsPostgreSQL as a Vector Database: A Pgvector TutorialUsing Pgvector With PythonHow to Choose a Vector DatabaseVector Databases Are the Wrong AbstractionUnderstanding DiskANNA Guide to Cosine SimilarityStreaming DiskANN: How We Made PostgreSQL as Fast as Pinecone for Vector DataImplementing Cosine Similarity in PythonVector Database Basics: HNSWVector Database Options for AWSVector Store vs. Vector Database: Understanding the ConnectionPgvector vs. Pinecone: Vector Database Performance and Cost ComparisonHow to Build LLM Applications With Pgvector Vector Store in LangChainHow to Implement RAG With Amazon Bedrock and LangChainRetrieval-Augmented Generation With Claude Sonnet 3.5 and PgvectorRAG Is More Than Just Vector SearchPostgreSQL Hybrid Search Using Pgvector and CohereImplementing Filtered Semantic Search Using Pgvector and JavaScriptRefining Vector Search Queries With Time Filters in Pgvector: A TutorialUnderstanding Semantic SearchWhat Is Vector Search? Vector Search vs Semantic SearchText-to-SQL: A Developer’s Zero-to-Hero GuideNearest Neighbor Indexes: What Are IVFFlat Indexes in Pgvector and How Do They WorkBuilding an AI Image Gallery With OpenAI CLIP, Claude Sonnet 3.5, and Pgvector
Understanding IoT (Internet of Things)A Beginner’s Guide to IIoT and Industry 4.0Storing IoT Data: 8 Reasons Why You Should Use PostgreSQLMoving Past Legacy Systems: Data Historian vs. Time-Series DatabaseWhy You Should Use PostgreSQL for Industrial IoT DataHow to Choose an IoT DatabaseHow to Simulate a Basic IoT Sensor Dataset on PostgreSQLFrom Ingest to Insights in Milliseconds: Everactive's Tech Transformation With TimescaleHow Ndustrial Is Providing Fast Real-Time Queries and Safely Storing Client Data With 97 % CompressionHow Hopthru Powers Real-Time Transit Analytics From a 1 TB Table Migrating a Low-Code IoT Platform Storing 20M Records/DayHow United Manufacturing Hub Is Introducing Open Source to ManufacturingBuilding IoT Pipelines for Faster Analytics With IoT CoreVisualizing IoT Data at Scale With Hopara and TimescaleDB
What Is ClickHouse and How Does It Compare to PostgreSQL and TimescaleDB for Time Series?Timescale vs. Amazon RDS PostgreSQL: Up to 350x Faster Queries, 44 % Faster Ingest, 95 % Storage Savings for Time-Series DataWhat We Learned From Benchmarking Amazon Aurora PostgreSQL ServerlessTimescaleDB vs. Amazon Timestream: 6,000x Higher Inserts, 5-175x Faster Queries, 150-220x CheaperHow to Store Time-Series Data in MongoDB and Why That’s a Bad IdeaPostgreSQL + TimescaleDB: 1,000x Faster Queries, 90 % Data Compression, and Much MoreEye or the Tiger: Benchmarking Cassandra vs. TimescaleDB for Time-Series Data
Alternatives to RDSWhy Is RDS so Expensive? Understanding RDS Pricing and CostsEstimating RDS CostsHow to Migrate From AWS RDS for PostgreSQL to TimescaleAmazon Aurora vs. RDS: Understanding the Difference
5 InfluxDB Alternatives for Your Time-Series Data8 Reasons to Choose Timescale as Your InfluxDB Alternative InfluxQL, Flux, and SQL: Which Query Language Is Best? (With Cheatsheet)What InfluxDB Got WrongTimescaleDB vs. InfluxDB: Purpose Built Differently for Time-Series Data
5 Ways to Monitor Your PostgreSQL DatabaseHow to Migrate Your Data to Timescale (3 Ways)Postgres TOAST vs. Timescale CompressionBuilding Python Apps With PostgreSQL: A Developer's GuideData Visualization in PostgreSQL With Apache SupersetMore Time-Series Data Analysis, Fewer Lines of Code: Meet HyperfunctionsIs Postgres Partitioning Really That Hard? An Introduction To HypertablesPostgreSQL Materialized Views and Where to Find ThemTimescale Tips: Testing Your Chunk Size
Postgres cheat sheet
HomeTime series basicsPostgres basicsPostgres guidesPostgres best practicesPostgres extensionsPostgres for real-time analytics
Sections

AI and vector fundamentals

A Brief History of AI: How Did We Get Here, and What's Next?A Beginner’s Guide to Vector EmbeddingsPostgreSQL as a Vector Database: A Pgvector TutorialUsing Pgvector With PythonHow to Choose a Vector DatabaseVector Databases Are the Wrong Abstraction

Cosine similarity

A Guide to Cosine SimilarityImplementing Cosine Similarity in Python

Vector databases

Vector Database Options for AWSVector Store vs. Vector Database: Understanding the Connection

Tutorials

How to Build LLM Applications With Pgvector Vector Store in LangChainHow to Implement RAG With Amazon Bedrock and LangChainRetrieval-Augmented Generation With Claude Sonnet 3.5 and PgvectorRAG Is More Than Just Vector Search

Hybrid search & filtering

PostgreSQL Hybrid Search Using Pgvector and CohereImplementing Filtered Semantic Search Using Pgvector and JavaScriptRefining Vector Search Queries With Time Filters in Pgvector: A Tutorial

Image search

Building an AI Image Gallery With OpenAI CLIP, Claude Sonnet 3.5, and Pgvector

Semantic search

Fundamentals

Understanding Semantic SearchWhat Is Vector Search? Vector Search vs Semantic SearchWhen Should You Use Full-Text Search vs. Vector Search?

Vectorscale

Fundamentals

Understanding DiskANN

Schema design

Streaming DiskANN: How We Made PostgreSQL as Fast as Pinecone for Vector Data
Vector Database Basics: HNSW

Benchmarks

Pgvector vs. Pinecone: Vector Database Performance and Cost Comparison

Fundamentals

HNSW vs. DiskANN
Nearest Neighbor Indexes: What Are IVFFlat Indexes in Pgvector and How Do They Work

AI query interfaces

Text-to-SQL: A Developer’s Zero-to-Hero Guide

Products

Time Series and Analytics AI and Vector Enterprise Plan Cloud Status Support Security Cloud Terms of Service

Learn

Documentation Blog Forum Tutorials Changelog Success Stories Time Series Database

Company

Contact Us Careers About Brand Community Code Of Conduct Events

Subscribe to the Tiger Data Newsletter

By submitting, you acknowledge Tiger Data's Privacy Policy

2025 (c) Timescale, Inc., d/b/a Tiger Data. All rights reserved.

Privacy preferences
LegalPrivacySitemap

Published at Aug 29, 2024

Vector Store vs. Vector Database: Understanding the Connection

Explore for free

AI development for all developers, not just AI experts. Build your AI app with Tiger Cloud today.

Neon letters VS and VD over a black background.

Written by Anya Sage

Vector embeddings—numerical representations of words, phrases, or other data in a high-dimensional space—are a critical component of semantic search and AI systems. They allow machines to capture semantic meaning by encoding relationships and similarities between concepts. Yet embedded vectors are a unique data type that requires special handling due to their high-dimensional nature. To address this need, two related data storage systems have emerged: vector stores and vector databases. 

The terms “vector store” and “vector database” are often used interchangeably, so parsing the exact connection between them can be hard. But it’s a connection that’s important to understand because it sheds light on the nature of vector data storage/retrieval and the technical details of building vector data systems. 

To explain the vector store vs. vector database connection, we first define vector stores and vector databases. Then, we examine the relationship between them and the resulting technical complexities. Finally, we consider what to look for when evaluating vector databases for your projects and show how Timescale's vector database, anchored on our mature PostgreSQL cloud platform, meets these assessment criteria. Let's dive in.

What's the Difference Between a Store and a Database?

Before we zoom into the specifics, let’s define vector stores and vector databases. Both tools are designed for storing and searching embedded vectors. However, there are subtle differences between them that outline their relationship and functionality.

What is a vector store?

A vector store is a specialized system designed for holding embedded vectors. Due to the unique properties of vector embeddings, vector stores require specific design considerations that set them apart from traditional data storage systems.

Vector embeddings are high-dimensional numerical representations of data often used in machine learning and natural language processing tasks. An embedding is a compact representation of raw data, such as an image or text, transformed into a vector comprising floating-point numbers. It’s a powerful way of representing data according to its underlying meaning.

image

Vector embeddings work by representing features or objects as points in a multidimensional vector space, where the relative positions of these points represent meaningful relationships between the features or objects.

The key characteristics of vector stores include:

  1. Optimization for high-dimensional data: Vector embeddings typically consist of hundreds or thousands of dimensions, which pose unique challenges for storage and retrieval.

  2. Specialized retrieval algorithms: Unlike traditional databases that use exact matching queries, vector stores employ nearest-neighbor searches with specific distance metrics. These algorithms, such as those found in the scikit-learn library, are designed to find the most similar vectors based on their numerical properties. As scikit-learn explains: “The principle behind nearest neighbor methods is to find a predefined number of training samples closest in distance to the new point, and predict the label from these.” 

  3. Efficiency focus: Traditional databases are often inefficient when dealing with vector data. Vector stores are built from the ground up to efficiently handle the storage and retrieval of high-dimensional vectors.

  4. Limited data type flexibility: To optimize performance, vector stores typically focus on supporting high-dimensional numerical data, sacrificing the versatility (of handling various data types) found in general-purpose databases.

  5. Streamlined schema designs: Compared to general-purpose databases, vector stores often have less flexible schema designs, prioritizing structures that are optimized for vector data.

  6. Specialized query support: Instead of supporting a wide range of query types, vector stores are optimized for nearest neighbor retrieval, which is the primary operation performed on vector data.

What is a vector database?

A vector database, on the other hand, is a more comprehensive system that incorporates the capabilities of a vector store while providing additional features and functionality. Here are the key characteristics of a vector database:

  1. Extended database functionality: vector databases are often built as extensions of existing database systems, adding vector storage and retrieval capabilities to proven database technologies.

  2. Integration of vector and relational data: these systems connect stored vectors to the robust, complex query systems and structured data typically found in relational databases.

  3. Broader query support: vector databases allow for more complex queries that can combine vector similarity searches with traditional database operations.

  4. Flexible data model: unlike pure vector stores, vector databases can often handle a mix of vector and non-vector data types, providing greater versatility for complex applications.

  5. Advanced indexing and optimization: many vector databases incorporate advanced indexing techniques to improve the performance of both vector and non-vector queries.

How Vector Stores and Vector Databases Are Related

To understand the relationship between vector stores and vector databases, think of them as interconnected components within a larger system. Most available vector storage and retrieval systems are actually vector databases that contain a vector store as a core component.

The relationship can be described as follows:

  • Vector store as a subsystem: the vector database has a vector store contained within it, serving as the specialized component for efficiently holding and searching vector data.

  • Database wrapper: the larger database system acts as a wrapper around the vector store, providing additional functionality and integration capabilities.

  • Connecting systems: the database layer includes systems that allow database queries to interact with the store's retrieval functions, bridging the gap between traditional database operations and vector-specific functionality.

A prime example of this relationship can be seen in the pgvector extension for PostgreSQL (enabling open-source vector similarity search for PostgreSQL so you can store your vectors with the rest of your data). This extension introduces tools for creating tables optimized for vector data storage and provides function calls that perform different types of nearest neighbor searches on vector tables.

In this case:

  • The vector tables created using pgvector serve as the vector store component.

  • The larger PostgreSQL database, with the pgvector extension, becomes a full-fledged vector database.

  • SQL queries can incorporate pgvector functions, seamlessly integrating vector operations (the vector store subsystem) with traditional database queries (the larger database system design).

The pgvector extension allows developers to leverage the power of vector similarity searches while still benefiting from the robust features and familiarity of a traditional relational database system.

What to Look for in a Vector Store and Vector Database

When evaluating vector databases for your projects, there are key factors to consider. These factors help ensure that you choose a database that not only performs well but also integrates smoothly with your existing infrastructure and workflows.

Well-optimized vector store

Adding high-dimensional schema support and nearest-neighbor search capabilities to a database isn't an extremely complicated project, yet optimizing these features for production use is a significant challenge. A production-ready vector database should have a store component with the following characteristics:

  1. Efficient, fast storage: The system should be able to quickly insert, update, and delete vector data, even when dealing with large datasets.

  2. State-of-the-art nearest neighbor algorithms: Optimizing nearest neighbor search is an active field in algorithm research. The best systems stay at the cutting edge of these developments and implementations, continuously improving their performance.

  3. Scalability: The vector store should be able to handle growing datasets without significant performance degradation.

  4. Memory efficiency: Given the high-dimensional nature of vector data, efficient memory usage is needed to maintain performance as data volumes increase.

  5. Support for multiple distance metrics: Different applications may require different similarity measures, so a versatile vector store should support various distance metrics (for example, Euclidean, cosine, dot product).

Clean connection with the database

While vector store efficiency and speed are critical, it's equally important that this component integrates smoothly with the broader database system. Some considerations in this area include:

  1. Intuitive syntax: Vector stores designed solely with optimization in mind can sometimes come with clunky or unusual syntax. Look for systems that offer a clean, intuitive interface for vector operations.

  2. Compatibility with database features: The vector store should work well with broader database tools like indexing, transactions, and backup systems.

  3. Query integration: It should be easy to combine vector similarity searches with traditional database queries, allowing for complex operations that leverage both vector and non-vector data.

  4. Consistent data types: The system should provide a seamless way to work with vector data types alongside standard database types.

  5. Performance optimization: Look for systems that can optimize queries involving both vector and non-vector operations.

Familiar and robust database system

Some vector database products offer excellent vector store performance and database integration (the two properties mentioned above), but they may be built from the ground up as entirely new systems. This specificity can introduce a significant learning curve and potential challenges:

  1. Learning new tools: Adopting a completely new database system can be time-consuming and costly, especially when it's primarily to handle one specific data type.

  2. Integration challenges: New systems may not easily integrate with existing tools and workflows in your organization.

  3. Limited community support: Newer, specialized systems might have smaller user communities, making it harder to find solutions to problems or best practices.

  4. Uncertain long-term support: There's always a risk that a new, specialized system might not receive long-term support or updates.

An ideal vector database builds on existing, well-supported database systems to mitigate these risks. This approach offers several advantages:

  • Shorter learning curve: Developers can leverage their existing knowledge of familiar database systems.

  • Robust ecosystem: Established databases often have a wide range of tools, extensions, and integrations available.

  • Large community: Popular database systems have large, active communities that can provide support and share knowledge.

  • Long-term stability: Well-established database systems are more likely to receive ongoing updates, security patches, and feature improvements.

  • Easier talent acquisition: It's typically easier to find developers experienced with popular database systems than those familiar with highly specialized new tools.

The vector store vs. vector database decision is a paradox of choice captured well in Making Postgres a Better AI Database: 

“The strength of the PostgreSQL ecosystem is what makes it the most loved database for professional developers…However, the rise of AI applications that leverage the capabilities unlocked by large language models (LLMs) means that developers now demand more from their databases. In order to become the preferred AI database, PostgreSQL, as we know it, will have to adapt to these new developer needs.”

As it turns out, evolving to meet changing developer needs is exactly what PostgreSQL has done since its inception, thanks to its rich extension ecosystem and community. So, with the specialized extensions as explained below, using Postgres for AI delivers the best of both worlds—better vector storage and retrieval performance without losing time-tested reliability and convenience. 

What Timescale Offers: PostgreSQL as a High-Performance Vector Database

Timescale's vector database system lives in Timescale Cloud, which enables developers to build production AI applications at scale with PostgreSQL.

With Timescale Cloud, developers can access pgvector, pgvectorscale, and pgai—extensions that turn PostgreSQL into an easy-to-use and high-performance vector database, plus a fully managed cloud database experience. It is designed to meet the demanding requirements of modern vector data applications while building on the strengths of established database technology—something that developers in companies large and small are excited about. 

But how does Timescale address the vector database selection criteria discussed above? Let’s examine that. 

Production-level vector store performance

Timescale Cloud offers an open-source PostgreSQL stack for AI applications. With Timescale Cloud, developers can access pgvector, pgvectorscale, and pgai—extensions that turn PostgreSQL into an easy-to-use and high-performance vector database, plus a fully managed cloud database experience. These extensions make Postgres the de facto database for building AI applications because they: 

  • Eliminate the need to use a standalone vector database in your AI data stack

  • Lower the barriers for adopting and scaling PostgreSQL for your AI applications

  • Empower you to easily build and scale RAG, search, and agents applications

Here’s a quick overview of each extension: 

  • Pgvector is the popular open-source extension for vector data in PostgreSQL, enabling open-source vector similarity search for PostgreSQL.

  • Pgai brings more AI workflows—like embedding creation and model completion—to PostgreSQL, making it easier for developers to build search and retrieval augmented generation (RAG) applications.

  • Pgvectorscale builds on pgvector to enable development of more scalable AI applications, with higher-performance embedding search and cost-efficient storage.

Our benchmark test (using a dataset of 50 million Cohere embeddings of 768 dimensions each) compared the performance of PostgreSQL with pgvector and pgvectorscale against Pinecone, widely regarded as the market leader for specialized vector databases. The results showed that using PostgreSQL with pgvector and pgvectorscale dismantles the argument of “greater performance” often made to justify choosing a dedicated vector database. 

Compared to Pinecone’s storage-optimized index (s1), PostgreSQL with pgvector and pgvectorscale achieves 28x lower p95 latency and 16x higher query throughput for approximate nearest neighbor queries at 99 % recall.

image

PostgreSQL with pgvector and pgvectorscale extensions outperformed Pinecone’s s1 pod-based index type, offering 28x lower p95 latency.

The benchmark test also revealed compelling cost benefits.

image

Self-hosting PostgreSQL with pgvector and pgvectorscale offers better performance while being 75-79 % cheaper than using Pinecone.

PostgreSQL on Timescale Cloud has additional unique capabilities for handling vector data at scale. Its production-level performance features include:

  1. Advanced indexing algorithms: Pgvectorscale’s Streaming DiskANN overcomes limitations of in-memory indexes like HNSW by storing part of the index on disk, making it more cost-efficient to run and scale as vector workloads grow. Pgvectorscale’s Streaming DiskANN includes support for Statistical Binary Quantization (SBQ), a novel binary quantization method (developed by researchers at Timescale) that improves accuracy over traditional methods of quantization. 

  2. Efficient storage: Hybrid time-based vector search is optimized, leveraging the automatic time-based partitioning and indexing of Timescale’s hypertables to efficiently find recent embeddings, constrain vector search by a time range or document age, and store and retrieve LLM response and chat history with ease.

  3. Simplified AI stack: PostgreSQL on Timescale Cloud provides a single place for vector embeddings, relational data, time-series data, and event data that powers next-generation AI applications.

  4. Support for streaming filtering: pgvectorscale supports streaming filtering which allows for accurate retrieval even when secondary filters are applied during similarity search.

Built on Timescale's PostgreSQL foundation

A major Timescale advantage is its foundation in PostgreSQL, one of the most popular and well-supported open-source databases. PostgreSQL is emerging as the de facto database standard; as noted by Timescale CEO and co-founder Ajay Kulkarni in Why PostgreSQL Is the Bedrock for the Future of Data, “PostgreSQL for Everything” has become a “growing war cry among developers.” It’s a war cry for stack simplification at a time when specialized database proliferation has led to overly complex data pipelines.

image

Timescale Cloud for AI and vector data works with everything in your AI stack. Source

Timescale Cloud’s foundation in PostgreSQL provides several advantages:

  1. Proven reliability: PostgreSQL has been battle-hardened by production use for over three decades. Timescale Cloud inherits that reliability.

  2. Rich ecosystem: Users can leverage the vast array of tools, extensions, and integrations available for PostgreSQL, which have effectively turned PostgreSQL into a full-fledged platform.

  3. SQL support: Timescale Cloud allows users to combine vector similarity searches with standard SQL queries, providing powerful data manipulation capabilities.

  4. Built-for-PostgreSQL scalability: Timescale’s tiered storage architecture makes PostgreSQL big-data-ready by leveraging the flexibility of PostgreSQL and hypertables for effective data management. With tiered storage, you can automatically tier your data between disk and object storage (S3), effectively creating the ability to have an infinite table. 

  5. Advanced features: Alongside vector operations, users can take advantage of PostgreSQL's features like full-text search, window functions, JSON support, and capabilities like streaming replication, hot standby, and in-place upgrade.

Robust support and functionality

As a high-performance, developer-focused cloud platform, Timescale Cloud provides PostgreSQL services for the most demanding workloads—whether AI or time-series, analytics, and event workloads. It is ideal for production applications and provides a worry-free and easy development experience, with programmatic APIs, one-click database forking, high availability (HA), read replication, seamless upgrades, and expert support—plus robust security and privacy functionality. Timescale Cloud benefits from PostgreSQL’s rock-solid foundations:  

  1. Community support: users can tap into the vast knowledge base of the PostgreSQL community, which continues to make the core better and is witnessing more companies contributing, including the hyperscalers. 

  2. Timescale-specific support: Timescale provides additional support through its own community and team, focusing on vector-specific features and optimizations.

  3. Regular updates: as part of the larger Timescale ecosystem, Timescale Cloud receives regular updates and improvements.

  4. Time-series optimization: leveraging Timescale's expertise in time-series data, Timescale Cloud for AI and vector data offers powerful optimization potential for applications that combine vector and time-series data.

Conclusion

As we've explored in this article, vector stores and vector databases are closely related tools in the domain of high-dimensional data storage and retrieval. Vector stores are storage and retrieval tools optimized around the specific technical requirements of embedded vector data, while vector databases connect vector stores to familiar structured database systems.

Understanding the connection between vector stores and databases is important for teams evaluating tools for their vector-data-based projects. Whether you're working on semantic search, recommendation systems, or other AI-powered applications that rely on vector embeddings, that understanding allows more informed decisions when choosing a system that balances performance, usability, and integration with existing workflows.

This article reviewed vector databases assessment criteria, which Timescale Cloud meets as it offers powerful, robust store and database tools built on a familiar PostgreSQL backbone. By combining cutting-edge vector search capabilities with the reliability and extensive feature set of PostgreSQL, Timescale Cloud provides a compelling option for organizations looking to implement vector-based applications without sacrificing database functionality or ease of use. 

Try Timescale Cloud for free

With one database for your application's metadata, vector embeddings, and time-series data, you can say goodbye to the operational complexity of data duplication, synchronization, and keeping track of updates across multiple systems. Let’s sum up what you get with Timescale:

  • One mature cloud platform for your AI application (for vector, relational, and time-series data) 

  • Flexible and transparent pricing that decouples compute and storage

  • Ready to scale from day one so you can push to production with confidence

  • Enterprise-grade security and data privacy, including SOC2 Type II and GDPR compliance

Ready to explore the capabilities of Timescale Cloud and see if it’s right for your vector project? You can find out by trying it for free today.

On this page

    Explore for free

    AI development for all developers, not just AI experts. Build your AI app with Tiger Cloud today.