Category: All posts
May 02, 2025
Choosing a vector database in 2025 is anything but easy. Every database system now seems to have vector search capabilities, from specialized vector-first solutions to traditional databases with add-ons. But basic main-memory HNSW (hierarchical navigable small world) vector search support doesn't equal production readiness. Many teams discover this painful truth only after their chosen solution buckles under production-scale vector search demands, forcing expensive migrations or performance compromises.
The challenge isn't just finding a vector database but finding one that your team knows how to run in production, can truly scale with your application's growing needs, and fits in with the rest of your data infrastructure. In this blog post, we’ll compare Qdrant versus Postgres with the pgvector and pgvectorscale extensions, two of the most popular open-source vector databases for developing AI applications. The bets are on: Which open-source vector database can search a dataset of 50 million Cohere embeddings with acceptable latency and throughput?
In one corner, we have Qdrant, an open-source specialized vector database designed for vector similarity search workloads. In the other corner, there's PostgreSQL, the popular and robust general-purpose relational database that gains vector capabilities through the pgvector extension, and with pgvectorscale adding specialized data structures and algorithms for large-scale vector search. Pgvectorscale (part of the pgai family) extends pgvector with StreamingDiskANN, a purpose-built search index for high performance and cost-efficient scalability.
Postgres and its vector search extensions are open source, with the flexibility to develop and deploy locally, self-host on-prem or in the cloud, or use a managed cloud service like Timescale Cloud. Postgres is a very mature database with advanced production-necessary features for high availability, streaming replication, point-in-time recovery, and observability. Qdrant offers vector search and filtering, but it is relatively newer in the ecosystem and has different operational features.
Under the hood, pgvectorscale and Qdrant share some similarities: Both are Rust-based implementations that support high-performance search with binary quantization (BQ) for efficient vector storage and filtered search capabilities. These commonalities can make the choice even more challenging for developers evaluating their options.
So the question is: When building an AI application, do you need a specialized vector database like Qdrant, or can you leverage Postgres’s familiar ecosystem that you might already know how to operationalize (and deploy in your data stack)? And more importantly, which performs better for large-scale vector workloads common in production AI applications like streaming video search, recommendation systems, and unstructured retrieval for RAG (retrieval-augmented generation) and agentic applications?
Before we dive into the full comparison, here’s the short answer:
The results show that Postgres is able to deliver in high-performance vector search use cases, despite its status as a general-purpose database rather than a specialized vector database: At 99 % recall, Postgres with pgvector and pgvectorscale achieves an-order-of-magnitude more throughput than Qdrant and is able to keep latencies below the 100 ms latency limit even while running queries in parallel.
Qdrant does achieve better tail latencies for high recall vector search and remains a solid choice for niche high-performance use cases. These results are consistent with the benchmark comparison we conducted in 2024 between Postgres with pgvector and Pinecone, another leading specialized vector database on the market.
Our tests show that despite being a general-purpose database, pgvector and pgvectorscale transform Postgres into a high-performance vector database capable of matching—or even outperforming—leading specialized vector databases like Qdrant on large-scale vector search workloads.
We believe that your default choice should be a general-purpose database (and we are, of course, biased towards Postgres) unless there is a compelling reason to switch to a specialized database. In the case of vector search, we don’t see one.
As the meme goes, “Postgres is all you need.”
Using Postgres empowers development teams to confidently build on the foundation they already know and trust, extending it with purpose-built extensions for vector search. This approach leverages existing operational knowledge, consolidates infrastructure, allows joins and other SQL operations to be combined with vector search, and simplifies the technology stack. If you could do that—without compromising performance—why wouldn’t you?
The performance demonstrated in our evaluation stands as a testament to the Postgres community's commitment to evolution and adaptation. Through continuous innovation and collaborative development, Postgres remains relevant even as data workloads transform in the AI era.
That said, we recognize that certain use cases may benefit from Qdrant's strengths, particularly applications requiring native horizontal scaling across many nodes or deployment scenarios where dedicated vector search services align better with architectural goals. These workloads specifically benefit from Qdrant's implementation characteristics.
The optimal choice ultimately depends on your requirements, existing infrastructure, and team expertise. We believe these benchmark results provide valuable data to inform that decision, showing that the "Postgres vs. specialized vector database" question isn't as clear-cut as many assume. With the right extensions, Postgres delivers competitive performance while maintaining the advantages of a mature, general-purpose database system that your team already knows how to operate.
Now that you have an overview, let’s dive into the specifics of how Qdrant compares to Postgres for large-scale vector search.
When evaluating vector databases for production AI applications, understanding the architectural differences is critical. These design choices impact performance, scalability, operational complexity, and cost-effectiveness in ways that directly affect your application's success.
Benchmarking tool: We used a fork of the industry standard, the open-source ANN-benchmarks tool, to benchmark both Qdrant and Postgres with pgvectorscale. Before testing performance, we modified it to measure the parallel throughput for measuring queries per second (QPS) when using multiple threads. We also made modifications to run different queries to warm up (versus test) the index. You can find all of our modifications in this tag of our fork of ANN-Benchmarks.
Dataset: 50 million Cohere embeddings of 768 dimensions each. The dataset was created by concatenating multiple Cohere Wikipedia datasets until we had 50 million vectors of 768 dimensions in our training dataset and 1,000 in our test dataset. Links to datasets are publicly available on HuggingFace here:
Client machine details: A standalone client machine ran the ANN-Benchmarks tool. We used AWS r6id.4xlarge machine instances, which have 16 vCPUs and 128 GB of RAM. We downloaded the dataset before the benchmarking started; we didn’t stream it during the runs. We stored the databases on EC2 instance store volumes.
Database server machine details: We used AWS r6id.4xlarge EC2 machines, which have 16 vCPUs and 128 GB RAM. Disk storage used a 950 GB locally attached NVMe SSD. The machine ran Ubuntu 24.04. At the time of publishing, the monthly cost for such a machine was $835.
Testing methodology: We only tested approximate nearest neighbor search queries (ANN search). The queries did not involve filtering. The client ran 29,000 queries in each benchmark using training vectors to “pre-warm” the system. Then, the client used the 1,000 “real” test vectors, which were different from the pre-warm set, to query. We only used the figures from the test vectors for the results.
Performance metrics: For the test, we use the standard metrics reported back from ANN-Benchmarks, but report on the following in this post: recall, query latency (p50, p95, and p99 percentile statistics), and query throughput as measured in queries per second.
Favorable configurations for testing query latency and query throughput: Qdrant has a batch mode, which we used to test query throughput performance. In the batch mode, query latency is reported per batch, so we turned off batch mode to get per-query latency results for a fair query latency assessment. Rather than batching, pgvectorscale supports parallel query execution via threads, so both query latency and query throughput results reflect parallel query processing being enabled.
indexing_threshold=0
), then re-enabled after uploadef_construct
: Construction-time exploration factor (tested with values 64-512)hnsw_ef
: Search-time exploration factor (tested with values 8-768)HNSW index: We used Qdrant’s HNSW index as the ANN index for vector searches.
Note on finding the right index parameters: We should note that we had trouble finding the right parameters for Qdrant’s HNSW. The defaults weren’t great, and it was time-prohibitive to test all the possibilities used by ANN-Benchmark on such a big dataset. We iterated for weeks to try to find the right values through trial-and-error, but it’s always possible we missed a better configuration. We welcome any feedback here and will commit to updating the blog post if we find a better set of configuration values.
For the 99 % recall threshold, we used the following HNSW parameters:
ef_construct
=64hnsw_ef
=768For the 91 % recall threshold, we used the following HNSW parameters:
ef_construct
=64hnsw_ef
=48General approach: We experimented with various Postgres machine, database, and index configurations. We self-hosted the Postgres instance on AWS EC2 to accurately reflect the experience of running fully open-source software for developers.
StreamingDiskANN index: We used the StreamingDiskANN index for large-scale approximate nearest neighbor search. The StreamingDiskANN index for pgvector is a key innovation introduced by the pgvectorscale extension.
StreamingDiskANN index parameters: We used the following index parameters; most are default values, and marked non-default parameters with an asterisk (*):
99 % recall threshold configuration:
num_neighbors
: 50search_list_size
: 100max_alpha
: 1.2query_rescore
: 400 (default: 50)query_search_list_size
: 75 (default: 100)num_bits_per_dimension
: 0use_bq
: Truepq_vector_length
: 090 % recall configuration:
num_neighbors
: 50search_list_size
: 100max_alpha
: 1.2query_rescore
: 115* (default: 50)query_search_list_size
: 75* (default: 100)num_bits_per_dimension
: 0*use_bq
: True*pq_vector_length
: 0*At a 99 % recall threshold, both Postgres and Qdrant achieve sub-100 ms percentile latencies for p50, p95, and p99. Qdrant achieves 1 % better p50 query latency (30.75 ms vs. 31.07 ms), 39 % lower p95 latency (36.73 ms vs. 60.42 ms), and 48 % better p99 query latency (38.71 ms vs. 74.60 ms) compared to Postgres with pgvector and pgvectorscale.
The benchmark results show that both vector search solutions deliver strong performance (sub-100 ms), and both systems achieve reasonable latency metrics for many production use cases. One important takeaway is that Qdrant demonstrates smaller variance between percentiles, which makes it a better choice for applications where tail latency is critical.
At a 90 % recall threshold, the results are again close, with both Qdrant and Postgres with pgvector and pgvectorscale achieving sub-20 ms query latencies across all percentiles.
At a 90 % recall threshold, Qdrant achieves 50.3 % lower p50 query latency (4.74 ms vs. 9.54 ms), 58.6 % lower p95 latency (5.50 ms vs. 13.30 ms), and 63.2 % lower p99 query latency (5.79 ms vs. 15.73 ms).
Postgres with pgvector and pgvectorscale achieves 11.4x higher throughput than Qdrant at 99 % recall when searching over 50M embeddings, with Postgres handling 471.57 queries per second compared to Qdrant's 41.47 queries per second.
Postgres with pgvector and pgvectorscale shows a substantial advantage in processing capacity, handling 471.57 queries per second compared to Qdrant's 41.47 QPS. This 11.4x performance gap suggests Postgres may be better suited for high-throughput applications where maintaining high recall is critical. The difference could have significant implications for production environments where query volume is a primary concern, especially when scaling to larger datasets while maintaining high accuracy and low latency requirements.
At 90% recall, Postgres with pgvector and pgvectorscale achieves 4.4x higher throughput than Qdrant when searching over 50M embeddings, with Postgres able to handle 1,589 queries per second compared to Qdrant's 360.
Concurrent read queries with Qdrant appear to suffer from contention that dramatically impacts read throughput compared to Postgres + pgvector(scale). This is likely simply due to Qdrant’s relative immaturity: Postgres has had many years to iron out sources of contention in heavily concurrent read workloads, and pgvector(scale) does not introduce any new ones.
Pgvectorscale took around 11.1 hours to build an index for 50M vectors. Qdrant took only around 3.3 hours to build the same index. In this case, the tables are turned, and pgvectorscale’s implementation is the one showing relative immaturity; index-building in pgvectorscale is currently a serial, single-threaded implementation. Parallelizing the implementation (and performing other optimizations) should eventually close this gap and is something the Timescale engineering team is working on presently.
Pgvector and pgvectorscale can be installed as extensions into an existing Postgres database, leveraging standard infrastructure often already in place. This approach benefits teams already invested in the Postgres ecosystem, as it integrates seamlessly without requiring additional services or infrastructure changes.
In contrast, Qdrant requires a standalone deployment. The good news is that the deployment is fairly simple, allowing developers to get started easily via Docker. This container-friendly approach makes Qdrant well-suited for containerized environments and cloud deployments where teams want a dedicated vector database solution while managing multiple databases.
The query interfaces of these systems reflect their divergent design philosophies.Pgvector(scale) leverages standard SQL syntax that will be immediately familiar to most developers, particularly those with database experience. This SQL foundation enables complex queries that combine vector similarity with traditional SQL operators, allowing for sophisticated data operations.
For example, a typical Postgres pgvectorscale query might look like:
SELECT product_name, description,
embedding <=> $1 AS distance
FROM products
WHERE category = 'electronics' AND in_stock = true
ORDER BY distance
LIMIT 10;
This query finds the five most similar electronics products currently in stock, showcasing how vector similarity seamlessly integrates with traditional SQL filtering.
The ability to work with the vast ecosystem of Postgres clients, object-relational mappers (ORMs), and tools in virtually any programming language represents a significant advantage for teams already using SQL-based workflows. A filtering condition is simply a WHERE clause. The full gamut of SQL features, such as joins with other tables, can be freely used in combination with vector similarity search, yielding great expressive power.
Qdrant approaches the developer experience differently, offering various language clients and functionality narrowly scoped for vector search operations, in contrast to Postgres’s more full-spectrum database operations. A comparable query with Qdrant’s Python client might look like:
client.search(
collection_name="products",
query_vector=query_embedding,
query_filter=models.Filter(
must=[
models.FieldCondition(
key="category",
match=models.MatchValue(value="electronics")
),
models.FieldCondition(
key="in_stock",
match=models.MatchValue(value=True)
)
]
),
limit=5
)
Many developers appreciate Qdrant's streamlined table creation and recreation capabilities with single function calls, as well as its REST API and gRPC interfaces that offer integration flexibility with the database. Filtering conditions, such as the ones in the example above, are expressed with a JSON-based domain-specific language (DSL). While relatively expressive, the DSL has basic limitations: for example, joins are not supported.
Postgres with pgvectorscale provides configuration flexibility through fine-grained control over index parameters in both StreamingDiskANN and HNSW, as well as IVFFLAT index types. Developers can tune numerous settings, such as num_neighbors
, search_list_size
, and query_rescore
to optimize the accuracy-performance trade-off for their specific use cases.
Beyond vector search, Postgres supports multiple index types, including HNSW and StreamingDiskANN for vector search and B-tree, GiST, and GIN for associated metadata. It also supports partial indexes for specialized queries combining vector and metadata conditions.
Qdrant focuses on providing vector-specific configuration options optimized for its core purpose. While offering fewer configuration parameters than Postgres, these options are carefully tailored for vector workloads. Qdrant's payload indexing capabilities are designed to enhance filtering performance in vector-centric workflows without requiring developers to understand general database indexing strategies.
Qdrant starts building the vector index for your uploaded vectors as soon as you start adding them to a collection. The requests to create vectors immediately insert the vector, but the index isn’t immediately complete.
If you make a request while the index is being built (what Qdrant calls the yellow state), it doesn’t use the HNSW for the unoptimized portion of the collection but instead does a scan over all vectors to find the closest ones to the query.
We ran into an issue where one of our testing indices was stuck in the grey state, where the HNSW index isn’t built until another update occurs (even though we already inserted all our vectors). We resolved this by using the Qdrant web UI to manually trigger an index rebuild.
These systems’ operational characteristics reflect their origins and intended use cases. Postgres with pgvectorscale inherits Postgres' enterprise-grade operational features, including rich support for consistent backups, streaming backups, and both incremental and full backups. The availability of point-in-time recovery provides robust protection against operator errors, while mature replication and failover solutions ensure high availability for mission-critical applications.
Qdrant offers basic backup and snapshot mechanisms and support for distributed clusters with replication, focusing on the core operational needs of vector database workloads. While these capabilities cover essential requirements for data protection and availability, they lack some of the advanced recovery options available in Postgres's mature ecosystem.
Postgres provides an extensive observability ecosystem that includes hundreds of metrics through Postgres_exporter
for Prometheus, query execution planning with the EXPLAIN command, and detailed query statistics tracking via pg_stat_statements
.
Additional tools like pg_buffercache
for database memory inspection and automatic logging of slow queries give operators exceptional visibility into database performance and behavior, making troubleshooting significantly easier when problems arise.
Qdrant implements basic monitoring capabilities with standard metrics, providing the essential information needed to operate a vector database in production. While less extensive than Postgres's observability toolset, these monitoring features are focused on the metrics most relevant to vector search performance, offering a streamlined approach to monitoring for teams primarily concerned with vector operations.
Postgres excels in managing complex data relationships with mature support for schema evolution through ALTER TABLE
commands and ACID-compliant transactions for reliable data operations. The ability to define constraints, triggers, and foreign keys helps maintain data quality across complex relationships between vectors and traditional data, making Postgres with pgvectorscale ideal for applications where vectors represent just one aspect of a richer data model.
Qdrant takes a more specialized approach with a collection-based organization of vector data and associated payloads, optimized for vector search workloads rather than strict transactional consistency. This purpose-built design simplifies schema requirements for vector-centric applications, prioritizing search performance over complex relational capabilities. This streamlined approach can reduce unnecessary complexity for teams focused primarily on vector search without complex data relationships.
The community and ecosystem surrounding these technologies present perhaps their starkest contrast. Pgvectorscale benefits from Postgres' massive, 30-year-old ecosystem with its vast array of management tools, monitoring solutions, and client libraries. The extensive documentation, tutorials, and community resources, combined with Postgres' well-established position in enterprise environments, provide an unmatched foundation of knowledge and support for production deployments.
Qdrant represents a newer approach with a growing community specifically focused on vector search. Designed with modern vector search use cases in mind, Qdrant's ecosystem is more specialized but evolving rapidly to address the unique challenges of vector-centric applications. This vector-first approach means the community is highly focused on innovations specific to embedding search without the legacy considerations of general-purpose databases.
Our benchmarks demonstrate that Postgres with pgvector and pgvectorscale can indeed support high-accuracy vector search on large datasets. Compared to Qdrant, it has an order of magnitude higher throughput while maintaining sub-100 ms percentile latencies, which makes it fast for queries. However, Qdrant does have lower tail latencies and lower index build times. Overall, we think these results challenge the assumption that specialized vector databases inherently outperform general-purpose databases for vector workloads and show that Postgres can actually perform well for large-scale vector search use cases.
Choose Postgres with pgvector/pgvectorscale for:
Consider Qdrant for:
Get started today: Pgvector and pgvectorscale are both open source under the Postgres License and are available for you to use in your AI projects today. You can also access pgvector and pgvectorscale on any database service on the Timescale Cloud Postgres platform. For self-hosted deployments, you can find installation instructions on the pgvector and pgvectorscale GitHub repositories, respectively.
Get involved with the pgvectorscale community: