
Back to blog
5 min read
Mar 26, 2026
Table of contents
01 What You Will Learn02 Row Retrieval Lifecycle03 Try This Now: Measure Your Row Weight04 Page Size Limits and Wide Rows05 Serialization Tax06 Benchmarking the Cost07 ConclusionSwitching from SELECT * to explicit field lists can reduce query I/O by over 90% in high-volume environments. In tables with over 500 million rows, wildcard queries hurt performance so much that no amount of index tuning can offset it. You may have already optimized your autovacuum settings and partitioned your data to handle high-frequency ingestion, yet still see latencies climb. The bottleneck in these datasets is rarely how the data enters the system. Rather, it is the cost of how you request it in your queries.
In early-stage development, wildcard selectors feel like a harmless shortcut as you improve your schemas. However, on tables with billions of rows, this shortcut forces the database to move gigabytes of cold metadata that your application never uses. By leading with the specific fields you need, you stop paying the hidden tax of resource over-provisioning and reclaim the full performance of your database.
This guide breaks down the mechanical impact of wildcard queries on high-volume systems. You will learn:
To understand why wildcards hurt your database system, you have to look at how Postgres moves data from the physical disk to your application. Postgres stores data in 8 KB pages. When you run a query, the engine does not just grab the specific fields you want. It loads the entire page containing that row into the buffer cache.
With a wildcard query, you force the database to process and move more data than necessary from the disk to the memory. Your system has to process all the data with checks like Multi-Version Concurrency Control, wasting valuable CPU cycles. The query also pollutes your cache by filling it with cold data that your database never actually uses. As the buffer pool fills with this extra data, it replaces the hot data your queries actually need. This creates a cycle of increased I/O and higher latency that only slows your system down.
You can see the difference in data volume by measuring the byte size of a single row versus a single column. Run these two queries on one of your larger tables:
Measure the Full Row Weight (wildcard)
SELECT pg_column_size(t.*) AS row_bytes
FROM your_table_name t
LIMIT 1;
Measure the Needed Data Weight (explicit field list)
SELECT pg_column_size(your_column_name) AS column_bytes
FROM your_table_name
LIMIT 1;
If your row_bytes is 1,200 and your column_bytes is 8, you are moving 150 times more data than necessary per row. When your query retrieves 10,000 rows, that 1,192-byte difference becomes 11.9 MB of cold data clogging your buffer cache.
Row-based storage ties I/O efficiency to the width of your rows. Because Postgres interleaves values from different columns on the same 8 KB page, wide rows mean fewer rows fit on a single page.
When you use a wildcard query on a wide table, you maximize the amount of data read per row. If your rows are 1 KB wide, you can only fit about eight rows on a page. If your query filters by a time range and fetches thousands of rows, the database must load hundreds of pages into memory.
This creates a read amplification effect. You are reading 10-15 times as much data as a columnar layout would require. While a specialized engine like Tiger Data can read only the specific columns requested from compressed arrays, vanilla Postgres is forced to move the entire heap tuple. This extra data consumes space in the shared buffers, pushing out other useful pages and forcing the next query to go back to the slow disk.
The final stage of the query is serialization. This is the process of converting data from the internal database format to a format that can be transmitted over the network.
Every column in a SELECT * result must be serialized, wasting CPU cycles on the database server and consuming extranetwork bandwidth. Large telemetry datasets typically include metadata strings or JSON blobs that are too heavy to serialize. If your application logic only needs a single floating-point value, serializing a 500-character string for every row is pure waste.
Consider a workload processing 1 million rows. If each row contains an unused 1 KB JSON blob, you are forcing the database to serialize and transmit 1 GB of useless data. On a system with 50,000 inserts per second, this cumulative effect leads to a bottleneck where the database spends more time talking than thinking.
You can see the performance costs of wildcard queries in action by using EXPLAIN (ANALYZE, BUFFERS). This command shows exactly how many data blocks the database had to touch to satisfy your request.
Let’s say you had a table named benchmark_metrics with 25 metadata columns and one million rows. We want to find all the readings for device_id=42.
The Wildcard Approach
This query fetches all columns, including all the heavy metadata, even though we only want to see the performance of a specific device.
EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM benchmark_metrics
WHERE device_id = 42;

The Explicit Approach
This query only fetches the specific reading value we need
EXPLAIN (ANALYZE, BUFFERS)
SELECT reading_value FROM benchmark_metrics
WHERE device_id = 42;

Wildcard queries create a performance ceiling by forcing unnecessary page loads and data serialization. Replacing them with explicit field lists preserves buffer pool space and eliminates the CPU tax of moving unused metadata.
To make the transition from wildcard queries to explicit field lists, start by using the pg_stat_statements extension to find the exact queries causing the biggest performance hit. The following query filters your history for any statement containing SELECT * and ranks them by their total impact on the CPU. It calculates the average execution time so you can see which wildcard queries are dragging down your latencies the most.
SELECT
query,
calls,
total_exec_time / 1000 AS total_seconds,
mean_exec_time AS avg_ms,
rows / calls AS avg_rows_per_call
FROM pg_stat_statements
WHERE query ILIKE '%SELECT *%'
AND query NOT ILIKE '%pg_stat_statements%'
ORDER BY total_exec_time DESC
LIMIT 10;
By identifying these high-impact queries, you can quickly refactor your application to deliver the most immediate relief. Audit your application code and dashboard configurations today to replace these wildcard selectors with explicit field lists. This immediate change will reduce pressure on your buffer pool and be the first step in moving off the optimization treadmill to reclaim your database performance.

Yes, You Can Do Hybrid Search in Postgres (And You Probably Should)
Most search stacks run four systems to answer one question. You don't need any of them. Build production hybrid search in Postgres with pg_textsearch for BM25, pgvectorscale for vector similarity, and Reciprocal Rank Fusion to combine them. One query. One database.
Read more
Receive the latest technical articles and release notes in your inbox.