TigerData logo
TigerData logo
  • Product

    Product

    Tiger Cloud

    Robust elastic cloud platform for startups and enterprises

    TimescaleDB Enterprise

    Self-managed TimescaleDB for on-prem, edge and private cloud

    Open source

    TimescaleDB

    Time-series, real-time analytics and events on Postgres

    Search

    Vector and keyword search on Postgres

  • Industry

    Crypto

    Energy Telemetry

    Oil & Gas Operations

  • Docs
  • Pricing

    Pricing

    Enterprise Tier

  • Developer Hub

    Changelog

    Benchmarks

    Blog

    Community

    Customer Stories

    Events

    Support

    Integrations

    Launch Hub

  • Company

    Contact us

    About

    Timescale

    Partners

    Security

    Careers

Log InStart a free trial
TigerData logo

Products

Time-series and Analytics AI and Vector Enterprise Plan Cloud Status Support Security Cloud Terms of Service

Learn

Documentation Blog Tutorials Changelog Success Stories Time-series Database

Company

Contact Us Careers About Newsroom Brand Community Code Of Conduct Events

Subscribe to the Tiger Data Newsletter

By submitting, you acknowledge Tiger Data's Privacy Policy

2026 (c) Timescale, Inc., d/b/a Tiger Data. All rights reserved.

Privacy preferences
LegalPrivacySitemap

Back to blog

Copy as HTML

Open in ChatGPT

Open in Claude

Open in v0

NanoHertz Solutions - Jake Hertz

By NanoHertz Solutions - Jake Hertz

5 min read

Mar 26, 2026

Scaling PostgreSQL

Table of contents

01 What You Will Learn02 Row Retrieval Lifecycle03 Try This Now: Measure Your Row Weight04 Page Size Limits and Wide Rows05 Serialization Tax06 Benchmarking the Cost07 Conclusion

The Hidden Performance Cost of Wildcard Queries

The Hidden Performance Cost of Wildcard Queries

Back to blog

Scaling PostgreSQL
NanoHertz Solutions - Jake Hertz

By NanoHertz Solutions - Jake Hertz

5 min read

Mar 26, 2026

Table of contents

01 What You Will Learn02 Row Retrieval Lifecycle03 Try This Now: Measure Your Row Weight04 Page Size Limits and Wide Rows05 Serialization Tax06 Benchmarking the Cost07 Conclusion

Copy as HTML

Open in ChatGPT

Open in Claude

Open in v0

Switching from SELECT * to explicit field lists can reduce query I/O by over 90% in high-volume environments. In tables with over 500 million rows, wildcard queries hurt performance so much that no amount of index tuning can offset it. You may have already optimized your autovacuum settings and partitioned your data to handle high-frequency ingestion, yet still see latencies climb. The bottleneck in these datasets is rarely how the data enters the system. Rather, it is the cost of how you request it in your queries.

In early-stage development, wildcard selectors feel like a harmless shortcut as you improve your schemas. However, on tables with billions of rows, this shortcut forces the database to move gigabytes of cold metadata that your application never uses. By leading with the specific fields you need, you stop paying the hidden tax of resource over-provisioning and reclaim the full performance of your database.

What You Will Learn

This guide breaks down the mechanical impact of wildcard queries on high-volume systems. You will learn:

  • How buffers and page-level I/O waste time and space during row retrieval.
  • The way wide rows trigger cache eviction and pollute your memory.
  • Techniques to reduce serialization taxes that slow down your application.
  • How to audit your data-to-result ratio to identify resource waste.

Row Retrieval Lifecycle

To understand why wildcards hurt your database system, you have to look at how Postgres moves data from the physical disk to your application. Postgres stores data in 8 KB pages. When you run a query, the engine does not just grab the specific fields you want. It loads the entire page containing that row into the buffer cache.

With a wildcard query, you force the database to process and move more data than necessary from the disk to the memory. Your system has to process all the data with checks like Multi-Version Concurrency Control, wasting valuable CPU cycles. The query also pollutes your cache by filling it with cold data that your database never actually uses. As the buffer pool fills with this extra data, it replaces the hot data your queries actually need. This creates a cycle of increased I/O and higher latency that only slows your system down.

Try This Now: Measure Your Row Weight

You can see the difference in data volume by measuring the byte size of a single row versus a single column. Run these two queries on one of your larger tables:

Measure the Full Row Weight (wildcard)

SELECT pg_column_size(t.*) AS row_bytes 
FROM your_table_name t 
LIMIT 1;

Measure the Needed Data Weight (explicit field list)

SELECT pg_column_size(your_column_name) AS column_bytes 
FROM your_table_name 
LIMIT 1;

If your row_bytes is 1,200 and your column_bytes is 8, you are moving 150 times more data than necessary per row. When your query retrieves 10,000 rows, that 1,192-byte difference becomes 11.9 MB of cold data clogging your buffer cache.

Page Size Limits and Wide Rows

Row-based storage ties I/O efficiency to the width of your rows. Because Postgres interleaves values from different columns on the same 8 KB page, wide rows mean fewer rows fit on a single page.

When you use a wildcard query on a wide table, you maximize the amount of data read per row. If your rows are 1 KB wide, you can only fit about eight rows on a page. If your query filters by a time range and fetches thousands of rows, the database must load hundreds of pages into memory.

This creates a read amplification effect. You are reading 10-15 times as much data as a columnar layout would require. While a specialized engine like Tiger Data can read only the specific columns requested from compressed arrays, vanilla Postgres is forced to move the entire heap tuple. This extra data consumes space in the shared buffers, pushing out other useful pages and forcing the next query to go back to the slow disk.

Serialization Tax

The final stage of the query is serialization. This is the process of converting data from the internal database format to a format that can be transmitted over the network.

Every column in a SELECT * result must be serialized, wasting CPU cycles on the database server and consuming extranetwork bandwidth. Large telemetry datasets typically include metadata strings or JSON blobs that are too heavy to serialize. If your application logic only needs a single floating-point value, serializing a 500-character string for every row is pure waste.

Consider a workload processing 1 million rows. If each row contains an unused 1 KB JSON blob, you are forcing the database to serialize and transmit 1 GB of useless data. On a system with 50,000 inserts per second, this cumulative effect leads to a bottleneck where the database spends more time talking than thinking.

Benchmarking the Cost

You can see the performance costs of wildcard queries in action by using EXPLAIN (ANALYZE, BUFFERS). This command shows exactly how many data blocks the database had to touch to satisfy your request.

Let’s say you had a table named benchmark_metrics with 25 metadata columns and one million rows. We want to find all the readings for device_id=42.

The Wildcard Approach

This query fetches all columns, including all the heavy metadata, even though we only want to see the performance of a specific device.

EXPLAIN (ANALYZE, BUFFERS)
SELECT * FROM benchmark_metrics 
WHERE device_id = 42;
image

The Explicit Approach

This query only fetches the specific reading value we need

EXPLAIN (ANALYZE, BUFFERS)
SELECT reading_value FROM benchmark_metrics 
WHERE device_id = 42;
image

Conclusion

Wildcard queries create a performance ceiling by forcing unnecessary page loads and data serialization. Replacing them with explicit field lists preserves buffer pool space and eliminates the CPU tax of moving unused metadata.

To make the transition from wildcard queries to explicit field lists, start by using the pg_stat_statements extension to find the exact queries causing the biggest performance hit. The following query filters your history for any statement containing SELECT * and ranks them by their total impact on the CPU. It calculates the average execution time so you can see which wildcard queries are dragging down your latencies the most.

SELECT 
    query, 
    calls, 
    total_exec_time / 1000 AS total_seconds, 
    mean_exec_time AS avg_ms, 
    rows / calls AS avg_rows_per_call 
FROM pg_stat_statements 
WHERE query ILIKE '%SELECT *%' 
  AND query NOT ILIKE '%pg_stat_statements%' 
ORDER BY total_exec_time DESC 
LIMIT 10;

By identifying these high-impact queries, you can quickly refactor your application to deliver the most immediate relief. Audit your application code and dashboard configurations today to replace these wildcard selectors with explicit field lists. This immediate change will reduce pressure on your buffer pool and be the first step in moving off the optimization treadmill to reclaim your database performance.

Related posts

Yes, You Can Do Hybrid Search in Postgres (And You Probably Should)

Yes, You Can Do Hybrid Search in Postgres (And You Probably Should)

pg_textsearchCloud

Apr 20, 2026

Most search stacks run four systems to answer one question. You don't need any of them. Build production hybrid search in Postgres with pg_textsearch for BM25, pgvectorscale for vector similarity, and Reciprocal Rank Fusion to combine them. One query. One database.

Read more

Surviving the Performance Cliff of Disk-Bound Data

Surviving the Performance Cliff of Disk-Bound Data

PostgreSQL PerformanceScaling PostgreSQL

Apr 03, 2026

Learn why database performance falls off a cliff when your working set exceeds RAM—and three strategies to survive it, from vertical scaling to columnar compression.

Read more

Stay updated with new posts and releases.

Receive the latest technical articles and release notes in your inbox.

Share

Start a free trial