TigerData logo
TigerData logo
  • Product

    Tiger Cloud

    Robust elastic cloud platform for startups and enterprises

    Agentic Postgres

    Postgres for Agents

    TimescaleDB

    Postgres for time-series, real-time analytics and events

  • Docs
  • Pricing

    Pricing

    Enterprise Tier

  • Developer Hub

    Changelog

    Benchmarks

    Blog

    Community

    Customer Stories

    Events

    Support

    Integrations

    Launch Hub

  • Company

    Contact us

    About

    Timescale

    Partners

    Security

    Careers

Log InTry for free
Home
AWS Time-Series Database: Understanding Your OptionsStationary Time-Series AnalysisThe Best Time-Series Databases ComparedTime-Series Analysis and Forecasting With Python Alternatives to TimescaleWhat Are Open-Source Time-Series Databases—Understanding Your OptionsWhy Consider Using PostgreSQL for Time-Series Data?Time-Series Analysis in RWhat Is Temporal Data?What Is a Time Series and How Is It Used?Is Your Data Time Series? Data Types Supported by PostgreSQL and TimescaleUnderstanding Database Workloads: Variable, Bursty, and Uniform PatternsHow to Work With Time Series in Python?Tools for Working With Time-Series Analysis in PythonGuide to Time-Series Analysis in PythonUnderstanding Autoregressive Time-Series ModelingCreating a Fast Time-Series Graph With Postgres Materialized Views
Understanding PostgreSQLOptimizing Your Database: A Deep Dive into PostgreSQL Data TypesUnderstanding FROM in PostgreSQL (With Examples)How to Address ‘Error: Could Not Resize Shared Memory Segment’ How to Install PostgreSQL on MacOSUnderstanding FILTER in PostgreSQL (With Examples)Understanding GROUP BY in PostgreSQL (With Examples)PostgreSQL Join Type TheoryA Guide to PostgreSQL ViewsStructured vs. Semi-Structured vs. Unstructured Data in PostgreSQLUnderstanding Foreign Keys in PostgreSQLUnderstanding PostgreSQL User-Defined FunctionsUnderstanding PostgreSQL's COALESCE FunctionUnderstanding SQL Aggregate FunctionsUsing PostgreSQL UPDATE With JOINHow to Install PostgreSQL on Linux5 Common Connection Errors in PostgreSQL and How to Solve ThemUnderstanding HAVING in PostgreSQL (With Examples)How to Fix No Partition of Relation Found for Row in Postgres DatabasesHow to Fix Transaction ID Wraparound ExhaustionUnderstanding LIMIT in PostgreSQL (With Examples)Understanding PostgreSQL FunctionsUnderstanding ORDER BY in PostgreSQL (With Examples)Understanding WINDOW in PostgreSQL (With Examples)Understanding PostgreSQL WITHIN GROUPPostgreSQL Mathematical Functions: Enhancing Coding EfficiencyUnderstanding DISTINCT in PostgreSQL (With Examples)Using PostgreSQL String Functions for Improved Data AnalysisData Processing With PostgreSQL Window FunctionsPostgreSQL Joins : A SummaryUnderstanding OFFSET in PostgreSQL (With Examples)Understanding PostgreSQL Date and Time FunctionsWhat Is Data Compression and How Does It Work?What Is Data Transformation, and Why Is It Important?Understanding the Postgres string_agg FunctionWhat Is a PostgreSQL Left Join? And a Right Join?Understanding PostgreSQL SELECTSelf-Hosted or Cloud Database? A Countryside Reflection on Infrastructure ChoicesUnderstanding ACID Compliance Understanding percentile_cont() and percentile_disc() in PostgreSQLUnderstanding PostgreSQL Conditional FunctionsUnderstanding PostgreSQL Array FunctionsWhat Characters Are Allowed in PostgreSQL Strings?Understanding WHERE in PostgreSQL (With Examples)What Is a PostgreSQL Full Outer Join?What Is a PostgreSQL Cross Join?What Is a PostgreSQL Inner Join?Data Partitioning: What It Is and Why It MattersStrategies for Improving Postgres JOIN PerformanceUnderstanding the Postgres extract() FunctionUnderstanding the rank() and dense_rank() Functions in PostgreSQL
Guide to PostgreSQL PerformanceHow to Reduce Bloat in Large PostgreSQL TablesDesigning Your Database Schema: Wide vs. Narrow Postgres TablesBest Practices for Time-Series Data Modeling: Single or Multiple Partitioned Table(s) a.k.a. Hypertables Best Practices for (Time-)Series Metadata Tables A Guide to Data Analysis on PostgreSQLA Guide to Scaling PostgreSQLGuide to PostgreSQL SecurityHandling Large Objects in PostgresHow to Query JSON Metadata in PostgreSQLHow to Query JSONB in PostgreSQLHow to Use PostgreSQL for Data TransformationOptimizing Array Queries With GIN Indexes in PostgreSQLPg_partman vs. Hypertables for Postgres PartitioningPostgreSQL Performance Tuning: Designing and Implementing Your Database SchemaPostgreSQL Performance Tuning: Key ParametersPostgreSQL Performance Tuning: Optimizing Database IndexesDetermining the Optimal Postgres Partition SizeNavigating Growing PostgreSQL Tables With Partitioning (and More)Top PostgreSQL Drivers for PythonWhen to Consider Postgres PartitioningGuide to PostgreSQL Database OperationsUnderstanding PostgreSQL TablespacesWhat Is Audit Logging and How to Enable It in PostgreSQLGuide to Postgres Data ManagementHow to Index JSONB Columns in PostgreSQLHow to Monitor and Optimize PostgreSQL Index PerformanceSQL/JSON Data Model and JSON in SQL: A PostgreSQL PerspectiveA Guide to pg_restore (and pg_restore Example)PostgreSQL Performance Tuning: How to Size Your DatabaseAn Intro to Data Modeling on PostgreSQLExplaining PostgreSQL EXPLAINWhat Is a PostgreSQL Temporary View?A PostgreSQL Database Replication GuideHow to Compute Standard Deviation With PostgreSQLHow PostgreSQL Data Aggregation WorksBuilding a Scalable DatabaseRecursive Query in SQL: What It Is, and How to Write OneGuide to PostgreSQL Database DesignHow to Use Psycopg2: The PostgreSQL Adapter for Python
Best Practices for Scaling PostgreSQLHow to Design Your PostgreSQL Database: Two Schema ExamplesHow to Handle High-Cardinality Data in PostgreSQLHow to Store Video in PostgreSQL Using BYTEABest Practices for PostgreSQL Database OperationsHow to Manage Your Data With Data Retention PoliciesBest Practices for PostgreSQL AggregationBest Practices for Postgres Database ReplicationHow to Use a Common Table Expression (CTE) in SQLBest Practices for Postgres Data ManagementBest Practices for Postgres PerformanceBest Practices for Postgres SecurityBest Practices for PostgreSQL Data AnalysisTesting Postgres Ingest: INSERT vs. Batch INSERT vs. COPYHow to Use PostgreSQL for Data Normalization
PostgreSQL Extensions: amcheckPostgreSQL Extensions: Unlocking Multidimensional Points With Cube PostgreSQL Extensions: hstorePostgreSQL Extensions: ltreePostgreSQL Extensions: Secure Your Time-Series Data With pgcryptoPostgreSQL Extensions: pg_prewarmPostgreSQL Extensions: pgRoutingPostgreSQL Extensions: pg_stat_statementsPostgreSQL Extensions: Install pg_trgm for Data MatchingPostgreSQL Extensions: Turning PostgreSQL Into a Vector Database With pgvectorPostgreSQL Extensions: Database Testing With pgTAPPostgreSQL Extensions: PL/pgSQLPostgreSQL Extensions: Using PostGIS and Timescale for Advanced Geospatial InsightsPostgreSQL Extensions: Intro to uuid-ossp
Columnar Databases vs. Row-Oriented Databases: Which to Choose?Data Analytics vs. Real-Time Analytics: How to Pick Your Database (and Why It Should Be PostgreSQL)How to Choose a Real-Time Analytics DatabaseUnderstanding OLTPOLAP Workloads on PostgreSQL: A GuideHow to Choose an OLAP DatabasePostgreSQL as a Real-Time Analytics DatabaseWhat Is the Best Database for Real-Time AnalyticsHow to Build an IoT Pipeline for Real-Time Analytics in PostgreSQL
When Should You Use Full-Text Search vs. Vector Search?HNSW vs. DiskANNA Brief History of AI: How Did We Get Here, and What's Next?A Beginner’s Guide to Vector EmbeddingsPostgreSQL as a Vector Database: A Pgvector TutorialUsing Pgvector With PythonHow to Choose a Vector DatabaseVector Databases Are the Wrong AbstractionUnderstanding DiskANNA Guide to Cosine SimilarityStreaming DiskANN: How We Made PostgreSQL as Fast as Pinecone for Vector DataImplementing Cosine Similarity in PythonVector Database Basics: HNSWVector Database Options for AWSVector Store vs. Vector Database: Understanding the ConnectionPgvector vs. Pinecone: Vector Database Performance and Cost ComparisonHow to Build LLM Applications With Pgvector Vector Store in LangChainHow to Implement RAG With Amazon Bedrock and LangChainRetrieval-Augmented Generation With Claude Sonnet 3.5 and PgvectorRAG Is More Than Just Vector SearchPostgreSQL Hybrid Search Using Pgvector and CohereImplementing Filtered Semantic Search Using Pgvector and JavaScriptRefining Vector Search Queries With Time Filters in Pgvector: A TutorialUnderstanding Semantic SearchWhat Is Vector Search? Vector Search vs Semantic SearchText-to-SQL: A Developer’s Zero-to-Hero GuideNearest Neighbor Indexes: What Are IVFFlat Indexes in Pgvector and How Do They WorkBuilding an AI Image Gallery With OpenAI CLIP, Claude Sonnet 3.5, and Pgvector
Understanding IoT (Internet of Things)A Beginner’s Guide to IIoT and Industry 4.0Storing IoT Data: 8 Reasons Why You Should Use PostgreSQLMoving Past Legacy Systems: Data Historian vs. Time-Series DatabaseWhy You Should Use PostgreSQL for Industrial IoT DataHow to Choose an IoT DatabaseHow to Simulate a Basic IoT Sensor Dataset on PostgreSQLFrom Ingest to Insights in Milliseconds: Everactive's Tech Transformation With TimescaleHow Ndustrial Is Providing Fast Real-Time Queries and Safely Storing Client Data With 97 % CompressionHow Hopthru Powers Real-Time Transit Analytics From a 1 TB Table Migrating a Low-Code IoT Platform Storing 20M Records/DayHow United Manufacturing Hub Is Introducing Open Source to ManufacturingBuilding IoT Pipelines for Faster Analytics With IoT CoreVisualizing IoT Data at Scale With Hopara and TimescaleDB
What Is ClickHouse and How Does It Compare to PostgreSQL and TimescaleDB for Time Series?Timescale vs. Amazon RDS PostgreSQL: Up to 350x Faster Queries, 44 % Faster Ingest, 95 % Storage Savings for Time-Series DataWhat We Learned From Benchmarking Amazon Aurora PostgreSQL ServerlessTimescaleDB vs. Amazon Timestream: 6,000x Higher Inserts, 5-175x Faster Queries, 150-220x CheaperHow to Store Time-Series Data in MongoDB and Why That’s a Bad IdeaPostgreSQL + TimescaleDB: 1,000x Faster Queries, 90 % Data Compression, and Much MoreEye or the Tiger: Benchmarking Cassandra vs. TimescaleDB for Time-Series Data
Alternatives to RDSWhy Is RDS so Expensive? Understanding RDS Pricing and CostsEstimating RDS CostsHow to Migrate From AWS RDS for PostgreSQL to TimescaleAmazon Aurora vs. RDS: Understanding the Difference
5 InfluxDB Alternatives for Your Time-Series Data8 Reasons to Choose Timescale as Your InfluxDB Alternative InfluxQL, Flux, and SQL: Which Query Language Is Best? (With Cheatsheet)What InfluxDB Got WrongTimescaleDB vs. InfluxDB: Purpose Built Differently for Time-Series Data
5 Ways to Monitor Your PostgreSQL DatabaseHow to Migrate Your Data to Timescale (3 Ways)Postgres TOAST vs. Timescale CompressionBuilding Python Apps With PostgreSQL: A Developer's GuideData Visualization in PostgreSQL With Apache SupersetMore Time-Series Data Analysis, Fewer Lines of Code: Meet HyperfunctionsIs Postgres Partitioning Really That Hard? An Introduction To HypertablesPostgreSQL Materialized Views and Where to Find ThemTimescale Tips: Testing Your Chunk Size
Postgres cheat sheet
HomeTime series basicsPostgres basicsPostgres guidesPostgres best practicesPostgres extensionsPostgres for real-time analytics
Sections

Performance

Guide to PostgreSQL Performance

Schema design

PostgreSQL Performance Tuning: Designing and Implementing Your Database Schema

Performance tuning

PostgreSQL Performance Tuning: Key ParametersPostgreSQL Performance Tuning: Optimizing Database IndexesHow to Reduce Bloat in Large PostgreSQL TablesPostgreSQL Performance Tuning: How to Size Your Database

Partitioning

Determining the Optimal Postgres Partition SizeNavigating Growing PostgreSQL Tables With Partitioning (and More)When to Consider Postgres PartitioningPg_partman vs. Hypertables for Postgres Partitioning

Database design and modeling

An Intro to Data Modeling on PostgreSQLDesigning Your Database Schema: Wide vs. Narrow Postgres TablesBest Practices for Time-Series Data Modeling: Single or Multiple Partitioned Table(s) a.k.a. Hypertables Best Practices for (Time-)Series Metadata Tables Guide to PostgreSQL Database Design

Database replication

A PostgreSQL Database Replication Guide

Data analysis

A Guide to Data Analysis on PostgreSQLHow to Compute Standard Deviation With PostgreSQL

Data transformation

How to Use PostgreSQL for Data Transformation

Data aggregation

How PostgreSQL Data Aggregation Works

Scaling postgres

A Guide to Scaling PostgreSQLBuilding a Scalable Database

Database security

Guide to PostgreSQL SecurityWhat Is Audit Logging and How to Enable It in PostgreSQL

Data management

Understanding PostgreSQL TablespacesGuide to Postgres Data ManagementHandling Large Objects in Postgres

Database operations

Guide to PostgreSQL Database Operations

JSON

How to Query JSON Metadata in PostgreSQLHow to Query JSONB in PostgreSQLHow to Index JSONB Columns in PostgreSQLSQL/JSON Data Model and JSON in SQL: A PostgreSQL Perspective

Query optimization

Explaining PostgreSQL EXPLAINWhat Is a PostgreSQL Temporary View?Optimizing Array Queries With GIN Indexes in PostgreSQLRecursive Query in SQL: What It Is, and How to Write One

Database tools and libraries

How to Use Psycopg2: The PostgreSQL Adapter for PythonTop PostgreSQL Drivers for Python

Database indexes

How to Monitor and Optimize PostgreSQL Index Performance

Database backups and restore

A Guide to pg_restore (and pg_restore Example)

Products

Time Series and Analytics AI and Vector Enterprise Plan Cloud Status Support Security Cloud Terms of Service

Learn

Documentation Blog Forum Tutorials Changelog Success Stories Time Series Database

Company

Contact Us Careers About Brand Community Code Of Conduct Events

Subscribe to the Tiger Data Newsletter

By submitting, you acknowledge Tiger Data's Privacy Policy

2025 (c) Timescale, Inc., d/b/a Tiger Data. All rights reserved.

Privacy preferences
LegalPrivacySitemap

Published at Mar 6, 2024

Schema Design

Designing Your Database Schema: Wide vs. Narrow Postgres Tables

An image of data as a hologram—Designing your database schema, narrow, medium or wide table layout

Written by Chris Engelbert and Carlota Soto

When setting up your PostgreSQL database, even before you start ingesting data, you’re confronted with a few important design decisions regarding schema design. Settling on a database schema is akin to an artist choosing their canvas before starting to paint: the shape of your canvas, or in this case, your table layout, will significantly influence how your database shapes up, affecting your performance, usability, and maintainability. 

There are two major table design options for your PostgreSQL database: narrow and wide. In this blog post, we look at both and explain why and how you would use them while also introducing a third option (medium layout) as a middle-term between the traditional wide and narrow designs. 

Database Schema: What Are Narrow and Wide Tables in PostgreSQL? 

Let’s start by describing the differences between narrow and wide tables in a database schema and then discuss the pros and cons of each option. 

Narrow PostgreSQL tables 

In a schema design, narrow tables are characterized by having few data value columns, meaning that, for each row in the table, there aren't many different types of information being stored. For instance, a narrow table might simply store a timestamp and a singular measurement, like temperature. 

However, it's essential to understand that having "few data value columns" does not equate to "few columns in total." A narrow table can still have multiple columns that serve other purposes. For example, columns could be present to categorize or label each row, associating it with a specific device from which the data came, the metric being measured or assigning it to a particular time series. So, while the core data in a narrow table might be simple, the table can still be structured to offer context or classification for each data point.

Here is an example of a narrow table:

CREATE TABLE narrow_table (     created    TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),     point_id   UUID NOT NULL DEFAULT gen_random_uuid(),     device_id  UUID NOT NULL,     metric     TEXT NOT NULL,     value      DOUBLE PRECISION NOT NULL );

-- Example insert statement for the given data: INSERT INTO narrow_table (created, point_id, device_id, metric, value)  VALUES ('2022-01-01 00:00:00.0+00', '123', '10', 'temp', 24);

As you can see, the row has a single column that holds the value: in this case, a decimal value. This example also uses one column for the metric name (in this example, “temp” for temperature). It could also be anything stored as a decimal value.

If you want to use other data type values, you could create a different table to store a text string or whatever other type. Alternatively, you often see JSONB as the data type for the value column. That way, you can store any valid JSON value in the column.

CREATE TABLE narrow_table (     created    TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),     point_id   UUID NOT NULL DEFAULT gen_random_uuid(),     device_id  UUID NOT NULL,     metric     TEXT NOT NULL,     value      JSONB NOT NULL );

Wide PostgreSQL tables

The wide table layout is the exact opposite of the narrow layout, meaning that you’ll end up with a lot of columns. Most often, it’s one column per available metric. Depending on the number of metrics you have, wide tables can get—as the name indicates—very wide, very quickly. It is common to see 200 or more columns in a single table. 

A simple example of a wide table could be something like this:

CREATE TABLE measurement_table (     created        TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),     point_id       UUID NOT NULL DEFAULT gen_random_uuid(),     device_id      UUID NOT NULL,     temp           DOUBLE PRECISION,     hum            DOUBLE PRECISION,     co2            INTEGER,     wind_speed     INTEGER,     wind_direction INTEGER );

-- Example insert statement for the given data: INSERT INTO measurement_table (created, point_id, device_id, temp, hum, co2, wind_speed, wind_direction)  VALUES ('2022-01-01 00:00:00.0+00', '123', '10', 24.7, 57.1, 271, NULL, NULL);

Based on your use case, the value columns may be nullable or not. That mostly depends on how data is delivered. For example, if data is provided simultaneously and can be stored in one row, or if data becomes available at different times, and only one or more columns are used each time. 

In the above example, the three values for temperature, humidity, and CO2 are provided together, while wind speed and wind direction would be in a separate data packet, hence the NULL values.

Narrow vs. Wide Table Layout: Advantages and Disadvantages 

Advantages of a narrow table layout

When designing your database schema, the narrow table layout is great if you have a low cardinality of metrics and you know that upfront, along with all their data types. If you don’t know this, you can remove the metric name as a column and create one table for each metric type, somehow mangling the metric into the table name like this:

Table: temperature | point_id | device_id | value | |      123 |        10 |  24.7 | Table: humidity | point_id | device_id | value | |      123 |        10 |  57.1 |

This example includes independent tables for the temperature and humidity metrics. That way, you can think about the data type of every metric independently, and the table design becomes much easier.

Extending the layout with additional metrics is easy: create an additional hypertable matching the metric’s data type, and you’re done. 

Disadvantages of a narrow table layout 

The previous advantage, however, can also turn into a disadvantage. If you have a high number of different metrics, the number of individual tables could become a hassle to manage. While PostgreSQL doesn’t have an issue with many tables, humans often do.

Additional complexity may come in if you want to split those tables further down, for example, having a temperature table per customer—also known as multi-tenancy. You could handle this by mangling the name (<<customer_id>>_temperature) or using a lookup table that maps the customer and metric to a generated table name. It could get tedious, though.

Another way to mitigate complexity is using JSONB as the value column’s data type. This allows you to unify all metrics per customer into a single table, adding back the metric column. Remember, you still need to understand what data type the value will be when queried since JSONB might force you to cast the resulting value into the data type your application expects (for example, JSON doesn’t use real integers, only numbers according to IEEE 754).

Last but not least, querying many different metrics at the same time can mean you need to use a lot of JOIN or UNIONs, especially when every metric has its own table. While it doesn’t necessarily slow down the query, it can be tedious to write or generate the queries. Plus, every JOIN operation increases the input/output (I/O) load on the system when preparing the result. All this becomes more likely when the cardinality of metrics keeps increasing.

Advantages of a wide table layout

The wide table is a great choice for your schema design if you know all potential metrics upfront. This is often the case with systems that have been in production for a long time and data that will be migrated over. If new columns need to be added only at a low rate, it may be a viable option, but adding a column may be a heavy operation.

Besides that, supporting multi-tenancy is simple, adding a new wide table per customer. (Still, remember, when adding a new column, you may have to add it to all of the customer’s tables too, which can be a tedious process.)

Querying data is simple: we don’t have to cast data as they’re already stored in their native data types, and there is no need to look up the column to query since there is one column per metric. Also, querying a subset of the columns is very efficient, especially when chunks are compressed into columnar representation.

Querying data from multiple customers is quick. Just JOIN and UNION the different intermediate queries together, and you’re done.

Disadvantages of a wide table layout 

While the wide table layout is a common choice for beginners, it is probably the most complicated to manage and extend over time.

The biggest issue with the wide table is adding more metric columns. Keeping track of things in systems that are constantly growing with new metrics can be challenging.

Also, while dropping a column because you don’t need the metric anymore is an immediate operation (the column will just be hidden from the query engine), it doesn’t immediately remove the content from disk.

The rows will be rewritten whenever that row is updated, which will most likely never happen in time-series data, leaving the dangling data around forever. Adding NULLABLE columns is simple, the same as DEFAULT values. Non-updated rows will be evaluated at query time.

Quick summary for your database schema

Advantages of a narrow table layout

  • Works well for low cardinality of known metrics and data types.

  • Allows individual data type consideration for each metric.

  • Simplifies table design with independent tables for each metric.

  • Easy to extend with new metrics by adding a new table. 

Disadvantages of a narrow table layout:

  • Managing a high number of tables for different metrics can be challenging.

  • Multi-tenancy (e.g., a table per customer) adds complexity.

  • Using JSONB requires casting values, impacting data type clarity.

  • Querying multiple metrics involves multiple JOINs or UNIONs, complicating query writing.

  • Implies increased I/O load due to JOIN operations, especially with growing metric cardinality.

Advantages of a wide table layout:

  • Works well when all potential metrics are known in advance.

  • Simplifies multi-tenancy with a new table per customer.

  • Native data types eliminate the need for casting during queries.

  • Efficient querying for subsets of columns, especially with columnar compression.

  • Faster queries across multiple customers using JOIN and UNION.

Disadvantages of a wide table layout:

  • Challenging to manage and extend over time, especially for beginners.

  • Adding new metric columns can be hard in dynamic systems.

  • Dropping a column doesn't immediately free up disk space, leaving unused data.

  • Evaluates non-updated rows with DEFAULT values at query time, affecting performance.

An Alternative: Medium Table Layout 

There’s a middle-ground alternative between the two narrow and wide models when designing your database schema: the medium table layout.

The basic design principle is to create one column for each necessary data type: 

CREATE TABLE medium_table (     created     TIMESTAMP WITH TIME ZONE NOT NULL DEFAULT NOW(),     point_id    UUID NOT NULL DEFAULT gen_random_uuid(),     device_id   UUID NOT NULL,     metric      TEXT NOT NULL,     int_value   INTEGER,     float_value DOUBLE PRECISION );

-- Example insert statement for the given data: INSERT INTO medium_table (point_id, device_id, metric, int_value, float_value)  VALUES ('123', '10', 'temp', NULL, 2); To elaborate on the difference between this layout and the previous ones: 

  • Narrow tables typically have one or very few columns representing data values (e.g., metric and value).

  • Wide tables have multiple columns, with each column representing a distinct metric or data point. If we had separate columns for temperature, humidity, pressure, etc., in the previous table, that would be a wide table layout.

  • The medium table above sits in between the narrow and wide layouts. It has columns representing different data types or categories of metrics but retains some level of generic structure. In the provided table, there's a separation between int_value and float_value, but the actual metric being stored (e.g., temperature, humidity) is still identified by the metric column. This offers more flexibility than a narrow table but doesn't go full out with distinct columns for each metric like a wide table.

If you need more data types, you can just add more columns. This is a very convenient setup if you know all the expected data types, which is fairly easy compared to “knowing all possible metrics in advance.” However, you must know which metric type would be found in which specific data type column. The medium table layout has one row with one column per needed data type. That means you only need one column defining what kind of data is stored in that specific row. This may need an additional lookup table, not only defining the metrics but also which data type column needs to be queried for each. For a low, static number of metrics, you could handle this in the application layer.

This layout provides most of the benefits of the narrow design, being easy to extend with additional metrics, as long as we don’t want to introduce a new data type. But even then, you only need to add a new column to the table layout. It depends on the existing number of rows in your hypertable as to how heavy an operation like that is.

The medium layout is a very practical design choice if you are sure about the different data types that can occur in the future but are not yet certain of all potential metrics. This is a common situation in IoT use cases, where you add new types of metrics frequently. While the number of metrics increases, and you have to add them to your lookup mechanism (metric to data type column), the table's design doesn’t change, and you can keep adding new metrics without touching the layout.

Multi-tenancy is easily achieved by creating one hypertable per customer and storing all corresponding data in that particular table. If you need to query data for multiple customers, you can JOIN and UNION ALL the data together and get a single result set.

It is also straightforward to query multiple metrics simultaneously since all data resides in the same table. If you want to retrieve data from multiple different data types, though, you may end up using CASE WHEN clauses and filling result columns with NULLs when no value is available for a specific metric but other queried ones. This would need to be handled in the application layer. 

Which Table Layout to Choose? 

There are quite a few bits to consider when making a decision. While it is possible to migrate from one layout to another, the process can be quite tedious, especially with a growing number of records being stored. The following table gives you a quick overview of the key pros and cons of every option: 

Narrow Table Layout

Medium Table Layout

Wide Table Layout

Ease of Use

Easy

Easy

Somewhat easy

Cost of Up-Front Design

Low

Somewhat low

Somewhat high

Extensibility

Easy

Somewhat easy

Somewhat complicated

Multi-Tenancy

Somewhat complicated

Easy

Easy

Let’s break them down. 

When to use a narrow table in my schema design? 

Consider using narrow tables in your database schema if the following applies:

  • You have a low cardinality of metrics. Narrow tables thrive when you know your metrics and their data types upfront. It's simpler to manage and provides a clear layout.

  • Your metrics are evolving. If you’re unsure about the range of metrics you might handle, you might enjoy the flexibility of creating separate tables for each metric type. 

  • You value extensibility. As your system grows and requires new metrics, you can just introduce a new table tailored to that metric's data type.

However, be cautious if the following applies:

  • You anticipate a surge in metrics. Managing many tables can be a hassle. 

  • You require complex multi-tenancy. Creating a unique table for each metric-customer combination can become tedious.

  • You'll query across multiple metrics frequently. Crafting queries that touch on many metrics may necessitate multiple JOINs or UNIONs, potentially complicating your SQL and putting more strain on your system.

When to use a wide table in my schema design? 

Consider using wide tables if the following applies:

  • Your metrics are established. If you have a clear understanding of all the metrics you’ll ever need, wide tables offer a structured and predictable layout.

  • You need simplified multi-tenancy. With one table per customer, handling data across multiple tenants becomes straightforward.

  • Your queries are column-specific. Since each metric has its own column, queries are direct and efficient.

Similarly as before, be cautious if the following applies: 

  • You anticipate new metrics often. Constantly adding columns can be challenging, especially in growing systems.

  • Your system will undergo frequent changes. Regularly adding columns to all customer tables can become a maintenance headache.

When to use a medium table in my schema design? 

Lastly, consider using medium tables if: 

  • You want a balanced approach. If you're looking for something between the granularity of narrow tables and the structure of wide ones, medium tables offer a good compromise.

  • You're certain about data types. If you have a clear idea about the data types but are unsure of the metrics, medium tables provide flexibility.

  • You prioritize straightforward multi-tenancy. 

Best Practices for Designing Your Hypertable Schema

If you need tips to design your hypertable schema, check out this video.

Timescale Special Considerations: On Hypertables and Compression 

Let’s have a quick look at how Timescale’s compression plays with the different table layouts. Compression is one of the essential features of Timescale, allowing to reduce database size by up to 10x. 

P.S. If you don’t know what Timescale is, check out this article. It makes PostgreSQL faster by adding automatic partitioning to your large tables, query planning improvements, and many other things. 

Generally speaking, all layouts are fully supported when using Timescale compression. 

The narrow table layout is straightforward to manage: since no columns are to be expected at a later point in time (given you decided early on if you want to use the metric column or table name setup), no special considerations are to be expected with compression. You can also expect a really good compression ratio since the records are very uniform and, therefore, compress extremely well.

Thanks to the DML capabilities of Timescale compression, medium and wide table layouts also play nicely with compression. If you have to add new columns, you can add them directly.

And as a final note, columns with NULL compress really well: an often-seen worry for many people is that the number of NULL values in medium or wide layouts creates a lot of unnecessary bloat in the hypertable. But due to the way compressed chunks are designed, NULL values compress to almost nothing.

Wrap-Up 

Your use case characteristics and your own priorities will determine which table layout you ultimately choose for your schema design. There is no “one-size-fits-all” layout. Look at your specific use case and decide on the pros and cons of the different layout aspects.

Maybe you have different use cases with different benefits using different layouts. You may, for example, want to store IoT metrics from sensors differently from your infrastructure’s observability data.

While in the wild, the medium table layout is probably the most uncommon one, it is often an excellent trade-off between an easy-to-grasp design, future extensibility, and a quick start. Some lookup mechanism is necessary to understand what metric will be available in what data type column.

If you know the exact metrics you want to store, the wide table layout may be a great choice. It provides a quick setup, great query speeds (especially with compressed enabled), and common queries, basically what your typical relational database query looks like. Finally, no lookup tables are necessary.

If you want the most straightforward design choice and the highest isolation of different types of metrics, the narrow table layout is yours. Given that there are no changes in the table’s layout to be expected, it is the easiest to design since you probably know exactly what the data type for a specific metric (or set of metrics sharing the same data type) looks like. With a high cardinality of metrics, a lookup mechanism may be necessary to understand what table to query (especially with name mangling).

Whatever you do, just remember one thing: it is possible to switch from one layout to another, but migrating isn’t simple. If you’re unsure about the future, consider one of the simpler extensible layout options.

Read more

  • Best Practices for Time-Series Data Modeling: Single or Multiple Partitioned Table(s) a.k.a. Hypertables

  • Best Practices for (Time-)Series Metadata Tables

On this page