---
title: Handle semi-structured data with JSON | Tiger Data Docs
description: Use JSON or JSONB to store semi-structured data, such as user-defined fields, and learn how to index the JSONB structure and individual fields
---

JSON and JSONB fit payloads where the shape changes per tenant or device. Keep identifiers and timestamps as typed columns, and tuck the long tail of optional fields into JSON so you can index what matters.

```
CREATE TABLE metrics (
  time TIMESTAMPTZ,
  user_id INT,
  device_id INT,
  data JSONB
);
```

That example keeps `time`, `user_id`, and `device_id` as real columns because predicates and joins hit them far more often than rarely used JSON keys.

You should also use the JSONB data type, that is, JSON stored in a binary format, rather than JSON data type. JSONB data types are more efficient in both storage overhead and lookup performance.

Note

Use JSONB for user-defined data rather than sparse data. This works best for most data sets. For sparse data, use NULLable fields and, if possible, run on top of a compressed file system like ZFS. This will work better than a JSONB data type, unless the data is extremely sparse, for example, more than 95% of fields for a row are empty.

## Index the JSONB structure

When you index JSONB data across all fields, it is usually best to use a GIN (generalized inverted) index. In most cases, you can use the default GIN operator, like this:

```
CREATE INDEX idxgin ON metrics USING GIN (data);
```

For more information about GIN indexes, see the [PostgreSQL documentation](https://www.postgresql.org/docs/current/datatype-json.html#JSON-INDEXING).

This index only optimizes queries where the `WHERE` clause uses the `?`, `?&`, `?|`, or `@>` operator. For more information about these operators, see the [PostgreSQL documentation](https://www.postgresql.org/docs/current/functions-json.html#FUNCTIONS-JSONB-OP-TABLE).

## Index individual fields

JSONB columns sometimes have common fields containing values that are useful to index individually. Indexes like this can be useful for ordering operations on field values, [multicolumn indexes](https://www.postgresql.org/docs/current/indexes-multicolumn.html), and indexes on specialized types, such as a postGIS geography type. Another advantage of indexes on individual field values is that they are often smaller than GIN indexes on the entire JSONB field. To create an index like this, it is usually best to use a [partial index](https://www.postgresql.org/docs/current/indexes-partial.html) on an [expression](https://www.postgresql.org/docs/current/indexes-expressional.html) accessing the field. For example:

```
CREATE INDEX idxcpu
  ON metrics(((data->>'cpu')::double precision))
  WHERE data ? 'cpu';
```

In this example, the expression being indexed is the `cpu` field inside the `data` JSONB object, cast to a double. The cast reduces the size of the index by storing the much smaller double, instead of a string. The `WHERE` clause ensures that the only rows included in the index are those that contain a `cpu` field, because the `data ? 'cpu'` returns `true`. This also serves to reduce the size of the index by not including rows without a `cpu` field. Note that in order for a query to use the index, it must have `data ? 'cpu'` in the WHERE clause.

This expression can also be used with a multi-column index, for example, by adding `time DESC` as a leading column. Note, however, that to enable index-only scans, you need `data` as a column, not the full expression `((data->>'cpu')::double precision)`.
