<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0" xmlns:media="http://search.yahoo.com/mrss/">
    <channel>
        <title><![CDATA[Tiger Data Blog]]></title>
        <description><![CDATA[Insights, product updates, and tips from TigerData (Creators of TimescaleDB) engineers on Postgres, time series & AI. IoT, crypto, and analytics tutorials & use cases.]]></description>
        <link>https://www.tigerdata.com/blog</link>
        <image>
            <url>https://www.tigerdata.com/icon.ico</url>
            <title>Tiger Data Blog</title>
            <link>https://www.tigerdata.com/blog</link>
        </image>
        <generator>RSS for Node</generator>
        <lastBuildDate>Tue, 07 Apr 2026 16:24:21 GMT</lastBuildDate>
        <atom:link href="https://www.tigerdata.com/blog" rel="self" type="application/rss+xml"/>
        <ttl>60</ttl>
        <item>
            <title><![CDATA[How Messari Uses Data to Open the Cryptoeconomy to Everyone]]></title>
            <description><![CDATA[Learn how Messari sets up its datastack to collect, calculate, and contextualize crypto metrics, break down data silos, and bring transparency to the cryptoeconomy.]]></description>
            <link>https://www.tigerdata.com/blog/how-messari-uses-data-to-open-the-cryptoeconomy-to-everyone</link>
            <guid isPermaLink="true">https://www.tigerdata.com/blog/how-messari-uses-data-to-open-the-cryptoeconomy-to-everyone</guid>
            <category><![CDATA[Dev Q&A]]></category>
            <category><![CDATA[PostgreSQL]]></category>
            <dc:creator><![CDATA[Adam Inoue]]></dc:creator>
            <pubDate>Wed, 25 Aug 2021 14:04:30 GMT</pubDate>
            <media:content medium="image" href="https://timescale.ghost.io/blog/content/images/2021/08/feature-image-Messari.png">
            </media:content>
            <content:encoded><![CDATA[<p><em>This is an installment of our “Community Member Spotlight” series, where we invite our customers to share their work, shining a light on their success and inspiring others with new ways to use technology to solve problems.</em></p><p><em>In this edition, </em><a href="https://twitter.com/AdamIn0ue"><em>Adam Inoue</em></a><em>, Software Engineer at Messari, joins us to share how they bring transparency to the cryptoeconomy, combining tons of data about crypto assets with real-time alerting mechanisms to give investors a holistic view of the market and ensure they never miss an important event.</em></p><p><a href="https://messari.io/">Messari</a> is a data analytics and research company on a mission to organize and contextualize information for crypto professionals. Using Messari, analysts and enterprises can analyze, research, and stay on the cutting edge of the crypto world – all while trusting the integrity of the underlying data. </p><p>This gives professionals the power to make informed decisions and take timely action. We are uniquely positioned to provide an experience that combines automated data collection (such as our quantitative <a href="https://messari.io/asset/ethereum/metrics/all">asset metrics</a> and charting tools) with <a href="https://messari.io/research">qualitative research</a> and <a href="https://messari.io/intel/list">market intelligence</a> from a global team of analysts.</p><p>Our users range from some of the most prominent analysts, investors, and individuals in the crypto industry to top platforms like Coinbase, BitGo, Anchorage, 0x, Chainanalysis, Ledger, Compound, MakerDAO, and many more. </p><h2 id="about-the-team">About the Team</h2><p>I have over five years of experience as a backend developer in roles where I’ve primarily focused on high-throughput financial systems, financial reporting, and relational databases to support those systems. </p><p>After some COVID-related career disruptions, I started at Messari as a software engineer this past April (2021). I absolutely love it. The team is small but growing quickly, and everyone is specialized, highly informed, and at the top of their game. (Speaking of growing quickly, <a href="https://messari.io/careers">we’re hiring</a>!)</p><p>We’re still small enough to function mostly as one team. We are split into front-end and back-end development. The core of our back end is a suite of microservices written in Golang and managed by Kubernetes, and I—along with two other engineers—“own” managing the cluster and associated services. (As an aside, another reason I love Messari: we’re a fully remote team. I’m in Hawaii, and those two colleagues are in New York and London. Culturally, we also minimize meetings, which is great because we’re so distributed, <em>and</em> we end up with lots of time for deep work.)</p><p>From a site reliability standpoint, my team is responsible for all of the backend APIs that serve the live site, our <a href="https://messari.io/api">public API</a>, our real-time data ingestion, the ingestion and calculation of asset metrics, and more.</p><p>So far, I’ve mostly specialized in the ingestion of real-time market data—and that’s where TimescaleDB comes in!</p><h2 id="about-the-project">About the Project</h2><p>Much of our website is completely free to use, but we have <a href="https://messari.io/pro">Pro</a> and <a href="https://messari.io/enterprise">Enterprise</a> tiers that provide enhanced functionality. For example, our Enterprise version includes <a href="https://messari.io/intel/list">Intel</a>, a real-time alerting mechanism that notifies users about important events in the crypto space (e.g., forks, hacks, protocol changes, etc.) as they occur.</p><p>We collect and calculate a huge catalog of <a href="https://messari.io/asset/ethereum/metrics/all">crypto-asset metrics</a>, like price, volume, all-time cycle highs and lows, and detailed information about each currency. Handling these metrics uses a relatively low proportion of our compute resources, while real-time trade ingestion is a much more resource-intensive operation. </p><p>Our crypto price data is currently calculated based on several thousand trades per second (ingested from partners, such as <a href="https://www.kaiko.com">Kaiko</a> and <a href="https://www.gemini.com">Gemini</a>), as well as our own on-chain integrations with <a href="https://thegraph.com">The Graph</a>. We also keep exhaustive historical data that goes as far back as the dawn of Bitcoin. (You can read more about the <a href="https://www.investopedia.com/terms/b/bitcoin.asp" rel="noreferrer">history of Bitcoin</a>.)</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://timescale.ghost.io/blog/content/images/2022/01/Untitled--1-.png" class="kg-image" alt="Messari dashboard with a dark background. The dashboard consists of four parts: 1) watchlist tracking assets like Bitcoin or Ethereum, 2) ROI chart, 3) line graph showing real-time Bitcoin price data, and 4) Intel chart, a real-time alerting mechanism." loading="lazy" width="2000" height="1396" srcset="https://timescale.ghost.io/blog/content/images/size/w600/2022/01/Untitled--1-.png 600w, https://timescale.ghost.io/blog/content/images/size/w1000/2022/01/Untitled--1-.png 1000w, https://timescale.ghost.io/blog/content/images/size/w1600/2022/01/Untitled--1-.png 1600w, https://timescale.ghost.io/blog/content/images/2022/01/Untitled--1-.png 2048w" sizes="(min-width: 720px) 720px"><figcaption><i><em class="italic" style="white-space: pre-wrap;">Messari dashboard, with data available at </em></i><a href="https://messari.io/"><i><em class="italic" style="white-space: pre-wrap;">messari.io</em></i></a><i><em class="italic" style="white-space: pre-wrap;"> for free</em></i></figcaption></figure><p><strong>Our data pipelines are the core of the quantitative portion of our product—and are, therefore, mission-critical. </strong>For our site to be visibly alive, the most important metric is our real-time volume-weighted average price (VWAP), although we calculate hundreds of other metrics on an hourly or daily basis. We power our real-time view through WebSocket connections to our back-end, and we keep the latest price data in memory to avoid having to make constant repeated database calls. </p><p>Everything “historical”—i.e., even as recently as five minutes ago—makes a call to our time-series endpoint. <strong>Any cache misses there will hit the database, so it’s critical that the database is highly available.</strong></p><p>We use the price data to power the quantitative views we display on our live site and directly serve our data to API users. Much of what we display on our live site is regularly retrieved and cached by a backend-for-frontend GraphQL server, but some of it is also retrieved by HTTP calls or WebSocket connections from one or more Go microservices.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://timescale.ghost.io/blog/content/images/2022/01/Untitled.png" class="kg-image" alt="Messari dashboard with a dark background. The dashboard shows all available data about Ethereum, including real-time price, ROI, and key metrics like market cap and price." loading="lazy" width="2000" height="998" srcset="https://timescale.ghost.io/blog/content/images/size/w600/2022/01/Untitled.png 600w, https://timescale.ghost.io/blog/content/images/size/w1000/2022/01/Untitled.png 1000w, https://timescale.ghost.io/blog/content/images/size/w1600/2022/01/Untitled.png 1600w, https://timescale.ghost.io/blog/content/images/2022/01/Untitled.png 2048w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The asset view of Messari dashboard, showing various price stats for a specific currency.</span></figcaption></figure><p>The accuracy of our data is extremely important because it’s public-facing and used to help our users make decisions. And, just like the rest of the crypto space, we are also scaling quickly, both in terms of our business and the amount of data we ingest. </p><h2 id="choosing-and-using-timescaledb">Choosing (and Using!) TimescaleDB</h2><p>We’re wrapping up a complete transition to TimescaleDB from <a href="https://www.influxdata.com">InfluxDB</a>. It would be reasonable to say that <a href="https://timescale.ghost.io/blog/timescaledb-vs-influxdb-for-time-series-data-timescale-influx-sql-nosql-36489299877/" rel="noreferrer">we used InfluxDB until it fell over</a>; we asked it to do a huge amount of ingestion and continuous aggregation, not to mention queries around the clock, to support the myriad requests our users can make. </p><p>Over time, we pushed it enough that it became less stable, so eventually, it became clear that InfluxDB wasn’t going to scale with us. Thus, <a href="https://github.com/dev-kpyc">Kevin Pyc</a> (who served as the entire backend “team” until earlier this year) became interested in TimescaleDB as a possible alternative.</p><p>The pure PostgreSQL interface and impressive performance characteristics sold him on TimescaleDB as a good option for us. </p><p>From there, the entire tech group convened and agreed to try TimescaleDB. We were aware of its performance claims but needed to test it out for ourselves for our exact use case. I began by reimplementing our real-time trade ingestion database adapter on TimescaleDB—and on every test, TimescaleDB blew my expectations out of the water.</p><p>The most significant aspects of our system are INSERT and SELECT performance.</p><ul><li>INSERTs of real-time trade data are constant, 24/7, and rarely dip below 2,000 rows per second. At peak times, they can exceed 4,500—and, of course, we expect this number to continually increase as the industry continues to grow and we see more and more trades.</li><li>SELECT performance impacts our APIs’ response time for anything we haven’t cached; we briefly cache many of the queries needed for the live site, but less common queries end up hitting the database. </li></ul><p>When we tested these with TimescaleDB, both our SELECT and INSERT performance results flatly outperformed InfluxDB. In testing, even though our fully managed <a href="https://www.timescale.com/products" rel="noreferrer">Timescale</a> instance is currently only located in us-east-1 and most of our infrastructure is in an us-west region, we saw an average of ~40&nbsp;ms improvement in both types of queries. Plus, we could batch-insert 500 rows of data instead of 100, with no discernible drop in execution time relative to InfluxDB. </p><p><strong>These impressive performance benchmarks, combined with the fact that we can use Postgres with foreign key relationships to derive new datasets from our existing ones (which we weren’t able to do with InfluxDB), are key differentiators for TimescaleDB.</strong></p><p>✨ <strong>Editor’s Note: </strong><em>For more comparisons and benchmarks, see how TimescaleDB compares to </em><a href="https://timescale.ghost.io/blog/timescaledb-vs-influxdb-for-time-series-data-timescale-influx-sql-nosql-36489299877/" rel="noreferrer"><em>InfluxDB</em></a><em>, </em><a href="https://timescale.ghost.io/blog/how-to-store-time-series-data-mongodb-vs-timescaledb-postgresql-a73939734016/" rel="noreferrer"><em>MongoDB</em></a><em>, </em><a href="https://timescale.ghost.io/blog/timescaledb-vs-amazon-timestream-6000x-higher-inserts-175x-faster-queries-220x-cheaper/" rel="noreferrer"><em>AWS Timestream</em></a><em>, and other </em><a href="https://www.timescale.com/learn/the-best-time-series-databases-compared" rel="noreferrer"><em>time-series database alternatives</em> <em>on various vectors</em></a><em>, from performance and ecosystem to query language and beyond. For tips on optimizing your database insert rate, see our </em><a href="https://timescale.ghost.io/blog/blog/13-tips-to-improve-postgresql-insert-performance/"><em>13 ways to improve PostgreSQL insert performance</em></a><em> blog post.</em></p><p>We are also really excited about continuous aggregates. We store our data at minute-level granularity, so any granularity of data above one minute is powered by continuous queries that feed a rollup table. </p><p>In InfluxDB-world, we had a few problems with continuous queries: they tended to lag a few minutes behind real-time ingestion, and, in our experience, continuous queries would occasionally fail to pick up a trade ingested out of order—for instance, one that’s half an hour old—and it wouldn’t be correctly accounted for in our rollup queries. </p><p>Switching these rollups to TimescaleDB continuous aggregates has been great; they’re never out of date, and we can gracefully refresh the proper time range whenever we receive an out-of-order batch of trades or are back-filling data.</p><p>At the time of writing, I’m still finalizing our continuous aggregate views—we had to refresh them all the way back to 2010!—but all of the other parts of our implementation are complete and have been stable for some time.</p><p>✨<em> </em><strong>Editor’s Note</strong>:<em> Check out the </em><a href="https://docs.timescale.com/use-timescale/latest/continuous-aggregates/" rel="noreferrer"><em>continuous aggregates documentation</em></a><em> and follow </em><a href="https://docs.timescale.com/tutorials/latest/blockchain-query/" rel="noreferrer"><em>the step-by-step tutorial</em></a><em> to learn how to utilize continuous aggregates for analyzing crypto data.</em></p><h2 id="current-deployment-future-plans">Current Deployment &amp; Future Plans</h2><p>As I mentioned earlier, all of the core services in our back-end are currently written in Go, and we have some projects on the periphery written in Node or Java. We don't currently need to expose TimescaleDB to any project that isn't written in Go. We use <a href="https://gorm.io">GORM</a> for most database operations, so we connect to TimescaleDB with a <code>gorm.DB</code> object.</p>
<p>We try to use GORM conventions as much as possible; for TimescaleDB-specific operations like <a href="https://docs.timescale.com/api/latest/compression/add_compression_policy/">managing compression policies</a> or the <a href="https://docs.timescale.com/use-timescale/latest/hypertables/create/"><code>create_hypertable</code> step</a> where no GORM method exists, we write out queries literally.</p>
<p>For instance, we initialize our tables using <code>repo.PrimaryDB.AutoMigrate(repo.Schema.Model)</code>, which is a GORM-specific feature, but we create new hypertables as follows:</p>
<pre><code>res := repo.PrimaryDB.Table(tableName).Exec(
		fmt.Sprintf("SELECT create_hypertable('%s', 'time', chunk_time_interval =&gt; INTERVAL '%s');",
			tableName, getChunkSize(repo.Schema.MinimumInterval)))
</code></pre>
<p>Currently, our architecture that touches TimescaleDB looks like this:</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://timescale.ghost.io/blog/content/images/2021/08/Messari---Dev-Q-A---Architecture--1-.jpg" class="kg-image" alt="The architecture diagram of Messari solution" loading="lazy" width="2000" height="1352" srcset="https://timescale.ghost.io/blog/content/images/size/w600/2021/08/Messari---Dev-Q-A---Architecture--1-.jpg 600w, https://timescale.ghost.io/blog/content/images/size/w1000/2021/08/Messari---Dev-Q-A---Architecture--1-.jpg 1000w, https://timescale.ghost.io/blog/content/images/size/w1600/2021/08/Messari---Dev-Q-A---Architecture--1-.jpg 1600w, https://timescale.ghost.io/blog/content/images/2021/08/Messari---Dev-Q-A---Architecture--1-.jpg 2000w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The current architecture diagram</span></figcaption></figure><p>We use Prometheus for a subset of our monitoring, but for our real-time ingestion engine, we’re in an advantageous position: the system’s performance is obvious just by looking at our logs. </p><p>Whenever our database upsert backlog is longer than a few thousand rows, we log that with a timestamp to easily see how large the backlog is and how quickly we can catch up. </p><p>Our backlog tends to be shorter and more stable with TimescaleDB than it was previously—and our developer experience has improved as well. </p><p>Speaking for myself, I didn’t understand much about our InfluxDB implementation’s inner workings, but after talking it through with my teammates, it seems highly customized and hard to explain from scratch. The hosted TimescaleDB implementation with Timescale is much easier to understand, particularly because we can easily view the live database dashboard, complete with all our table definitions, chunks, policies, and the like.</p><p>Looking ahead, we have a lot of projects that we’re excited about! One of the big ones is that, with TimescaleDB, we’ll have a much easier time deriving metrics from multiple existing data sets. </p><p>In the past, because InfluxDB is NoSQL, linking <a href="https://www.tigerdata.com/blog/time-series-introduction" rel="noreferrer">time series</a> together to generate new, derived, or aggregated metrics was challenging. <strong>Now, we can use simple JOINs in one query to easily return all the data we need to derive a new metric</strong>. </p><p>Many other projects have to remain under wraps for now, but we think TimescaleDB will be a crucial part of our infrastructure for years to come, and we’re excited to scale with it.</p><h2 id="getting-started-advice-resources">Getting Started Advice &amp; Resources</h2><p>TimescaleDB is complex, and it's important to understand the implementation of <a href="https://www.tigerdata.com/blog/database-indexes-in-postgresql-and-timescale-cloud-your-questions-answered" rel="noreferrer">hypertables</a> quickly. To best benefit from TimescaleDB’s features, you need to think about how to chunk your hypertables, what retention and compression policies to set, and whether/how to set up continuous aggregates. (Particularly with regard to your hypertable chunk size, because it's hard to change that decision later.)</p><p>In our case, the “answers” to three of these questions were addressed from our previous InfluxDB setup: compress after 48 hours (the maximum time in the past we expect to ingest a trade); retain everything; and rollup all of our price and volume data into our particular set of intervals (5m, 15m, 30m, 1h, 6h, 1d, and 1w).</p><p>The most difficult part was understanding how long our chunks should be (i.e., setting our <a href="https://docs.timescale.com/api/latest/hypertable/set_chunk_time_interval/#optional-arguments"><code>chunk_time_interval</code></a> on each hypertable). We settled on one day, mostly by default, with some particularly small metrics chunked after a year instead.</p>
<p>I’m not sure these decisions would be as obvious for other use cases. </p><p>In summary, the strongest advantages of TimescaleDB are its performance and pure Postgres interface. Both of these make us comfortable recommending it across a wide range of use cases. Still, the decision shouldn’t be cavalier; we tested Timescale for several weeks before committing to the idea and finishing our implementation.</p><p><em>We’d like to thank Adam and all of the folks at Messari for sharing their story and for their effort to lower the barriers to investing in crypto assets by offering a massive number of crypto-asset metrics and a real-time alerting mechanism.</em></p><p><em>We’re always keen to feature new community projects and stories on our blog. If you have a story or project you’d like to share, reach out on Slack (</em><a href="https://timescaledb.slack.com/team/U03797BSQKT?ref=timescale.com"><em>@Ana Tavares</em></a><em>), and we’ll go from there.</em></p>]]></content:encoded>
        </item>
        <item>
            <title><![CDATA[How METER Group Brings a Data-Driven Approach to the Cannabis Production Industry]]></title>
            <description><![CDATA[Learn how METER Group architected its data stack to collect and visualize massive amounts of data – and help customers make informed business decisions.]]></description>
            <link>https://www.tigerdata.com/blog/how-meter-group-brings-a-data-driven-approach-to-the-cannabis-production-industry</link>
            <guid isPermaLink="true">https://www.tigerdata.com/blog/how-meter-group-brings-a-data-driven-approach-to-the-cannabis-production-industry</guid>
            <category><![CDATA[Dev Q&A]]></category>
            <category><![CDATA[PostgreSQL]]></category>
            <dc:creator><![CDATA[Paolo Bergantino]]></dc:creator>
            <pubDate>Mon, 26 Jul 2021 14:42:35 GMT</pubDate>
            <media:content medium="image" href="https://timescale.ghost.io/blog/content/images/2021/07/174779559_978280639374264_6865881920322659010_n.jpg">
            </media:content>
            <content:encoded><![CDATA[<p><em>This is an installment of our “Community Member Spotlight” series, where we invite our customers to share their work, shining a light on their success and inspiring others with new ways to use technology to solve problems.</em></p><p><em>In this edition, </em><a href="https://www.linkedin.com/in/paolobergantino/"><em>Paolo Bergantino</em></a><em>, Director of Software for the Horticulture business unit at METER Group, joins us to share how they make data accessible to their customers so that they can maximize their cannabis yield and increase efficiency and consistency between grows.</em></p><p><a href="https://aroya.io/">AROYA</a> is the leading cannabis production platform servicing the U.S. market today. AROYA is part of <a href="https://www.metergroup.com/">METER Group</a>, a scientific instrumentation company with 30+ years of expertise in developing sensors for the agriculture and food industries. We have taken this technical expertise and applied it to the cannabis market, developing a platform that allows growers to grow more efficiently and increase their yields—and to do so consistently and at scale.</p><h2 id="about-the-team">About the team</h2><p>My name is <a href="https://www.linkedin.com/in/paolobergantino/">Paolo Bergantino</a>. I have about 15 years of experience developing web applications in various stacks, and I have spent the last four here at METER Group. Currently, I am the Director of Software for the Horticulture business unit, which is in charge of the development and infrastructure of the AROYA software platform. My direct team consists of about ten engineers, 3 QA engineers, and a UI/UX Designer. (<a href="https://www.metergroup.com/career/">We’re also hiring!</a>)</p><h2 id="about-the-project">About the project</h2><p>AROYA is built as a React Single-Page App (SPA) that communicates with a Django/DRF back-end. In addition to using <a href="https://www.timescale.com/products">Timescale Cloud</a> for our database, we use AWS services such as EC2+ELB for our app and workers, <a href="https://aws.amazon.com/elasticache/redis/">ElastiCache for Redis</a>, <a href="https://aws.amazon.com/s3/">S3</a> for various tasks, <a href="https://docs.aws.amazon.com/iot/latest/developerguide/sqs-rule-action.html">AWS IoT/SQS</a> for handling packets from our sensors, and some other services here and there.</p><p>As I previously mentioned, AROYA was born out of our desire to build a system that leveraged our superior sensor technology in an industry that needed such a system. Cannabis worked out great in this respect, as the current legalization movement throughout the U.S. has resulted in a lot of disruption in the space. </p><p>The more we spoke to growers, the more we were struck by how much mythology there was in growing cannabis and by how little science was being applied by relatively large operations. As a company with deeply scientific roots, we found it to be a perfect match and an area where we could bring some of our knowledge to the forefront. <strong>We ultimately believe the only survivors in the space are those who can use data-driven approaches to their cultivation to maximize their yield and increase efficiency and consistency between grows. </strong></p><p>As part of the AROYA platform, we developed a wireless module (called a “nose”) that could be attached to our sensors. Using Bluetooth Low Energy (BLE) for low power consumption and attaching a solar panel to take advantage of the lights in a grow room, the module can run indefinitely without charging.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://timescale.ghost.io/blog/content/images/2021/07/204951218_1941930365957763_2364085903187304407_n.jpg" class="kg-image" alt="The AROYA nose device with the cannabis plants in the background" loading="lazy" width="1032" height="1290" srcset="https://timescale.ghost.io/blog/content/images/size/w600/2021/07/204951218_1941930365957763_2364085903187304407_n.jpg 600w, https://timescale.ghost.io/blog/content/images/size/w1000/2021/07/204951218_1941930365957763_2364085903187304407_n.jpg 1000w, https://timescale.ghost.io/blog/content/images/2021/07/204951218_1941930365957763_2364085903187304407_n.jpg 1032w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The AROYA nose in its natural habitat (</span><a href="https://www.instagram.com/p/CQbpXuXBgke/"><span style="white-space: pre-wrap;">aroya.io</span></a><span style="white-space: pre-wrap;"> Instagram)</span></figcaption></figure><p>The most critical sensor we attach to this nose is called the TEROS 12, the three-pronged sensor pictured below. It can be installed into any growing medium (like rockwool, coconut coir, soil, or mixes like perlite, pumice, or peat moss) and give insights into the temperature, water content (WC), and electrical conductivity (EC) of the medium. Without getting too into the weeds (pardon the pun), WC and EC, in particular, are crucial in helping growers make informed irrigation decisions that will steer the plants into the right state and ultimately maximize their yield potential.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://timescale.ghost.io/blog/content/images/2021/07/161449707_437044057364377_9165544412024707037_n.jpg" class="kg-image" alt="The white three-pronged sensor called TEROS laying on the grey shelf" loading="lazy" width="845" height="556" srcset="https://timescale.ghost.io/blog/content/images/size/w600/2021/07/161449707_437044057364377_9165544412024707037_n.jpg 600w, https://timescale.ghost.io/blog/content/images/2021/07/161449707_437044057364377_9165544412024707037_n.jpg 845w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The AROYA nose with a connected TEROS 12 sensor (</span><a href="https://www.instagram.com/p/CMe_cyuBHUO/"><span style="white-space: pre-wrap;">aroya.io </span></a><span style="white-space: pre-wrap;">Instagram)</span></figcaption></figure><p>We also have an ATMOS 14 sensor for measuring the climate in the rooms and <a href="https://aroya.io/cultivation/">a whole suite of sensors</a> for other use cases.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://timescale.ghost.io/blog/content/images/2021/07/201782665_1014596755960402_5100068875632858009_n-1.jpg" class="kg-image" alt="The white ATMOS sensor hanging from the ceiling with cannabis plants underneath it" loading="lazy" width="1032" height="1290" srcset="https://timescale.ghost.io/blog/content/images/size/w600/2021/07/201782665_1014596755960402_5100068875632858009_n-1.jpg 600w, https://timescale.ghost.io/blog/content/images/size/w1000/2021/07/201782665_1014596755960402_5100068875632858009_n-1.jpg 1000w, https://timescale.ghost.io/blog/content/images/2021/07/201782665_1014596755960402_5100068875632858009_n-1.jpg 1032w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">An AROYA repeater with an ATMOS 14 sensor for measuring the climate (</span><a href="https://www.instagram.com/p/CQME57tB6Ds/"><span style="white-space: pre-wrap;">aroya.io </span></a><span style="white-space: pre-wrap;">Instagram)</span></figcaption></figure><p>AROYA’s core competency is collecting this data—e.g., EC, WC, soil temp, air temperature, etc.—and serving it to our clients in real-time (or, at least “real-time” for our purposes, as our typical sampling interval is 3 minutes).</p><p>Growers typically split their growing rooms into irrigation zones. We encourage them to install statistically significant amounts of sensors into each room and its zones, so that AROYA gives them good and actionable feedback on the state of their room.  For example, there’s a concept in cultivation called "<a href="https://aroya.io/resources/crop-steering/">crop steering</a>" that basically says that if you stress the plant in just the right way,  you can "steer" it into generative or vegetative states at will and drive it to squeeze every last bit of flower. How and when you do this is crucial to doing it properly.</p><p>Our data allows growers to dial in their irrigation strategy so they can hit their target "dry back" for the plant (this is more or less the difference between the water content at the end of irrigation and the water content at the next irrigation event). Optimizing dry back is one of the biggest factors in making crop steering work, and it's basically impossible to do well without good data. (We provide lots of other data that helps growers make decisions, but this is one of the most important ones.)</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://timescale.ghost.io/blog/content/images/2021/07/image--1-.png" class="kg-image" alt="A line chart with dark blue background showing electrical conductivity and water content data related to a room in AROYA." loading="lazy" width="1723" height="659" srcset="https://timescale.ghost.io/blog/content/images/size/w600/2021/07/image--1-.png 600w, https://timescale.ghost.io/blog/content/images/size/w1000/2021/07/image--1-.png 1000w, https://timescale.ghost.io/blog/content/images/size/w1600/2021/07/image--1-.png 1600w, https://timescale.ghost.io/blog/content/images/2021/07/image--1-.png 1723w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">Graph showing electrical conductivity (EC) and water content (WC) data related to a room in AROYA.</span></figcaption></figure><p>This can be even more important when multiple cultivars (“strains”) of cannabis are grown in the same room, as the differences between two cultivars regarding their needs and expectations can be pretty dramatic. For those unfamiliar with the field, an example might be that different cultivars "drink" water differently, and thus must be irrigated differently to achieve maximum yields. There are also "stretchy" cultivars that grow taller faster than "stocky" ones, and this also affects how they interact with the environment. AROYA not only helps in terms of sensing, but in documenting and helping understand these differences to improve future runs.</p><p>The most important thing from collecting all this data is making it accessible to users via graphs and visualizations in an intuitive, reliable, and accurate way, so they can make informed decisions about their cultivation.</p><p>We also have alerts and other logic that we apply to incoming data. These visualizations and business logic can happen at the sensor level, at the zone level, at the room level, or sometimes even at the facility level.</p><p>A typical use case with AROYA might be that a user logs in to their dashboard to view sensor data for a room. Initially, they view charts aggregated to the zone level, but they may decide to dig deeper into a particular zone and view the individual sensors that make up that zone. Or, vice versa, they may want to pull out and view data averaged all the way up to the room. So, as we designed our solution, we needed to ensure we could get to (and provide) the data at the right aggregation level quickly.</p><h2 id="choosing-and-using-timescaledb">Choosing and using TimescaleDB</h2><h3 id="the-initial-solution">The initial solution</h3><p>During the days of our closed alpha and beta of AROYA with early trial accounts (late 2017 through our official launch in December 2019), the amount of data coming into the system was not significant. Our nose was still being developed (and hardware development is nice and slow), so we had to make due with some legacy data loggers that METER also produces. </p><p>These data loggers only sampled every 5 minutes and, at best, reported every 15 minutes. We used <a href="https://aws.amazon.com/rds/aurora/postgresql-features/">AWS’ RDS Aurora PostgreSQL</a> service and cobbled together a set of triggers and functions that partitioned our main readings table by each client facility—but no more. Because we have so many sensor models and data types we can collect, I chose to use a <a href="https://docs.timescale.com/timescaledb/latest/overview/data-model-flexibility/narrow-data-model/#narrow-table-model">narrow data model</a> for our main readings table. </p><p>This overall setup worked well enough at first, but as we progressed from alpha to beta and our customer base grew, it became increasingly clear that it was not a long-term solution for our <a href="https://www.tigerdata.com/blog/time-series-introduction" rel="noreferrer">time series data</a> needs. I could have expanded my self-managed system of triggers and functions and cobbled together additional partitions within a facility, but this did not seem ideal. There had to be a better way! </p><p>I started looking into specific time-series solutions. I am a bit of a home automation aficionado, and I was already familiar with InfluxDB—but <strong>I didn’t wish to split my relational data and readings data or teach my team a new query language. </strong></p><p><strong>TimescaleDB, being built on top of PostgreSQL, initially drew my attention: it “just worked” in every respect, I could expect it to, and I could use the same tools I was used to for it.</strong> At this point, however, I had a few reservations about some non-technical aspects of hosting TimescaleDB that prevented me from going full steam ahead with it.</p><p><em>✨  <strong>Editor’s Note:</strong> For more comparisons and benchmarks, see how TimescaleDB compares to </em><a href="https://timescale.ghost.io/blog/timescaledb-vs-influxdb-for-time-series-data-timescale-influx-sql-nosql-36489299877/" rel="noreferrer"><em>InfluxDB</em></a><em>, </em><a href="https://timescale.ghost.io/blog/how-to-store-time-series-data-mongodb-vs-timescaledb-postgresql-a73939734016/" rel="noreferrer"><em>MongoDB</em></a><em>, </em><a href="https://timescale.ghost.io/blog/timescaledb-vs-amazon-timestream-6000x-higher-inserts-175x-faster-queries-220x-cheaper/" rel="noreferrer"><em>AWS Timestream</em></a><em>, and other </em><a href="https://www.timescale.com/learn/the-best-time-series-databases-compared" rel="noreferrer"><em>time-series database alternatives</em> <em>on various vectors</em></a><em>, from performance and ecosystem to query language and beyond.</em></p><h3 id="applying-a-band-aid-and-setting-a-goal">Applying a band-aid and setting a goal</h3><p>If I am perfectly truthful, before this point, I did not have any serious requirements or standards about what I considered to be the adequate quality of service for our application. I had a bit of an “<a href="https://en.wikipedia.org/wiki/I_know_it_when_I_see_it">I know it when I see it</a>” attitude towards the whole thing. </p><p>When we had a potential client walk away during a demo due to a particularly slow loading graph, I knew we had a problem on our hands and that we needed something really solid for the long term. </p><p>Still, at the time, we also needed something to get us by until we could perform a thorough evaluation of the available solutions and build something around that. At this point, I decided to set a Redis cluster between RDS and our application, which stored the last 30 days of sensor data (at all the aggregation levels required) as a Pandas data frame. <a href="https://redis.io/topics/cluster-tutorial">Redis cluster</a> Any chart request coming in for data within the first 30 days - which accounted for something like 90&nbsp;% of our requests—would simply hit Redis. Anything longer would cobble together the answer using both Redis and querying the database. <strong>Performance for the 90&nbsp;% use case was adequate, but it was getting increasingly dreadful as more and more historical data piled up for anything that hit the database.</strong> </p><p>At this point, I set the goalposts for what our new solution would need to meet: <em>Any chart request, which is an integral part of AROYA, needs to take less than one second for the API to serve.</em></p><h3 id="the-research-and-the-first-solution">The research and the first solution</h3><p>We looked at other databases at this point, InfluxDB was looked at again, we got in a beta of Timestream for AWS and looked at that. We even considered going NoSQL for the whole thing. We ran tests and benchmarks, created matrices of pros and cons, estimated costs, and the whole shebang. <strong>Nothing compared favorably to what we were able to achieve with TimescaleDB.</strong></p><p>Ultimately,<strong> the feature that really caught our attention was </strong><a href="https://docs.timescale.com/timescaledb/latest/getting-started/create-cagg/"><strong>continuous aggregates</strong></a><strong> in TimescaleDB</strong>. The way our logic works is that we see the timeframe that the user is requesting and sample our data accordingly. In other words, if a user fetches three months' worth of data, we would not send three months' worth of raw data to be graphed to the front end. Instead, we would bucket our data into appropriately sized buckets that would give us the right amount of data we want to display in the interface. </p><p>Although it would require quite a few views if we created continuous aggregates for every aggregation level and bucket size we cared about and then directly queried the right aggregation/bucket combination (depending on the parameters requested), that should do it, right? The answer was a resounding <strong>yes</strong>. </p><p><strong>The performance we were able to achieve using these views shattered the competition. </strong>Although I admit we were kind of “cheating” by precalculating the data, the point is that we could easily do it. Not only this but when we ran load tests on our proposed infrastructure, we were blown away by how much more traffic we could support without any service degradation. We could also eliminate all the complicated infrastructure that our Redis layer required, which was quite a load off (literally and figuratively).</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://timescale.ghost.io/blog/content/images/2022/01/image--4-.png" class="kg-image" alt="The chart with two lines showing the application server load before and after the TimescaleDB deployment." loading="lazy" width="613" height="317" srcset="https://timescale.ghost.io/blog/content/images/size/w600/2022/01/image--4-.png 600w, https://timescale.ghost.io/blog/content/images/2022/01/image--4-.png 613w"><figcaption><span style="white-space: pre-wrap;">Grafana dashboard for the internal team showing app server load average before and after deployment of initial TimescaleDB implementation.</span></figcaption></figure><p>The Achilles’ heel of this solution, an astute reader may already notice, is that we were paying for this performance in disk space. </p><p>I initially brushed this off as fair trade and moved on with my life. <strong>We found </strong><a href="https://docs.timescale.com/timescaledb/latest/getting-started/compress-data/"><strong>TimescaleDB’s compression</strong></a><strong> to be as good as advertised, which gave us 90&nbsp;%+ space savings in our underlying hypertable, </strong>but our sizable collection of uncompressed continuous aggregates grew by the day (keep reading to learn why this is a “but”...).</p><p><em>✨ <strong>Editor’s Note</strong>: We’ve put together resources about </em><a href="https://docs.timescale.com/timescaledb/latest/getting-started/create-cagg/##what-are-continuous-aggregates"><em>continuous aggregates</em></a><em> and </em><a href="https://docs.timescale.com/timescaledb/latest/how-to-guides/compression/"><em>compression</em></a><em> to help you get started.</em></p><h3 id="the-%E2%80%9Cfinal%E2%80%9D-solution">The “final” solution</h3><p>AROYA has been on an amazing trajectory since launch, and our growth was evident in the months before and after we deployed our initial TimescaleDB implementation. Thousands upon thousands of sensors hitting the field was great for business – but bad for our disk space. </p><p>Our monitoring told a good story of how long our chart requests were taking, as 95%+ of them were under 1 second, and virtually all were under 2 seconds. Still, within a few months of deployment, we needed to upgrade tiers in Timescale Cloud solely to keep up with our disk usage. <em>approaching</em></p><p>We had adequate computing resources for our load, but 1 TB was no longer enough, so we doubled our total instance size to get another 1 TB. While everything was running smoothly, I felt a dark cloud overhead as our continuous aggregates grew and grew in size.</p><p>The clock was ticking, and before we knew it, we were approaching 2 TB of readings. So, we had to take action. </p><p>We had attended a webinar hosted by Timescale and heard someone make a relatively off-hand comment about rolling their own compression for continuous aggregates. This planted a seed that was all we needed to get going.</p><p>The plan was thus: first, after consulting with Timescale staff, we were alerted we had way too many bucket sizes. We could use <a href="https://docs.timescale.com/api/latest/analytics/time_bucket/">TimescaleDB’s time_bucket functions</a> to do some of this on the fly without affecting performance or keeping as many continuous aggregates. That was an easy win. </p><p>Next, we split each of our current continuous aggregates into three separate components:</p><ul><li>First, we kept the original continuous aggregate.</li><li>Then, we leveraged the <a href="https://docs.timescale.com/timescaledb/latest/how-to-guides/user-defined-actions/">TimescaleDB job scheduler</a> to move and compress chunks from the original continuous aggregate into a <a href="https://docs.timescale.com/timescaledb/latest/getting-started/create-hypertable/">hypertable </a>for that specific bucket/aggregation view.</li><li>Finally, we created a plain old view that UNIONed the two and made it a transparent change for our application. </li></ul><p>This allowed us to compress everything but the last week of all of our continuous aggregates, and the results were as good as we could have hoped for.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://timescale.ghost.io/blog/content/images/2022/01/pasted-image-0-5.png" class="kg-image" alt="The line chart showing the compression of data from 1.83 TB to 700 GB" loading="lazy" width="626" height="229" srcset="https://timescale.ghost.io/blog/content/images/size/w600/2022/01/pasted-image-0-5.png 600w, https://timescale.ghost.io/blog/content/images/2022/01/pasted-image-0-5.png 626w"><figcaption><span style="white-space: pre-wrap;">The 1.83TB database was compressed into 700 GB.</span></figcaption></figure><p><strong>We were able to take our ~1.83 TB database and compress it down to 700 GB</strong>. Not only that, about 300 GB of that is log data that’s unrelated to our main reading pipeline. </p><p>We will be migrating out this data soon, which gives us a vast amount of room to grow. (We think we can even move back the 1TB plan at this point, but have to test to ensure that compute doesn’t become an issue.) The rate of incrementation in disk usage was also massively slowed, which bodes well for this solution in the long term. What’s more, there was virtually no penalty for doing this in terms of performance for any of the metrics we monitor.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://timescale.ghost.io/blog/content/images/2022/01/pasted-image-0--1--1.png" class="kg-image" alt="The dot plot on the dark background showing how long sampling of chart requests takes to serve." loading="lazy" width="624" height="232" srcset="https://timescale.ghost.io/blog/content/images/size/w600/2022/01/pasted-image-0--1--1.png 600w, https://timescale.ghost.io/blog/content/images/2022/01/pasted-image-0--1--1.png 624w"><figcaption><span style="white-space: pre-wrap;">Our monitoring shows how long sampling of chart requests takes to serve.</span></figcaption></figure><p>Ultimately TimescaleDB had wins across the board for my team. <strong>Performance was going to be the driving force behind whatever we went with, and TimescaleDB has delivered that in spades.</strong></p><h2 id="current-deployment-future-plans">Current deployment &amp; future plans</h2><p><strong>We currently ingest billions of readings every month using TimescaleDB and couldn’t be happier. </strong>Our data ingest and charting capabilities are two of the essential aspects of AROYA’s infrastructure. </p><p>While the road to get here has been a huge learning experience, our current infrastructure is straightforward and performant, and we’ve been able to rely on it to work as expected and to do the right thing. I am not sure I can pay a bigger compliment than that.</p><figure class="kg-card kg-image-card kg-card-hascaption"><img src="https://timescale.ghost.io/blog/content/images/2022/01/pasted-image-0--2-.png" class="kg-image" alt="The architecture diagram of AROYA solution" loading="lazy" width="1600" height="764" srcset="https://timescale.ghost.io/blog/content/images/size/w600/2022/01/pasted-image-0--2-.png 600w, https://timescale.ghost.io/blog/content/images/size/w1000/2022/01/pasted-image-0--2-.png 1000w, https://timescale.ghost.io/blog/content/images/2022/01/pasted-image-0--2-.png 1600w" sizes="(min-width: 720px) 720px"><figcaption><span style="white-space: pre-wrap;">The current architecture diagram</span></figcaption></figure><p>We’ve recently gone live with our AROYA Analytics release, which is building upon what we’ve done to deliver deeper insights into the environment and the operations at the facilities using our service. Every step of the way, it’s been straightforward (and performant!) to calculate the metrics we need with our TimescaleDB setup. </p><h2 id="getting-started-advice-resources">Getting started advice &amp; resources</h2><p>I think it’s worth mentioning that there were many trade-offs and requirements that guided me to where AROYA is today with our use of TimescaleDB. Ultimately, my story is simply the set of decisions that led me to where we are now and people’s mileage may vary depending on their requirements. </p><p>I am sure that the set of functionality offered means that, with a little bit of creativity, TimescaleDB can work for just about any time-series use case I can think of.</p><p>The exercise we went through when iterating from our initial non-Timescale solution to Timescale was crucial to get me to be comfortable with that migration. Moving such a critical part of my infrastructure was scary, and it is <em>still</em> scary. </p><p>Monitoring everything you can, having redundancies, and being vigilant about any unexpected activity - even if it’s not something that may trigger an error - has helped us stay out of trouble.</p><p>We have a big <a href="https://grafana.com/">Grafana</a> dashboard on a TV in our office that displays various metrics and multiple times we’ve seen something odd and uncovered an issue that could have festered into something much more if we hadn’t dug into it right away. Finally, diligent load testing of the infrastructure and staging runs of any significant modifications have made our deployments a lot less stressful, since they instill quite a bit of confidence.</p><p><em><strong>✨ Editor’s Note:</strong> Check out </em><a href="https://www.youtube.com/playlist?list=PLsceB9ac9MHTjwvV18QJnPcLrTXm_Q-Ft"><em>Grafana 101 video series</em></a><em> and </em><a href="https://docs.timescale.com/timescaledb/latest/tutorials/grafana/"><em>Grafana tutorials</em></a><em> to learn everything from building awesome, interactive visualizations to setting up custom alerts, sharing dashboards with teammates, and solving common issues.</em></p><p>I would like to give a big shout-out to Neil Parker, who is my right-hand man in anything relating to AROYA infrastructure and did virtually all of the actual work in getting many of these things set up and running. I would also like to thank <a href="https://twitter.com/michaelfreedman">Mike Freedman</a> and <a href="https://www.linkedin.com/in/priscilafletcher/">Priscila Fletcher</a> from Timescale, who have given us a great bit of time and information and helped us in our journey with TimescaleDB.</p><p><em>We’d like to give a big thank you to Paolo and everyone at AROYA for sharing their story, as well as for their efforts to help transform the cannabis production industry, equipping growers with the data they need to improve their crops, make informed decisions, and beyond.</em></p><p><em>We’re always keen to feature new community projects and stories on our blog. If you have a story or project you’d like to share, reach out on Slack (</em><a href="https://timescaledb.slack.com/archives/D01TTSRCFC7"><em>@</em></a><em>Ana Tavares), and we’ll go from there.</em></p>]]></content:encoded>
        </item>
    </channel>
</rss>