|
| 1 | +## IoT recipes |
| 2 | + |
| 3 | +This section contains recipes for IoT issues: |
| 4 | + |
| 5 | +### Work with columnar IoT data |
| 6 | + |
| 7 | +Narrow and medium width tables are a great way to store IoT data. A lot of reasons are outlined in |
| 8 | +[Designing Your Database Schema: Wide vs. Narrow Postgres Tables][blog-wide-vs-narrow]. |
| 9 | + |
| 10 | +One of the key advantages of narrow tables is that the schema does not have to change when you add new |
| 11 | +sensors. Another big advantage is that each sensor can sample at different rates and times. This helps |
| 12 | +support things like hysteresis, where new values are written infrequently unless the value changes by a |
| 13 | +certain amount. |
| 14 | + |
| 15 | +#### Narrow table format example |
| 16 | + |
| 17 | +Working with narrow table data structures presents a few challenges. In the IoT world one concern is that |
| 18 | +many data analysis approaches - including machine learning as well as more traditional data analysis - |
| 19 | +require that your data is resampled and synchronized to a common time basis. Fortunately, TimescaleDB provides |
| 20 | +you with [hyperfunctions][hyperfunctions] and other tools to help you work with this data. |
| 21 | + |
| 22 | +An example of a narrow table format is: |
| 23 | + |
| 24 | +| ts | sensor_id | value | |
| 25 | +|-------------------------|-----------|-------| |
| 26 | +| 2024-10-31 11:17:30.000 | 1007 | 23.45 | |
| 27 | + |
| 28 | +Typically you would couple this with a sensor table: |
| 29 | + |
| 30 | +| sensor_id | sensor_name | units | |
| 31 | +|-----------|--------------|--------------------------| |
| 32 | +| 1007 | temperature | degreesC | |
| 33 | +| 1012 | heat_mode | on/off | |
| 34 | +| 1013 | cooling_mode | on/off | |
| 35 | +| 1041 | occupancy | number of people in room | |
| 36 | + |
| 37 | +A medium table retains the generic structure but adds columns of various types so that you can |
| 38 | +use the same table to store float, int, bool, or even JSON (jsonb) data: |
| 39 | + |
| 40 | +| ts | sensor_id | d | i | b | t | j | |
| 41 | +|-------------------------|-----------|-------|------|------|------|------| |
| 42 | +| 2024-10-31 11:17:30.000 | 1007 | 23.45 | null | null | null | null | |
| 43 | +| 2024-10-31 11:17:47.000 | 1012 | null | null | TRUE | null | null | |
| 44 | +| 2024-10-31 11:18:01.000 | 1041 | null | 4 | null | null | null | |
| 45 | + |
| 46 | +To remove all-null entries, use an optional constraint such as: |
| 47 | + |
| 48 | +```sql |
| 49 | + CONSTRAINT at_least_one_not_null |
| 50 | + CHECK ((d IS NOT NULL) OR (i IS NOT NULL) OR (b IS NOT NULL) OR (j IS NOT NULL) OR (t IS NOT NULL)) |
| 51 | +``` |
| 52 | + |
| 53 | +#### Get the last value of every sensor |
| 54 | + |
| 55 | +There are several ways to get the latest value of every sensor. The following examples use the |
| 56 | +structure defined in [Narrow table format example][setup-a-narrow-table-format] as a reference: |
| 57 | + |
| 58 | +- [SELECT DISTINCT ON][select-distinct-on] |
| 59 | +- [JOIN LATERAL][join-lateral] |
| 60 | + |
| 61 | +##### SELECT DISTINCT ON |
| 62 | + |
| 63 | +If you have a list of sensors, the easy way to get the latest value of every sensor is to use |
| 64 | +`SELECT DISTINCT ON`: |
| 65 | + |
| 66 | +```sql |
| 67 | +WITH latest_data AS ( |
| 68 | + SELECT DISTINCT ON (sensor_id) ts, sensor_id, d |
| 69 | + FROM iot_data |
| 70 | + WHERE d is not null |
| 71 | + AND ts > CURRENT_TIMESTAMP - INTERVAL '1 week' -- important |
| 72 | + ORDER BY sensor_id, ts DESC |
| 73 | +) |
| 74 | +SELECT |
| 75 | + sensor_id, sensors.name, ts, d |
| 76 | +FROM latest_data |
| 77 | +LEFT OUTER JOIN sensors ON latest_data.sensor_id = sensors.id |
| 78 | +WHERE latest_data.d is not null |
| 79 | +ORDER BY sensor_id, ts; -- Optional, for displaying results ordered by sensor_id |
| 80 | +``` |
| 81 | + |
| 82 | +The common table expression (CTE) used above is not strictly necessary. However, it is an elegant way to join |
| 83 | +to the sensor list to get a sensor name in the output. If this is not something you care about, |
| 84 | +you can leave it out: |
| 85 | + |
| 86 | +```sql |
| 87 | +SELECT DISTINCT ON (sensor_id) ts, sensor_id, d |
| 88 | + FROM iot_data |
| 89 | + WHERE d is not null |
| 90 | + AND ts > CURRENT_TIMESTAMP - INTERVAL '1 week' -- important |
| 91 | + ORDER BY sensor_id, ts DESC |
| 92 | +``` |
| 93 | + |
| 94 | +It is important to take care when down-selecting this data. In the previous examples, |
| 95 | +the time that the query would scan back was limited. However, if there any sensors that have either |
| 96 | +not reported in a long time or in the worst case, never reported, this query devolves to a full table scan. |
| 97 | +In a database with 1000+ sensors and 41 million rows, an unconstrained query takes over an hour. |
| 98 | + |
| 99 | +#### JOIN LATERAL |
| 100 | + |
| 101 | +An alternative to [SELECT DISTINCT ON][select-distinct-on] is to use a `JOIN LATERAL`. By selecting your entire |
| 102 | +sensor list from the sensors table rather than pulling the IDs out using `SELECT DISTINCT`, `JOIN LATERAL` can offer |
| 103 | +some improvements in performance: |
| 104 | + |
| 105 | +```sql |
| 106 | +SELECT sensor_list.id, latest_data.ts, latest_data.d |
| 107 | +FROM sensors sensor_list |
| 108 | + -- Add a WHERE clause here to downselect the sensor list, if you wish |
| 109 | +LEFT JOIN LATERAL ( |
| 110 | + SELECT ts, d |
| 111 | + FROM iot_data raw_data |
| 112 | + WHERE sensor_id = sensor_list.id |
| 113 | + ORDER BY ts DESC |
| 114 | + LIMIT 1 |
| 115 | +) latest_data ON true |
| 116 | +WHERE latest_data.d is not null -- only pulling out float values ("d" column) in this example |
| 117 | + AND latest_data.ts > CURRENT_TIMESTAMP - interval '1 week' -- important |
| 118 | +ORDER BY sensor_list.id, latest_data.ts; |
| 119 | +``` |
| 120 | + |
| 121 | +Limiting the time range is important, especially if you have a lot of data. Best practice is to use these |
| 122 | +kinds of queries for dashboards and quick status checks. To query over a much larger time range, encapsulate |
| 123 | +the previous example into a materialized query that refreshes infrequently, perhaps once a day. |
| 124 | + |
| 125 | +Shoutout to **Christopher Piggott** for this recipe. |
| 126 | + |
| 127 | +[blog-wide-vs-narrow]: https://www.timescale.com/learn/designing-your-database-schema-wide-vs-narrow-postgres-tables |
| 128 | +[setup-a-narrow-table-format]: /tutorials/:currentVersion:/cookbook/#narrow-table-format-example |
| 129 | +[select-distinct-on]: /tutorials/:currentVersion:/cookbook/#select-distinct-on |
| 130 | +[join-lateral]: /tutorials/:currentVersion:/cookbook/#join-lateral |
| 131 | +[hyperfunctions]: /use/:currentVersion:/hyperfunctions/ |
0 commit comments