-
Notifications
You must be signed in to change notification settings - Fork 177
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #177 from awslabs/sql-join-tables
Sql Example - Join tables in Amazon Timestream SQL query
- Loading branch information
Showing
8 changed files
with
30,764 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,90 @@ | ||
# Join tables in Amazon Timestream SQL query | ||
|
||
The goal for this join is that there are 2 tables | ||
1. Sensor events for sensor measurements | ||
2. Sensor status events and other details | ||
|
||
The JOIN examples enriches the sensor events with data from the status event. Common use is to only filter sensor events for sensors that are active. The active/stopped informaiton is here stored in the second table of sensor details. | ||
|
||
### 1. Sensor Event table | ||
Contains measure events (temperature, humidity) | ||
|
||
| Column | Type | Description | | ||
|--------|------|-------------| | ||
|gpio|Dimension| Sensor channel identifier | | ||
|time|Timestamp| Time measure event was taken| | ||
|measure_name|measure_name|Measure type| | ||
|temperature|DOUBLE|Temperature in F| | ||
|humidity|DOUBLE|Humidity in % | | ||
|
||
### 2. Sensor Details table | ||
Additional details about the sensor. These are time based change events | ||
|
||
| Column | Type | Description | | ||
|--------|------|-----------------------------------------| | ||
|gpio|Dimension| Sensor channel identifier | | ||
|time|Timestamp| Time details were created/changed | | ||
|measure_name|measure_name| event type (initial_event/change_event) | | ||
|name|VARCHAR|readable name of sensor| | ||
|status|VARCHAR|active/stopped status of sensor| | ||
|
||
## SQL query | ||
|
||
Two SQL example statements are included in the file ```query_example_join_tables.sql``` | ||
1. Standard SQL JOIN to illustrate SQL compatibility | ||
2. Performance optimized CTE JOIN | ||
|
||
## Loading example data | ||
|
||
The following commands is using the ```create_batch_load_task.py``` utility to load the example csv files used in this | ||
query example. Please adjust as needed | ||
|
||
### 1. Load sensor events data | ||
|
||
```shell | ||
cd <github-clone-root>/amazon-timestream-tools/sample_apps/sql/utils | ||
``` | ||
|
||
```shell | ||
python3 ./create_batch_load_task.py \ | ||
region=<your_region> \ | ||
mapping=../join_tables/datamodel_events.json \ | ||
input_bucket=<s3-bucket-name> \ | ||
object_key_prefix=<upload_folder> \ | ||
data_file=../join_tables/sensor_events.csv \ | ||
database=amazon-timestream-tools \ | ||
table=sensor_events \ | ||
partition_key=gpio | ||
``` | ||
|
||
### 2. Load sensor details data | ||
|
||
```shell | ||
python3 ./create_batch_load_task.py \ | ||
region=<your_region> \ | ||
mapping=../join_tables/datamodel_details.json \ | ||
input_bucket=<s3-bucket-name> \ | ||
object_key_prefix=<upload_folder> \ | ||
data_file=../join_tables/sensor_details.csv \ | ||
database=amazon-timestream-tools \ | ||
table=sensor_details \ | ||
partition_key=gpio | ||
|
||
``` | ||
|
||
| **⚠ Note**: | | ||
|------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | ||
| It is recommended to use the database name `amazon-timestream-tools` and table names `sensor_events` and `sensor_details` as described below. The SQL statement uses `"amazon-timestream-tools"."sensorddata"` in WHERE clause and would not need to be modified. If you use a different database name and table combination, please adjust the SQL query as needed. | | ||
|
||
Parameters used: | ||
|
||
Parameter | Description | Recommended Value | ||
-----------------|----------------------------------------------------------------------------------------------------------------------------------------|------------------- | ||
**region** | Region where database is deployed | any region where Timestream is available | ||
**mapping** | File used to map CSV data columns to Timestream attributes | `../join_tables/datamodel_events.json` <br/> `../join_tables/datamodel_details.json` | ||
**input_bucket** | S3 bucket used to upload data file and report Batch Load Status | existing S3 bucket | ||
**object_key** | Folder where data file will be uploaded | any string, should not be root folder | ||
**data_file** | CSV file for this example | `../join_tables/sensor_events.csv ` <br/> `../join_tables/sensor_details.csv` | ||
**database** | Database in region. Database will be created if not exists. | `amazon-timestream-tools` | ||
**table** | Table where data is loaded. If this table does not exist, the table will be created | `sensor_events` <br/>`sensor_details` | ||
**partition_key**| Custome defined partition key (CDPK) used for this data example | `gpio` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
{ | ||
"TimeUnit": "MILLISECONDS", | ||
"TimeColumn": "time", | ||
"DimensionMappings": [ | ||
{ | ||
"SourceColumn": "gpio", | ||
"DestinationColumn": "gpio" | ||
} | ||
], | ||
"MultiMeasureMappings": { | ||
"MultiMeasureAttributeMappings": [ | ||
{ | ||
"SourceColumn": "name", | ||
"TargetMultiMeasureAttributeName": "name", | ||
"MeasureValueType": "VARCHAR" | ||
}, | ||
{ | ||
"SourceColumn": "status", | ||
"TargetMultiMeasureAttributeName": "status", | ||
"MeasureValueType": "VARCHAR" | ||
} | ||
] | ||
}, | ||
"MeasureNameColumn": "measure_name" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
{ | ||
"TimeUnit": "MILLISECONDS", | ||
"TimeColumn": "time", | ||
"DimensionMappings": [ | ||
{ | ||
"SourceColumn": "gpio", | ||
"DestinationColumn": "gpio" | ||
} | ||
], | ||
"MultiMeasureMappings": { | ||
"MultiMeasureAttributeMappings": [ | ||
{ | ||
"SourceColumn": "temperature", | ||
"TargetMultiMeasureAttributeName": "temperature", | ||
"MeasureValueType": "DOUBLE" | ||
}, | ||
{ | ||
"SourceColumn": "humidity", | ||
"TargetMultiMeasureAttributeName": "humidity", | ||
"MeasureValueType": "DOUBLE" | ||
} | ||
] | ||
}, | ||
"MeasureNameColumn": "measure_name" | ||
} |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,25 @@ | ||
-- Example 1 - Standard JOIN | ||
SELECT events.time, events.gpio, temperature, humidity, name, status | ||
FROM "amazon-timestream-tools"."sensor_events" as events | ||
JOIN "amazon-timestream-tools"."sensor_details" as details | ||
ON events.gpio = details.gpio | ||
WHERE details.status = 'active' | ||
AND events.time BETWEEN '2023-09-19' AND '2023-11-01' | ||
AND details.time BETWEEN '2023-09-19' AND '2023-11-01' | ||
|
||
-- Example 2 - CTO JOIN (preferred) | ||
with events as ( | ||
SELECT time, gpio, temperature, humidity | ||
FROM "amazon-timestream-tools"."sensor_events" | ||
WHERE time BETWEEN '2023-09-19' AND '2023-11-01' | ||
), | ||
details as ( | ||
SELECT time, gpio, name, status | ||
FROM "amazon-timestream-tools"."sensor_details" | ||
WHERE time BETWEEN '2023-09-19' AND '2023-11-01' | ||
AND status = 'active' | ||
) | ||
SELECT events.time, events.gpio, temperature, humidity, name, status | ||
FROM events | ||
JOIN details | ||
ON events.gpio = details.gpio |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,4 @@ | ||
gpio,time,name,status,measure_name | ||
17,1695335950657,office,active,initial_event | ||
22,1695335950657,outside,active,change_event | ||
27,1695335950657,window,stopped,change_event |
Oops, something went wrong.