Skip to content

Commit

Permalink
Documentation of custom sql query checks updated.
Browse files Browse the repository at this point in the history
  • Loading branch information
piotrczarnas committed Nov 3, 2024
1 parent 6818205 commit f905d83
Show file tree
Hide file tree
Showing 32 changed files with 450 additions and 3,013 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -441,7 +441,7 @@ spec:
|[*sql_condition_failed_on_table*](../checks/table/custom_sql/sql-condition-failed-on-table.md)|Maximum count of rows that failed SQL conditions|[Validity](../dqo-concepts/data-quality-dimensions.md#data-validity)|A table-level check that uses a custom SQL expression on each row to verify (assert) that all rows pass a custom condition defined as an SQL condition. Use the {alias} token to reference the tested table. This data quality check can be used to compare columns on the same table. For example, the condition can verify that the value in the *col_price* column is higher than the *col_tax* column using an SQL expression: `{alias}.col_price > {alias}.col_tax`. Use an SQL expression that returns a *true* value for valid values and a *false* one for invalid values, because it is an assertion.|:material-check-bold:|
|[*sql_condition_passed_percent_on_table*](../checks/table/custom_sql/sql-condition-passed-percent-on-table.md)|Minimum percentage of rows that passed SQL condition|[Validity](../dqo-concepts/data-quality-dimensions.md#data-validity)|A table-level check that ensures that a minimum percentage of rows passed a custom SQL condition (expression). Measures the percentage of rows passing the condition. Raises a data quality issue when the percent of valid rows is below the *min_percent* parameter.| |
|[*sql_aggregate_expression_on_table*](../checks/table/custom_sql/sql-aggregate-expression-on-table.md)|Custom aggregated SQL expression within range|[Reasonableness](../dqo-concepts/data-quality-dimensions.md#data-reasonableness)|A table-level check that calculates a given SQL aggregate expression on a table and verifies if the value is within a range of accepted values.| |
|[*sql_invalid_record_count_on_table*](../checks/table/custom_sql/sql-invalid-record-count-on-table.md)|Custom SELECT SQL that returns invalid records|[Validity](../dqo-concepts/data-quality-dimensions.md#data-validity)|A table-level check that uses a custom SQL query that return invalid values from column. Use the {alias} token to reference the tested table. This data quality check can be used to compare columns on the same table. For example, when this check is applied on a *age* column, the condition can verify that the *age* is lower than 18 using an SQL expression: `{alias}.age < 18`|:material-check-bold:|
|[*sql_invalid_record_count_on_table*](../checks/table/custom_sql/sql-invalid-record-count-on-table.md)|Custom SELECT SQL that returns invalid records|[Validity](../dqo-concepts/data-quality-dimensions.md#data-validity)|A table-level check that uses a custom SQL query that return invalid values from column. Use the {table} token to reference the tested table. This data quality check can be used to compare columns on the same table. For example, when this check is applied on a *age* column, the condition can find invalid records in which the *age* is lower than 18 using an SQL query: `SELECT age FROM {table} WHERE age < 18`.|:material-check-bold:|
|[*import_custom_result_on_table*](../checks/table/custom_sql/import-custom-result-on-table.md)|Import custom data quality results on table|[Validity](../dqo-concepts/data-quality-dimensions.md#data-validity)|A table-level check that uses a custom SQL SELECT statement to retrieve a result of running a custom data quality check that was hardcoded in the data pipeline, and the result was stored in a separate table. The SQL query that is configured in this external data quality results importer must be a complete SELECT statement that queries a dedicated table (created by the data engineers) that stores the results of custom data quality checks. The SQL query must return a *severity* column with values: 0 - data quality check passed, 1 - warning issue, 2 - error severity issue, 3 - fatal severity issue.| |


Expand All @@ -458,7 +458,7 @@ that are used by those checks.
|[*sql_condition_failed_on_column*](../checks/column/custom_sql/sql-condition-failed-on-column.md)|Maximum count of rows that failed SQL conditions|[Validity](../dqo-concepts/data-quality-dimensions.md#data-validity)|A column-level check that uses a custom SQL expression on each column to verify (assert) that all rows pass a custom condition defined as an SQL expression. Use the {alias} token to reference the tested table, and the {column} to reference the column that is tested. This data quality check can be used to compare columns on the same table. For example, when this check is applied on a *col_price* column, the condition can verify that the *col_price* is higher than the *col_tax* using an SQL expression: `{alias}.{column} > {alias}.col_tax` Use an SQL expression that returns a *true* value for valid values and *false* for invalid values, because it is an assertion.|:material-check-bold:|
|[*sql_condition_passed_percent_on_column*](../checks/column/custom_sql/sql-condition-passed-percent-on-column.md)|Minimum percentage of rows that passed SQL condition|[Validity](../dqo-concepts/data-quality-dimensions.md#data-validity)|A table-level check that ensures that a minimum percentage of rows passed a custom SQL condition (expression). Measures the percentage of rows passing the condition. Raises a data quality issue when the percent of valid rows is below the *min_percent* parameter.| |
|[*sql_aggregate_expression_on_column*](../checks/column/custom_sql/sql-aggregate-expression-on-column.md)|Custom aggregated SQL expression within range |[Reasonableness](../dqo-concepts/data-quality-dimensions.md#data-reasonableness)|A column-level check that calculates a given SQL aggregate expression on a column and verifies if the value is within a range of accepted values.| |
|[*sql_invalid_value_count_on_column*](../checks/column/custom_sql/sql-invalid-value-count-on-column.md)|Custom SELECT SQL that returns invalid values|[Validity](../dqo-concepts/data-quality-dimensions.md#data-validity)|A column-level check that uses a custom SQL query that return invalid values from column. This check is used for setting testing queries or ready queries used by users in their own systems (legacy SQL queries). Use the {alias} token to reference the tested table, and the {column} to reference the column that is tested. For example, when this check is applied on a column, the condition can verify that the column has lower value than 18 using an SQL expression: `{alias}.{column} < 18`.|:material-check-bold:|
|[*sql_invalid_value_count_on_column*](../checks/column/custom_sql/sql-invalid-value-count-on-column.md)|Custom SELECT SQL that returns invalid values|[Validity](../dqo-concepts/data-quality-dimensions.md#data-validity)|A column-level check that uses a custom SQL query that return invalid values from column. This check is used for setting testing queries or ready queries used by users in their own systems (legacy SQL queries). Use the {table} token to reference the tested table, and the {column} to reference the column that is tested. For example, when this check is applied on a column. The condition can find invalid values in the column which have values lower than 18 using an SQL query: `SELECT {column} FROM {table} WHERE {column} < 18`.|:material-check-bold:|
|[*import_custom_result_on_column*](../checks/column/custom_sql/import-custom-result-on-column.md)|Import custom data quality results on column|[Validity](../dqo-concepts/data-quality-dimensions.md#data-validity)|Column level check that uses a custom SQL SELECT statement to retrieve a result of running a custom data quality check on a column by a custom data quality check, hardcoded in the data pipeline. The result is retrieved by querying a separate **logging table**, whose schema is not fixed. The logging table should have columns that identify a table and a column for which they store custom data quality check results, and a *severity* column of the data quality issue. The SQL query that is configured in this external data quality results importer must be a complete SELECT statement that queries a dedicated logging table, created by the data engineering team.| |


Expand Down
10 changes: 5 additions & 5 deletions docs/checks/column/custom_sql/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -60,15 +60,15 @@ A column-level check that calculates a given SQL aggregate expression on a colum
### [sql invalid value count on column](./sql-invalid-value-count-on-column.md)
A column-level check that uses a custom SQL query that return invalid values from column.
This check is used for setting testing queries or ready queries used by users in their own systems (legacy SQL queries).
Use the {alias} token to reference the tested table, and the {column} to reference the column that is tested.
For example, when this check is applied on a column, the condition can verify that the column has lower value than 18 using an SQL expression: `{alias}.{column} < 18`.
Use the {table} token to reference the tested table, and the {column} to reference the column that is tested.
For example, when this check is applied on a column. The condition can find invalid values in the column which have values lower than 18 using an SQL query: `SELECT {column} FROM {table} WHERE {column} < 18`.


| Data quality check name | Friendly name | Check type | Description | Standard |
|-------------------------|---------------|------------|-------------|----------|
|[<span class="no-wrap-code">`profile_sql_invalid_value_count_on_column`</span>](./sql-invalid-value-count-on-column.md#profile-sql-invalid-value-count-on-column)|Custom SELECT SQL that returns invalid values|[profiling](../../../dqo-concepts/definition-of-data-quality-checks/data-profiling-checks.md)|Runs a custom query that retrieves invalid values found in a column and returns the number of them, and raises an issue if too many failures were detected. This check is used for setting testing queries or ready queries used by users in their own systems (legacy SQL queries). For example, when this check is applied on a column, the condition can verify that the column has lower value than 18 using an SQL expression: &#x60;{alias}.{column} &lt; 18&#x60;.|:material-check-bold:|
|[<span class="no-wrap-code">`daily_sql_invalid_value_count_on_column`</span>](./sql-invalid-value-count-on-column.md#daily-sql-invalid-value-count-on-column)|Custom SELECT SQL that returns invalid values|[monitoring](../../../dqo-concepts/definition-of-data-quality-checks/data-observability-monitoring-checks.md)|Runs a custom query that retrieves invalid values found in a column and returns the number of them, and raises an issue if too many failures were detected. This check is used for setting testing queries or ready queries used by users in their own systems (legacy SQL queries). For example, when this check is applied on a column, the condition can verify that the column has lower value than 18 using an SQL expression: &#x60;{alias}.{column} &lt; 18&#x60;.|:material-check-bold:|
|[<span class="no-wrap-code">`monthly_sql_invalid_value_count_on_column`</span>](./sql-invalid-value-count-on-column.md#monthly-sql-invalid-value-count-on-column)|Custom SELECT SQL that returns invalid values|[monitoring](../../../dqo-concepts/definition-of-data-quality-checks/data-observability-monitoring-checks.md)|Runs a custom query that retrieves invalid values found in a column and returns the number of them, and raises an issue if too many failures were detected. This check is used for setting testing queries or ready queries used by users in their own systems (legacy SQL queries). For example, when this check is applied on a column, the condition can verify that the column has lower value than 18 using an SQL expression: &#x60;{alias}.{column} &lt; 18&#x60;.|:material-check-bold:|
|[<span class="no-wrap-code">`profile_sql_invalid_value_count_on_column`</span>](./sql-invalid-value-count-on-column.md#profile-sql-invalid-value-count-on-column)|Custom SELECT SQL that returns invalid values|[profiling](../../../dqo-concepts/definition-of-data-quality-checks/data-profiling-checks.md)|Runs a custom query that retrieves invalid values found in a column and returns the number of them, and raises an issue if too many failures were detected. This check is used for setting testing queries or ready queries used by users in their own systems (legacy SQL queries). For example, when this check is applied on a column. The condition can find invalid values in the column which have values lower than 18 using an SQL query: &#x60;SELECT {column} FROM {table} WHERE {column} &lt; 18&#x60;.|:material-check-bold:|
|[<span class="no-wrap-code">`daily_sql_invalid_value_count_on_column`</span>](./sql-invalid-value-count-on-column.md#daily-sql-invalid-value-count-on-column)|Custom SELECT SQL that returns invalid values|[monitoring](../../../dqo-concepts/definition-of-data-quality-checks/data-observability-monitoring-checks.md)|Runs a custom query that retrieves invalid values found in a column and returns the number of them, and raises an issue if too many failures were detected. This check is used for setting testing queries or ready queries used by users in their own systems (legacy SQL queries). For example, when this check is applied on a column. The condition can find invalid values in the column which have values lower than 18 using an SQL query: &#x60;SELECT {column} FROM {table} WHERE {column} &lt; 18&#x60;.|:material-check-bold:|
|[<span class="no-wrap-code">`monthly_sql_invalid_value_count_on_column`</span>](./sql-invalid-value-count-on-column.md#monthly-sql-invalid-value-count-on-column)|Custom SELECT SQL that returns invalid values|[monitoring](../../../dqo-concepts/definition-of-data-quality-checks/data-observability-monitoring-checks.md)|Runs a custom query that retrieves invalid values found in a column and returns the number of them, and raises an issue if too many failures were detected. This check is used for setting testing queries or ready queries used by users in their own systems (legacy SQL queries). For example, when this check is applied on a column. The condition can find invalid values in the column which have values lower than 18 using an SQL query: &#x60;SELECT {column} FROM {table} WHERE {column} &lt; 18&#x60;.|:material-check-bold:|



Expand Down
Loading

0 comments on commit f905d83

Please sign in to comment.