Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update documentation for 2.6.0 #484

Merged
merged 10 commits into from
Mar 13, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
9 changes: 0 additions & 9 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,22 +11,13 @@ Released changes are shown in the
## [Not released]

### Added
- Persistent id.
- New fields in config page.
- The initial config file and all subsequent changes are saved in the caching folder.
- Outcome option to dropdown in Smart Tag Analysis.
- Link from confusion matrix cells and row/column labels to utterance table.
- Preserve edits done to the config via the API when relaunching Azimuth with the env var `LOAD_CONFIG_HISTORY=1`.
- Support for dataset-only smart tag analysis.

### Changed
- Change the outcome per threshold bar chart to area chart, making the x axis continuous, and add vertical dashed line marking current threshold.

### Deprecated/Breaking Changes

### Removed

### Fixed
- Fix Utterance Details page collapsing with extra long utterances.

### Security
Binary file modified docs/docs/_static/images/dashboard/post-processing-analysis.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/docs/_static/images/dataset-warnings/dataset-warnings-4.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/docs/_static/images/exploration-space/utterance-table.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file modified docs/docs/_static/images/settings.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
50 changes: 49 additions & 1 deletion docs/docs/getting-started/changelog.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,54 @@
# Releases

## [2.5.3] - 2022-02-16
## [2.6.0] - 2023-03-10

### Added
- **Config history.**
- The initial config file and all subsequent changes are saved in the caching folder in a `config_history.jsonl` file.
- Preserve edits to the config executed via the API when relaunching Azimuth with the env var `LOAD_CONFIG_HISTORY=1`.
- See details in the [Getting Started](c-run.md#2-running-the-app) or the [Development](../development/launching.md#back-end) section.
- **Persistent id.**
- Users can now specify a persistent id for each utterance that will persist through time.
- See how to specify it in the column section of the [Project Config](../reference/configuration/project.md#columns).
- It will be used for exporting/importing proposed actions (See below).
- If specified, it will be displayed when hovering on the index column in the utterance table, as detailed [here](../user-guide/exploration-space/utterance-table.md#index).
- **Import/Export of proposed actions.**
- Proposed actions can be exported in a simple CSV file using the persistent id as the key.
- This allows to import back the proposed actions at any time, including with a new dataset version.
- See details in the [Utterance table](../user-guide/exploration-space/utterance-table.md#proposed-action) section.
- **New interactions on the Exploration Space.**
- Link from confusion matrix cells and row/column labels to utterance table. Example provided [here](../user-guide/exploration-space/confusion-matrix.md#interaction).
- Users can now search for indices or persistent ids in the [utterance search box](../user-guide/exploration-space/index.md#filter-categories).
- **Support for the training set only.**
- Azimuth can now launch with a training set only.
- Dataset warnings are now also available with just one split (training or evaluation).
- **Better support for CSV files.** New helper function to load CSV files. Example provided [here](../reference/custom-objects/dataset.md#examples).

### Changed
- **Enhanced config page.** Additional fields from the config can be modified from the [settings page](../user-guide/settings.md), allowing to restart some start-up tasks based on the requested changes.
- **Syntax smart tags.**
- Syntax smart tags are now computed even if utterances have more than one sentence.
- For that reason, `short_sentence` and `long_sentence` were renamed to `short_utterance` and `long_utterance`. The default value for `long_utterance` was set to 12 words.
- See details in the [Syntax](../key-concepts/syntax-analysis.md) section.
- **Improved visualizations.**
- Change the [outcome per threshold bar chart](../user-guide/post-processing-analysis.md) to an area chart, making the x-axis continuous, and add a vertical dashed line marking the current threshold.
- Add outcome option to the dropdown in [smart tag analysis](../user-guide/smart-tag-analysis.md) on the Dashboard, and display the analysis even when no pipeline is selected.
- [Word clouds](../user-guide/exploration-space/prediction-overview.md#word-clouds) now uses the language from the config to determine the stop words (it used to only support English).
- Show short/long utterances on the [word count histogram](../user-guide/dataset-warnings.md#length-mismatch) in dataset warnings.
- **Performance improvements.** A few routes and caching logic were improved, making the app faster.

### Deprecated/Breaking Changes
- **Dependency Update.** Few libraries were updated to reduce security issues. These might cause breaking changes when loading user models and data.
- Bump `datasets` from 1.16.1 to 2.1.0
- Bump `tensorflow` from 2.8.0 to 2.11.0
- Bump `torch` from 1.9.0 to 1.13.1

### Fixed
- Fix the Utterance Details page collapsing with extra long utterances.
- Fix potential time-out issues for bigger datasets and models after start-up.
- Fix browser history after navigating to the exploration page with no pipeline.

## [2.5.3] - 2023-02-16

### Fixed
- Support truncation for HF pipelines.
Expand Down
2 changes: 1 addition & 1 deletion docs/docs/reference/configuration/project.md
Original file line number Diff line number Diff line change
Expand Up @@ -121,7 +121,7 @@ follows:
2. Optional column for the raw text input (before any pre-processing). Unused at the moment.
3. Features column for the label
4. Optional column to specify whether an example has failed preprocessing. Unused at the moment.
5. Column with a unique identifier for every example that should be persisted if the dataset is modified, such as if new examples are added or if examples are modified or removed.
5. Column with a unique identifier for every example that should be persisted if the dataset is modified, such as if new examples are added or if examples are modified or removed. It defaults to the Azimuth generated index.

=== "Config Example"

Expand Down
45 changes: 40 additions & 5 deletions docs/docs/reference/custom-objects/dataset.md
Original file line number Diff line number Diff line change
Expand Up @@ -33,20 +33,20 @@ def load_your_dataset(azimuth_config: AzimuthConfig, **kwargs) -> DatasetDict:

### Dataset splits

Azimuth expects the `train` and one of `validation` or `test` splits to be available. If
both `validation` and `test` are available, we will pick the former. The `train` is not mandatory for Azimuth to run.
Azimuth expects either `train`, `validation` or `test` splits to be available.

* If both `validation` and `test` are available, we will pick the former as the `evaluation` split.
* The app can load a `train` split only, an `evaluation` split only, or both.

## Column names and rejection class

Go to the [:material-link: Project Config](../configuration/project.md) to see other attributes that
should be set along with the dataset.

## Example
## Examples

Using this API, we can load SST2, a sentiment analysis dataset.

**Note:** in this case, we can omit `azimuth_config` from the definition because we don't need it.

=== "azimuth_shr/loading_resources.py"

```python
Expand Down Expand Up @@ -77,4 +77,39 @@ Using this API, we can load SST2, a sentiment analysis dataset.
}
```

We can also load a CSV file.

=== "azimuth_shr/loading_resources.py"

```python
from datasets import DatasetDict, load_dataset


def load_csv(train_path=None, validation_path=None) -> DatasetDict:
data_files = dict()
if train_path:
data_files["train"] = train_path
if validation_path:
data_files["validation"] = validation_path
ds_dict = load_dataset(path="csv", data_files=data_files)
return ds_dict
```
=== "Configuration file"

```json
{
"dataset": {
"class_name": "loading_resources.load_csv",
"remote": "/azimuth_shr",
"kwargs": {
"train_path": "path_to_data"
}
}
}
```

**Note:** in both cases, we can omit `azimuth_config` from the definition because we don't need it.

For more examples, users can refer to `azimuth_shr/loading_resources.py` in the repo.

--8<-- "includes/abbreviations.md"
1 change: 1 addition & 0 deletions docs/docs/user-guide/dataset-warnings.md
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,7 @@ above `Z`% (default is 5%), the analysis flags it.
Length mismatch compares the number of **words per utterance** in both sets. The application flags
a warning if the mean and/or standard deviation between the 2 distributions is above `A` and `B` (
default is 3 for both) respectively.
The values determining a short and a long utterance for the smart tags are displayed on the plot.

## Configuration

Expand Down
24 changes: 12 additions & 12 deletions docs/docs/user-guide/exploration-space/confusion-matrix.md
Original file line number Diff line number Diff line change
@@ -1,25 +1,25 @@
# Confusion Matrix

The confusion matrix displays the **model confusion between each pair of intents**. The confusion is
defined as the number of utterances with a given label that are predicted as another label.
The confusion matrix displays the **model confusion between each pair of intents**.
The confusion is defined as the number of utterances with a given label that are predicted as another label.
The prediction [outcome](../../key-concepts/outcomes.md) colors are shown on the confusion matrix.

![Screenshot](../../_static/images/exploration-space/confusion-matrix.png)

## Normalization
The toggle "Normalize" in the top right corner allows alternating between normalized and raw values.
When normalized, the number of utterances is divided by to the total number of utterances
with the given label.

!!! example

In this example, 45% of utterances labeled as `bill_due` were predicted as `bill_balance`.

## Class Ordering
The default order for the rows and columns is determined based on the reverse Cuthill-Mckee algorithm, which will group as many classes as possible with similar confusion. The algorithm ignores all confusion values under 10%. The rejection class is also ignored and is always the last one in the order.

Toggling off "Reorder classes" disables the reordering and allows showing the confusion matrix according to the class order provided by the user.

!!! tip "Outcome colors"

The prediction [outcome](../../key-concepts/outcomes.md) colors are shown on the confusion
matrix.

![Screenshot](../../_static/images/exploration-space/confusion-matrix.png)

!!! example

In this example, 45% of utterances labeled as `bill_due` were predicted as `bill_balance`.
## Interaction
Users can click on any confusion matrix cell, row, or column to filter the data accordingly.
For example, clicking on a row label will filter the utterances with that specific label.
2 changes: 1 addition & 1 deletion docs/docs/user-guide/exploration-space/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,7 +79,7 @@ filters are listed below.

![Screenshot](../../_static/images/control-panel/utterances-search.png){: style="width:200px"}

* **Search a particular string** to filter utterances that contain it.
* **Search a particular string** to filter utterances. It can be a substring from the utterance or its exact index or persistent id.
* Filter predictions based on their **confidence value**. You can specify a minimum and a maximum
value.
* Filter predictions according to their prediction [**outcomes**](../../key-concepts/outcomes.md).
Expand Down
26 changes: 21 additions & 5 deletions docs/docs/user-guide/exploration-space/utterance-table.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,10 +20,12 @@ the [:material-link: Utterance Details](utterance-details.md) page.
:material-sort: Click a column header to sort the table by the column values. Each click
rotates between ascending order, descending order, and no sorting.

### ID
### Id

A **unique ID** for each utterance is created for referencing purposes. When exporting the
utterances, the utterance ID refers to the column `row_idx`.
A **unique index** for each utterance is generated for referencing purposes. When exporting the
utterances, the utterance index refers to the column `row_idx`.

If a persistent id is provided in the [columns](../../reference/configuration/project.md#columns) section of the config, hovering on the index will display both the generated index and the persistent id, when they differ.

### Utterance

Expand Down Expand Up @@ -53,10 +55,24 @@ information, see [Smart Tags](../../key-concepts/smart-tags.md).

For each data point, the user can specify if an action needs to be
taken. [Proposed Actions](../../key-concepts/proposed-actions.md) are explained in the Key Concepts
section. The actions are done outside the app. Export the proposed actions in a `.csv` file and use
the list to resolve the utterance issues. The exported file also contains the smart tags.
section. The actions are done outside the app, using the exported list to resolve the utterance issues.

!!! tip "Apply in batch"

Proposed actions can be applied in **batches** by selecting multiple rows (or selecting all
based on the current search) and applying the change.

![Screenshot](../../_static/images/exploration-space/import-export.png){: style="width:400px"}

#### Exporting
To export the proposed actions, two options are available:

1. **Exporting only the proposed actions**. In the CSV, only rows with proposed actions will be present, and the two columns will be the [persistent id](../../reference/configuration/project.md#columns) and the proposed action.
1. **Exporting the complete dataset**, including all columns (smart tags, predictions, and so on). This can be useful for purposes other than proposed actions.

#### Importing
From the CSV exported by the first option, proposed actions can be imported back into Azimuth using the import button.

* This can be useful if the dataset is changed (for example, labels modified or rows removed), and the user wants to verify if the proposed actions were resolved.
* It can also be useful if you need to kill Azimuth and restart it without having access to the cache.
* By default, if some persistent ids are in the imported file, but not in the dataset in Azimuth, they will be ignored.
2 changes: 1 addition & 1 deletion docs/docs/user-guide/post-processing-analysis.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,7 @@

With the Threshold Comparison page, you can compare the performance of the model on the evaluation
set at **different threshold values**. The visualization shows the performance for threshold values
between 0 and 95%, with increments of 5%.
between 0 and 100%, with increments of 5%. The current threshold value is displayed as a vertical line.

A suggested minimum amount of correct predictions, as well as a maximum amount of incorrect
predictions, are displayed on the plot.
Expand Down
13 changes: 6 additions & 7 deletions docs/docs/user-guide/settings.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,10 @@
# Settings

## Speed-up the Start-Up

Disable behavioral testing and similarity analysis in the config file to increase the start-up
speed, as explained
in [:material-link: Behavioral Testing Configuration](../reference/configuration/analyses/behavioral_testing.md)
and [:material-link: Similarity Configuration](../reference/configuration/analyses/similarity.md). Enable
them through settings later on without restarting the app.
The settings page allows the user to edit the config.
Click on :fontawesome-solid-gear: in the top right of the Azimuth app to access it.

![Screenshot](../_static/images/settings.png)

* `Discard` any current modifications if you change your mind.
* When clicking `Apply and close`, the start-up tasks may start again, depending on the changes made to the config.
* For now, only select fields are editable through this screen.
3 changes: 2 additions & 1 deletion docs/docs/user-guide/smart-tag-analysis.md
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
The Smart Tag Analysis shows the proportion of samples that have been tagged by each smart tag
family, broken down by [**prediction outcomes**](../key-concepts/outcomes.md), along with
sample counts and prediction accuracies.
If no pipeline is selected, only the sample count will be available.

The analyses associated with each smart tag family may also be associated with
a specific model behavior, failure mode, and/or approach to address any issues. For example,
Expand Down Expand Up @@ -40,7 +41,7 @@ interpreted as references to rows.

### Columns
- The first column shows the class variable for which other values are presented. Use the
dropdown :material-arrow-down-drop-circle-outline: to switch between labels and predictions.
dropdown :material-arrow-down-drop-circle-outline: to switch between labels, predictions and outcomes.
- The second and third columns show sample count and pipeline accuracy, which can help with
identifying or prioritizing classes to investigate. For example, you may want to sort by
accuracy in ascending order, to focus on classes for which the model had more difficulty.
Expand Down
2 changes: 1 addition & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
[tool.poetry]
name = "azimuth"
version = "2.5.3"
version = "2.6.0"
description = "Azimuth provides a unified error analysis experience to data scientists."
readme = "README.md"
authors = ["Azimuth team <[email protected]>"]
Expand Down