Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[dagster-sigma] Implement materializing workbooks #25433

Draft
wants to merge 65 commits into
base: graphite-base/25433
Choose a base branch
from

Conversation

benpankow
Copy link
Member

Summary & Motivation

How I Tested These Changes

Changelog

Insert changelog entry or delete this section.

@benpankow
Copy link
Member Author

benpankow commented Oct 22, 2024

Warning

This pull request is not mergeable via GitHub because a downstack PR is open. Once all requirements are satisfied, merge this PR as a stack on Graphite.
Learn more

This stack of pull requests is managed by Graphite. Learn more about stacking.

Join @benpankow and the rest of your teammates on Graphite Graphite

@benpankow benpankow force-pushed the benpankow/sigma-mat-workbook branch 2 times, most recently from 00f230c to e3c0dd4 Compare October 22, 2024 21:46
@benpankow benpankow changed the title asdf [dagster-sigma] Implement materializing workbooks Oct 23, 2024
@benpankow benpankow force-pushed the benpankow/sigma-translator-2 branch 4 times, most recently from 43dc0ca to 2b55bc3 Compare October 24, 2024 05:03
maximearmstrong and others added 9 commits October 24, 2024 14:09
## Summary

Standardizes the `DagsterLookerApiTranslator` format to have a central
`get_asset_spec`, `get_asset_key` method. Updates all methods so that
they receive a LookerStructureData object as input.

## How I Tested These Changes

Updated unit tests
… resource (#25501)

## Summary

As part of migrating to the APIs mocked out in
https://www.notion.so/dagster/Asset-integration-customization-options-11a18b92e462807c94c2f84675021120
and
https://www.notion.so/dagster/Power-BI-semantic-model-API-options-11918b92e4628006a25af3f58a6b48d9,
detaches the asset spec loading process from the Looker resource, and
moves building asset definitions explicitly into user code.

```python
looker_resource =  LookerResource(
    client_id=os.environ["LOOKERSDK_CLIENT_ID"],
    client_secret=os.environ["LOOKERSDK_CLIENT_SECRET"],
    base_url=os.environ["LOOKERSDK_HOST_URL"],
)

looker_specs = load_looker_asset_specs(looker_resource=looker_resource)

pdts = build_looker_pdt_assets_definitions(
    resource_key="looker",
    request_start_pdt_builds=[RequestStartPdtBuild(model_name="my_model", view_name="my_view")],
)


defs = Definitions(
    assets=[*pdts, *looker_specs],
    resources={"looker": looker_resource},
)

```


## How I Tested These Changes

New unit test, updated existing unit tests.
## Summary & Motivation

Remove the `flagCodeLocationPage` flag, permanently enabling the new
code location pages in OSS.

## How I Tested These Changes

View app with legacy nav turned on and off, navigate to Deployment.
Click on code locations, verify that their new pages render correctly.

## Changelog

[ui] New code location pages with library versions, metadata, and
definitions.
## Summary & Motivation

Fix the `'unknown'` state for AMP flags, when navigating directly to
`/overview/automation`.

This route currently redirects to `/automation` if the user is not using
the legacy nav and if the AMP sensor flag state is not
`'has-global-amp'`. Unfortunately, on initial pageload we may not yet
have received the response from the GraphQL request that asks for the
flag state, so this is likely to be `'unknown'`, and the page redirects.

Fix this by waiting until the query resolves before redirecting or
rendering anything other than a `div`.

## How I Tested These Changes

Test permutations of legacy nav and AMP sensor flag state.

- Verify that redirect to `/overview/automation` occurs when the flag is
`'has-global-amp'`, on navigation and full pageload.
- Verify that when the flag is set to `'has-sensor-amp'`, the page
redirects to `/overview/sensors` for legacy nav and `/automation` for
new nav.

## Changelog

[ui] Fix redirect behavior on full pageloads of auto-materialize
overview page.
…au resource (#25459)

## Summary

As part of migrating to the APIs mocked out in
https://www.notion.so/dagster/Asset-integration-customization-options-11a18b92e462807c94c2f84675021120
and
https://www.notion.so/dagster/Power-BI-semantic-model-API-options-11918b92e4628006a25af3f58a6b48d9,
detaches the asset spec loading process from the Tableau resource, and
moves building asset definitions explicitly into user code.

```python
resource = TableauCloudWorkspace(
    connected_app_client_id=EnvVar("CONNECTED_APP_CLIENT_ID"),
    connected_app_secret_id=EnvVar("CONNECTED_APP_SECRET_ID"),
    connected_app_secret_value=EnvVar("CONNECTED_APP_SECRET_VALUE"),
    username=EnvVar("USERNAME"),
    site_name=EnvVar("SITE_NAME"),
    pod_name=EnvVar("POD_NAME"),
)

tableau_specs = load_tableau_asset_specs(
    workspace=resource,
)

non_executable_asset_specs = [
    spec for spec in tableau_specs if spec.tags.get("dagster-tableau/asset_type") == "data_source"
]

executable_asset_specs = [
    spec
    for spec in tableau_specs
    if spec.tags.get("dagster-tableau/asset_type") in ["dashboard", "sheet"]
]

defs = Definitions(
    assets=[
        build_tableau_executable_assets_definition(
            resource_key="tableau",
            workspace=resource,
            specs=executable_asset_specs,
            refreshable_workbook_ids=["b75fc023-a7ca-4115-857b-4342028640d0"],
        ),
        *non_executable_asset_specs,
    ],
    resources={"tableau": resource},
)

```


## How I Tested These Changes

New unit test, updated existing unit tests.
## Summary

Standardizes the `DagsterTableauTranslator` format to have a central
`get_asset_spec`, `get_asset_key` method.

## How I Tested These Changes

Updated unit tests
…powerbi` package (#25519)

## Summary

Adds a new section to our powerbi docs mentioning how to install the
package.
## Summary

As part of migrating to the APIs mocked out in https://www.notion.so/dagster/Asset-integration-customization-options-11a18b92e462807c94c2f84675021120  and https://www.notion.so/dagster/Power-BI-semantic-model-API-options-11918b92e4628006a25af3f58a6b48d9, detaches the asset spec loading process from the resource.

```python
resource = SigmaOrganization(
    base_url=SigmaBaseUrl.AWS_US,
    client_id=EnvVar("SIGMA_CLIENT_ID"),
    client_secret=EnvVar("SIGMA_CLIENT_SECRET"),
)

sigma_specs = load_sigma_asset_specs(resource, dagster_sigma_translator=MyCoolTranslator)
```

## How I Tested These Changes

New unit test, updated existing unit tests.
@benpankow benpankow changed the base branch from benpankow/sigma-translator-2 to graphite-base/25433 October 24, 2024 20:11
## Summary

Standardizes the `DagsterSigmaTranslator` format to:

1) Have a central `get_asset_spec`, `get_asset_key` method
2) Pass the `asset_key` returned by `get_asset_key` to `get_asset_spec`
3) Enforce that the key returned by `get_asset_spec` matches the input key

## How I Tested These Changes

Updated unit tests
cmpadden and others added 29 commits October 25, 2024 16:34
## Summary & Motivation

<img width="655" alt="image"
src="https://github.com/user-attachments/assets/1b1760e2-23c8-4b99-91d9-af6b310e842f">

## How I Tested These Changes

## Changelog

> Insert changelog entry or delete this section.
## Summary & Motivation

Update `dagster-ge` to support newer versions and drop support for older
versions.

Why?

- Prior to this PR, the most recent version of `great_expectations`
supported by `dagster_ge` is `0.17.11`, released 2023-08-17 (so over a
year old). There have been many releases (including 1.x series) since
then.
- Pydantic 2 is only supported in 0.17.15+, so we can't update `dagster`
to Pydantic 2-only until `dagster-ge` supports it.
- The later versions of `great_expectations` we want to support drop
some of the old APIs this integration was still supporting.
- Due to `dagster-ge` supporting only old `great_expectations`,
`dagster-ge` was a PITA for environment management, with nth-order
dependencies capped at old versions conflicting with the newer versions
we'd like to use elsewhere.

What was done here:

- Prior to this PR, `dagster-ge` exposed two versions of its sole public
API: `ge_validation_op_factory` and `ge_validation_op_factory_v3`.
`ge_validation_op_factory_v3` uses APIs that are supported in more
recent GE versions. `ge_validation_op_factory` used ancient APIs. Also
`ge_validation_op_factory_v3` wasn't even exported from the top-level. I
simply renamed `ge_validation_op_factory_v3` to
`ge_validation_op_factory` and deleted the old
`ge_validation_op_factory`.
- General consolidation refactor of tests for clarity and concision.
- Adapt `examples/with_great_expectations` to use the new version.
- Update dev install script to install `dagster-ge` again. It was
skipping it due to the aforementioned nth-order dependency conflicts.

## How I Tested These Changes

Revamped tests.

## Changelog

[dagster-ge] `dagster-ge` now only supports
`great_expectations>=0.17.15`. The `ge_validation_op_factory` API has
been replaced with the API previously called
`ge_validation_op_factory_v3`. Now there is only one API,
`ge_validation_op_factory`.
## Summary & Motivation

As title, this does a quick refresh of the docs around automation
conditions, and adds in a bit of information about the arbitrary python
stuff

## How I Tested These Changes

## Changelog

NOCHANGELOG
## Summary & Motivation
Adds ignored files to vscode env for airlift demo
## Summary & Motivation

## How I Tested These Changes

## Changelog

> Insert changelog entry or delete this section.
Internal companion PR: dagster-io/internal#12359

## Summary & Motivation

Update `dagster-helm` to use Pydantic 2+ instead of Pydantic 1. This
unblocks updating `dagster` to use exclusively 2+ (Pydantic 1.x is EOL).

Because the entire purpose of `dagster-helm` is to define our helm
schema with pydantic, and Pydantic 1 -> 2 has many breaking changes,
there are a lot of updates here:

- Fields of form: `field: Optional[SomeType]` were updated to `field:
Optional[SomeType] = None`. The `None` default is automatically applied
in v1 but not v2.
- `Config` classes in our models were removed in favor of
`model_config=ConfigDict(...)` or setting config directly in the class
args. This is the preferred API in v2.
- Renamed `schema_extra` config specifications to `json_schema_extra`
- Kubernetes models with no defined fields but a `json_schema_extra`
pointing to an external schema now have `extra=allow` on them. This
allows fields to actually be set on these objects-- this was possible in
v1 because `BaseModel.construct()` ignored the `extra` setting and
always set extra fields, but that behavior changed in v2.
- Tests that constructed an object for a schema with nested models
previously sometimes just used a dict instead of a model object on the
test object. This stopped working and now we make sure the test object
contains all the nested model objects.
- Few other tweaks for niche API breakages.

## How I Tested These Changes

Existing test suite.
Internal companion PR: dagster-io/internal#12357

## Summary & Motivation

Bump min version of pydantic to 2.x in the `dagster` core. This rests on
downstack work supporting pydantic 2 in some subsidiary packages.
Changes include:

- Removing tox envs for pydantic 1/2-specific testing
- A bunch of code that branched on whether pydantic 1 vs 2 was installed
in env

There is a whole additional "pydantic compat layer" that was designed to
provide a uniform internal interface for our pydantic-using code
depending on whether 1 or 2 was installed. I'll remove this in a
followup.

## How I Tested These Changes

Existing test suite.

## Changelog

`dagster` now requires `pydantic>=2`.
## Summary & Motivation

Previous PR was the wrong fix

## How I Tested These Changes

## Changelog

> Insert changelog entry or delete this section.
## Summary & Motivation

Python 3.8 throwing an error for dagster-ge, and we're dropping 3.8
support soon everywhere anyway. So rm dagster-ge 3.8 support.
…ext (#25445)

refactor to use selectors and do point lookups against the workspace
context, allowing different context implementations more flexibility

## How I Tested These Changes

existing coverage
## Summary & Motivation

Adding `dagster-ray` to community integrations page
…utionContext a standalone class (#25542)

## Summary & Motivation
#25541 makes
`AssetExecutionContext` not a subclass of `OpExecutionContext`. This PR
is the changes to Pipes libraries required to make that change.

Includes updating type signatures to allow for `AssetExecutionContext`,
since previously, type annotation of `OpExecutionContext` would have
allowed `AssetExecutionContext` to be passed

For any `context` calls that needed to be updated, I've added a comment
explaining the change

## How I Tested These Changes
existing test suite
…25552)

go through the context for resolving pieces of information to facilitate
different backing implementations in other workspace classes.

## How I Tested These Changes

existing tests
## Summary & Motivation

A new release `4.0.0` of `croniter` introduces breaking changes in
current dagster schedule definitions.

### What did you expect to happen?

```python
@schedule(
    cron_schedule="0 10 * * 7", 
)
def some_schedule(context: ScheduleEvaluationContext):
``` 

The current cron string should work. Instead, we are getting the error:

```
dagster._core.errors.DagsterInvalidDefinitionError: Found invalid cron schedule '0 10 * * 7' for schedule 'some_schedule''. Dagster recognizes standard cron expressions consisting of 5 fields.
```

Mentioned Issue: #25570

## Changelog

> Pins croniter below 4.0.0
Test that will fail if we break the ability to reference '7' for Sunday.

## Summary & Motivation

## How I Tested These Changes

## Changelog

> Insert changelog entry or delete this section.
## Summary & Motivation

This fixes an issue flagged here:
https://dagsterlabs.slack.com/archives/C03CCE471E0/p1729870052943159, in
which two uses of `useCursorPaginatedQuery` exist on the page and need
to separately store their pagination state in the URL.

I chose to change the new `RunsFeed`'s cursor because 1) the runs feed
is embedded within a variety of pages (eg: asset automation tab, sensor
runs / schedule runs, etc), and I think it's likely to be the common
element conflicting with other uses of the hook and 2) it's new, and
uses a new cursor style, so any saved pre-1.9 links would be broken
anyway.

I changed the cursor to `runs_before=abcd123`, which reflects how the
cursor operates in the run feed API.

## How I Tested These Changes

I repro'd the case in the slack discussion and verified that the
offending scenario works correctly with this change.

I verified that there are no places where we use `cursor=${` to
explicitly navigate to the runs page with a cursor already applied.

Co-authored-by: bengotow <[email protected]>
…try (#25576)

## Summary & Motivation
It has been a long time since we had build issues requiring this now that our grpcio version is unpinned.

Resolves #15699 

## How I Tested These Changes
👀
#25511)

## Summary & Motivation
Switches the latest materializations by partition resolver to always use
the optimized latest storage ids by partition call.

## How I Tested These Changes
BK
…RL to be too long and page to be unloadable" (#25577)

## Summary & Motivation
As titled, fixes the above issue.

So the original implementation redirects to the `/setup` path which all
it does is save the configuration to localStorage and then immediately
redirects to the `/playground` route. This solution cuts out the
middleman and just directly saves the config to localStorage and then
opens `/playground` directly. This avoids the need to pass a large
config via the URL.

## How I Tested These Changes

Used the test sensor feature and opened one of the config's it returned.
## Summary & Motivation

Eliminate `isLegacyNav` flag, put all users into new navigation structure.

## How I Tested These Changes

Buildkite. Load app, verify that nav and routes are rendered properly.

## Changelog

[ui] Enable new top navigation and deployment pages for all users.
Rename this method to better communicate that getting the full set of
asset keys may not be a "free" operation, as is the case in asset graph
implementations that lazily load. This was motivated by the callsites
updated in this PR that used `key in graph.all_asset_keys` instead of
`graph.has(key)` which can be supported by more performant
implementations.

## How I Tested These Changes

existing coverage
## Summary & Motivation

Remove support for Python 3.8

## How I Tested These Changes

## Changelog

Dagster no longer supports Python 3.8, which hit EOL on 2024-10-07.
theres is a great deal of code that does `asset_graph.get` so instead of
trying to reroute things through a new context method, lets just focus
on improving that code path

## How I Tested These Changes

existing tests
## Summary & Motivation
small update to the backfill change to more accurately reflect ui
behavior

## How I Tested These Changes
## Summary & Motivation

As the title. Fixes
[DS-402](https://linear.app/dagster-labs/issue/DS-402/add-dbt-and-dagster-branch-deployment-guide)

## How I Tested These Changes

Docs preview

## Changelog

[dagster-dbt] A guide on how to use dbt defer with Dagster branch
deployments has been added to the dbt reference.

- [ ] `NEW` _(added new feature or capability)_
- [ ] `BUGFIX` _(fixed a bug)_
- [x] `DOCS` _(added or updated documentation)_
## Summary & Motivation

Make `load_tableau_asset_specs` and
`build_tableau_executable_assets_definition` experimental before
releasing to 1.9
## Summary & Motivation

Make `load_looker_asset_specs` and `build_looker_pdt_assets_definitions`
experimental before releasing to 1.9
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.