Skip to content

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inline Macro Support #3379

Closed
jaypeedevlin opened this issue May 20, 2021 · 2 comments
Closed

Inline Macro Support #3379

jaypeedevlin opened this issue May 20, 2021 · 2 comments
Labels
discussion enhancement New feature or request

Comments

@jaypeedevlin
Copy link
Contributor

Describe the feature

The ability to define a macro inside a .sql file that is available only to that .sql file, like a function called inside the same .py file would work. The idea would be when you have repeated logic in a model that you want to abstract into a macro that won't be used in other models, so it doesn't feel right to have a 'single use' macro in the macros folder.

Describe alternatives you've considered

Just throwing the macro in the macros folder.

Additional context

Based on this conversation in Slack. This also seems to be supported in the dbt cloud IDE (but not via CLI)

Who will this benefit?

Likely this is for a medium to advanced dbt user.

Are you interested in contributing this feature?

If this is something within my grasp I'd definitely be up for contributing this.

@jaypeedevlin jaypeedevlin added enhancement New feature or request triage labels May 20, 2021
@jtcohen6
Copy link
Contributor

@jaypeedevlin Thanks for opening! I'm mostly for this, just want to register a few concerns that come to mind.

Pro: "Private" logic, unique to one model file, can already be expressed via most Jinja constructs within a model file (ifs, loops, etc). Sometimes, there's logic you just want to express functionally, with inputs and outputs. The ability to define a "private" macro at the top of a model, and use/reuse over the course of the file, keeps the logic close to home while also expressing it in the clearest way possible.

Con: Macros are resources in dbt. This "private" macro probably wouldn't work like other ("global") macros in a number of ways, because it wouldn't be a project resource: It couldn't be described, registered in the manifest, shown in the documentation site, or reused if imported from a package. (I say probably because anything is possible with enough code.) As long as those drawbacks don't feel surprising to you—it is a private macro, after all—then it may not be cause for concern.

In which types of resource files would you expect to be able to define and call inline macros? I'm thinking models, snapshots, analyses, data tests.

This also seems to be supported in the dbt cloud IDE (but not via CLI)

Yes! This is actually supported via the dbt RPC server, which has the ability to parse, compile, and execute an arbitrary blob of dbt-onic Jinja-SQL, via the compile_sql and run_sql methods only. It does this using the context of the project, but it doesn't actually register the new blob o' stuff within the project context. If that blob includes macros, it will parse them out here:

https://github.com/fishtown-analytics/dbt/blob/4d5d0e2150ff2776f5171eb229c367af839b9a35/core/dbt/parser/rpc.py#L50-L61

So if I've got a query, similar to the one linked in Slack:

{% macro test_func(column_name) %}
  {{ column_name }}
{% endmacro %}

SELECT
  {{ test_func('user_id') }}
FROM (
    SELECT 1 as user_id
) sbq

I can base64-encode that query, fire it off to a run_sql request, and see the results:

{
    "jsonrpc": "2.0",
    "method": "run_sql",
    "params": {
        "sql": "eyUgbWFjcm8gdGVzdF9mdW5jKGNvbHVtbl9uYW1lKSAlfQogIHt7IGNvbHVtbl9uYW1lIH19CnslIGVuZG1hY3JvICV9CgpTRUxFQ1QKICB7eyB0ZXN0X2Z1bmMoJ3VzZXJfaWQnKSB9fQpGUk9NICgKICAgIFNFTEVDVCAxIGFzIHVzZXJfaWQKKSBzYnE=",
        "name": "my_inline_macro_query"
    },
    "id": "2db9a2fe-9a39-41ef-828c-25e04dd6b07d"
}
"results": [
    {
        "logs": [],
        "raw_sql": "\n\nSELECT\n  {{ test_func('user_id') }}\nFROM (\n    SELECT 1 as user_id\n) sbq",
        "compiled_sql": "\n\nSELECT\n  \n  user_id\n\nFROM (\n    SELECT 1 as user_id\n) sbq",
        "node": {...},
        "timing": [...],
        "table": {
            "column_names": [
                "user_id"
            ],
            "rows": [
                [
                    1.0
                ]
            ]
        },
        "generated_at": "2021-05-25T12:26:22.403038Z"
    }
]

In any case, I do think the implementation of this could be tricky; we'd want to keep parse-time performance in mind. I'm open to continuing the conversation, and hearing what other folks think about the benefits, how they'd want to use this, or if there's any risk of confusing the interface to macros in dbt today.

@jaypeedevlin
Copy link
Contributor Author

Hi Jeremy! Thanks for the thoughtful reply. A few responses:

As long as those drawbacks don't feel surprising to you—it is a private macro, after all—then it may not be cause for concern.

I didn't explicitly say it, but that would be my assumption — it would be scoped just to the file that it's contained in.

In which types of resource files would you expect to be able to define and call inline macros? I'm thinking models, snapshots, analyses, data tests

Agreed on those, I can't think of any others.

This issue was moved to a discussion.

You can continue the conversation there. Go to discussion →

Labels
discussion enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants