-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CT-1216] [Bug] {{ concat }} macro not working on Spark / Databricks #5888
Comments
@jelstongreen Thanks for opening, and providing the clear reproduction case. Given the log message you're seeing, it looks like the Which version of # packages
packages:
- package: dbt-labs/dbt_utils
version: 0.9.2 {% set native_query = hash(concat(['a','b','c'])) %}
{{ log("Without namespace: " ~ native_query, info=True) }}
{% set dbt_utils_query = dbt_utils.hash(dbt_utils.concat(['a','b','c'])) %}
{{ log("With dbt_utils namespace: " ~ dbt_utils_query, info=True) }}
SELECT 1 $ dbt parse
17:37:38 Running with dbt=1.2.1
17:37:38 Start parsing.
17:37:38 Dependencies loaded
17:37:38 ManifestLoader created
17:37:38 Without namespace: md5(cast(concat(a, b, c) as string))
17:37:38 Warning: the `concat` macro is now provided in dbt Core. It is no longer available in dbt_utils and backwards compatibility will be removed in a future version of the package. Use `concat` (no prefix) instead. The test.my_model model triggered this warning.
17:37:38 Warning: the `hash` macro is now provided in dbt Core. It is no longer available in dbt_utils and backwards compatibility will be removed in a future version of the package. Use `hash` (no prefix) instead. The test.my_model model triggered this warning.
17:37:38 With dbt_utils namespace: md5(cast(concat(a, b, c) as string))
17:37:38 Manifest loaded
17:37:38 Manifest checked
17:37:38 Flat graph built
17:37:38 Manifest loaded
17:37:38 Performance info: target/perf_info.json
17:37:38 Done.
... |
Hey @jtcohen6 thanks for looking into this! These are my packages:
I'm thinking this could be do with my dispatch config which I've not really updated since first starting out with dbt - does this need to be amended?
|
@jelstongreen Yeah! What happens if you remove that |
@jtcohen6 Unfortunately no change with that section removed:
|
I've worked this one out: We had an internal macro called Probably no need for any resolution here although it would be handy to know if an internal macro is conflicting with a built in macro of the same name, particularly after a version change as can be difficult to determine otherwise. Thanks for your help @jtcohen6 ! |
@jelstongreen Ah, thanks for the update! Definitely open to thinking about how it could be easier to debug these ones in the future. It's true that there's more possibility for namespace collision when calling macros from the global namespace, though with the benefit of making it simpler for folks to override specific macro behavior when they see the need. Unfortunately, because this override happens during macro resolution, rather than dispatch resolution (subtle distinction), there's no way to identify at parse time / via "macro.dbt_utils.default__surrogate_key": {
"unique_id": "macro.dbt_utils.default__surrogate_key",
"package_name": "dbt_utils",
"root_path": "/Users/jerco/dev/scratch/testy/dbt_packages/dbt_utils",
"path": "macros/sql/surrogate_key.sql",
"original_file_path": "macros/sql/surrogate_key.sql",
"name": "default__surrogate_key",
"macro_sql": "\n\n{%- macro default__surrogate_key(field_list) -%}\n\n{%- if varargs|length >= 1 or field_list is string %}\n\n{%- set error_message = '\nWarning: the `surrogate_key` macro now takes a single list argument instead of \\\nmultiple string arguments. Support for multiple string arguments will be \\\ndeprecated in a future release of dbt-utils. The {}.{} model triggered this warning. \\\n'.format(model.package_name, model.name) -%}\n\n{%- do exceptions.warn(error_message) -%}\n\n{# first argument is not included in varargs, so add first element to field_list_xf #}\n{%- set field_list_xf = [field_list] -%}\n\n{%- for field in varargs %}\n{%- set _ = field_list_xf.append(field) -%}\n{%- endfor -%}\n\n{%- else -%}\n\n{# if using list, just set field_list_xf as field_list #}\n{%- set field_list_xf = field_list -%}\n\n{%- endif -%}\n\n\n{%- set fields = [] -%}\n\n{%- for field in field_list_xf -%}\n\n {%- set _ = fields.append(\n \"coalesce(cast(\" ~ field ~ \" as \" ~ type_string() ~ \"), '')\"\n ) -%}\n\n {%- if not loop.last %}\n {%- set _ = fields.append(\"'-'\") -%}\n {%- endif -%}\n\n{%- endfor -%}\n\n{{ hash(concat(fields)) }}\n\n{%- endmacro -%}",
"resource_type": "macro",
"tags": [],
"depends_on": {
"macros": [
"macro.dbt_utils.type_string",
"macro.dbt_utils.hash",
"macro.dbt_utils.concat"
]
}, FYI @dbeatty10 @joellabes - this is an issue that would have been prevented by calling the macro explicitly as I'm going to close this specific issue as resolved for now. |
Oh boy, I had no idea that Personally I prefer Now that it does... 🤔 which failure case is more common/difficult to work around? This or the one raised in 5720? |
@joellabes To be clear, they yield different results only if you've defined a macro in your own project named So, I also find myself leaning toward explicit namespaces... I don't remember exactly why @dbeatty10 and I opted for prefixless, but knowing us, there's definitely a thread somewhere. The bug in #5720 is decidedly a bug, not just a point of user confusion — it's something we can try to fix. It's a gnarly bug, though. We can invest more time in trying to resolve it, which would clear the way toward us being able to recommend explicit namespaces unambiguously, but it won't be fixed by dbt-core v1.3 + dbt-utils v1.0. At the same time, the issue will just go away when we remove the deprecated back-compat macros from |
Wholeheartedly agree with using explicit namespaces IFF we can't avoid bugs otherwise! i.e., @jtcohen6 our original discussion was basically this:
An underlying assumption for me was that with/without prefix was equivalent. We should be explicit if we will have bugs otherwise. We will need to add To fully unpack my ordered preferences for syntax (if we could create our ideal world):
In the first instance, all the Jinja has melted away as-if |
@dbeatty10 I think I may have a fix for the bug that crops up when users explicitly specify Given that, I'm inclined to encourage explicit
✅ We can open an issue for this. @joellabes I imagine we'd also want to update the references in |
I already have a PR open for this page in our documentation, so I'll add it there: |
My preferences are actually the exact inverse of yours! If we are being clever, then I want to see where the magic is coming from. Mostly because:
is a great greenfield stance to take, but it's already a SQL standard function implemented by a lot of databases, so this would open us up to wild amounts of clashes - it would really only be possible with the dbt proxy server rewriting everything before the warehouse saw it. Which isn't impossible I guess! But certainly a surprise for the new employee when they discover that something is swizzling their functions from under them. Maybe I'm thinking too small? I'm open to the scales falling from my eyes
:infinitypartyparrot: |
Is this a new bug in dbt-core?
Current Behavior
compiles to:
Expected Behavior
The behaviour should mimic the previously used :
which compiles too:
Steps To Reproduce
Create an example model containing:
Relevant log output
Which database adapter are you using with dbt?
spark
Additional Context
Databricks
The text was updated successfully, but these errors were encountered: