Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(templating): Safer Jinja template processing #11704

Merged
merged 14 commits into from
Nov 17, 2020

Conversation

robdiciuccio
Copy link
Member

@robdiciuccio robdiciuccio commented Nov 14, 2020

SUMMARY

Prevent passing potentially unsafe modules/methods to the Jinja template context:

  • Removes default exposed modules such as datetime and timedelta
  • Proxies exposed methods such as current_user_id and url_param through a method that enforces static return values

There are no changes to Jijna templating syntax with this update, other than the default modules (such as datetime) are no longer available by default.

Adding custom Jinja context items is still possible via JINJA_CONTEXT_ADDONS as is overriding the template processor on a per-database engine basis via CUSTOM_TEMPLATE_PROCESSORS.

Additional context/discussion in #11617

TODO

  • Write tests

TEST PLAN

  • Test Jinja templating in SQL Lab and in virtual datasets

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Requires DB Migration.
  • Confirm DB Migration upgrade and downgrade tested.
  • Introduces new feature or API
  • Removes existing feature or API

@mistercrunch @ktmud @dpgaspar @villebro

@codecov-io
Copy link

codecov-io commented Nov 14, 2020

Codecov Report

Merging #11704 (f7fbd59) into master (c241c6f) will decrease coverage by 7.04%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #11704      +/-   ##
==========================================
- Coverage   63.04%   56.00%   -7.05%     
==========================================
  Files         895      408     -487     
  Lines       43315    14437   -28878     
  Branches     4015     3716     -299     
==========================================
- Hits        27308     8085   -19223     
+ Misses      15829     6352    -9477     
+ Partials      178        0     -178     
Flag Coverage Δ
cypress 56.00% <ø> (?)
javascript ?
python ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...uperset-frontend/src/dashboard/util/dnd-reorder.js 0.00% <0.00%> (-100.00%) ⬇️
...rset-frontend/src/dashboard/util/getEmptyLayout.js 0.00% <0.00%> (-100.00%) ⬇️
...dashboard/components/resizable/ResizableHandle.jsx 0.00% <0.00%> (-100.00%) ⬇️
.../src/dashboard/util/getFilterScopeFromNodesTree.js 0.00% <0.00%> (-93.48%) ⬇️
...src/dashboard/components/gridComponents/Header.jsx 10.52% <0.00%> (-86.85%) ⬇️
...rc/dashboard/components/gridComponents/Divider.jsx 13.33% <0.00%> (-86.67%) ⬇️
...set-frontend/src/views/CRUD/welcome/EmptyState.tsx 5.71% <0.00%> (-82.10%) ⬇️
...et-frontend/src/components/ListView/ActionsBar.tsx 11.11% <0.00%> (-81.20%) ⬇️
...et-frontend/src/components/Menu/LanguagePicker.tsx 0.00% <0.00%> (-80.00%) ⬇️
superset-frontend/src/components/Link.tsx 7.69% <0.00%> (-79.81%) ⬇️
... and 797 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c241c6f...f7fbd59. Read the comment docs.

@@ -677,6 +677,10 @@ class CeleryConfig: # pylint: disable=too-few-public-methods
# language. This allows you to define custom logic to process macro template.
CUSTOM_TEMPLATE_PROCESSORS: Dict[str, Type[BaseTemplateProcessor]] = {}

# Prevent access to classes/objects and proxy methods in the default Jinja context,
# unless explicitly overridden by JINJA_CONTEXT_ADDONS or CUSTOM_TEMPLATE_PROCESSORS.
Copy link
Member

@mistercrunch mistercrunch Nov 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We may not want to make JINJA_CONTEXT_ADDONS mutually exclusive with SAFE_JINJA_PROCESSING. Someone may want to add a safe function to their environment without having to fully pivot into the legacy/more risky approach. It should be easy to support this, but we should highlight the caveats.

"""
Exposing functionality through JINJA_CONTEXT_ADDONS has security implications as it opens a window for a user to execute untrusted code. It's important to make sure that you make sure that the objects exposed (as well as objects attached to those objets) are harmless. We recommend only exposing simple/pure functions that return native types.
"""

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was unintentional. Fixing to allow JINJA_CONTEXT_ADDONS to coexist with SAFE_JINJA_PROCESSING

def set_context(self, **kwargs: Any) -> None:
extra_cache = ExtraCache(self._extra_cache_keys)
self._context = {
"url_param": partial(safe_proxy, extra_cache.url_param),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wasn't familiar with partial, but I'm guessing that the intent is to clean up other methods/context that could be attached to the callable (?)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The intent with partial is to wrap the callable with a method to enforce a safe return value.

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A good step forward for securing Jinja. Would be curious to hear other community thoughts on this, especially if someone is extensively using datetime, random, uuid etc from the current base context. Perhaps kick off a DISCUSS on the mailing list?

@@ -677,6 +677,10 @@ class CeleryConfig: # pylint: disable=too-few-public-methods
# language. This allows you to define custom logic to process macro template.
CUSTOM_TEMPLATE_PROCESSORS: Dict[str, Type[BaseTemplateProcessor]] = {}

# Prevent access to classes/objects and proxy methods in the default Jinja context,
# unless explicitly overridden by JINJA_CONTEXT_ADDONS or CUSTOM_TEMPLATE_PROCESSORS.
SAFE_JINJA_PROCESSING: bool = True
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it make sense to introduce a TEMPLATE_PROCESSOR parameter that accepts TemplateProcessorEnum values, something like

TemplateProcessorEnum(enum.Enum):
    SafeJinja = 1
    LegacyJinja = 2
    Chevron = 3
    Custom = 4

In this approach, LegacyJinja would include the old datetime, random etc base context, and SafeJinja would have a more limited set.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure we want to support so many different modes. To me it's more important to find a "paved path" of safe and flexible templating solution that makes the most sense. Every feature flag we added here is more like a temporary solution for compatibility rather than something we want to support in the long-term.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@ktmud I agree, I think we should push safety and (potentially unsafe) customizability as a path forward.

superset/jinja_context.py Show resolved Hide resolved
none_type,
"bool",
"str",
"unicode",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'm pretty sure unicode has been replaced by str in Python 3.

return_value = func(*args, **kwargs)
value_type = type(return_value).__name__
if value_type not in allowed_types:
raise SupersetTemplateException("Unsafe template value")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to get a more verbose error here, something like

raise SupersetTemplateException(__("Unsafe return type for function %(func)s: %(value_type)s", func=func.__name__, value_type=value_type)

@robdiciuccio
Copy link
Member Author

Taking @ktmud 's feedback, making this less configurable and moving towards sane defaults which can be overridden via existing JINJA_CONTEXT_ADDONS and CUSTOM_TEMPLATE_PROCESSORS configuration.

UPDATING.md Outdated
@@ -36,7 +38,7 @@ assists people when migrating to a new version.
and requires more work. You can easily turn on the languages you want
to expose in your environment in superset_config.py

* [11172](https://github.com/apache/incubator-superset/pull/11172): Breaking change: SQL templating is turned off be default. To turn it on set `ENABLE_TEMPLATE_PROCESSING` to True on `DEFAULT_FEATURE_FLAGS`
* [11172](https://github.com/apache/incubator-superset/pull/11172): Breaking change: SQL templating is turned off be default. To turn it on set `ENABLE_TEMPLATE_PROCESSING` to True in `FEATURE_FLAGS`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should prevent DEFAULT_FEATURE_FLAGS from being overridden by custom config files/modules.

Copy link
Member

@dpgaspar dpgaspar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! but some tests are still failing, left a couple of non blocking comments and a question regarding presto context

UPDATING.md Outdated
- [11172](https://github.com/apache/incubator-superset/pull/11172): Breaking change: SQL templating is turned off be default. To turn it on set `ENABLE_TEMPLATE_PROCESSING` to True on `DEFAULT_FEATURE_FLAGS`
>>>>>>> master
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems like a merge conflict, while your at it can you fix turned off be default to turned off by default

self._context[self.engine] = {
"first_latest_partition": partial(safe_proxy, self.first_latest_partition),
"latest_partitions": partial(safe_proxy, self.latest_partitions),
"latest_sub_partition": partial(safe_proxy, self.latest_sub_partition),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Not totally related with this PR but it's strange that def latest_sub_partition(self, table_name: str, **kwargs: Any) -> Any: returns Any should probably be Optional[str]

superset/jinja_context.py Show resolved Hide resolved
Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, non-blocking comment on naming of function.

superset/jinja_context.py Outdated Show resolved Hide resolved
Copy link
Member

@mistercrunch mistercrunch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@ktmud ktmud left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with one small question

COLLECTION_TYPES = ("list", "dict", "tuple", "set")


@memoized
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the benefit of memoization here? Is current_app proxy very expensive?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The value doesn't change over time, so just trying to save some processing cycles. Probably not much gain here, but attempting not to repeat initialization logic.

@craig-rueda craig-rueda merged commit 01d15f5 into apache:master Nov 17, 2020
@craig-rueda craig-rueda deleted the rd/safe-jinja branch November 17, 2020 19:55
@mistercrunch mistercrunch added the rush! Requires immediate attention label Nov 18, 2020
auxten pushed a commit to auxten/incubator-superset that referenced this pull request Nov 20, 2020
* Enable safer Jinja template processing

* Allow JINJA_CONTEXT_ADDONS with SAFE_JINJA_PROCESSING

* Make template processor initialization less magical, refactor classes

* Consolidat Jinja logic, remove config flag in favor of sane defaults

* Restore previous ENABLE_TEMPLATE_PROCESSING default

* Add recursive type checking, update tests

* remove erroneous config file

* Remove TableColumn models from template context

* pylint refactoring

* Add entry to UPDATING.md

* Resolve botched merge conflict

* Update docs on running single python test

* Refactor template context checking to support engine-specific methods
@junlincc junlincc removed the rush! Requires immediate attention label Mar 18, 2021
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 1.0.0 labels Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/XL 🚢 1.0.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants