Skip to content
This repository has been archived by the owner on Feb 8, 2018. It is now read-only.

review site for html injections #722

Closed
chadwhitacre opened this issue Mar 12, 2013 · 52 comments · Fixed by #3164
Closed

review site for html injections #722

chadwhitacre opened this issue Mar 12, 2013 · 52 comments · Fixed by #3164

Comments

@chadwhitacre
Copy link
Contributor

I just discovered that I wasn't escaping statements. :-(

I'm not sure how that happened, but it did. We need to review the whole site for html injections.


Want to back this issue? Place a bounty on it! We accept bounties via Bountysource.

@wilkie
Copy link
Contributor

wilkie commented Mar 12, 2013

In investigating the issue 8506070 caused, I've audited participant statement usage throughout the site.

User Statement

stored non-escaped

statement.json: wrapped!
index.html: meta og:description: SERIOUSLY BAD: not-escaped (I confirmed in a development environment that I can indeed close the tag and inject a <link> or <meta> etc in <head>, which is obvi-bad)
index.html: displaying the statement: escaped twice! see #723 This doesn't seem to be a place where escape is needed, oddly.
about/stats: escaped for every user
about/goals.html: BAD: not escaped
templates/profile-edit.html: intentionally not escaped

@chadwhitacre
Copy link
Contributor Author

Fixed in b4bdc32: index.html: meta og:description: SERIOUSLY BAD: not-escaped (I confirmed in a development environment that I can indeed close the tag and inject a <link> or <meta> etc in <head>, which is obvi-bad)

Fixed in 959e72a: index.html: displaying the statement: escaped twice! see #723 This doesn't seem to be a place where escape is needed, oddly.

Fixed in cca632d: about/goals.html: BAD: not escaped

Good to close?

@wilkie
Copy link
Contributor

wilkie commented Mar 14, 2013

well, the issue suggests looking for html injections throughout the site. I only focused on statement fields, so technically the issue is not yet fulfilled.

@chadwhitacre
Copy link
Contributor Author

Good point. :)

@chadwhitacre
Copy link
Contributor Author

@buttscicles caught a few more of these (thanks!). I'm beginning to realize now why Tornado moved to HTML-escaping by default in 2.x. This feels like whack-a-mole. Gittip uses Aspen's default templating language, Tornado 1.x. We have a ticket over on Aspen to upgrade to 2.x (AspenWeb/pando.py#158). Picking up over there ...

@chadwhitacre
Copy link
Contributor Author

Let's take this up now that we've migrated to Jinja2 (#1783).

@chadwhitacre
Copy link
Contributor Author

@clone1018 Are we good to close now that we're on Jinja2?

@clone1018
Copy link
Contributor

I wouldn't say we're good but we're good for now.

@chadwhitacre
Copy link
Contributor Author

@clone1018 Would you be willing to conduct a review so we can close this ticket?

@clone1018
Copy link
Contributor

@whit537 I wouldn't feel completely safe until we change Jinja2 to being autoescape by default

http://jinja.pocoo.org/docs/extensions/#autoescape-extension

@clone1018
Copy link
Contributor

I'm in favor of closing this specific ticket for now, we can reticket any other issues and if escaping becomes a problem we can ticket the autoescape extension.

@chadwhitacre
Copy link
Contributor Author

I'm leery of autoescape, because the rules for escaping differ with the syntax you're templating into. If we're inserting into an href attribute like href="{{ foo }}" then we have different escaping needs than if we are inserting into an HTML context. Auto-escaping can give us a false sense of security that everything is escaped properly when we really still need to be aware of every single place in templates where we insert stuff and whether or not we're doing it right.

Even with autoescape we still need this audit.

@seanlinsley
Copy link
Contributor

Could you go into detail @whit537 about the sorts of escaping rules that could cause trouble? In general, automatically escaping everything is a boon to security because it gives you a much smaller surface area to check for holes.

@chadwhitacre
Copy link
Contributor Author

@seanlinsley
Copy link
Contributor

So you're saying... it might not use the right type of escaping for the given context?

@chadwhitacre
Copy link
Contributor Author

Yup. We got bit by this once already, iirc.

@chadwhitacre
Copy link
Contributor Author

Basically, we need an audit based on that cheat sheet. Whether or not we use autoescaping is orthogonal.

@chadwhitacre
Copy link
Contributor Author

Really, it would be great to have some tooling that presented a list of all the places in templates where we insert stuff, so we can review and check them off one by one.

@chadwhitacre
Copy link
Contributor Author

Now that we have i18n we should consider including that in a review. IRC

@blrhc
Copy link
Contributor

blrhc commented Nov 11, 2014

I'll start taking a look at this. Thanks.

@chadwhitacre
Copy link
Contributor Author

IRC

@chadwhitacre
Copy link
Contributor Author

Retort:

  • We're only using Jinja2 for HTML.
  • But it doesn't hurt to use it when it's not needed.
  • Hmmm ... really?

@chadwhitacre
Copy link
Contributor Author

It just seems increasingly stupid to rely on "a human knows" for the 544 places where we are potentially vulnerable to injection. Performance will have to be really bad to justify not switching to autoescaping. No?

@chadwhitacre
Copy link
Contributor Author

@chadwhitacre
Copy link
Contributor Author

Autoescape interacts with i18n somehow:

screen shot 2015-02-04 at 1 15 30 pm

@chadwhitacre
Copy link
Contributor Author

Also if [newstyle gettext calls] are used in an autoescaping environment they better support automatic escaping.

http://jinja.pocoo.org/docs/dev/extensions/#newstyle-gettext

@chadwhitacre
Copy link
Contributor Author

So it looks like implementing autoescaping will take some work. It's not a trivial quick answer to this ticket.

@Changaco
Copy link
Contributor

Changaco commented Feb 4, 2015

  • We're only using Jinja2 for HTML.

Not true, we also use it for plain text in email simplates.

@chadwhitacre
Copy link
Contributor Author

P.S. The place to implement autoescaping would be in aspen_jinja2, by adding autoescape=True, extensions['jinja2.ext.autoescape'] to the Environment instantiation. Then, another thing to work around is that we don't want to autoescape the CSS and JS assets. The workaround for that is:

[---]
{% autoescape false %}{{ css }}{% endautoescape %}

If the application enables the Autoescape Extension one can activate and deactivate the autoescaping from within the templates.

http://jinja.pocoo.org/docs/dev/templates/#autoescape-extension

@chadwhitacre
Copy link
Contributor Author

Not true, we also use it for plain text in email simplates.

Okay, so if we go with autoescaping I guess we'd use the {% autoescape false %} mechanism to turn it off for the text/plain email parts.

@chadwhitacre
Copy link
Contributor Author

@Changaco Can you speak to whether we're using newstyle gettext calls?

@Changaco
Copy link
Contributor

Changaco commented Feb 4, 2015

Sort of, we use custom functions.

chadwhitacre added a commit that referenced this issue Feb 4, 2015
Figured this would be worth having for now until we switch to autoescape
(#3152).
chadwhitacre added a commit that referenced this issue Feb 4, 2015
Figured this would be worth having for now until we switch to autoescape
(#3152).
@chadwhitacre
Copy link
Contributor Author

Here's a slightly differently-formatted list:

https://gist.github.com/whit537/03b1e5f88fcf38c3dd87

Looks like we have 527 replacements on master right now.

@chadwhitacre
Copy link
Contributor Author

I separated out the places where we are just doing i18n ({{ _("Greetings, program!") }}).

https://gist.github.com/whit537/007d96927377a7ed0419

@chadwhitacre
Copy link
Contributor Author

Category N
High Risk 336
Simple i18n 139
Assets 23
Escaped 29

https://gist.github.com/whit537/0a60f06b87ecc332f4fe

@chadwhitacre
Copy link
Contributor Author

Alright, I still don't want to go through 336 of these. :-)

@chadwhitacre
Copy link
Contributor Author

The next step down the autoescaping path is to benchmark performance. If performance isn't "really a lot" worse with autoescaping then we should just use that. It'll take a chunk of work to switch over though due to i18n so benchmarking probably makes sense to make sure it'll be worth it.

@chadwhitacre
Copy link
Contributor Author

If performance is really as atrocious as promised then we should review the 336 places and make it part of our PR review habit.

@chadwhitacre
Copy link
Contributor Author

Here's a script I wrote to help with benchmarking:

https://gist.github.com/whit537/352a6b02d5f644720007

Gonna do some benchmarks now.

@chadwhitacre
Copy link
Contributor Author

[gratipay] $ honcho run -e defaults.env,local.env ./env/bin/python benchmark-autoescaping.py | grep " seconds "
11.9522500038 seconds without autoescaping
12.4190990925 seconds *with* autoescaping
[gratipay] $
#!/usr/bin/env python
"""This is a helper script for benchmarking page performance re: #722.

Set this in local.env to avoid a canonicalizing redirect:

    CANONICAL_HOST=

Then run this script like so:

    [gratipay] $ honcho run -e defaults.env,local.env ./env/bin/python benchmark-autoescaping.py

"""
import time
import aspen_jinja2_renderer
from gratipay.testing import Harness


def autoescaping_compile_meta(self, configuration):
    loader = None
    if configuration.project_root is not None:
        # Instantiate a loader that will be used to resolve template bases.
        loader = aspen_jinja2_renderer.FileSystemLoader(configuration.project_root)
    return aspen_jinja2_renderer.Environment( loader=loader
                                            , autoescape=True
                                            , extensions=['jinja2.ext.autoescape']
                                             )

start = time.time()
for i in range(100):
    Harness.client.GET('/')
print("{} seconds without autoescaping".format(time.time() - start))

aspen_jinja2_renderer.Factory.compile_meta = autoescaping_compile_meta

start = time.time()
for i in range(100):
    Harness.client.GET('/')
print("{} seconds *with* autoescaping".format(time.time() - start))

@chadwhitacre
Copy link
Contributor Author

10x

[gratipay] $ honcho run -e defaults.env,local.env ./env/bin/python benchmark-autoescaping.py | grep " seconds "
1.3902361393 seconds without autoescaping
1.18036794662 seconds *with* autoescaping
[gratipay] $

@chadwhitacre
Copy link
Contributor Author

[gratipay] $ honcho run -e defaults.env,local.env ./env/bin/python benchmark-autoescaping.py 10 | grep " seconds "
1.29223489761 seconds without autoescaping
1.17796278 seconds *with* autoescaping
[gratipay] $ honcho run -e defaults.env,local.env ./env/bin/python benchmark-autoescaping.py 100 | grep " seconds "
11.9243550301 seconds without autoescaping
12.4741439819 seconds *with* autoescaping
[gratipay] $ honcho run -e defaults.env,local.env ./env/bin/python benchmark-autoescaping.py 1000 | grep " seconds "
124.910935163 seconds without autoescaping
132.204340935 seconds *with* autoescaping
[gratipay] $
#!/usr/bin/env python
"""This is a helper script for benchmarking page performance re: #722.

Set this in local.env to avoid a canonicalizing redirect:

    CANONICAL_HOST=

Then run this script like so:

    [gratipay] $ honcho run -e defaults.env,local.env ./env/bin/python benchmark-autoescaping.py

"""
import sys
import time

import aspen_jinja2_renderer
from gratipay.testing import Harness


def autoescaping_compile_meta(self, configuration):
    loader = None
    if configuration.project_root is not None:
        # Instantiate a loader that will be used to resolve template bases.
        loader = aspen_jinja2_renderer.FileSystemLoader(configuration.project_root)
    return aspen_jinja2_renderer.Environment( loader=loader
                                            , autoescape=True
                                            , extensions=['jinja2.ext.autoescape']
                                             )

try:
    n = int(sys.argv[1])
except IndexError:
    n = 10

start = time.time()
for i in range(n):
    Harness.client.GET('/')
print("{} seconds without autoescaping".format(time.time() - start))

aspen_jinja2_renderer.Factory.compile_meta = autoescaping_compile_meta

start = time.time()
for i in range(n):
    Harness.client.GET('/')
print("{} seconds *with* autoescaping".format(time.time() - start))

@chadwhitacre
Copy link
Contributor Author

Alright, performance concerns look like FUD to me.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants