Avoiding an XSS Loophole in Twig
Cross-site-scripting, or XSS, is a class of web application vulnerabilities. When an attacker is able to inject a code snippet of choice into a web page and have it treated as HTML, JavaScript or similar, they gain vast freedom to perform actions on behalf of the users to which that code is displayed. The Twig template engine comes with an „autoescape“ feature that will prevent many XSS attack vectors by default. Yet, you still need to be aware of potential pitfalls. In this blog post, I‘ll cover one example.
In this post
Starting point
Assume we have a macro that simplifies some output task. For the sake of simplicity, let‘s have it print a list.
{% macro list(items) %}
<ul>
{% for item in items %}
<li>{{ item }}</li>
{% endfor %}
</ul>
{% endmacro %}
{{ _self.list([book.title, book.author])}}
So far, everything is fine.
Thanks to Twig‘s autoescape feature, this code is protected from XSS attacks: Even when an attacker can supply arbitrary values for
book.title
or
book.author
, Twig will make sure the
htmlspecialchars()
function is applied to make the values safe for use in HTML.
You can try the example in a Twig fiddle.
Adding HTML with Good Intentions™
Now, assume we‘re tasked to make the author‘s name editable. We add functionality somewhere to do that, and we‘d like to allow the user to jump to that edit from right from our list.
We might change the macro invocation to look like this:
{{ _self.list([
book.title,
book.author ~ '<a href="#">edit</a>'
])}}
We give it a shot by reloading the page in the browser, or by looking at the updated fiddle.
We immediately recognize our mistake, since the
<a href="#">...</a>
link is printed as-is in the page. Right, that autoescape feature! What do we need to do?
Let‘s update our macro, since we know we want to include HTML in the output.
{% macro list(items) %}
<ul>
{% for item in items %}
{# Use |raw for HTML passthrough #}
<li>{{ item | raw }}</li>
{% endfor %}
</ul>
{% endmacro %}
Another quick refresh (fiddle) shows us that now it‘s working as expected. Let‘s commit and we‘re done.
What just happened
By applying the
|raw
filter in our macro, we have declared that whatever value the
item
variable contains, it is safe to use as-is in the current context. It will turn off Twig‘s autoescaping at the point where the list items are printed.
We have shifted responsibility for proper escaping from Twig to the developer using our macro. And that not only affects the item where we add the HTML link, which might be fixed like so:
{{ _self.list([
book.title,
book.author|e ~ '<a href="#">edit</a>'
])}}
Note I‘ve added |e , the shorthand for the escape filter, to do HTML escaping before concatenating the author name with the HTML markup.
This, of course, falls short, since we need not only escape the
book.author
item, but also
book.title
.
Even worse, when our macro is a re-used component being called from various places, we‘d need to have this escaping applied to all inputs supplied to the macro, which might be a lot of work to find and update.
You probably would take a step back and find some other way to solve your current task without doing such a change. Maybe you‘d just stop using the macro in the current situation and duplicate the code in this one spot. If… you are aware of it.
You ain‘t gonna notice (in time)
The challenge, in my opinion, is that you are likely not going to notice the problem you just created. It takes a lot of awareness to understand the subtle implications of the change.
This is may not only be an issue for less experienced developers or template designers. Imagine the template not being as simple as in the above example, but a bit more cluttered. For example, the list macro might need to also deal with array keys and blend in CSS classes (maybe taken from a second parameter) as well. That gives a lot of distraction also for more experienced Twig users and you might easily miss the important point.
Also, since Twig – and proably other template engines as well – do a great job at producing „safe“ output most of the time nowadays, when it comes to HTML and XSS we have become a bit more negligent or less well trained than some of us might have been a decade ago.
To make matters worse, nothing will point you to the problem in the short run. Whilst the initial HTML that you added showed up in the output and immediately reminded you of „fixing“ it, the XSS loophole is lurking in your template with no indications of a problem. The „lorem ipsum“ style data you will typically use during development and also during automated tests (functional or acceptance) is most likely not suited to find XSS exploits like this.
Here is a fiddle where an author attended a creative writing course and changed their name afterwards, to better illustrate the problem. Imagine an admin user of your online book shop visiting a page where the above macro is used: You‘re not seeing anything – no „funny looking“ author name. Yet, JavaScript is executed in the background in the context of your admin‘s session.
A more robust pattern
Here is another way how you could have written the previous change in Twig.
{% set author_with_link %}
{{ book.author }} <a href="#">edit</a>
{% endset %}
{{ _self.list([book.title, author_with_link])}}
The interesting thing about this is that you can keep the
list
macro as-is, without adding
|raw
in it.
When Twig processes the
{% set … %}...{% endset %}
section, it will apply autoescaping as usual. The
book.author
value will have HTML escaping applied before being printed, and literal HTML markup in templates is safe by default.
Now, the
author_with_link
variable does not contain a simple string, but an instance of a special class in Twig. It will feel like a string for most purposes, so you can pass it around, concatenate it with other strings or pass it through filters.
As long as you use it unchanged, Twig will „remember“ that output escaping has been applied to this piece of HTML already. You can verify in this fiddle that in fact, the HTML link is printed as plain HTML, whereas HTML markup in the
book.author
data is escaped.
Conclusion
Be careful when you apply the
|raw
filter in Twig. It shifts the responsibilty for applying appropriate output encoding from Twig to you, the template developer.
When using
|raw
, especially in centralized places like includes or macros, make sure everyone passing in data is aware of their obligation to apply appropriate escaping. This may be a break in backwards compatibility for the include/macro.
Glitches in output escaping are likely to go unnoticed for a long time, since you don‘t typically test with the necessary inputs, unless doing security audits.
Doing string concatenation operations with HTML snippets in Twig may be a sign of danger. Even tough it may take a few more lines of code to write and variable names to come up with, writing markup in Twig and capturing it with
{% set … %}
may help you to stay on the safe side, especially when passing data along afterwards.