Content Security Policy introduction

I blogged about Content Security Policy about 2 year ago when it was still called 'Site Security Policy'. It started as a specification and an add-on, and turned into a patch a bit later. Finally it made it into Firefox 4 beta 1. I think CSP is the next web security revolution, so make yourself aware of how it works and the implications.

So what is it? The short version is that it's a very effective measure against cross-site scripting. By specifying a policy through the 'X-Content-Security-Policy', you can specify exactly from which locations you accept javascript and other content. This allows you to block scripts from any domains unknown to you, and inline scripts altogether.

A simple example

  1. X-Content-Security-Policy: allow 'self'

A simple PHP example to see this in action:

  1. <?php
  2.  
  3. header("X-Content-Security-Policy: allow 'self'");
  4.  
  5. ?>
  6. <html>
  7.   <head>
  8.     <title>CSP test</title>
  9.   </head>
  10.   <body>
  11.  
  12. <script type="text/javascript">
  13.  
  14. alert('XSS!');
  15.  
  16. </script>
  17.  
  18.   </body>
  19. </html>

If the above code is opened in Firefox 4.0 beta1, the script will not execute, and a warning is added to the "Error Console" (in the Tools menu).

Not only does this header block inline scripts, it also blocks the following:

  • eval(). This important for people using eval() to parse json responses.
  • setTimeout and setInterval if the function is provided as a string.
  • javascript: urls
  • HTML event attributes (onclick, onload, etc.).
  • All images, plugin objects (flash, quicktime etc.), audio, video, html frames and fonts not served from the same domain as the html page.
  • XMLHttpRequest to domains other than the source domain.

Fortunately there are fine grained controls about what you want to allow from which domains. Here are some examples from the specification.

  1. X-Content-Security-Policy: allow 'self'; img-src *; \
  2.                            object-src media1.com media2.com *.cdn.com; \
  3.                            script-src trustedscripts.example.com

This example starts with "allow 'self'", allowing only content from the same domain. The "img-src *" rule allows images from any domain. "object-src: media1.com media2.com" allows <object> tags to use files from media1.com, media1.com and the same domain as the html was served from. To learn more about these, I would recommend just taking a good look at the directives list in the specification.

Options and reporting

Using the 'options' directive it's possible to turn on specific measures. Valid values for options are 'eval-script' and 'inline-script'.

  1. X-Content-Security-Policy: allow 'self'; options inline-script, eval-script

The preceding example allows inline scripts (using html event attributes, or the script tag) as well as the 'eval()' function. In general I would try to avoid this though.

When a security rule is violated, it's possible to get the browser to send a report back to the server. For example, if an image is referenced from a blocked domain, the browser can send a simple report to a url you specify.

  1. X-Content-Security-Policy: allow 'self'; report-uri http://example.org/cspreport.php

This allows you to detect any problems with your policy, or successful attempts by your evil users to inject code. An example of such a report is the following:

  1. {
  2.   "csp-report":
  3.     {
  4.       "request": "GET http://index.html HTTP/1.1",
  5.       "request-headers": "Host: example.com                                                        
  6.                          User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.3a5pre) Gecko/20100601 Minefield/3.7a5pre                                                        
  7.                          Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8  
  8.                          Accept-Language: en-us,en;q=0.5                                          
  9.                          Accept-Encoding: gzip,deflate                                            
  10.                          Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7                          
  11.                          Keep-Alive: 115                                                          
  12.                          Connection: keep-alive",
  13.       "blocked-uri": "http://evil.com/some_image.png",
  14.       "violated-directive": "img-src 'self'",
  15.       "original-policy": "allow 'none'; img-src *, allow 'self'; img-src 'self'"
  16.     }
  17. }

Final notes

Using CSP does not mean you can go easy on other security measures. At the moment a very limited amount of users will have support for CSP, so everybody else still needs to be protected. However, it's still a great idea to implement. Your Firefox users will automatically be protected better, and because of the reporting functionality, they automatically help you detect holes which benefits everybody.

My guess is that CSP is going to be very important, and is here to stay. There are two things you can do to prepare for the future:

  1. Figure out your policy. It's a good idea for your web application to know anyway where resources are coming from. Especially advertisers tend to be bad at using many different domains and scripts using other scripts.
  2. Try to avoid any inline scripting, html event handlers and eval(). They are all avoidable, and in my opinion it is a good idea to keep your javascript out of html anyway. This is a big one, because both inline scripts and html events are still very popular. With the popularity of libraries such as jQuery, I do think it will be easier to just grab most of the inline scripts and move them to an external script.

Storing encrypted session information in a cookie

cookie

Our session system is due for an upgrade. Currently all PHP sessions are stored in the database, and some things are getting a bit slow. There have been a couple of approaches I've been considering, one of which is simply storing all the information in a browser cookie.

First I want to make clear I don't necessarily condone this. The reason I'm writing this post, is because I'm hoping for some more community feedback. Is this a really bad idea? I would love to know.

The benefits

If all the session data is stored in the browser, it means that I don't need to store it on the server. I actually don't care all that much for having the data on the server (unless it's the only secure way), it's mostly a gigantic map with session tokens and user id's (along with some other info).

I also feel it's more natural for HTTP, as it makes it a bit more stateless.

Sample code

  1. <?php
  2.  
  3. class BrowserSession {
  4.  
  5.     public $secret = 'this will need to be a cryptographic random number';
  6.     public $currentUser = null;
  7.  
  8.     // Sessions time out after 10 minutes
  9.     public $timeout = 600;
  10.  
  11.     function init() {
  12.  
  13.         if (!isset($_COOKIE['MYSESSION'])) {
  14.             echo "No session cookie found\n";
  15.             return;
  16.         }
  17.  
  18.         list($userId, $time, $signature) = explode(':',$_COOKIE['MYSESSION']);
  19.        
  20.         // The cookie is old
  21.         if ($time> time() + $this->timeout) {
  22.             echo "The session cookie timed out\n";
  23.         }
  24.  
  25.         if ($signature !== $this->generateSignature($userId,$time)) {
  26.             echo "The secret was incorrect\n";
  27.         }
  28.  
  29.         $this->currentUser = $userId;
  30.  
  31.         echo "Logged in as user: $userId\n";
  32.  
  33.     }
  34.  
  35.     function login($userId) {
  36.  
  37.         $this->userId = $userId;
  38.  
  39.         $time = time();
  40.  
  41.         $cookie = $this->userId . ':' . time() . ':' . $this->generateSignature($userId,$time);
  42.  
  43.         setcookie('MYSESSION',$cookie,$time+$this->timeout,null,null,null,true);
  44.  
  45.         echo "Set cookie: $cookie\n";
  46.  
  47.     }
  48.  
  49.     function generateSignature($userId,$time) {
  50.  
  51.         $stringToSign =
  52.            $userId . "\n" .
  53.            $time . "\n" .
  54.            $_SERVER['HTTP_USER_AGENT'] . "\n" .
  55.            $_SERVER['REMOTE_ADDR'];
  56.  
  57.         return hash_hmac('SHA1',$stringToSign,$this->secret);
  58.  
  59.     }
  60.  
  61. }
  62.  
  63. ob_start();
  64. $session = new BrowserSession();
  65. $session->init();
  66.  
  67. if (isset($_GET['login'])) $session->login($_GET['login']);
  68. else {
  69.  
  70.     echo '<br /><a href="?login=1234">Log in as user 1234</a>';
  71.  
  72. }
  73. ?>

A few notes:

  • The preceeding code was just intended as a proof of concept, it's missing some validation.
  • Currently the secret would be the same for every user. I was thinking of appending some per-user information to the secret. If somebody does guess or bruteforce the secret, they would only have access to a single users' information.
  • If a user changes their password, existing sessions should expire. To do this the signature should also include a sequence number that changes when the password changes.
  • Currently this only stores a user id. It could be extended to contain more data, but this is all I need.

So, is there anything fundamentally wrong with this approach? In general the client should never be trusted, but for setups where the security requirements aren't as high (highly subjective, I know) I feel this might be strong enough. OAuth, OpenID and Amazon AWS all seem to trust HMAC+SHA1, but those applications do work differently.

Credit where it's due

I first asked this question on stack overflow. The users there already gave some great suggestions and pointed out some of the flaws. Thank you!

What happened to HTTP authentication?

Rant warning

We enter our usernames and password on pretty much all the sites we commonly visit. Authentication is probably one of the first things you're being taught when starting to work with PHP. For some reason, in 99% of the cases this is done through an HTML form, with the username and password submitted as a urlencoded string.

You probably know that HTTP also has native authentication, in the form of Basic and Digest authentication (read my older article if you want to know how). Every browser and pretty much any HTTP client does too. There's some big benefits to that, because it provides a very standardized mechanism to authenticate a client, whether you're a machine or human.

What baffles me is that HTTP authentication hasn't been developed further. HTTP Digest is pretty secure by itself, and has some nice features (hashed password, protection against man in the middle and replay attacks, message digests) which is way more advanced than an HTML POST form with a session cookie can provide.

What's missing?

  1. There's no way for a user to see if they are authenticated to a site. Perhaps a username in the addressbar?
  2. Pretty much everybody always wonders how they can code a logout mechanism. Because there are no session cookies that can be destroyed, there are some hacks that trick the browser to ask for credentials again. There should be no need for the server to provide this functionality. The browser knows it's logged in, and HTTP applications are stateless. We need an in-browser log-out button.
  3. Less important, some javascript hooks that allow developers to still use html forms to setup HTTP authentication.

Mozilla is doing some interesting things with their Account Manager Add-on for firefox, but even that add-on does not support HTTP authentication. With Account Manager they are jumping through some hoops with javascript hooks so it works with regular authentication systems, but you'd think that if HTTP Authentication was used, things could be a lot more straightforward. The browser knows exactly who is logged in.

So, does anyone know how this happened? Is there a major flaw in HTTP authentication I'm just missing?

When to escape your data

Two examples of escaping data are the following:

The question I'd like to ask today is, when to do this? There are two possible moments:

  1. Right when the data comes in. For SQL this used to be done with 'magic quotes' quite a bit in PHP-land. In general I don't see this happening a lot anymore for SQL. I do however see data encoded using htmlentities/htmlspecialchars before entering the database.
  2. The other way to go about it, is to only escape when you know how you're going to use it. For example, only call htmlspecialchars right before you echo() your data into your document.

I would personally argue that #2 is the best way to go about things. The first reason is that you don't know exactly how your data might be used in the future. If you pre-encoded everything using htmlentities, but at some point in the future you need the data to be used in an XML feed, you're going to be in trouble. The reason for this, is that the only valid entities in XML are &amp;, &lt;, &gt;, and &quote;. If you are going to need to need to output to CSV, very different rules apply. Other examples are: escaping for urls, escaping for command-line arguments, escaping for javascript and escaping for mime-headers.

In the illustrated example, this is no big disaster. A workaround would be to call htmlspecialchars_decode() or html_entity_decode() first, and then escape for your desired output. A worse case is filtering. If you have been stripping out all, or some html tags before saving it do the database, and later on your decide you wanted to show some of them anyway, that data is now lost.

Conclusion

So my argument is to store raw data. Only encode right before you know where you going to need it. If you're worried about the overhead of escaping right before output in an html page, cache the output.

Whichever route you go, make sure this is clearly documented. There's 2 ways this can go wrong:

  1. Escaping is done on input and output. Now you see literal &amp;'s in your html, or quotes prepended by slashes. (\'hello\').
  2. Escaping is forgotten at both ends. Now you might be vulnerable to SQL injection attacks, XSS attacks or data corruption.

What do you think? I'm especially interested in the other side of the argument.

Frame busting and clickjacking prevention

Clickjacking allows
an attacker to trick your users into clicking parts of your interface without
their consent. A simple way to describe describe this is, an attacker will embed
your application in their site as an iframe. On top of the iframe they can
show a completely different interface. You're thinking you're clicking buttons
on your own interface, while in fact you are hitting the 'Delete my account'
button in for example GMail.

Because this technique completely operates with frames, it can be
circumvented by using a 'Frame busting' technique. As a bonus, this will also
disallow for example Digg to steal and monetize your content.

Frame busting can be achieved with a simple javascript technique:

  1. <script type="text/javascript">
  2. if (top !== self) top.location.replace(self.location.href);
  3. </script>

Security through javascript?

If you think this sounds like a bad idea, you are probably right. Users might
simply have javascript disabled, and I also don't like relying on UI developers
too much to implement preventive security measures (although I realize in most
cases you do have to).

In Internet Explorer the situation is worse, IE allows you to specify the
non-standard attribute security="restricted":

  1. <iframe src="http://www.rooftopsolutions.nl/ security="restricted"></iframe>

This attribute tells IE to not allow executing of javascript in the iframe,
which actually is not a bad security measure for other types of attacks. In this
case however, it allows the attacker to disable the framebusting script.

X-Frame-Options

Thankfully, Internet Explorer 8 introduces a new feature that allows the site
owner to disallow frames altogether, which is in my opinion an even better
protection mechanism, because it doesn't rely on javascript to be executed.

The name of the http header is specified as such:

  1. X-FRAME-OPTIONS: SAMEORIGIN
  2. X-FRAME-OPTIONS: DENY

You only have to specify one of these two, 'sameorigin' means the page
can only be framed from an html page hosted on the same domain, deny will
kill framing altogether.

PHP example:

  1. <?php
  2. header('X-FRAME-OPTIONS: DENY');
  3. ?>

Firefox also appears to
have started implementing this feature, and there's afeature request for
webkit open as well.

Protecting yourself

Unfortunately you can safely assume most sites don't implement either of
these security measures. For firefox users I would therefore strongly recommend
using the NoScript plugin. Not only
does it implement the X-FRAME-OPTIONS for firefox, it also actively detects
clickjacking attempts.

Reference: hackademix.net

Preventing XSS in Javascript strings

Escaping user-input in your HTML is essential for preventing worlds #1 vulnerability.

When you're embedding user input into javascript, a simple htmlspecialchars won't cut it, you'll need to make sure you're escaping other things, like \n (line endings), and \ (slashes). Google doctype has a good list of characters in need of proper escaping to prevent users breaking your javascript.

However, when I dropped the question if a simple string replacement would be good enough, the members of the Web security mailing list gave me a different answer.

When escaping or filtering output using a blacklist (such as the one published on google doctype) browser/unicode escaping bugs are not taking into consideration. Some new vulnerability might appear in the future, which would immediately open a hole in your app. For this reason its wiser to go with a much more defensive white-list approach, essentially only letting things through you know is safe.

Introducing Reform

Reform is a tool that does exactly this. Reform allows you to escape your data for a javascript, xml, html or vbscript (yes it still exists) context. It provides libraries for Java, .NET, PHP, Perl, Python, Javascript and ASP. Pretty cool!

One dislike I have is that it only considers I really small set of unicode codepoints safe, especially when dealing with non-latin languages this is going to add a great deal to the bandwidth usage and the legibility of your sourcecode. One would think there has to be more ranges considered 'safe'.

PHP example:

  1. <?php
  2.   // Assuming the Reform class is included..
  3.  
  4.   echo '<script type="text/javascript"> var myString = ', Reform::JsString($userInput), '; </script>';
  5.  
  6. ?>

I made a couple of changes in the PHP version, specifically:

  • Prepended the 'static' keyword to every method to make it work in PHP5's strict mode.
  • Removed the UTF-8 checks, I'm in a controlled environment, mbstring is installed, and the internal encoding is utf-8.
  • Added a parameter to Reform::JsString to not automatically put the string between quotes (').

IE8 comprehensive protection

Today on the IE blog a big announcement was made regarding the upcoming security features in Internet Explorer 8.

Definitely check it out! Among things it includes an XSS protection filter, HTML sanitizing built straight into the scripting engine and a way to disable the infamous 'content sniffing'. I'd still hope to see the content-sniffing 'feature' to be opt-in, instead of the proposed opt-out solution.. but hey, at least it allows us to plug the hole.

To serve files as text/plain, serve the document with the Content-Type header as:

  1. Content-Type: text/plain; authoritative=true;

I have to say, I'm quite impressed how IE is catching up with things like standards and security.

Site Security Policy

Via: Jeremiah Grossman.

A proposal for a Site Security Policy has been proposed by mozilla employee Brandon Sterne. This is an extremely important specification for the web, and could be a big step ahead for security on the web.

<rant>

Over the last decade websites have transformed into feature-rich web applications, with the introduction of Javascript and XMLHTTPRequest, Flash and whatnot. While great for user experience, this has also brought huge security implications, resulting in over 80% of all documented security vulnerabilities in 2007 being carried out using XSS.

While implementing all this new fancy stuff, browser vendors have been slacking thinking about the security implications and essentially the responsibility of safe browsing has been put on the user. (Remember the 90's advice of disabling cookies while browsing?). Browsers have become better over time, but one single XSS hole on any site can still have devastating effects for you and your users.

With the demand for web development and web developers still way higher than the supply, education is in a sorry state too. In job interviews I've conducted it's been rare to find a junior who knows of the concept XSS, and finding one who can explain the implications of CSRF, is, well, I've yet to meet one. I don't claim to be a security expert myself, but I feel everybody who's in the profession of web development, should at least be aware of the basic attack vectors and how to prevent them.

“If builders built buildings the way programmers wrote programs, then the first woodpecker that came along would destroy civilization.”

- Weinberg’s Second Law

</rant>

With the Site Security Policy we're given the ability to lock down certain types of behaviours. It allows us to disable javascript included from unknown domains (a whitelist approach), and HTTP requests initiated by external domains, essentially fixing the CSRF problem completely. Additionally, it defines a way for a browser to log attempts to violate the policy.

For the actual implementation details, I'd suggest just reading the spec, even though its still a proposal, it's good reading material.

So whats next?

The spec needs to be finished. While the current policy is distributed through HTTP headers, some people seem to prefer an external file, like how crossdomain.xml or robots.txt is implemented. The latter would have my vote, because the policy can then be easily cached, which can save some bytes in the end. It would also be easier for people to upload a policy file to a server where there's no scripting available and allows the policy to be enforced for a complete domain, instead having to add it to each and every script.

And last but not least, browser vendors would need to implement it. Sterne works at Mozilla, so that's a good sign already. Personally, I can't wait.

Getting around "su : must be run from a terminal"

I killed the sshd daemon from one of our servers by accident today. I wanted to avoid going to the data center, so I was able to upload and run a PHP script to give me a shell..

Problem was, that it would run under the www-data user and trying to su to root gave me the following message:

  1. su : must be run from a terminal

After some googling, I found the solution from Tero's glob. If you have python installed, just run the following from your shell:

  1. echo "import pty; pty.spawn('/bin/bash')" > /tmp/asdf.py
  2. python /tmp/asdf.py

You now have a proper terminal, and things like 'su' will work as usual.

HTML Purifier rocks!

HTML purifier

I had to create an RSS aggregator for my job, and I had to find (or create) a good tool that sanitizes the HTML that comes in. I stumbled upon HTML purifier, and I haven't seen a better tool for the job yet.

Some of the features:

  • It can turn the html into valid XHTML (transitional or string)
  • So it also balances tags out..
  • Removes any code that could expose a security risk. (tested with RSnakes XSS cheatcheat).
  • Allows you to truncate HTML (if you don't want to show an entire post) and still results in proper HTML!

So yea, if you need something similar; I'd suggest you check it out..

 1 2 Next →

About

My name is Evert, and I've been writing semi-regularly on this blog since 2006.

I'm currently available for contract work.

more info.

Subscribe

Dropbox

Dropbox is a simple cross-platform online backup and sync application. The first 2GB of space is free, and both you and me get an extra 250MB extra space if you sign up through this link.