Content Security Policy introduction

I blogged about Content Security Policy about 2 year ago when it was still called 'Site Security Policy'. It started as a specification and an add-on, and turned into a patch a bit later. Finally it made it into Firefox 4 beta 1. I think CSP is the next web security revolution, so make yourself aware of how it works and the implications.

So what is it? The short version is that it's a very effective measure against cross-site scripting. By specifying a policy through the 'X-Content-Security-Policy', you can specify exactly from which locations you accept javascript and other content. This allows you to block scripts from any domains unknown to you, and inline scripts altogether.

A simple example

  1. X-Content-Security-Policy: allow 'self'

A simple PHP example to see this in action:

  1. <?php
  2.  
  3. header("X-Content-Security-Policy: allow 'self'");
  4.  
  5. ?>
  6. <html>
  7.   <head>
  8.     <title>CSP test</title>
  9.   </head>
  10.   <body>
  11.  
  12. <script type="text/javascript">
  13.  
  14. alert('XSS!');
  15.  
  16. </script>
  17.  
  18.   </body>
  19. </html>

If the above code is opened in Firefox 4.0 beta1, the script will not execute, and a warning is added to the "Error Console" (in the Tools menu).

Not only does this header block inline scripts, it also blocks the following:

  • eval(). This important for people using eval() to parse json responses.
  • setTimeout and setInterval if the function is provided as a string.
  • javascript: urls
  • HTML event attributes (onclick, onload, etc.).
  • All images, plugin objects (flash, quicktime etc.), audio, video, html frames and fonts not served from the same domain as the html page.
  • XMLHttpRequest to domains other than the source domain.

Fortunately there are fine grained controls about what you want to allow from which domains. Here are some examples from the specification.

  1. X-Content-Security-Policy: allow 'self'; img-src *; \
  2.                            object-src media1.com media2.com *.cdn.com; \
  3.                            script-src trustedscripts.example.com

This example starts with "allow 'self'", allowing only content from the same domain. The "img-src *" rule allows images from any domain. "object-src: media1.com media2.com" allows <object> tags to use files from media1.com, media1.com and the same domain as the html was served from. To learn more about these, I would recommend just taking a good look at the directives list in the specification.

Options and reporting

Using the 'options' directive it's possible to turn on specific measures. Valid values for options are 'eval-script' and 'inline-script'.

  1. X-Content-Security-Policy: allow 'self'; options inline-script, eval-script

The preceding example allows inline scripts (using html event attributes, or the script tag) as well as the 'eval()' function. In general I would try to avoid this though.

When a security rule is violated, it's possible to get the browser to send a report back to the server. For example, if an image is referenced from a blocked domain, the browser can send a simple report to a url you specify.

  1. X-Content-Security-Policy: allow 'self'; report-uri http://example.org/cspreport.php

This allows you to detect any problems with your policy, or successful attempts by your evil users to inject code. An example of such a report is the following:

  1. {
  2.   "csp-report":
  3.     {
  4.       "request": "GET http://index.html HTTP/1.1",
  5.       "request-headers": "Host: example.com                                                        
  6.                          User-Agent: Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.3a5pre) Gecko/20100601 Minefield/3.7a5pre                                                        
  7.                          Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8  
  8.                          Accept-Language: en-us,en;q=0.5                                          
  9.                          Accept-Encoding: gzip,deflate                                            
  10.                          Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7                          
  11.                          Keep-Alive: 115                                                          
  12.                          Connection: keep-alive",
  13.       "blocked-uri": "http://evil.com/some_image.png",
  14.       "violated-directive": "img-src 'self'",
  15.       "original-policy": "allow 'none'; img-src *, allow 'self'; img-src 'self'"
  16.     }
  17. }

Final notes

Using CSP does not mean you can go easy on other security measures. At the moment a very limited amount of users will have support for CSP, so everybody else still needs to be protected. However, it's still a great idea to implement. Your Firefox users will automatically be protected better, and because of the reporting functionality, they automatically help you detect holes which benefits everybody.

My guess is that CSP is going to be very important, and is here to stay. There are two things you can do to prepare for the future:

  1. Figure out your policy. It's a good idea for your web application to know anyway where resources are coming from. Especially advertisers tend to be bad at using many different domains and scripts using other scripts.
  2. Try to avoid any inline scripting, html event handlers and eval(). They are all avoidable, and in my opinion it is a good idea to keep your javascript out of html anyway. This is a big one, because both inline scripts and html events are still very popular. With the popularity of libraries such as jQuery, I do think it will be easier to just grab most of the inline scripts and move them to an external script.

New job at IBuildings

IBuildings

Since a 2 weeks I'm now employed by IBuildings. First a couple of weeks from their office in Vlissingen, and then if all goes well, to Utrecht.

IBuildings is actually a company I've been wanting to work for for a while, so I'm pretty happy. So far it's a bit of an adjustment to work regular hours again, but I'm having fun. It's good to be working in an office again. Working from home can definitely get to you after a while. Having lots of talented people around is a big plus.

And: if you know a good place to live in Utrecht, drop me a line! I'm looking to rent a place not too far from downtown :)

Storing encrypted session information in a cookie

cookie

Our session system is due for an upgrade. Currently all PHP sessions are stored in the database, and some things are getting a bit slow. There have been a couple of approaches I've been considering, one of which is simply storing all the information in a browser cookie.

First I want to make clear I don't necessarily condone this. The reason I'm writing this post, is because I'm hoping for some more community feedback. Is this a really bad idea? I would love to know.

The benefits

If all the session data is stored in the browser, it means that I don't need to store it on the server. I actually don't care all that much for having the data on the server (unless it's the only secure way), it's mostly a gigantic map with session tokens and user id's (along with some other info).

I also feel it's more natural for HTTP, as it makes it a bit more stateless.

Sample code

  1. <?php
  2.  
  3. class BrowserSession {
  4.  
  5.     public $secret = 'this will need to be a cryptographic random number';
  6.     public $currentUser = null;
  7.  
  8.     // Sessions time out after 10 minutes
  9.     public $timeout = 600;
  10.  
  11.     function init() {
  12.  
  13.         if (!isset($_COOKIE['MYSESSION'])) {
  14.             echo "No session cookie found\n";
  15.             return;
  16.         }
  17.  
  18.         list($userId, $time, $signature) = explode(':',$_COOKIE['MYSESSION']);
  19.        
  20.         // The cookie is old
  21.         if ($time> time() + $this->timeout) {
  22.             echo "The session cookie timed out\n";
  23.         }
  24.  
  25.         if ($signature !== $this->generateSignature($userId,$time)) {
  26.             echo "The secret was incorrect\n";
  27.         }
  28.  
  29.         $this->currentUser = $userId;
  30.  
  31.         echo "Logged in as user: $userId\n";
  32.  
  33.     }
  34.  
  35.     function login($userId) {
  36.  
  37.         $this->userId = $userId;
  38.  
  39.         $time = time();
  40.  
  41.         $cookie = $this->userId . ':' . time() . ':' . $this->generateSignature($userId,$time);
  42.  
  43.         setcookie('MYSESSION',$cookie,$time+$this->timeout,null,null,null,true);
  44.  
  45.         echo "Set cookie: $cookie\n";
  46.  
  47.     }
  48.  
  49.     function generateSignature($userId,$time) {
  50.  
  51.         $stringToSign =
  52.            $userId . "\n" .
  53.            $time . "\n" .
  54.            $_SERVER['HTTP_USER_AGENT'] . "\n" .
  55.            $_SERVER['REMOTE_ADDR'];
  56.  
  57.         return hash_hmac('SHA1',$stringToSign,$this->secret);
  58.  
  59.     }
  60.  
  61. }
  62.  
  63. ob_start();
  64. $session = new BrowserSession();
  65. $session->init();
  66.  
  67. if (isset($_GET['login'])) $session->login($_GET['login']);
  68. else {
  69.  
  70.     echo '<br /><a href="?login=1234">Log in as user 1234</a>';
  71.  
  72. }
  73. ?>

A few notes:

  • The preceeding code was just intended as a proof of concept, it's missing some validation.
  • Currently the secret would be the same for every user. I was thinking of appending some per-user information to the secret. If somebody does guess or bruteforce the secret, they would only have access to a single users' information.
  • If a user changes their password, existing sessions should expire. To do this the signature should also include a sequence number that changes when the password changes.
  • Currently this only stores a user id. It could be extended to contain more data, but this is all I need.

So, is there anything fundamentally wrong with this approach? In general the client should never be trusted, but for setups where the security requirements aren't as high (highly subjective, I know) I feel this might be strong enough. OAuth, OpenID and Amazon AWS all seem to trust HMAC+SHA1, but those applications do work differently.

Credit where it's due

I first asked this question on stack overflow. The users there already gave some great suggestions and pointed out some of the flaws. Thank you!

What happened to HTTP authentication?

Rant warning

We enter our usernames and password on pretty much all the sites we commonly visit. Authentication is probably one of the first things you're being taught when starting to work with PHP. For some reason, in 99% of the cases this is done through an HTML form, with the username and password submitted as a urlencoded string.

You probably know that HTTP also has native authentication, in the form of Basic and Digest authentication (read my older article if you want to know how). Every browser and pretty much any HTTP client does too. There's some big benefits to that, because it provides a very standardized mechanism to authenticate a client, whether you're a machine or human.

What baffles me is that HTTP authentication hasn't been developed further. HTTP Digest is pretty secure by itself, and has some nice features (hashed password, protection against man in the middle and replay attacks, message digests) which is way more advanced than an HTML POST form with a session cookie can provide.

What's missing?

  1. There's no way for a user to see if they are authenticated to a site. Perhaps a username in the addressbar?
  2. Pretty much everybody always wonders how they can code a logout mechanism. Because there are no session cookies that can be destroyed, there are some hacks that trick the browser to ask for credentials again. There should be no need for the server to provide this functionality. The browser knows it's logged in, and HTTP applications are stateless. We need an in-browser log-out button.
  3. Less important, some javascript hooks that allow developers to still use html forms to setup HTTP authentication.

Mozilla is doing some interesting things with their Account Manager Add-on for firefox, but even that add-on does not support HTTP authentication. With Account Manager they are jumping through some hoops with javascript hooks so it works with regular authentication systems, but you'd think that if HTTP Authentication was used, things could be a lot more straightforward. The browser knows exactly who is logged in.

So, does anyone know how this happened? Is there a major flaw in HTTP authentication I'm just missing?

Guidelines for generating XML

Over the last little while I've come across quite a few XML feed generators written in PHP, with varying degrees of 'correctness'. Even though generating XML should be very simple, there's still quite a bit of pitfalls I feel every PHP or (insert your language)-developer should know about.

1. You are better off using an XML library

This is the first and foremost rule. Most people end up generating their xml using simple string concatenation, while there are many dedicated tools out there that really help you generate your own XML.

In PHP land the best example is XMLWriter. It is actually quite easy to use:

  1. <?php
  2.  
  3. $xmlWriter = new XMLWriter();
  4. $xmlWriter->openMemory();
  5. $xmlWriter->startDocument('1.0','UTF-8');
  6. $xmlWriter->startElement('root');
  7. $xmlWriter->text('Contents of the root tag');
  8. $xmlWriter->endElement(); // root
  9. $xmlWriter->endDocument();
  10. echo $xmlWriter->outputMemory();
  11.  
  12. ?>

Granted, XMLWriter is verbose, but you have to worry a lot less about escaping and validating your xml documents.

2. Understand Unicode

Do you know the difference between a byte, a character and a codepoint? If you don't, I'd probably think twice about hiring you. It's absolutely shocking how many programmers are out there that don't understand the basics of unicode, UTF-8 and how it relates to the web.

An often-heard excuse for not having to care for non-ascii characters, such as people in English speaking countries. However, if you need to use the euro-sign (€) or if you deal with people copy-pasting from word documents, you most definitely will come across problems.

A simple call to utf8_encode is not actually enough. If some of your source-data was already encoded as UTF-8 you will end up losing data. Only use utf8_encode if you know your source-data is encoded as ISO-8859-1.

The one true way to go about it, is to make sure that every step of the way in your web application is UTF-8. Including your HTTP/HTML contenttype, MySQL database and anything that basically ingests data for your application (email, csv importers, xml readers, web services). Once you are absolutely sure every part in your application is UTF-8, and converted any old data things will start to behave correctly.

3. CDATA is never a solution

It might be tempting to solve any encoding issues by simply surrounding it with <![CDATA[ and ]]>. This might make sure that XML parsers don't throw an error when reading, but they still have 'incorrect' characters. If your XML document has CDATA tags, or you think you need CDATA, you are probably wrong.

More often than not using CDATA actually stems from encoding problems (see section 2). CDATA is not a method to encode binary characters, xml parsers will still throw errors if they come across certain byte sequences. If you do really need to encode binary data in XML, the best way is to use something like base64_encode instead.

If your XML feed uses CDATA because of encoding issues you actually defer your problem to the consumer of your XML feed. So instead of seeing 'weird characters' on your side, the person that reads your xml feed now has no good way to detect which encoding was actually used. If it's for example an RSS feed you're generating, this can result in RSS readers throwing errors, or characters showing up incorrectly.

4. Be liberal with whitespace

An error like "unexpected character at line 1, column 176456" is much harder to debug than "line 5078, column 24". Whitespace between xml tags does usually not have any significance, so you can add as much indentation and linebreaks (\n) as you want. Note that tools such as XMLWriter will indent for you automatically.

5. Be verbose

Even though you might easily figure out that <ORD_NR> means 'order number', there's no reason why you shouldn't actually state it as <order-number>. Note that the following rules appear to fall in favor for most people:

  • Use lowercase for tags and attribute names.
  • Use dashes (-) to separate words, not underscores (_).
  • Minimize the use of attributes, nested tags allow more flexibility.

6. Be careful with entities

The only valid entities in XML are &lt; (<), &gt; (>) &amp; (&) and &quot; ("), so any other entity will simply not work and throw errors.

HTML DTD's add many entities, so if you're mostly used to using HTML you might expect other entities to work. If your source-data already has entities, you might have to get rid of these first.

In PHP it means you should use htmlspecialchars, instead of htmlentities.

Feel free to discuss, disagree, or add on to this list in the comments, I'm happy to hear your experiences.

Blogging for 4 years

I've been blogging for 4 years now, so it's time for some reflection.

This year I've been able to crank out a mere 40 blog-posts. Compared to 50 in the year before, and 54 before that it appears there's a year-over-year decline. One can only hope that the quality went up to make up for it.

Some crazy changes have happened in my life and I ended up moving from Toronto, Canada to the Netherlands and now in Daegu, South Korea. I switched from a full-time job to freelancing and from a crappy old custom blog to Habari. All good choices so far =).

Thanks for reading and commenting!

SabreDAV 1.2 released (with CalDAV support)

It's taken almost 12 months, but I finally finished a CalDAV plugin for SabreDAV. I've stayed within the standard as much as possible, but had to leave out some features that failed to meet the cost/benefit requirement.

Most importantly, there's solid support for Apple iCal, Evolution, Lightning/Sunbird, and the iPhone.

It all uses PDO, and it's tested on both SQLite3 and MySQL.

SabreDAV is primarily intended as a toolkit to implement these protocols in different applications. Despite that, it should be reasonably easy to setup your own CalDAV server. Head over to the instructions to figure out how.

Other changes and additions

  • CalDAV (RFC4791).
  • PDO backends for Locks, Authentication and Calendars.
  • 95% unittesting code coverage. 415 unittests. There's actually more unittesting code now than 'normal' code.
  • ACL (RFC3744) principals. Note that privileges are not yet implemented.
  • Support for Extended MKCOL (RFC5689).
  • Support for current-user-principal (RFC5397).
  • Now throwing an error if you're using Finder on an unsupported server (nginx, apache + fcgi, lighttpd).
  • Support for If-Range, If-Modified-Since, If-Unmodified-Since, If-Match and If-None-Match.
  • There's now 2 distributions. 1 unified zip with all the features, as well as 4 separate pear packages (Sabre, Sabre_HTTP, Sabre_DAV, Sabre_CalDAV).

If you're upgrading from 1.0, some changes have been made. Take a look at the migration guide for more information.

Download.

Future plans

The next big thing will be CardDAV. It won't take nearly as long as CalDAV support, as there are a lot of similarities. In general I feel I should spend a bit less time on this. I've been spending a large portion of my time in developing SabreDAV into a mature project, which can be hard to justify if it's not a source of income. I need to eat, after all.

I'm still enjoying it very much though and the best way to keep me motivated is to let me know you're using it or by requesting a new feature =).

When to escape your data

Two examples of escaping data are the following:

The question I'd like to ask today is, when to do this? There are two possible moments:

  1. Right when the data comes in. For SQL this used to be done with 'magic quotes' quite a bit in PHP-land. In general I don't see this happening a lot anymore for SQL. I do however see data encoded using htmlentities/htmlspecialchars before entering the database.
  2. The other way to go about it, is to only escape when you know how you're going to use it. For example, only call htmlspecialchars right before you echo() your data into your document.

I would personally argue that #2 is the best way to go about things. The first reason is that you don't know exactly how your data might be used in the future. If you pre-encoded everything using htmlentities, but at some point in the future you need the data to be used in an XML feed, you're going to be in trouble. The reason for this, is that the only valid entities in XML are &amp;, &lt;, &gt;, and &quote;. If you are going to need to need to output to CSV, very different rules apply. Other examples are: escaping for urls, escaping for command-line arguments, escaping for javascript and escaping for mime-headers.

In the illustrated example, this is no big disaster. A workaround would be to call htmlspecialchars_decode() or html_entity_decode() first, and then escape for your desired output. A worse case is filtering. If you have been stripping out all, or some html tags before saving it do the database, and later on your decide you wanted to show some of them anyway, that data is now lost.

Conclusion

So my argument is to store raw data. Only encode right before you know where you going to need it. If you're worried about the overhead of escaping right before output in an html page, cache the output.

Whichever route you go, make sure this is clearly documented. There's 2 ways this can go wrong:

  1. Escaping is done on input and output. Now you see literal &amp;'s in your html, or quotes prepended by slashes. (\'hello\').
  2. Escaping is forgotten at both ends. Now you might be vulnerable to SQL injection attacks, XSS attacks or data corruption.

What do you think? I'm especially interested in the other side of the argument.

Goodbye old Firefox profile

My firefox was getting a bit sluggish, which sucks.. because it's still my favourite browser. Today was just a disaster. The entire browser would just hang for seconds at a time, slow startup times.. etc.

I realized that firefox was running on a profile I made 4 years ago. It has seen Firefox 2.0 up until 3.6, as well as some beta and minefield builds. Dozens of add-ons have been installed, updated, removed and installed again and it has been migrated to 3 different computers. This couldn't be good for performance. Who knows what kind of settings, files and data is left hanging here and there.

So I went ahead today and created a new firefox profile. This is simply done by starting firefox with the --profilemanager argument.

I needed my bookmarks, history and saved passwords though. To get these 3 items, copy the following files from your old, to your new profile (while firefox is not running!).

  • places.sqlite - Contains history and bookmarks.
  • signon.sqlite - Contains all your saved passwords.
  • key3.db - Contains the encryption keys to decode your passwords.

After that, it was just a matter of installing Firebug, Firecookie and noscript again.

Right now the browser feels like new again. If you're seeing similar problems, I can highly recommend it.. but I can't be help responsible for losing any of your valuable data.. make backups! Your new profile will also have none of the customizations you've ever made.

Some links

Dropbox client library for PHP

I enjoy using Dropbox. It is a very easy syncing and backup tool, and it works everywhere. A few days ago the developer API was released. After a bit of wrestling with oauth, I completed a client library for PHP, and open sourced it (MIT licensed).

If you want to give it a shot, you first need to sign up for the developer program and get yourself your security tokens. Once that's done, you can install the library using:

  1. pear channel-discover pear.sabredav.org
  2. pear install sabredav/Dropbox-alpha

If that worked, you should be able to start using the API. The following example displays account information and uploads a file.

  1. <?php
  2.  
  3. /* Please supply your own consumer key and consumer secret */
  4. $consumerKey = '';
  5. $consumerSecret = '';
  6.  
  7. include 'Dropbox/autoload.php';
  8.  
  9. session_start();
  10.  
  11. $dropbox = new Dropbox_API($consumerKey, $consumerSecret);
  12.  
  13. /* Display account information */
  14. var_dump($dropbox->getAccountInfo());
  15.  
  16. /* Upload itself */
  17. $dropbox->putFile(basename(__FILE__), __FILE__);

The script needs to be run in a browser, because you will be redirected to Dropbox to authorize access.

I hope people like the library, and if you have any suggestions, feel free to let me know. If you want to contribute, you can head over to the project site on google code.

If you haven't used Dropbox yet and want to try it, consider signing up through this link. If you do so, both you and my girlfriend get an extra 250MB space for free (and she really needs it).

 1 2 3 … 20 Next →

About

My name is Evert, and I've been writing semi-regularly on this blog since 2006.

I'm currently available for contract work.

more info.

Subscribe

Dropbox

Dropbox is a simple cross-platform online backup and sync application. The first 2GB of space is free, and both you and me get an extra 250MB extra space if you sign up through this link.