Escaping user-input in your HTML is essential for preventing worlds #1 vulnerability.
When you're embedding user input into javascript, a simple htmlspecialchars won't cut it, you'll need to make sure you're escaping other things, like \n (line endings), and \ (slashes). Google doctype has a good list of characters in need of proper escaping to prevent users breaking your javascript.
However, when I dropped the question if a simple string replacement would be good enough, the members of the Web security mailing list gave me a different answer.
When escaping or filtering output using a blacklist (such as the one published on google doctype) browser/unicode escaping bugs are not taking into consideration. Some new vulnerability might appear in the future, which would immediately open a hole in your app. For this reason its wiser to go with a much more defensive white-list approach, essentially only letting things through you know is safe.
Introducing Reform
Reform is a tool that does exactly this. Reform allows you to escape your data for a javascript, xml, html or vbscript (yes it still exists) context. It provides libraries for Java, .NET, PHP, Perl, Python, Javascript and ASP. Pretty cool!
One dislike I have is that it only considers I really small set of unicode codepoints safe, especially when dealing with non-latin languages this is going to add a great deal to the bandwidth usage and the legibility of your sourcecode. One would think there has to be more ranges considered 'safe'.
PHP example:
- <?php
- // Assuming the Reform class is included..
-
- echo '<script type="text/javascript"> var myString = ', Reform::JsString($userInput), '; </script>';
-
- ?>
I made a couple of changes in the PHP version, specifically:
- Prepended the 'static' keyword to every method to make it work in PHP5's strict mode.
- Removed the UTF-8 checks, I'm in a controlled environment, mbstring is installed, and the internal encoding is utf-8.
- Added a parameter to Reform::JsString to not automatically put the string between quotes (').
