Rotating an image, retaining the original size

UPDATE: Ignore this post, this ranks number one in uselessness on my blog so far.. ImageRotate already does what I tried to achieve, and I didn't try it out, I just relied on a confusing line in the php manual.. For details : read pierre's response on this blog

.

Here's a small code snippet which allows you to rotate images, while the dimensions are being retained (width becomes height, height becomes width).

The image argument should be an image handler, created by for example imagecreatefromjpeg(), or imagecreatetruecolor().

The rotation argument expects 0, 1, 2, 3, for 0 degrees, 90 degrees, 180 degrees or 270 degrees rotation. Unlike php's imagerotate function, this is clockwise..

  1. <?php
  2.  
  3. function imageSmartRotate(&$image,$rotation) {
  4.  
  5. $width = imagesx($image);
  6. $height = imagexy($image);
  7.  
  8. switch ($rotation) {
  9.  
  10. case 0 : return true;
  11. case 2 : $image = imagerotate($image,180); return true;
  12. case 1 :
  13. case 3:
  14. // checking the biggest coordinate (width or height)
  15. $maxXY = $width>$height?$width:$height;
  16.  
  17. // create our new canvas
  18. $rImg = imagecreatetruecolor($maxXY,$maxXY);
  19.  
  20. // copy our image
  21. imagecopy($rImg,$image,0,0,0,0,$width,$height);
  22.  
  23. // rotate it (this function is counter clockwise (and counter intuitive))
  24. $rImg = imagerotate($Img,(360-($rotate*90)),0);
  25.  
  26. // swap width/height
  27. $h = $width;
  28. $width = $height;
  29. $height = $width;
  30.  
  31. // create the final canvas, with the proper dimensions
  32. $image = imagecreatetruecolor($width,$height);
  33.  
  34. // if rotated 90 degrees the image will be in the bottom right corner
  35. // if rotated 270 degrees the image will be in the top left corner
  36.  
  37. if ($rotate == 1) {
  38. imagecopy($image,$rImg,0,0,$max-$width,$max-$height,$maxXY,$maxXY);
  39. } else {
  40. imagecopy($image,$rImg,0,0,0,0,$width,$height);
  41. }
  42. return true;
  43.  
  44. default :
  45. return false;
  46.  
  47.  
  48. }
  49.  
  50. }
  51.  
  52. ?>

This method does take a bit more memory than usual, because it needs to create 2 other images of the same size to complete.. Are there more efficient ways to do this?

Caching in PHP using the filesystem, APC and Memcached

Caching is very important and really pays off in big internet applications. When you cache the data you're fetching from the database, in a lot of cases the load on your servers can be reduced enormously.

One way of caching, is simply storing the results of your database queries in files.. Opening a file and unserializing is often a lot faster than doing an expensive SELECT query with multiple joins.

Here's a simple file-based caching engine.

  1. <?php
  2.  
  3. // Our class
  4. class FileCache {
  5.  
  6. // This is the function you store information with
  7. function store($key,$data,$ttl) {
  8.  
  9. // Opening the file
  10. $h = fopen($this->getFileName($key),'w');
  11. if (!$h) throw new Exception('Could not write to cache');
  12. // Serializing along with the TTL
  13. $data = serialize(array(time()+$ttl,$data));
  14. if (fwrite($h,$data)===false) {
  15. throw new Exception('Could not write to cache');
  16. }
  17. fclose($h);
  18.  
  19. }
  20.  
  21. // General function to find the filename for a certain key
  22. private function getFileName($key) {
  23.  
  24. return '/tmp/s_cache' . md5($key);
  25.  
  26. }
  27.  
  28. // The function to fetch data returns false on failure
  29. function fetch($key) {
  30.  
  31. $filename = $this->getFileName($key);
  32. if (!file_exists($filename) || !is_readable($filename)) return false;
  33.  
  34. $data = file_get_contents($filename);
  35.  
  36. $data = @unserialize($data);
  37. if (!$data) {
  38.  
  39. // Unlinking the file when unserializing failed
  40. unlink($filename);
  41. return false;
  42.  
  43. }
  44.  
  45. // checking if the data was expired
  46. if (time() > $data[0]) {
  47.  
  48. // Unlinking
  49. unlink($filename);
  50. return false;
  51.  
  52. }
  53. return $data[1];
  54. }
  55.  
  56. }
  57.  
  58. ?>

Key strategies

All the data is identified by a key. Your keys have to be unique system wide; it is therefore a good idea to namespace your keys. My personal preference is to name the key by the class thats storing the data, combined with for example an id.

example

Your user-management class is called My_Auth, and all users are identified by an id. A sample key for cached user-data would then be "My_Auth:users:1234". '1234' is here the user id.

Some reasoning behind this code

I chose 4096 bytes per chunk, because this is often the default inode size in linux and this or a multiple of this is generally the fastest. Much later I found out file_get_contents is actually faster.

Lots of caching engines based on files actually don't specify the TTL (the time it takes before the cache expires) at the time of storing data in the cache, but while fetching it from the cache. This has one big advantage; you can check if a file is valid before actually opening the file, using the last modified time (filemtime()).

The reason I did not go with this approach is because most non-file based cache systems do specify the TTL on storing the data, and as you will see later in the article we want to keep things compatible. Another advantage of storing the TTL in the data, is that we can create a cleanup script later that will delete expired cache files.

Usage of this class

The number one place in web applications where caching is a good idea is on database queries. MySQL and others usually have a built-in cache, but it is far from optimal, mainly because they have no awareness of the logic of you application (and they shouldn't have), and the cache is usually flushed whenever there's an update on a table. Here is a sample function that fetches user data and caches the result for 10 minutes.

  1. <?php
  2.  
  3. // constructing our cache engine
  4. $cache = new FileCache();
  5.  
  6. function getUsers() {
  7.  
  8. global $cache;
  9.  
  10. // A somewhat unique key
  11. $key = 'getUsers:selectAll';
  12.  
  13. // check if the data is not in the cache already
  14. if (!$data = $cache->fetch($key)) {
  15. // there was no cache version, we are fetching fresh data
  16.  
  17. // assuming there is a database connection
  18. $result = mysql_query("SELECT * FROM users");
  19. $data = array();
  20.  
  21. // fetching all the data and putting it in an array
  22. while($row = mysql_fetch_assoc($result)) { $data[] = $row; }
  23.  
  24. // Storing the data in the cache for 10 minutes
  25. $cache->store($key,$data,600);
  26. }
  27. return $data;
  28. }
  29.  
  30. $users = getUsers();
  31.  
  32. ?>

The reason i picked the mysql_ set of functions here, is because most of the readers will probably know these.. Personally I prefer PDO or another abstraction library. This example assumes there's a database connection, a users table and other issues.

Problems with the library

The first problem is simple, the library will only work on linux, because it uses the /tmp folder. Luckily we can use the php.ini setting 'session.save_path'.

  1. <?php
  2.  
  3. private function getFileName($key) {
  4.  
  5. return ini_get('session.save_path') . '/s_cache' . md5($key);
  6.  
  7. }
  8.  
  9. ?>

The next problem is a little bit more complex. In the case where one of our cache files is being read, and in the same time being written by another process, you can get really unusual results. Caching bugs can be hard to find because they only occur in really specific circumstances, therefore you might never really see this issue happening yourself, somewhere out there your user will.

PHP can lock files with flock(). Flock operates on an open file handle (opened by fopen) and either locks a file for reading (shared lock, everybody can read the file) or writing (exclusive lock, everybody waits till the writing is done and the lock is released). Because file_get_contents is the most efficient, and we can only use flock on filehandles, we'll use a combination of both.

The updated store and fetch methods will look like this

  1. <?php
  2. // This is the function you store information with
  3. function store($key,$data,$ttl) {
  4.  
  5. // Opening the file in read/write mode
  6. $h = fopen($this->getFileName($key),'a+');
  7. if (!$h) throw new Exception('Could not write to cache');
  8.  
  9. flock($h,LOCK_EX); // exclusive lock, will get released when the file is closed
  10.  
  11. fseek($h,0); // go to the beginning of the file
  12.  
  13. // truncate the file
  14. ftruncate($h,0);
  15.  
  16. // Serializing along with the TTL
  17. $data = serialize(array(time()+$ttl,$data));
  18. if (fwrite($h,$data)===false) {
  19. throw new Exception('Could not write to cache');
  20. }
  21. fclose($h);
  22.  
  23. }
  24.  
  25. function fetch($key) {
  26.  
  27. $filename = $this->getFileName($key);
  28. if (!file_exists($filename)) return false;
  29. $h = fopen($filename,'r');
  30.  
  31. if (!$h) return false;
  32.  
  33. // Getting a shared lock
  34. flock($h,LOCK_SH);
  35.  
  36. $data = file_get_contents($filename);
  37. fclose($h);
  38.  
  39. $data = @unserialize($data);
  40. if (!$data) {
  41.  
  42. // If unserializing somehow didn't work out, we'll delete the file
  43. unlink($filename);
  44. return false;
  45.  
  46. }
  47.  
  48. if (time() > $data[0]) {
  49.  
  50. // Unlinking when the file was expired
  51. unlink($filename);
  52. return false;
  53.  
  54. }
  55. return $data[1];
  56. }
  57.  
  58. ?>

Well that actually wasn't too hard.. Only 3 new lines.. The next issue we're facing is updates of data. When somebody updates, say, a page in the cms; they usually expect the respecting page to update instantly.. In those cases you can update the data using store(), but in some cases it is simply more convenient to flush the cache.. So we need a delete method.

  1. <?php
  2.  
  3. function delete( $key ) {
  4.  
  5. $filename = $this->getFileName($key);
  6. if (file_exists($filename)) {
  7. return unlink($filename);
  8. } else {
  9. return false;
  10. }
  11.  
  12. }
  13.  
  14. ?>

Abstracting the code

This cache class is pretty straight-forward. The only methods in there are delete, store and fetch.. We can easily abstract that into the following base class. I'm also giving it a proper prefix (I tend to prefix everything with Sabre, name yours whatever you want..). A good reason to prefix all your classes, is that they will never collide with other classnames if you need to include other code. The PEAR project made a stupid mistake by naming one of their classes 'Date', by doing this and refusing to change this they actually prevented an internal PHP-date class to be named Date.

  1. <?php
  2.  
  3. abstract class Sabre_Cache_Abstract {
  4.  
  5. abstract function fetch($key);
  6. abstract function store($key,$data,$ttl);
  7. abstract function delete($key);
  8.  
  9. }
  10.  
  11. ?>

The resulting FileCache (which I'l rename to Filesystem) is:

  1. <?php
  2.  
  3. class Sabre_Cache_Filesystem extends Sabre_Cache_Abstract {
  4.  
  5. // This is the function you store information with
  6. function store($key,$data,$ttl) {
  7.  
  8. // Opening the file in read/write mode
  9. $h = fopen($this->getFileName($key),'a+');
  10. if (!$h) throw new Exception('Could not write to cache');
  11.  
  12. flock($h,LOCK_EX); // exclusive lock, will get released when the file is closed
  13.  
  14. fseek($h,0); // go to the start of the file
  15.  
  16. // truncate the file
  17. ftruncate($h,0);
  18.  
  19. // Serializing along with the TTL
  20. $data = serialize(array(time()+$ttl,$data));
  21. if (fwrite($h,$data)===false) {
  22. throw new Exception('Could not write to cache');
  23. }
  24. fclose($h);
  25.  
  26. }
  27.  
  28. // The function to fetch data returns false on failure
  29. function fetch($key) {
  30.  
  31. $filename = $this->getFileName($key);
  32. if (!file_exists($filename)) return false;
  33. $h = fopen($filename,'r');
  34.  
  35. if (!$h) return false;
  36.  
  37. // Getting a shared lock
  38. flock($h,LOCK_SH);
  39.  
  40. $data = file_get_contents($filename);
  41. fclose($h);
  42.  
  43. $data = @unserialize($data);
  44. if (!$data) {
  45.  
  46. // If unserializing somehow didn't work out, we'll delete the file
  47. unlink($filename);
  48. return false;
  49.  
  50. }
  51.  
  52. if (time() > $data[0]) {
  53.  
  54. // Unlinking when the file was expired
  55. unlink($filename);
  56. return false;
  57.  
  58. }
  59. return $data[1];
  60. }
  61.  
  62. function delete( $key ) {
  63.  
  64. $filename = $this->getFileName($key);
  65. if (file_exists($filename)) {
  66. return unlink($filename);
  67. } else {
  68. return false;
  69. }
  70.  
  71. }
  72.  
  73. private function getFileName($key) {
  74.  
  75. return ini_get('session.save_path') . '/s_cache' . md5($key);
  76.  
  77. }
  78.  
  79. }
  80.  
  81. ?>

There you go, a complete, proper OOP, file-based caching class... I hope I explained things well.

Memory based caching through APC

If files aren't fast enough for you, and you have enough memory to spare.. Memory-based caching might be the solution. Obviously, storing and retrieving stuff from memory is a lot faster. The APC extension not only does opcode cache (speeds up your php scripts by caching the parsed php script), but it also provides a simple mechanism to store data in shared memory.

Using shared memory in APC is extremely simple, I'm not even going to explain it, the code should tell enough.

  1. <?php
  2.  
  3. class Sabre_Cache_APC extends Sabre_Cache_Abstract {
  4.  
  5. function fetch($key) {
  6. return apc_fetch($key);
  7. }
  8.  
  9. function store($key,$data,$ttl) {
  10.  
  11. return apc_store($key,$data,$ttl);
  12.  
  13. }
  14.  
  15. function delete($key) {
  16.  
  17. return apc_delete($key);
  18.  
  19. }
  20.  
  21. }
  22.  
  23. ?>

My personal problem with APC that it tends to break my code.. So if you want to use it.. give it a testrun.. I have to admit that I haven't checked it anymore since they fixed 'my' bug.. This bug is now fixed, APC is amazing for single-server applications and for the really often used data.

Memcached

Problems start when you are dealing with more than one webserver. Since there is no shared cache between the servers situations can occur where data is updated on one server and it takes a while before the other server is up to date.. It can be really useful to have a really high TTL on your data and simply replace or delete the cache whenever there is an actual update. When you are dealing with multiple webservers this scheme is simply not possible with the previous caching methods.

Introducing memcached. Memcached is a cache server originally developed by the LiveJournal people and now being used by sites like Digg, Facebook, Slashdot and Wikipedia.

How it works

  • Memcached consists of a server and a client part.. The server is a standalone program that runs on your servers and the client is in this case a PHP extension.
  • If you have 3 webservers which all run Memcached, all webservers connect to all 3 memcached servers. The 3 memcache servers are all in the same 'pool'.
  • The cache servers all only contain part of the cache. Meaning, the cache is not replicated between the memcached servers.
  • To find the server where the cache is stored (or should be stored) a so-called hashing algorithm is used. This way the 'right' server is always picked.
  • Every memcached server has a memory limit. It will never consume more memory than the limit. If the limit is exceeded, older cache is automatically thrown out (if the TTL is exceed or not).
  • This means it cannot be used as a place to simply store data.. The database does that part. Don't confuse the purpose of the two!
  • Memcached runs the fastest (like many other applications) on a Linux 2.6 kernel.
  • By default, memcached is completely open.. Be sure to have a firewall in place to lock out outside ip's, because this can be a huge security risk.

Installing

When you are on debian/ubuntu, installing is easy:

  1. apt-get install memcached

You are stuck with a version though.. Debian tends to be slow in updates. Other distributions might also have a pre-build package for you. In any other case you might need to download Memcached from the site and compile it with the usual:

  1. ./configure
  2. make
  3. make install

There's probably a README in the package with better instructions.

After installation, you need the Pecl extension. All you need to do for that (usually) is..

  1. pecl install Memcache

You also need the zlib development library. For debian, you can get this by entering:

  1. apt-get install zlib1g-dev

However, 99% of the times automatic pecl installation fails for me. Here's the alternative installation instructions.

  1. pecl download Memcache
  2. tar xfvz Memcache-2.1.0.tgz #version might be changed
  3. cd Memcache-2.1.0
  4. phpize
  5. ./configure
  6. make
  7. make install

Don't forget to enable the extension in php.ini by adding the line extension=memcache.so and restarting the webserver.

The good stuff

After the Memcached server is installed, running and you have PHP running with the Memcache extension, you're off.. Here's the Memcached class.

  1. <?php
  2.  
  3. class Sabre_Cache_MemCache extends Sabre_Cache_Abstract {
  4.  
  5. // Memcache object
  6. public $connection;
  7.  
  8. function __construct() {
  9.  
  10. $this->connection = new MemCache;
  11.  
  12. }
  13.  
  14. function store($key, $data, $ttl) {
  15.  
  16. return $this->connection->set($key,$data,0,$ttl);
  17.  
  18. }
  19.  
  20. function fetch($key) {
  21.  
  22. return $this->connection->get($key);
  23.  
  24. }
  25.  
  26. function delete($key) {
  27.  
  28. return $this->connection->delete($key);
  29.  
  30. }
  31.  
  32. function addServer($host,$port = 11211, $weight = 10) {
  33.  
  34. $this->connection->addServer($host,$port,true,$weight);
  35.  
  36. }
  37.  
  38. }
  39.  
  40. ?>

Now, the only thing you have to do in order to use this class, is add servers. Add servers consistently! Meaning that every server should add the exact same memcache servers so the keys will distributed in the same way from every webserver.

If a server has double the memory available for memcached, you can double the weight. The chance that data will be stored on that specific server will also be doubled.

Example

  1. <?php
  2.  
  3. $cache = new Sabre_Cache_MemCache();
  4. $cache->addServer('www1');
  5. $cache->addServer('www2',11211,20); // this server has double the memory, and gets double the weight
  6. $cache->addServer('www3',11211);
  7.  
  8. // Store some data in the cache for 10 minutes
  9. $cache->store('my_key','foobar',600);
  10.  
  11. // Get it out of the cache again
  12. echo($cache->fetch('my_key'));
  13.  
  14. ?>

Some final tips

  • Be sure to check out the docs for Memcache and APC to and try to determine whats right for you.
  • Caching can help everywhere SQL queries are done.. You'd be surprised how big the difference can be in terms of speed..
  • In some cases you might want the cross-server abilities of memcached, but you don't want to use up your memory or have your items automatically get flushed out.. Wikipedia came across this problem and traded in fast memory caching for virtually infinite size file-based caching by creating a memcached-compatible engine, called Tugela Cache, so you can still use the Pecl Memcache client with this, so it should be pretty easy. I don't have experience with this or know how stable it is.
  • If you have different requirements for different parts of your cache, you can always consider using the different types alongside.

Creating a Gopher server with PHP and InetD

This tutorial will teach you how to create a Gopher Server using InetD with PHP. This will teach you how to create a simple socket server using InetD and it will teach you something about the gopher protocol.

Gopher

A long time ago, in the early nineties Gopher was the prefered way to access internet content.. Only later on Tim Berners Lee's HTTP/WWW idea took off. Sixapart recently wrote an article about this chapter of ancient internet history.

Gopher basically has a few main functions and it is kind of restricted to that. This is listing directories (or menu's), serving files and searching. One of the most innovative ideas was that hyperlinks were tightly integrated in the protocol. The modern internet is in a sense based on this important concept. If you want to see a gopher server in action, check out: gopher://gopher.quux.org/. Only a few browser support this protocol, among them are Firefox 1.5 or higher, Camino 1.0 or higher, or a recent Seamonkey or Flock. IE used to support it, but because of a security bug a while ago they didn't fix it, but disabled it instead. Lynx will also work if you're on linux.

InetD

InetD has to be the easiest way to write a socket server, you simply make an entry in /etc/inetd.conf and you can make it work with standard input/output.. more about this later.

What do you need?

The examples here are written for PHP 5.1.x, it will most likely also work in PHP 5.0.x, but you can't run it in 4.x.. further, you need root access to a *nix server with inetd installed (usually it any linux server comes with inetd). You also need a gopher-compatible browser. You can download the files for this tutorial here, but this is not required.

Lets get started..

We will first write a basic telnet server, because this is the easiest thing to do. We will make a telnet server that waits for a line to be entered and it will then echo exactly that and break the connection.
If we would be using the socket_* functions things would probably be a bit harder, but for us the script will actually look like this:

  1. #!/usr/bin/php
  2. <?php
  3.  
  4. $data = fgets(STDIN);
  5. echo('You said: ' . $data);
  6.  
  7. ?>

The first line (#!/usr/bin/php) is a special way of telling linux to use the php interpreter for the rest of the file. If PHP is installed in a non-standard location (other than /usr/bin) you can find it by typing whereis php on the command line.

fgets() is php's function to return 1 line from an open file. STDIN is a constant that refers to unix standard input.

Making it run

Give this testfile the +x permission, you can do that using the following command line:

  1. chmod +x filename.php

This tells linux (or mac) this file is allowed to run as an executable.

Try running it with ./testsocket.php (or whatever filename you gave it. If everything worked as expected it will wait till you type something in and press enter, and it will reply with the exact same thing and exit.

Turning it into a telnet server

To do this you will need to edit your /etc/inetd.conf file. Open it with your favourite editor and add this line:

  1. telnet stream tcp nowait www-data /path/to/your/testsocket.php

The spaces in between are either regular spaces or tabs. Be sure to change the last parameter to the correct path to your script. This runs the script with the username 'www-data', which is the default username for the apache server on Debian. You might want to change it to the user your apache server runs on.. this can either be www-data, nobody, apache, httpd or a few others. You can also run the script as root, but this gives the script privileges you might not want to give it (all of them).

This binds the script to the default telnet port (which is 23). Now you need to restart inetd to force it to reload the settings files. You can do that by running the following command (as root):

  1. killall -HUP inetd

You can try your new telnet server out by running telnet localhost from the command line, or if you want to try it from another machine, run : telnet yourhostname.com or telnet://yourhostname.com.

If it didn't work for you, you might want to check out your system logs.. it could tell you a bit more. You can do that (on some/most) systems with:

  1. tail -f /var/log/syslog

The last entry(ies) should give you information.

And now.. gopher

In order to do this, I need to explain a bit more about the gopher protocol. When a gopher client connects to a gopher server, it will first send a string containing the information it wants followed by a linebreak ("\n") after that the gopher server throws back the information it requested.

If you go to the root of a gopher server it will start out with a directory listing, like gopher://gopher.quux.org/.

View source in firefox won't help you, gopher uses a special format to submit this directory listing. Every item (including text-lines) are sent on 1 line (separated by \n). This is how a line is built up:

  1. [itemType]Name[tab]location[tab]server[tab]port[linebreak]

ItemType is a single character which tells the client what type of item this is. 0 means file, 1 means directory, 8 means telnet link, I means an image file and i is informational text.

At the end of the directory listing you'll find a . and the server closes the connection.

A simple gopher server

Add in another line in /etc/inetd.conf. You can write it in the exact same way, but start the line with gopher instead of telnet. Don't forget to restart!

  1. gopher stream tcp nowait www-data /path/to/your/gopherserver.php

For our gopherserver.php, we will start out with a simple class that does some of the work for us. It is highly recommended to check out the class first, you can find it here. I decided not to put it here, because there's a lot of code. The code is pretty much self-explanatory.

Now, our gopherserver will look like this:

  1. #!/usr/bin/php
  2. <?php
  3.  
  4. require_once 'Gopher/Server.php';
  5.  
  6. $server = new Gopher_Server();
  7.  
  8. $server->setHostname('gopher.rooftopsolutions.nl');
  9.  
  10. $server->exec();
  11.  
  12. ?>

Now you should have your own gopher server running. The server class is not complete though. Whatever you will serve it, it will reply with the exact same response. If you want plan to make this running you should change the processRequest method to return the correct results.

You will find a bunch of constants for the types of files you can serve. If you want to serve a binary file it doesn't matter if you use G_BINARY, G_MACFILE or G_DOSFILE. A bunch of them is not supported in modern clients.

Thats it for this tutorial, we have a proof of concept server running and you should be able to extend it to actually serve information. If people are interested I can make a follow-up tutorial which would explain the search feature, handling urls to the HTTP web.. let me know if it worked for you.

On HttpOnly, Firefox-specific XSS and this years major Livejournal XSS attack

Yep, thats a long title, but they are all related to each other in some way. In the first few paragraphs I will explain what cookies are and XSS. You might want to skip ahead if you already know what this is.

Sessions, Cookies

HTTP is stateless. This means that every request to the server is a 'new' one and normally there is no relation to a first or second request. To allow maintaining a session or 'state' between multiple requests, HTTP cookies are used.

A cookie is basically a HTTP header with a tiny piece of information that gets re-sent with every request to the server. A popular way to make use of this is through PHP's$_SESSION system. This sends a cookie with a unique id to the client that allows PHP to retain a users' information across pages.

XSS

If you allow users to for example comment on one of your pages and allow (certain) html, it is sometimes possible to inject a piece of javascript. There are many tricks to evade the so-called html sanitizers.. strip_tags() is PHP's built-in sanitizer, but it doesn't work really well.. if, for example, you would allow users to use a <p> tag, which might seem harmless, there would be tricks to abuse the style="" or onclick="" attributes, just to name a few.

XSS and cookies

So how do you abuse javascript and cookies combined?

Because with for example PHP's session system, you can use the contents of the cookie to steal someone's session. The hacker would be logged in as you and might able to change your password and log you out afterwards. The contents of the cookies is stored in the javascript variable document.cookie.

HttpOnly, a solution

Microsoft came up with a way to prevent this from happening, ever since Internet Explorer 6.0 (starting from Windows XP SP1). They added an extra piece of information to a cookie, that will still allow the use of cookies in the way you are used to, but it will prevent the cookie from being read by javascript (basically it is invisible for javascript). Be sure to check out microsofts spec at MSDN

Safari and Opera quickly started supporting this. Because of this it is becoming pretty useful to use in practice. Remember that this doesn't mean you can just accept any html on your site, you should still always sanitize the bad stuff or not allow it at all! But in the case you missed something, it can make it a lot more difficult for your attacker to steal sessions.

UPDATE: Safari/Opera actually ignore it, my excuses, I didn't check my sources.

Under the hood

Normally, a cookie header will look like this:

  1. Set-Cookie: USER=username; expires=Wednesday, 09-Nov-99 23:12:40 GMT;

But with the HttpOnly, it will look like this:

  1. Set-Cookie: USER=username; expires=Wednesday, 09-Nov-99 23:12:40 GMT; HttpOnly
.

It's a small change, and all normal browsers should still accept this even if they don't understand the HttpOnly part. There is an exception though, and it goes by the name of IE 5 for mac. This browser won't understand the cookie and totally ignores it. Personally I don't support this browser for any application anymore, as there are too many bugs in this browser. But if your boss wants it, this might prevent you from using HttpOnly

PHP support

A guy named Scott MacVicar created a patch for PHP that will add an extra parameter to set_cookie() to enable HttpOnly for your cookies. The patch should also enable this by default for the session system.

If the patch will get accepted we will likely be able to use this in PHP 6.0 and perhaps even PHP 5.2, I'm looking forward to that. There is a chance though, because of the IE5/mac breakage that it eventually won't be auto-enabled for the session system.

So what about firefox?

Firefox doesn't support it, there is currently a bug open for it (Bug #178993). There was an initial solution posted over 2 years ago (January 2004). And a few other patches later on, but the mozilla folks refused all of them because they want to maintain the exact format of cookies.txt (the file they use to store cookies), because other applications might rely on that format. A few solutions for that have been posted, but it doesn't seem like a high priority for them.

A workaround for firefox (kind of)

There have been solutions for firefox that also blocked reading of cookies by javascript. Firefox has a magic function __defineGetter__ that can block reading of variables. To do this for all cookies on your site, include the following snippet on top of your html page:

  1. HTMLDocument.prototype.__defineGetter__("cookie",function (){return null;});

However, you can't rely on this! There are still ways to get this cookie if the hacker can somehow create an iframe in your html page. The hacker has a reference to the same cookies if he uses the data: protocol in the src="" attribute. This will still make it a bit harder to steal cookies, so it's not a bad idea to implement. For a longer explanation of this workaround, check out http://www.wisec.it/sectou.php?lang=en.

The LiveJournal case

The same people who submitted the initial firefox-patch (see above) 2 years ago, also got hit by the attack in January 2006. Over 46% accounts were hijacked. This were over 900.000 stolen accounts. (check out their post about their solution.)

The reason they got hit by this is because they allowed users to use remote CSS stylesheets for their pages. CSS used to be only a specification for how html elements would look like on a page, but since a few years it has become more dynamic and now there are ways to exploit CSS with XSS attacks through for example IE's non-standard behavior: attribute and Firefox' -moz-binding: attribute (there are more you can exploit, but its outside the scope of the article). These attributes allow an author to create a custom behavior for a HTML element. The technique to do this for Firefox is called XBL.

Normally, when a page is loaded from domain A, and another (in a different frame for example) is loaded from domain B. Malicious scripts from domain B can never access cookies from domain A. This is called a 'same origin check'. This generally works in all browsers for all kinds of content, but an exception is XBL. (see bug #324253.)

Apparently this requires some major changes in how firefox works. The last comment in this bug is from February this year (2006). I hope they will wake up some day soon and make our life a little bit easier by both supporting HttpOnly and securing XBL.

So the LiveJournal attack could happen because the hackers used XBL in their CSS, which in turn accessed the HTTP cookies through javascript. The attack didn't affect users of other browsers, because: A) they support HttpOnly, B) even if they wouldn't.. the microsoft equivalent of XBL, which is .HTC files, don't work across different domains.

SabreAMF 0.2 is here

I just published a new version of SabreAMF, and it includes support for the flex2 messaging system. This means you can now use flex2 remoting with great ease.

It also includes a callback server class, which is now recommended for use with flex2. This class handles the standard AMF3 stuff like CommandMessages and translates Exceptions into ErrorMessages.

You can download it here. You can find an example on how to use the CallbackServer class here. An article on how to use this will follow shortly.

PHP Application Structure

One of the first things you do when you create a PHP application, is to create a good directory structure.

Many people have their own way of doing things and this is good. The most important thing is that people take the time to think of a good way to do stuff.

Today I'll share mine. I'm not saying this is the best way of doing stuff, this is simply a system that worked well for me.

Here's my main listing:

  • lib/
  • conf/
  • resources/
  • index.php

As you can see, I only have 3 directories, each with a specific purpose, and I'm always using a single index.php which handles all the requests.

lib/

This is where my class libraries go. All the classes are in a PEAR-structure, this means: 1 class per file, and the file/directory maps exactly to the classname, for example:the class Services_MetaWebLog can be found in lib/Services/MetaWebLog.php

There are a few good reasons why this is a good way to structure your classes:

  • Many systems already use this structure, so if you need foreign libraries (from PEAR, Solar or whatever) you usually can just copy-paste their classes in your structure
  • It allows you to easily find certain classes
  • It allows a pretty cool __autoload() function
  • If you are using subversion, you can easily include other projects libraries with the svn:externals keyword

This is the __autoload function, in case you need it. It's a bit simplified, but you'll get the point.

  1. <?php
  2. function __autoload($classname) {
  3. require_once 'lib/' . str_replace('_', '/', $classname) . '.php';
  4. }
  5. ?>

conf/

This is where global application configuration goes. In my case its usually a settings.xml with for example the database DSN.

Another tip is to put you apache configuration here. You can easily create a symlink to your /etc/apache2/sites-enabled . This way you can keep your apache configuration near to your applicaton. If you use Apache1, simply do an Include to this file.

resources/

This is where I put all the static stuff (I guess static would also have been a good name). My sub-directories tend to look like this:

  • css/
  • templates/
  • images/
  • js/
  • swf/

Good luck!

← Previous  1  4 5 6

About

My name is Evert, and I've been writing semi-regularly on this blog since 2006.

I'm currently available for contract work.

more info.

Subscribe

Dropbox

Dropbox is a simple cross-platform online backup and sync application. The first 2GB of space is free, and both you and me get an extra 250MB extra space if you sign up through this link.