My PHP Advent article

My PHP Advent article just got published. It's a list of best practices around dealing with dates and times in PHP. Have a read and tell me what you think. Also, be sure to follow @phpadvent or subscribe.

PHP Includes file generator

While profiling SabreDAV, I noticed a few times more than half of the request time was spent in the autoloader.

So instead of autoloading, now I prefer to unconditionally include every file for each package (there are 5 packages). For a while I manually maintained these files manually, but a while back I automated this process.

This is how you run it:

  1. phpincludes . includes.php

This will generate a file such as:

  1. <?php
  2.  
  3. // Begin includes
  4. include __DIR__ . '/Interface1.php';
  5. include __DIR__ . '/Class1.php';
  6. include __DIR__ . '/Class2.php';
  7. include __DIR__ . '/Class3.php';
  8. // End includes

You can edit everything before "// Begin includes" and after "// End includes". Subsequent edits will only replace the lines in between those comments.

The script will automatically expand classes and interface dependencies and load them in the correct order. It also has support for a PHP 5.2-compatible syntax (dirname(__FILE__) instead of __DIR__).

If you like it, head over to github, or install it with:

  1. pear channel-discover pear.sabredav.org
  2. pear install sabredav/PHPIncludes

iconv_substr vs mbstring_substr

While working on an application I ran across a huge bottleneck which I isolated down all the way to the use of the iconv_substr function. If you ever wonder which is better to use, this should help your decision:

Benchmark script

  1. <?php
  2.  
  3. $str = str_repeat("foo",100000);
  4. $time = microtime(true);
  5.  
  6. iconv_substr($str,1000,90000,'UTF-8');
  7.  
  8. echo "iconv_substr: " . (microtime(true)-$time) . "\n";
  9.  
  10. $time = microtime(true);
  11.  
  12. mb_substr($str,1000,90000,'UTF-8');
  13.  
  14. echo "mb_substr: " . (microtime(true)-$time) . "\n";
  15.  
  16. $time = microtime(true);
  17.  
  18. substr($str,1000,90000);
  19.  
  20. echo "substr: " . (microtime(true)-$time) . "\n";
  21. ?>

The results widely varied between machines, operating systems and PHP versions; but here are two results I recorded.

First, PHP 5.3.4 on OS/X:

  1. iconv_substr: 0.014400005340576
  2. mb_substr: 0.00049901008605957
  3. substr: 3.7193298339844E-5 # Note the E-notation, this was actually something like 0.00003 seconds.

As you can see iconv took 0.01 seconds, while mbstring took only 0.0004 seconds. Already a significant difference (2800% slower), but the difference became more apparent when running this on a Debian box with PHP 5.2.13.

  1. iconv_substr: 8.3735179901123
  2. mb_substr: 0.00039505958557129
  3. substr: 4.8160552978516E-5

Yup, it took 8.3 seconds. That's an increase of over 2100000%. So next time you're wondering which of the two may be smarter to use, this may help you decide.

Important to note that OS/X uses libiconv as the 'iconv implementation' and my Debian test machine 'glibc', so it looks like libiconv is much, much faster than glibc. mbstring still leaves both in the dust though.

I'm interested to hear what your results are, especially if they differ.

SabreDAV 1.5 released with CardDAV support

sabredav_200x60.png

Over the last month I've been working hard at the Atmail office in sunny Australia to get CardDAV support built into SabreDAV; and I've finally completed all the steps to do this release.

So there it is, CardDAV. Unfortunately there are not yet a lot of clients who actually use it, and it mainly comes down to iOS and OS/X, but I've been asked about CardDAV a lot and suspect more people will become interested in this protocol (especially if more vendors start supporting it).

So that's pretty much it; head over to download page to fetch a copy. I've had to break a couple of minor api's, you can read about those in the migration document.

I tried my best to write good documentation for the new stuff, but it's always very time consuming, and not as good as I'd like If you have time and the will to write more, let me know!

Lastly, a big thank you to Nick Boutelier for creating the new SabreDAV logo!

Numeric string comparison in PHP

Although PHP's loose comparison type juggling tends to invoke some negative responses, I don't think it has really ever worked against me, and works quite comfortably in my opinion. As long as you make sure you always use strict checking (=== and !==) where you can, and fall back to the loose checks when you must.

As a PHP developer, I think it's very important to understand and memorize exactly how these work, whether you're writing new code, or maintaining existing code.

A few days ago on PHP-internals I saw a behavior that was completely new to me, and very much counter-intuitive.

  1. if( '20110204024217300000' == '20110204024217300264' )
  2. echo 'equal';
  3. else
  4. echo 'not equal';

Guess what the output is.

PHP will for loose comparisons always try to convert numeric strings, even when both sides of the comparisons are strings. Because the numbers are too big to fit in an integer, they are converted to floats. For both numbers this conversion ends up in the number: "2.0110204024217E+19" (give or take, based on the standard precision settings).

In my mind it makes sense to do this type juggling when a comparison is done with <, >, <= or >=, but it definitely feels like a bug when doing an equals check.

The moral is: always do strict checks when you are able to.

Thanks to Matt Palmear for pointing this out.

SabreDAV 1.4.0-beta released

Last Saturday I put up version SabreDAV 1.4.

It's taken a while to get this one out. Much longer than I thought. The result was that there's been very little released over the past few months. In an effort to change this, I decided to release 1.4.0 as soon as possible, rather than when all the features are ready. I believe this is better for the end-user and for me as well (release early, release often, etc).

So there it is. These are the new major features:

  • WebDAV ACL support. This part is not 100% done. It can be integrated into existing API's, but there's no central ACL store or ability to modify ACL's through the WebDAV protocol yet. These additional features will be added in subsequent versions.
  • CalDAV proxy support. This is a proprietary apple extension, allowing users to delegate calendar access to other users.
  • Integrated the 'VObject' library, which provides an easy way to read and write iCalendar objects with an api similar to SimpleXML.
  • Added the ICSExportPlugin, allowing you to export iCalendar-formatted calendars.

full changelog

To allow for a proper ACL implementation, much of the 'principal' functionality has been moved from Sabre_DAV_Auth to Sabre_DAVACL. There's a Migration guide available with all the details.

As usual, if you're not ready to migrate to 1.4 because of the API breaks or because it's still considered beta, I'll be maintaining 1.3 for at least another year. However, I'll be doing this on a strictly on-demand basis. So if you need a bugfix backported or a release, feel free to ask on the mailing list.

Lastly, thanks to all the users. The number of deployments and feedback is steadily growing and that's very rewarding.

Download here.

Taking advantage of PHP namespaces with older code

During Rob Allen's ZF2 talk at PHPBenelux an audience member shouted this really useful tip, which I thought was worth sharing.

If you're running PHP 5.3 and you have to use pesky old code that uses long class prefixes (yea, so, pretty much all PHP code out there), you can still make use of namespace features to shorten them.

  1. <?php
  2. use Sabre_DAV_Auth_Backend_PDO as AuthBackend;
  3. use Zend_Controller_Action_Helper_AutoComplete_Abstract as AutoComplete;
  4.  
  5. $backend = new AuthBackend();
  6. ?>

Might have been super obvious to most of you, but it just hadn't occurred to me.

iCalendar / vCard parser for PHP

I've just finished an iCalendar vCard parser for PHP. It's done almost completely with a 'natural' simplexml-like interface, so it should (hopefully) be just as easy to parse, and also modify iCalendar / vCard objects (ics/vcf files).

To install using pear, run the following:

  1. pear channel-discover pear.sabredav.org
  2. pear install sabredav/Sabre_VObject-alpha

Or download from pear.sabredav.org.

For testing, I used this iCalendar file: icalendartest.ics.

To load in an object, you use the Reader class:

  1. // Link to the correct path if you manually dowloaded the package
  2. include 'Sabre/VObject/includes.php';
  3.  
  4. // Reading an object
  5. $calendar = Sabre_VObject_Reader::read(file_get_contents('icalendartest.ics'));

iCalendar objects consist of components (VEVENT, VTODO, VTIMEZONE, etc), properties (SUMMARY, DESCRIPTION, DTSTART, etc) and parameters, which are to properties what attributes are to elements in XML. To show a listing of all events in a calendar, this snippet would work:

  1. echo "There are ", count($calendar->vevent), " events in this calendar\n";
  2.  
  3. // Looping through events
  4. foreach($calendar->vevent as $event) {
  5.  
  6. echo (string)$event->dtstart, ": ", $event->summary, "\n";
  7.  
  8. }

You can easily modify properties:

  1. $calendar->vevent[0]->description = "It's a birthday party";

Creating new objects uses the following syntax:

  1. $todo = new Sabre_VObject_Component('vtodo');
  2. $todo->summary = 'Take out the dog';
  3. $calendar->add($todo);

And to turn your newly modified calendar back into an ics file:

  1. file_put_contents('output.ics', $calendar->serialize());

Lastly, parameters are accessible through array-syntax:

  1. echo (string)$calendar->vevent[0]->dtstart['tzid'], "\n";

I had fun building this, I hope it's useful to you as well. It's 100% unittested, but bugs might still appear due to the complex nature of API. Use at your own risk :). This library will be part of the SabreDAV project, which is also where you can go for the source, report bugs or make suggestions.

Internationalized domain names, are you ready?

Since may 11 TLD's (top-level domainnames) have been added. In order for this to work successfully, a lot of applications will have to be fixed.

Many email-validation scripts might use an approach like this:

  1. $ok = preg_match('/^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.[A-Z]{2,6}$/i', $email);

This one is pretty simple, it matches the most common address formats, as long as the tld (.com, nl, .uk, etc) is under 6 characters. For a bit more sophistication you might want to ensure that the tld is a bit more valid:

  1. $ok = preg_match('/^[A-Z0-9._%+-]+@[A-Z0-9.-]+\.(?:[A-Z]{2}|com|org|net|edu|gov|mil|biz|info|mobi|name|aero|asia|jobs|museum)$/i',$email);

Note: both these regexes were taken from regular-expression.info. The top google hit, and decent examples.

The new TLD's use non-ascii characters, and they might become aliases for existing top-level domains, or new tld's altogether. Here are the currently working examples:

At first sight these look like regular utf-8, characters, but if you look at the sourcecode of this page, you'll notice that it's actually encoded differently.

The korean url http://실례.테스트, is actually encoded as http://xn--9n2bp8q.xn--9t4b11yi5a/. This is called Punycode.

If you want support for these new urls (and thus domainnames in emails), you should have support for punycode. You will likely receive UTF-8 encoded domainnames for email address (example@실례.테스트), but internally you must make sure that you only deal with the punycode representation.

This translating is also what modern browsers do. If you were to paste "http://xn--9n2bp8q.xn--9t4b11yi5a/" directly in the firefox address bar, it will show you the UTF-8 characters instead. Firefox will re-encode to punycode though and use that format for HTTP requests.

The best way really to check for valid email addresses is to use a very liberal regex, but verify with a simple MX record lookup if a mailserver exists for the given domain. This example is an expansion on the first regex.

  1. $email = 'example@xn--9n2bp8q.xn--9t4b11yi5a';
  2.  
  3. if(preg_match('/^[A-Z0-9._%+-]+@([A-Z0-9.-]+\.[A-Z0-9-]{2,})$/i', $email,$matches)) {
  4. $hostname = $matches[1];
  5. if (!getmxrr($hostname, $hosts)) {
  6. echo "Host has an MX record\n";
  7. } else {
  8. echo "Host does not exist or does not have an MX record\n";
  9. }
  10. } else {
  11. echo "Email address did not match regular expression\n";
  12. }

The preceeding code does not convert UTF-8 to punycode though. There's not yet an easy native way in PHP to do this, but Pear's Net_IDNA2 provides a way. The implementation seems very complex though, and leaves me wondering if there's an easier way to go about it.

Storing encrypted session information in a cookie

cookie

Our session system is due for an upgrade. Currently all PHP sessions are stored in the database, and some things are getting a bit slow. There have been a couple of approaches I've been considering, one of which is simply storing all the information in a browser cookie.

First I want to make clear I don't necessarily condone this. The reason I'm writing this post, is because I'm hoping for some more community feedback. Is this a really bad idea? I would love to know.

The benefits

If all the session data is stored in the browser, it means that I don't need to store it on the server. I actually don't care all that much for having the data on the server (unless it's the only secure way), it's mostly a gigantic map with session tokens and user id's (along with some other info).

I also feel it's more natural for HTTP, as it makes it a bit more stateless.

Sample code

  1. <?php
  2.  
  3. class BrowserSession {
  4.  
  5. public $secret = 'this will need to be a cryptographic random number';
  6. public $currentUser = null;
  7.  
  8. // Sessions time out after 10 minutes
  9. public $timeout = 600;
  10.  
  11. function init() {
  12.  
  13. if (!isset($_COOKIE['MYSESSION'])) {
  14. echo "No session cookie found\n";
  15. return;
  16. }
  17.  
  18. list($userId, $time, $signature) = explode(':',$_COOKIE['MYSESSION']);
  19.  
  20. // The cookie is old
  21. if ($time> time() + $this->timeout) {
  22. echo "The session cookie timed out\n";
  23. }
  24.  
  25. if ($signature !== $this->generateSignature($userId,$time)) {
  26. echo "The secret was incorrect\n";
  27. }
  28.  
  29. $this->currentUser = $userId;
  30.  
  31. echo "Logged in as user: $userId\n";
  32.  
  33. }
  34.  
  35. function login($userId) {
  36.  
  37. $this->userId = $userId;
  38.  
  39. $time = time();
  40.  
  41. $cookie = $this->userId . ':' . time() . ':' . $this->generateSignature($userId,$time);
  42.  
  43. setcookie('MYSESSION',$cookie,$time+$this->timeout,null,null,null,true);
  44.  
  45. echo "Set cookie: $cookie\n";
  46.  
  47. }
  48.  
  49. function generateSignature($userId,$time) {
  50.  
  51. $stringToSign =
  52. $userId . "\n" .
  53. $time . "\n" .
  54. $_SERVER['HTTP_USER_AGENT'] . "\n" .
  55. $_SERVER['REMOTE_ADDR'];
  56.  
  57. return hash_hmac('SHA1',$stringToSign,$this->secret);
  58.  
  59. }
  60.  
  61. }
  62.  
  63. ob_start();
  64. $session = new BrowserSession();
  65. $session->init();
  66.  
  67. if (isset($_GET['login'])) $session->login($_GET['login']);
  68. else {
  69.  
  70. echo '<br /><a href="?login=1234">Log in as user 1234</a>';
  71.  
  72. }
  73. ?>

A few notes:

  • The preceeding code was just intended as a proof of concept, it's missing some validation.
  • Currently the secret would be the same for every user. I was thinking of appending some per-user information to the secret. If somebody does guess or bruteforce the secret, they would only have access to a single users' information.
  • If a user changes their password, existing sessions should expire. To do this the signature should also include a sequence number that changes when the password changes.
  • Currently this only stores a user id. It could be extended to contain more data, but this is all I need.

So, is there anything fundamentally wrong with this approach? In general the client should never be trusted, but for setups where the security requirements aren't as high (highly subjective, I know) I feel this might be strong enough. OAuth, OpenID and Amazon AWS all seem to trust HMAC+SHA1, but those applications do work differently.

Credit where it's due

I first asked this question on stack overflow. The users there already gave some great suggestions and pointed out some of the flaws. Thank you!

 1 2 3  6 Next →

About

My name is Evert, and I've been writing semi-regularly on this blog since 2006.

I'm currently available for contract work.

more info.

Subscribe

Dropbox

Dropbox is a simple cross-platform online backup and sync application. The first 2GB of space is free, and both you and me get an extra 250MB extra space if you sign up through this link.