PHPRPC and PHP frameworks

I started the process to submit PHPRPC to the major frameworks. I feel like I should submit it to all the major frameworks, so I can make sure people can use PHP-RPC regardless of their framework of choice.

Besides that, it might be a good way to gather feedback or critique from the pro's.

PEAR

PEAR

For PEAR I submitted it as a new PEAR2 package. PEAR2 is the upcoming next major version of PEAR, and will be PHP5-only. Much of my code (seemed) to follow PEAR2 coding standards, but the approval process will tell.

The most interesting (or weird) change I had to make the standards to include classes from within other classes. The old PEAR standards dictate:

  1. <?php
  2. require_once 'My/Other/Class.php';
  3.  
  4. ?>

Which assumes PEAR and its packages are in the include_path. However, the standard for PEAR2 is:

  1. <?php
  2. if (!class_exists( 'My_Other_Class',true )) {
  3. throw new Exception('Undefined class: My_Other_Class');
  4. }
  5. ?>

So, this means that the user of the package has to manually include all the dependencies. There is also an allfiles.php in every directory, which loads the entire package.

This allfiles.php is considered 'for beginners'. The wiki states that its also for opcode cache friendliness, but this is false (I submitted a bug report). So as a consequence of this all that using PEAR2 packages becomes a bit more harder to use for the following target audience: "Advanced developer, but doesn't want to trace each class' dependency tree"

The proposal.

Solar

Solar Framework

I opened a ticket in Solar's trac asking if its smarter to first write the Solar implementation, or first ask for approval for the contribution, because it would be good to know if Paul M. Jones hates the idea before I start.

Solar follows PEAR's old coding standards. The only annoyance here is that I need to prepend underscores to every private and protected property. (An idea that stems from the PHP4 era, where there was no property visibility).

The ticket.

Zend Framework

Zend Framework

I haven't really started with Zend yet, the coding standards seem to be nearly the exact same as the ones I use myself (except for the change from Sabre_ to Zend_), but in order to submit code to Zend, or even propose a package you have to sign a contract first; which means I have to print, sign and scan their pdf. Sadly, the only type of paper we have in this house is rolling paper.

PHP-RPC update 4

This should be the last version of the spec for PHP-RPC, unless somebody has some great feedback with stuff I overlooked. It might need some clarification and better writing here and there, but I think the general idea is there.

The api for the server class currently works the exact same as before, but support for multi-calls has been added. I also added a client class, which is helpful when you surpass the prototyping phase and you need a more decent way to interact with the service.

Example usage:

  1. <?php
  2.  
  3. $url = 'http://localhost/~evert/phprpc/server.php';
  4.  
  5. require_once 'Sabre/PHPRPC/Client.php';
  6.  
  7. $client = new Sabre_PHPRPC_Client($url,'system');
  8.  
  9. $data = $client->testingMethod('test');
  10.  
  11. print_r($data);
  12.  
  13. ?>

Multi-call example:

  1. <?php
  2.  
  3. $url = 'http://localhost/~evert/phprpc/server.php';
  4.  
  5. require_once 'Sabre/PHPRPC/Client.php';
  6.  
  7. $client = new Sabre_PHPRPC_Client($url,'system');
  8.  
  9. $client->startMultiCall();
  10. $client->testingMethod('test');
  11. $client->testingMethod2('test');
  12. $data = $client->execMultiCall();
  13.  
  14. print_r($data);
  15.  
  16. ?>

The source can be downloaded from here. I also added the code to a subversion repository.

Here's the updated proposal. Changes have been highlighted.

The proposal (0.3)

Goals

  • Client should be very easy to implement. Server is allowed to be a bit more complex.
  • No duplication of the HTTP protocol. For example, HTTP already provides encryption, redirecting and authentication.
  • PHP 4 and 5 compatibility. (and 6 when it is released).
  • Client and server implementations should be built from the idea 'be strict in what you produce, be liberal in what you accept'

The request

Requests are made using either GET or POST. Both should be accepted. GET is more appropriate for fetching information, whereas POST is used for posting new data. POST has the advantage that it doesn't have any limits in the size of the request and an encoding can be supplied. GET has the advantage that information can be fetched using a one-liner.

When there is no encoding specified, UTF-8 is assumed. Data supplied using POST should be encoded as application/x-www-form-urlencoded (this is how a browser submits data by default).

The method thats called should always be supplied as the 'method' variable. The method can contain periods (.) to separate namespaces like XML-RPC. Arguments can be specified in two ways, and the API documentation should specify what the appropriate way is. The first way is using named arguments, a GET example would be:

  1. http://www.example.org/services/phprpc?method=getUsers&maxItems=20

The method here is getUsers, the named argument is maxItems and its value is 20.

The second way is using a list of arguments, which might be more appropriate in some cases where you want to directly map services and methods from a class on the server to the api. This is also how XML-RPC works.

  1. http://www.example.org/services/phprpc?method=getUsers&arguments[0]=20&arguments[1]=1

The first argument is 20, the second is 1.

Smart servers should use reflection to automatically map named arguments to the actual arguments in a list.

Clients SHOULD supply the version of PHP they are running. This can be either a complete version number, or just the major version (e.g.: 4, 5, 6). Clients should supply this as the phpVersion parameter. If the versionnumber is not supplied, the current stable PHP version is assumed, which is at the time of writing 5.

Clients MAY also supply the version of the PHP-RPC protocol as the 'version' parameter. Currently this is 0.3.

Clients MAY supply a returnClasses parameter. The value for returnClasses is either 0 or 1 and this can tell the server if the client is aware of typed objects that might be sent from the server.

The server

The server MUST allow requests both GET and POST requests. The server MUST treat any incoming text without encoding as UTF-8.

The server SHOULD allow both named arguments and indexed arguments for methods where this is possible.

If the client sent phpVersion the server MUST convert the returned serialized string so it can be read by the server. If the phpVersion is 4 or 5 the server MUST convert all unicode-strings (type U) to binary strings (type s). If the phpVersion is 4 the server MUST convert all private and protected properties to public properties.

Servers SHOULD also convert all typed objects to either STDClass'es or arrays when the client supplied returnClasses is set to 0, if this is appropriate.

The return data is always in PHP's serialize data format. The Content-Type header should always be 'application/x-php-serialized'

The server will always return an error with the following properties:

result
The actual return data.. (or an array with information about an exception, in which case it should have at least the 'message' property.)
status
HTTP status code for the method call. (200 = success, 500 = internal server error, 400 = bad request, etc etc.) Custom error codes have to start at 600.
version
optional: PHP-RPC protocol version. Currently this is 0.3
server
optional: Name of the server. Can be any string.

Multicall

Servers should be able to parse multi-call requests. This allows a client to wrap multiple methodcalls into one http-request. Multi-call only works with arguments specified as sequences.

Making a multicall is simple, instead of supplying the method as a string, it should be specified as an array with 1 or more methodnames. Arguments are also wrapped in an array (which then contains multiple arrays per method.)

The 'result' key in the response structure will also have to be an array. Each item in this array contains a at least a 'status' and a 'result' property, which have the same meaning as in the main result structure.

This means the status of the entire call can be 200 (success), while the individual responses to methods can contain an error code. The top-level status code will only be an error if the actual request was somehow malformed, and the server couldn't process the individual requests.

PHP-RPC update 3

I figured, the best way to come up with a usable PHP-RPC spec, is to put it in practice, and see where the problems appear. This is to prevent putting 'academical correctness' before usefulness.

So, I created a server implementation, and so far it works well. It's under a bsd license, so feel free to give it a shot. Right now I put everything in the 'Sabre' package, hopefully at one point this will be Zend, PEAR2 or Solar, but I'll look at that when I can put the 1.0 stamp on it. This implementation is only tested with PHP 5.2 and 5.1.

download link

The server class:

  1. <?php
  2.  
  3. // creating the server object
  4. require_once 'Sabre/PHPRPC/Server.php';
  5. $server = new Sabre_PHPRPC_Server();
  6.  
  7. // handling method calls
  8. // Method contains a method name.. this could for example be 'blog.getPosts'
  9. // argumentNotation is 1 if its a simple array, it is 2 if the parameters are specified as a struct
  10. function invokeMethod($method,array $arguments,$argumentNotation) {
  11.  
  12. return 'Hello World! You called the following method: ' . $method;
  13.  
  14. }
  15.  
  16. $server->setInvokeCallback('invokeMethod');
  17.  
  18. // after this point, everything goes automatic.
  19. $server->exec();
  20.  
  21. ?>

A sample client:

  1. <?php
  2.  
  3. $url = 'http://www.example.org/phprpc.php';
  4.  
  5. $data = file_get_contents($url . '?method=system.testingMethod');
  6. $data = unserialize($data);
  7.  
  8. echo $data['result']; // will output "Hello World! You called the following method: system.testingMethod
  9.  
  10. ?>

Here's the updated proposal. Changes have been highlighted.

The proposal (0.2)

Goals

  • Client should be very easy to implement. Server is allowed to be a bit more complex.
  • No duplication of the HTTP protocol. For example, HTTP already provides encryption, redirecting and authentication.
  • PHP 4/5/6 compatiblity.
  • Client and server implementations should be built from the idea 'be strict in what you produce, be liberal in what you accept'

The request

Requests are made using either GET or POST. Both should be accepted. GET is more appropriate for fetching information, whereas POST is used for posting new data. POST has the advantage that it doesn't have any limits in the size of the request and an encoding can be supplied. GET has the advantage that information can be fetched using a one-liner.

When there is no encoding specified, UTF-8 is assumed. Data supplied using POST should be encoded as application/x-www-form-urlencoded (this is how a browser submits data by default).

The method thats called should always be supplied as the 'method' variable. The method can contain periods (.) to seperate namespaces like XML-RPC. Arguments can be specified in two ways, and the API documentation should specify what the appropriate way is. The first way is using named arguments, a GET example would be:

  1. http://www.example.org/services/phprpc?method=getUsers&maxItems=20

The method here is getUsers, the named argument is maxItems and its value is 20.

The second way is using a list of arguments, which might be more appropriate in some cases where you want to directly map services and methods from a class on the server to the api. This is also how XML-RPC works.

  1. http://www.example.org/services/phprpc?method=getUsers&arguments[0]=20&arguments[1]=1

The first argument is 20, the second is 1.

Smart clients should autodetect if the user is trying to use named arguments or a sequence by checking out the type of the keys in the array.

Smart servers should use reflection to automatically map named arguments to the actual arguments in a list.

Clients SHOULD supply the version of PHP they are running. This can be either a complete version number, or just the major version (e.g.: 4, 5, 6). Clients should supply this as the phpVersion parameter. If the versionnumber is not supplied, the current stable PHP version is assumed, which is at the time of writing 5.

Clients MAY also supply the version of the PHP-RPC protocol as the 'version' parameter. Currently this is 0.2.

Clients MAY supply a returnClasses parameter. The value for returnClasses is either 0 or 1 and this can tell the server if the client is aware of typed objects that might be sent from the server.

The server

The server MUST allow requests both GET and POST requests. The server MUST treat any incoming text without encoding as UTF-8.

The server SHOULD allow both named arguments and indexed arguments for methods where this is possible.

If the client sent phpVersion the server MUST convert the returned serialized string so it can be read by the server. If the phpVersion is 4 or 5 the server MUST convert all unicode-strings (type U) to binary strings (type s). If the phpVersion is 4 the server MUST convert all private and protected properties to public properties.

Servers SHOULD also convert all typed objects to either STDClass'es or arrays when the client supplied returnClasses is set to 0, if this is appropriate.

I'm fairly sure I will remove the following paragraph. If PHP gets an HTTP 500 on a file_get_contents, it will throw an error, which removes the possibility to easily grab the error message.

When the method-call was successful the server should send HTTP code 200. When an error occurred the server should send an appropriate HTTP error code. (for example 400 for missing arguments, 500 for unexpected exceptions, 401 if the user should authenticate itself first and 403 is the method was not allowed to be called).

The return data is always in PHP's serialize data format. The Content-Type header should always be 'application/x-php-serialized'

The server will always return an error with the following properties:

result
The actual return data.. (or an array with information about an exception, in which case it should have at least the 'message' property.)
status
HTTP status code for the method call. (200 = success, 500 = internal server error, 400 = bad request, etc etc.) Custom error codes have to start at 600.
version
optional: PHP-RPC protocol version. Currently this is 0.2
server
optional: Name of the server. Can be any string.

PHP serializer 0.2

I updated the serializer to make use of spl_object_hash() where this function is available, this means it will go a lot faster on PHP 5.2 when serializing objects. (suggestion from Sebastian Bergmann).

Links

PHP serializer in userland code

I did a bit of work on an alternative for serialize(), written in PHP.

I wanted to build this as a helper class for a draft-PHP-RPC server. The reason I needed a custom one was because I wanted to make sure I would be able to spit out PHP4-compatible serialized data, and in the future, when its ported to PHP6, also PHP5-compatible data.

Some of my findings:

  • Its dead-slow, compared to the built-in version (as expected). What PHP's built in serializer could do in 0.00366 seconds, I needed 0.0948.
  • So even though its CPU expensive, there is less memory needed for big structures, because it uses echo so it can stream it straight to the client if needed.
  • When a property is private or protected, it really is. There's no way to grab the value. I was hoping Reflection would have allowed me to cheat.
  • There's no proper way to find out if two variables reference the same data. The only way is to change one of them, and see if the second also changed.
  • I was hoping SPLObjectStorage would be able to give me back an index the stored object. Instead I'm looping through all the objects I got and use === to see if they are the same.

I'm starting to wonder now if its a better idea to just use serialize() and make the needed fixes with regexes and stuff, but thats an experiment for an other night.

For the people who might find it useful, here's the download and source code..

List of differences with PHP's serialize:

  1. It only checks references for objects.
  2. It converts Serializable objects to strings when the target version is PHP4.
  3. It has a setting that allows you to automatically convert any object to an array or STDClass.
  4. It ignores all private and protected variables.

PHP-RPC

Update: I found out the difference between pointer and normal references, so I updated the 'data format section. Update2: Got the definition of all data types

Over the past time I've seen several proposals and implementations of people trying to leverage PHP's serialize format for RPC (remote procedure calls).

PHP's format is very compact compared to XML-RPC, not to mention SOAP.. There's no complex XML Parsing involved and its very fast to parse. Consuming a webservices leveraging this format can often be done using 2 or 3 lines of code without use of any external library.

Additionally it allows you to send over typed objects.. You could for example, say, send a object of the 'User' class, and on the other end of the line it would show up with the exact same classname.

Disadvantages:

  • Most of the advantages only apply when its used with PHP. There are better ways to communicate when there's other languages involved.
  • The classmapping will only be effective when both the client and the server have the same class definitions.
  • The serialize structure is not 100% compatible between versions.
  • There is no formal standard for both the structure, or the RPC protocol.

I hope, by typing the following document I can fix the last 2 problems in this list. The classmapping issue might also be fixed down the road by adding some kind of negotiation scheme. Another TODO is adding introspection and multiple calls in one request, like most XML-RPC implementations today support.

If there's enough interest in a standard like this, I will change this document into a more 'official' one and detach it from this blog.. If there's not, well, it means I will have a nice set of business requirements for use within our business :). Please note that this an early draft, so subject to change.

The proposal (0.1)

Goals

  • Client should be very easy to implement. Server is allowed to be a bit more complex.
  • No duplication of the HTTP protocol. For example, HTTP already provides encryption, redirecting and authentication.
  • PHP 4/5/6 compatiblity.
  • Client and server implementations should be built from the idea 'be strict in what you produce, be liberal in what you accept'

The request

Requests are made using either GET or POST. Both should be accepted. GET is more appropriate for fetching information, whereas POST is used for posting new data. POST has the advantage that it doesn't have any limits in the size of the request and an encoding can be supplied. GET has the advantage that information can be fetched using a one-liner.

When there is no encoding specified, UTF-8 is assumed. Data supplied using POST should be encoded as application/x-www-form-urlencoded (this is how a browser submits data by default).

The method thats called should always be supplied as the 'method' variable. The method can contain periods (.) to seperate namespaces like XML-RPC. Arguments can be specified in two ways, and the API documentation should specify what the appropriate way is. The first way is using named arguments, a GET example would be:

  1. http://www.example.org/services/phprpc?method=getUsers&maxItems=20

The method here is getUsers, the named argument is maxItems and its value is 20.

The second way is using a list of arguments, which might be more appropriate in some cases where you want to directly map services and methods from a class on the server to the api. This is also how XML-RPC works.

  1. http://www.example.org/services/phprpc?method=getUsers&arguments[0]=20&arguments[1]=1

The first argument is 20, the second is 1.

Smart clients should autodetect if the user is trying to use named arguments or a sequence by checking out the type of the keys in the array.

Smart servers should use reflection to automatically map named arguments to the actual arguments in a list.

Clients SHOULD supply the version of PHP they are running. This can be either a complete version number, or just the major version (e.g.: 4, 5, 6). Clients should supply this as the phpVersion parameter. If the versionnumber is not supplied, the current stable PHP version is assumed, which is at the time of writing 5.

Clients SHOULD also supply the version of the PHP-RPC protocol as the 'version' parameter. Currently this is 0.1.

Clients MAY supply a returnClasses parameter. The value for returnClasses is either 0 or 1 and this can tell the server if the client is aware of typed objects that might be sent from the server.

The server

The server MUST allow requests both GET and POST requests. The server MUST treat any incoming text without encoding as UTF-8.

The server SHOULD allow both named arguments and indexed arguments for methods where this is possible.

If the client sent phpVersion the server MUST convert the returned serialized string so it can be read by the server. If the phpVersion is 4 or 5 the server MUST convert all unicode-strings (type U) to binary strings (type s). If the phpVersion is 4 the server MUST convert all private and protected properties to public properties.

Servers SHOULD also convert all typed objects to either STDClass'es or arrays when the client supplied returnClasses is set to 0, if this is appropriate.

When the method-call was successful the server should send HTTP code 200. When an error occurred the server should send an appropriate HTTP error code. (for example 400 for missing arguments, 500 for unexpected exceptions, 401 if the user should authenticate itself first and 403 is the method was not allowed to be called).

The return data is always in PHP's serialize data format. The Content-Type header should always be 'application/x-php-serialized'

When an error occurred the server MUST send back an array, with at least the 'message' property, which should contain a description of the error that occurred. The server MAY supply more information in this array, such as line number, filename, class of the exception, stacktrace, etc..

The serialized data format

All data is serialized using PHP's serialize format. This is an unofficial specification. Although the format is human-readable, it is and should be treated as a binary format.

All items start with an 1 byte type identifier. These are the different types out there:

aarray
bboolean
Cobject which implements Serializable
ddouble
iinteger
Nnull
oseems to be a depreciated way to encode objects
Oobject + class
rreference
RPointer reference
sstring
Sescaped string. PHP6 uses this, but recent versions of PHP5 can also decode it.
UUnicode string (PHP6)

A boolean, double and integer all have the format:

  1. type:value;

Where type is either b, i or d and value is a literal number (e.g. 12, 85.12, or 1 for true, 0 for false).

A null is specified as:

  1. N;

A string is specified with the length of the string, and the actual string between double quotes.

  1. s:10:"helloworld";

A Unicode string works the exact same way, however.. This type is only supported in PHP6. PHP6 differentiates between binary strings and unicode strings. Strings coming from older versions of PHP will therefore always be treated as binary strings in PHP6. Unicode strings will be supplied as UTF-16 and the length specifies the number of bytes, not characters.

Arrays wrap their elements in curly braces { }. The contents of the array are always simply a list of one or the other types (or more arrays.)

  1. a:lengthofarray:{key1 + value1 + key2 + value2}

Note: the + signs are not literals here. Also note that arrays and objects are the only types that do not end with a semi-colon (;).

Example:

  1. a:2:{i:0;s:3:"moo";i:1;s:4:"unox";}

Objects work similar to arrays, but they include the name of the class.

  1. O:classnamelength:"ClassName":propertycount:{key1 + value1 + key2 + value2}

Example:

  1. O:6:"MyClass":2:{s:6:"*prop1";s:6:"value1";s:5:"prop2";s:6:"value2";}

PHP5 introduces private and protected properties. If the name of a property is prepended with a *, it means the property is protected. When a property is private, it includes the name of the defining class and contains 0x00 before and 0x00 after the name of the class.

Written out, that would be:

  1. public s:9:"property1";s:6:"value1";
  2. protected s:10:"*property2";s:6:"value2";
  3. private s:20:"0x00 + ClassName + 0x00 + property3";s:"value3"; // all whitespace and + signs should be ignored in this line

PHP5 also introduced the Serializable interface, which allow a custom encoding of objects. Serializable objects are encoded as:

  1. C:classnamelength:"ClassName":datalength:{data}

So if the serialize method returned "foo", the result could look like this:

  1. C:9:"TestClass":3:{foo}

Lastly, references. When the structure you're serializing contains a reference to a variable that was used earlier, it will be referenced. The main reason for this is, if you would have a structure with a circular reference, the serialization would keep on traversing your structure.. until, well, it would break.. Also, if two variables reference the same data.. that link is actually maintained..

A reference looks like this:

  1. R:19;

There is a second reference type in PHP5. Objects in PHP sort of work like other references, but not completely. To illustrate I will just show an example.

  1. <?php
  2.  
  3. class MyClass {
  4. var $myProp = 1;
  5. }
  6.  
  7. $obj1 = new MyClass(); // new object
  8. $obj2 &= $obj1; // pointer reference
  9. $obj3 = $obj1; // value reference
  10.  
  11. $obj1->myProp = 2;
  12. echo $obj2->myProp, "\n"; // will display 2
  13. echo $obj3->myProp, "\n"; // will display 2
  14.  
  15. $obj1 = new MyClass();
  16. $obj1->myProp = 3;
  17. echo $obj2->myProp, "\n"; // will display 3
  18. echo $obj3->myProp, "\n"; // will display 2
  19.  
  20. ?>

PHP4 made a copy of every object when it was assigned to a new variable.. to understand the difference, the output of this script in PHP4 would be:

  1. 2
  2. 1
  3. 3
  4. 1

Value references in PHP are serialized as:

  1. r:19;

If you want to know which variable this is referencing to, you should be looking for the 19th variable you decoded so far in your structure (you start counting at 1), but excluding other references and property names.. (so array indexes don't count, array values do..).

PHP4 seems to be treating r and R as the same thing (pointer references), so there's no need for conversion for PHP4 clients. I tested this with PHP 4.4.4.

 1

About

My name is Evert, and I've been writing semi-regularly on this blog since 2006.

I'm currently available for contract work.

more info.

Subscribe

Dropbox

Dropbox is a simple cross-platform online backup and sync application. The first 2GB of space is free, and both you and me get an extra 250MB extra space if you sign up through this link.