sklar.com

...composed of an indefinite, perhaps infinite number of hexagonal galleries...

© 1994-2017. David Sklar. All rights reserved.

Converting a PHP Extension to PHP 7

To dig under the hood of PHP 7 a bit, I decided to see how hard it would be to update my Boxwood extension to be compatible with PHP 7.

Boxwood is something I built at Ning a few years ago to do efficient multi-word replacement in text. We used it to “bleep out” naughty words that Network Creators didn’t want to see on their networks. By building a trie of all the words to replace, it’s able to do a single pass through the text that might contain naughtiness and make replacements speedily.

It’s not a terribly complicated PHP extension but it exercises a few Zend Extension API features such as function calls (obviously), parameter parsing and type checking, resources, module globals and hash table traversal.

Guided by the handy instructions at https://wiki.php.net/phpng-upgrading I forked ning/boxwood over to davidsklar/boxwood, created a php7 branch and got to work. You can see the complete diff here.

Total time to make all the changes was about 40 minutes, and that includes some unrelated-to-PHP-7 housekeeping to remove warnings that gcc didn’t care about when I originally released boxwood but clang/LLVM (what I’m using now) complains about.

The interesting changes are all in php_boxwood.c.

First, I had to change how the boxwood resource is handled. Boxwood uses a PHP resource to represent a collection of words to bleep. The resource is created with boxwood_new(), words get added to the resource by boxwood_add_text(), and then boxwood_replace_text() does the replacement. The resource type is now zend_resource (instead of zend_rsrc_list_entry) and there’s a new syntax for creating a resource. Additionally, to retrieve a resource from PHP’s internal resource list when it’s passed as an argument to a userspace function, I had to change to use the zend_fetch_resource() function instead of the ZEND_FETCH_RESOURCE() macro.

Next, there were some changes to string handling in function arguments and return values. Instead of receiving a string argument with s in the zend_parse_parameters() argument specifier string, I use S. (That’s a capital S instead of lowercase s.) This puts the passed in string into a zend_string structure (instead of separate variables for the character data and length). The val member of the struct has the character data.

After that, I had to update the hash traversal in boxwood_replace_text(). The code became much simpler. Instead of tracking hash position myself and using a for loop with verbose increment and condition steps, I use the dainty ZEND_HASH_FOREACH_VAL() macro which conveniently iterates for me, plopping each hash value in a zval for my use.

The other little cleanup was due to the disappearance of IS_BOOL – the one place I used that I now have to test for IS_TRUE and IS_FALSE separately.

All in all an easy adventure. Next I think I’ll see how it goes getting it to work with HHVM’s ext_zend_compat.

Tagged with software , php