Converting a PHP Extension to PHP 7
23 Mar 2015To dig under the hood of PHP 7 a bit, I decided to see how hard it would be to update my Boxwood extension to be compatible with PHP 7.
Boxwood is something I built at Ning a few years ago to do efficient multi-word replacement in text. We used it to “bleep out” naughty words that Network Creators didn’t want to see on their networks. By building a trie of all the words to replace, it’s able to do a single pass through the text that might contain naughtiness and make replacements speedily.
It’s not a terribly complicated PHP extension but it exercises a few Zend Extension API features such as function calls (obviously), parameter parsing and type checking, resources, module globals and hash table traversal.
Guided by the handy instructions at https://wiki.php.net/phpng-upgrading I forked ning/boxwood over to davidsklar/boxwood, created a php7 branch and got to work. You can see the complete diff here.
Total time to make all the changes was about 40 minutes, and that includes some unrelated-to-PHP-7 housekeeping to remove warnings that gcc didn’t care about when I originally released boxwood but clang/LLVM (what I’m using now) complains about.
The interesting changes are all in php_boxwood.c.
First, I had to change how the boxwood resource is handled. Boxwood uses a PHP resource to represent a collection of words to bleep. The resource is created with boxwood_new()
, words get added to the resource by boxwood_add_text()
, and then boxwood_replace_text()
does the replacement. The resource type is now zend_resource
(instead of zend_rsrc_list_entry
) and there’s a new syntax for creating a resource. Additionally, to retrieve a resource from PHP’s internal resource list when it’s passed as an argument to a userspace function, I had to change to use the zend_fetch_resource()
function instead of the ZEND_FETCH_RESOURCE()
macro.
Next, there were some changes to string handling in function arguments and return values. Instead of receiving a string argument with s
in the zend_parse_parameters()
argument specifier string, I use S
. (That’s a capital S
instead of lowercase s
.) This puts the passed in string into a zend_string
structure (instead of separate variables for the character data and length). The val
member of the struct has the character data.
After that, I had to update the hash traversal in boxwood_replace_text()
. The code became much simpler. Instead of tracking hash position myself and using a for
loop with verbose increment and condition steps, I use the dainty ZEND_HASH_FOREACH_VAL()
macro which conveniently iterates for me, plopping each hash value in a zval
for my use.
The other little cleanup was due to the disappearance of IS_BOOL
– the one place I used that I now have to test for IS_TRUE
and IS_FALSE
separately.
All in all an easy adventure. Next I think I’ll see how it goes getting it to work with HHVM’s ext_zend_compat.