Converting a PHP Extension to PHP 723 Mar 2015
To dig under the hood of PHP 7 a bit, I decided to see how hard it would be to update my Boxwood extension to be compatible with PHP 7.
Boxwood is something I built at Ning a few years ago to do efficient multi-word replacement in text. We used it to “bleep out” naughty words that Network Creators didn’t want to see on their networks. By building a trie of all the words to replace, it’s able to do a single pass through the text that might contain naughtiness and make replacements speedily.
It’s not a terribly complicated PHP extension but it exercises a few Zend Extension API features such as function calls (obviously), parameter parsing and type checking, resources, module globals and hash table traversal.
Total time to make all the changes was about 40 minutes, and that includes some unrelated-to-PHP-7 housekeeping to remove warnings that gcc didn’t care about when I originally released boxwood but clang/LLVM (what I’m using now) complains about.
The interesting changes are all in php_boxwood.c.
First, I had to change how the boxwood resource is handled. Boxwood uses a PHP resource to represent a collection of words to bleep. The resource is created with
boxwood_new(), words get added to the resource by
boxwood_add_text(), and then
boxwood_replace_text() does the replacement. The resource type is now
zend_resource (instead of
zend_rsrc_list_entry) and there’s a new syntax for creating a resource. Additionally, to retrieve a resource from PHP’s internal resource list when it’s passed as an argument to a userspace function, I had to change to use the
zend_fetch_resource() function instead of the
Next, there were some changes to string handling in function arguments and return values. Instead of receiving a string argument with
s in the
zend_parse_parameters() argument specifier string, I use
S. (That’s a capital
S instead of lowercase
s.) This puts the passed in string into a
zend_string structure (instead of separate variables for the character data and length). The
val member of the struct has the character data.
After that, I had to update the hash traversal in
boxwood_replace_text(). The code became much simpler. Instead of tracking hash position myself and using a
for loop with verbose increment and condition steps, I use the dainty
ZEND_HASH_FOREACH_VAL() macro which conveniently iterates for me, plopping each hash value in a
zval for my use.
The other little cleanup was due to the disappearance of
IS_BOOL – the one place I used that I now have to test for
All in all an easy adventure. Next I think I’ll see how it goes getting it to work with HHVM’s ext_zend_compat.