|
ning
sklar.com/blog
|
Wednesday, September 29. 2010
At work, we added a language filter to Ning Pro last month. It lets Network Creators have naughty words (for the Network Creator's definition of "naughty") replaced with * characters.
A straightforward way to do this in PHP is to pass an array of words to look for and their replacements to a function like str_replace() or str_ireplace(). Or, similarly, use a regular expression that gloms the search terms together (and potentially checks word boundaries.) There are assorted WordPress plugins that work like this.
The problem with this approach is that it's really slow. Especially if you have a lot of words you're looking for. The amount of time it takes to do the search and replace grows in proportion to the number of words you're looking for. This is particularly unfortunate because usually, none of the words are ever found!
For our language filter, we took a different approach. We've packaged it up into a PHP extension called Boxwood and releasing it today as open source. (Find it on github: http://github.com/ning/boxwood.)
With Boxwood, you can have your list of search terms be as long as you like -- the search and replace algorithm doesn't get slower with more words on the list of words to look for. It works by building a trie of all the search terms and then scans your subject text just once, walking down elements of the trie and comparing them to characters in your text. It supports US-ASCII and UTF-8, case-sensitive or insensitive matching, and has some English-centric word boundary checking logic.
Take it for a drive and let us know what you think!
Tuesday, May 4. 2010
I just posted on the Ning code blog about the PHP microbenchmarking framework we released:
I'm pleased to announce the release of ub, a PHP microbenchmarking framework. You can download it from http://github.com/ning/ub.
The goal is to make it as easy as possible to compare the runtime of alternative approaches to the same problem, such as different regular expressions, or different methods for string or array manipulation.
The source distribution contains a README with some documentation and a bunch of sample benchmarks.
For normal use, it is rare that two similar, but different approaches produce appreciable differences in runtime. (Inefficient regexes and bloated call stacks aside.) The payoff from this kind of benchmarking is really on operations that happen hundreds or thousands of times in a request, or are happening on hundreds or thousands of servers. At that point, shaving off small amounts of runtime performance can really make a difference.
I am looking forward to beef up the set of included benchmarks -- contributions are welcome!
Tuesday, September 16. 2008
I presented my Static and Dynamic Analysis at Ning talk today at the 2008 Zend/PHP Conference. The conference is much bigger this year than last, very exciting to see all the different things people are doing.
The slides from my talk are available at http://www.sklar.com/files/static-dynamic-analysis-zendcon-2008.pdf .
Thursday, November 8. 2007
Friday, October 19. 2007
Jon Aquino, my talented cow-orker built a time-lapse view for Subversion. This is fantastic.
We switched from Perforce to Subversion recently (at Ning), and one of the things that I miss about Perforce was its handy time-lapse-view feature. No longer!
Friday, October 12. 2007
The slides from my API design talk yesterday are available.
If you missed Zendcon, come to the DC PHP Conference in a few weeks. I'll be talking about API Design there on November 8.
Thursday, October 4. 2007
As Diego and Gina have pointed out, the Ning Platform has just celebrated its second birthday.
The past two years have been a lot of work and a lot of fun. It's been gratifying to see the crazy, heartwarming, innovative uses folks have come up with for the platform and also extremely rewarding to work with such talented people.
I wonder what I'll have to say in a blog post a year from now that links to this one?
Wednesday, July 11. 2007
I'm going to be giving a talk on "API Design in PHP" at ZendCon 2007 in October.
Our PHP API at Ning is a little over two years old at this point. There are some things I love about it that I think we've done really well, some things that we have revised, some things we can do better, and some interesting things planned for the future.
The talk is all about what we've learned developing, building, and supporting the API with both a ferocious devotion to backwards compatibility and a rapid new release cycle.
See you in October!
Wednesday, April 25. 2007
So I've got this string (in PHP) and I need to scan through it character by character. I can't scan byte by byte because it's 2007, our users write in all sorts of languages, and the string is UTF-8.
The PHP 5 solution uses mb_strlen() to find the length and then mb_substr() to grab each character:
$j = mb_strlen($theString);
for ($k = 0; $k < $j; $k++) {
$char = mb_substr($theString, $k, 1);
// do stuff with $char
}
In PHP 6, one would do:
foreach (new TextIterator($theString, TextIterator::CHARACTER) as $char) {
// do stuff with $char
}
Some rough benchmarks on a 1500 character (and 2900 byte) string (Linux, whatever processor is inside this Thinkpad T43 here, your mileage may vary, etc etc etc) give me about 61 scans/sec with PHP 5.2.1, where a "scan" is just moving through the loop above with mb_substr and doing one if() test comparing the char to '<'
Under PHP 6.0.0-dev with unicode.semantics=on, switching from mb_strlen() and mb_substr() to regular strlen() and substr() produces about the same result. And indexing with $theString[$k] is the same speed as substr().
However, the TextIterator case is much faster, about 450 scans/sec!
Nicely done!
Thursday, September 28. 2006
It's been a lot of hard work, so I'm quite excited that we've just released three great new Ning sites: Ning Videos, Ning Photos, and Ning Group.
I particularly like the embeddable slideshow that Ning Photos has, and its companion in Ning Videos, the embeddable player -- so you can put photos or videos on your blog or wherever. Both apps let you e-mail in content from your phone, too. Ning Group has some spiffy HTML parsing and file upload features so you can share documents with folks and incorporate music, pics, or anything else in the forums.
Plus, all three sites have the juicy bits that every site on the Ning platform gets -- things such as cloneability, complete customization, and built-in REST APIs. I've been watching the feeds for clones of photos and videos -- I suppose seeing who's cloned sites you care about is the Web 2.0 version of ego surfing.
More on the Ning Blog and from Kyle.
Thursday, August 24. 2006
Some neat Ning + PHP related stuff recently: Ben and Elizabeth set up a Group clone for PHPCommunity -- http://phpcommunity.ning.com.
Ben also set up an app -- http://zendfw.ning.com -- where he installed the Zend Framework and made a few tweaks so it's runnng happily on the Ning Playground. I was pleased to see that our URL mapping support can handle everything that Zend Framework needs.
Tuesday, July 18. 2006
I live in New York City, but I'm gearing up for some extended travel. From September 1 to the end of the year, I'll be in Palo Alto, CA. I'm looking forward to working at Ning HQ for a few months. Then, from the beginning of January 2007 to the end of May, I'll be in Paris, resuming my working-remotely existence.
I have some packing to do!
Monday, June 26. 2006
Over the weekend we launched the "Ningbar" -- what Diego, Brian, and the rest of the crew at Ning have been cranking away on for while.
The Ningbar is not just a replacement for the sidebar that used to accompany all Ning apps. It's the control center for getting the most out of Ning, whether you're using an app, cloning an app (which now takes just exactly 2 clicks), or writing code behind an app. In the Ningbar, you can get stats about the app you're using (who else uses it, what do your friends do on the app), check your messages or send messages to others, customize your apps, or clone the app you're looking at.
Of course (we wouldn't have it any other way!) the Ningbar is completely customizable and programmable. You can tweak every aspect of its appearance and behavior. And because the Ningbar sits on top of our open Javascript and REST APIs, what the Ningbar can do is only limited by what cool stuff you can think of to build. (This panel showing the current weather was a fun quick hack.)
Gina's post on the Ning blog has lots of screen shots and gives some more background on Ningbar goodies. As always, http://documentation.ning.com has the programming and API details for how to make the Ningbar do your bidding. In particular, check out the sections on our new Javascript APIs and interface customization.
Saturday, April 22. 2006
Cow-orker and Javascript fellow traveller Jon Aquino posted his thoughts about working with two different Javascript toolkits (Dojo and Prototype):
Prototype is more of a Porsche, whereas Dojo is more like a Hummer.
I had been ruminating along similar lines (since we've both been playing with both toolkits) but ended up instead with this haiku:
Prototype makes your
Javascript look like Perl; but
Dojo? like Java.
Monday, April 10. 2006
BadgerFish is an attempt to have a straightforward way to represent arbitrary XML documents as JSON objects. Think of it as SimpleXML for Javascript.
|
|
|