|
software
sklar.com/blog
|
Wednesday, September 29. 2010
At work, we added a language filter to Ning Pro last month. It lets Network Creators have naughty words (for the Network Creator's definition of "naughty") replaced with * characters.
A straightforward way to do this in PHP is to pass an array of words to look for and their replacements to a function like str_replace() or str_ireplace(). Or, similarly, use a regular expression that gloms the search terms together (and potentially checks word boundaries.) There are assorted WordPress plugins that work like this.
The problem with this approach is that it's really slow. Especially if you have a lot of words you're looking for. The amount of time it takes to do the search and replace grows in proportion to the number of words you're looking for. This is particularly unfortunate because usually, none of the words are ever found!
For our language filter, we took a different approach. We've packaged it up into a PHP extension called Boxwood and releasing it today as open source. (Find it on github: http://github.com/ning/boxwood.)
With Boxwood, you can have your list of search terms be as long as you like -- the search and replace algorithm doesn't get slower with more words on the list of words to look for. It works by building a trie of all the search terms and then scans your subject text just once, walking down elements of the trie and comparing them to characters in your text. It supports US-ASCII and UTF-8, case-sensitive or insensitive matching, and has some English-centric word boundary checking logic.
Take it for a drive and let us know what you think!
Wednesday, November 14. 2007
How many different ways (in PHP) are there to concatenate the string values in two variables and put the result in a third variable?
Here are a few to start:
$alice = $bob . $charlie;$alice = "$bob$charlie";$alice = sprintf('%s%s', $bob, $charlie);ob_start(); echo $bob, $charlie; $alice = ob_get_clean();$alice = implode('', array($bob,$charlie));
I would say something like "Of course, this entire exercise is just for fun and in practice is totally useless," but whenever I start out thinking that something actually productive eventually emerges. So perhaps this is totally useless, perhaps not!
Friday, October 19. 2007
Jon Aquino, my talented cow-orker built a time-lapse view for Subversion. This is fantastic.
We switched from Perforce to Subversion recently (at Ning), and one of the things that I miss about Perforce was its handy time-lapse-view feature. No longer!
Wednesday, April 25. 2007
I upgraded a machine to Ubuntu Feisty Fawn (7.10) today and was having trouble recompiling the Cisco VPN client. Thanks to this thread, I went to this blog post and applied this patch and everything was fine.
I must admit that applying some random patch to one's VPN client inspires a moderate amount of queasiness, but fortunately the patch is simple enough to understand so one can be confident no mischief is involved.
Thursday, September 28. 2006
It's been a lot of hard work, so I'm quite excited that we've just released three great new Ning sites: Ning Videos, Ning Photos, and Ning Group.
I particularly like the embeddable slideshow that Ning Photos has, and its companion in Ning Videos, the embeddable player -- so you can put photos or videos on your blog or wherever. Both apps let you e-mail in content from your phone, too. Ning Group has some spiffy HTML parsing and file upload features so you can share documents with folks and incorporate music, pics, or anything else in the forums.
Plus, all three sites have the juicy bits that every site on the Ning platform gets -- things such as cloneability, complete customization, and built-in REST APIs. I've been watching the feeds for clones of photos and videos -- I suppose seeing who's cloned sites you care about is the Web 2.0 version of ego surfing.
More on the Ning Blog and from Kyle.
Thursday, June 29. 2006
The XML spec says that in XML documents, "Legal characters are tab, carriage return, line feed, and the legal characters of Unicode and ISO/IEC 10646." That is:
Char ::= #x9 | #xA | #xD | [#x20-#xD7FF] | [#xE000-#xFFFD] | [#x10000-#x10FFFF]
This means that the control characters under 0x20 (with the exception of the Three Wise Whitespace) are not allowed.
This restriction goes all the way back to the definition of a "character" in the W3C Working Draft of November 14, 1996.
Plenty of brilliance went into crafting that document, so I must be missing something extremely obvious here: why are those control characters outlawed? What is the reasoning behind the spec preventing me from including in an XML document:
<data></data>
Preventing terminal beeps from errant Ctrl-Gs?
Monday, June 26. 2006
Over the weekend we launched the "Ningbar" -- what Diego, Brian, and the rest of the crew at Ning have been cranking away on for while.
The Ningbar is not just a replacement for the sidebar that used to accompany all Ning apps. It's the control center for getting the most out of Ning, whether you're using an app, cloning an app (which now takes just exactly 2 clicks), or writing code behind an app. In the Ningbar, you can get stats about the app you're using (who else uses it, what do your friends do on the app), check your messages or send messages to others, customize your apps, or clone the app you're looking at.
Of course (we wouldn't have it any other way!) the Ningbar is completely customizable and programmable. You can tweak every aspect of its appearance and behavior. And because the Ningbar sits on top of our open Javascript and REST APIs, what the Ningbar can do is only limited by what cool stuff you can think of to build. (This panel showing the current weather was a fun quick hack.)
Gina's post on the Ning blog has lots of screen shots and gives some more background on Ningbar goodies. As always, http://documentation.ning.com has the programming and API details for how to make the Ningbar do your bidding. In particular, check out the sections on our new Javascript APIs and interface customization.
Monday, June 19. 2006
What would regular expressions for non-text look like? I think many of the quantifiers would be similar, but the notions of "character classes" and what goes in an atom would be totally different.
Two ways of specifying bits to match for audio could be score-oriented or sample-oriented. Score-oriented "classes" could match patterns consisting of particular notes, notes in particular keys, with particular duration, particular chords, parts of particular chords, and so on. These could be built up into multi-note patterns. Other notation would look for other parts of the score -- tempo, all those Italian words that describe how to play the notes, etc.
Sample-oriented classes could match particular samples (or subsets of samples), certain rhythms, melodies, etc. With fancy enough signal processing, the classes could match against particular words being sung, a particular singer doing it, or slices that "sound like" some other slice (where "sound like" is implemented by some pluggable algorithm.
A text-based notation might be doable for the score-oriented classes and some of the sample-oriented classes but to fully extend the analogy the expression of the audio-regex could/should be with audio (or visual representations of the audio) as well -- "Find the five bars on either side of any music that sounds like **this**" where **this** is some scoring or sample that's either taken literally or has been approrpriately "expression-ified" (with audio processing filters?), or, alternatively, is a visual representation of a score that has been similarly transformed.
Extending this to video is an expansion of the "sample-oriented" audio expression language, but with visual idioms in addition to the auditory ones -- some possible classes to match against are things such as frames with human faces in them, frames with a particular face, indoor frames, outdoor frames, frames that are predominantly a particular color, frames from the work identified on IMDB by ID XXX, and so on.
Saturday, April 22. 2006
Cow-orker and Javascript fellow traveller Jon Aquino posted his thoughts about working with two different Javascript toolkits (Dojo and Prototype):
Prototype is more of a Porsche, whereas Dojo is more like a Hummer.
I had been ruminating along similar lines (since we've both been playing with both toolkits) but ended up instead with this haiku:
Prototype makes your
Javascript look like Perl; but
Dojo? like Java.
Monday, April 10. 2006
BadgerFish is an attempt to have a straightforward way to represent arbitrary XML documents as JSON objects. Think of it as SimpleXML for Javascript.
Tuesday, February 14. 2006
This demo is stupendous. Changing interaction-with-"mouse"-pointer from serial to parallel provides such a wider information flow between you and the computer.
I look forward to when the screens in the demo are commonplace!
Friday, November 4. 2005
At the risk of getting beat up by elisp hoodlums the next time I find myself alone in a dark digital alley at night, I must admit that I am spending less time with XEmacs these days and more time with jEdit.
Jon turned me on to it at work* after it became his preferred editor for writing Ning apps. The initial thing that made me switch was that jEdit's SFTP support is excellent, while XEmacs's is (via tramp) is nonexistent.
You come for the SFTP, but you stay for the rest of the features. jEdit really has been an almost perfect balance of providing all the things I want but otherwise not getting in my way. There has been a bit of a learning curve as I figure out what key combos do what or which plugin is appropriate for a task, but it's not bad. Each time I think that jEdit doesn't have some feature I need, it turns out I'm wrong.
When I complained about missing isearch-forward, Jon pointed out that C-, does incremental search (and C-g moves to the next match.) When I thought, "a File-Open dialog box is so clunky compared to C-x o and then typing a path with tab completion", I discovered that the File-Open dialog box that C-o brings up in jEdit supports tab completion and auto-navigates to subdirectories as you type them.
That said, I am not completely gaga on the jEdit kool-aid yet. XEmacs has better source control system integration and its php-mode does better "indent to the right place (with spaces) when I hit tab". I also have a juicy set of elisp macros for writing Docbook XML and haven't tried any XML writing in jEdit yet. jEdit has somewhat better PHP intelligence than XEmacs -- its PHPPlugin can find various syntax errors, while XEmacs has nothing like that. However, jEdit's PHPPlugin isn't up to the code-completion of, e.g. Zend Studio, which is really nice.
* Does saying "at work" really make sense when I'm in NYC, he's in BC, and "the office" is in Palo Alto? I suppose "at" is virtual now, too.
Wednesday, October 5. 2005
We launched Ning yesterday -- a playground for building and using social applications.
I'm very excited to have this out and about now for experimentation and use. It's been incredibly fun to build and noodle on the consequences as we built it, but I suspect the feedback we now get from users and developers as well as the apps that new developers build will create an even bigger wave of neat ideas.
Working with such a spectacular team of folks has been (and continues to be!) a thrill. (As well as with all the spectacular folks who don't have blogs to link to.)
If you don't know PHP, dive in and clone existing apps to get your own social apps up and running in seconds. If you do know PHP, use our PHP API and components to build a cool new app in (slightly more) seconds.
Plenty post-launch cleanup and coordination to do (as well as absorb all of the nice traffic from /., del.icio.us, digg, MeFi, boingboing, etc.) but I'll have plenty more to say in the coming days, weeks, and months.
Thursday, October 14. 2004
As much I wanted my Windows Re-Education Camp efforts to succeed completely, my brain and my fingers have Unix idioms too deeply ingrained in them to make the increasing amount of adjustment effort worth it.
Instead of firing up my long-dormant VMWare installation, I thought I'd give CoLinux a try.
Continue reading "Wrangling CoLinux Networking"
Tuesday, September 21. 2004
I am looking for virtual market software -- along the lines of Blogshares or Hollywood Stock Exchange, but for trading arbitrary "securities".
This could be a neat project to build (whilst tiptoeing around relevant patents), but if something customizeable already exists, I'd like to take a look at it.
Any thoughts?
|
|
|