|
ideas
sklar.com/blog
|
Friday, April 11. 2008
Chris often cites this xkcd cartoon in security talks, since it's a) funny and b) a good example of SQL Injection.
I was curious to see what sorts of shenanigans one can get away with in a legal name. I'm still waiting to hear back from the NYC agency that issues birth certificates but here's what the US Social Security Agency told me:
The maximum number of characters to be shown on the Social Security number (SSN) card for the first and middle name is 26; the maximum number of characters for the last name is 26. Full names will not be reduced to initials unless the combination of first and middle names exceeds 26 characters. The only acceptable characters are alphas, hyphens, and apostrophes. The SSN card will be printed as entered into the enumeration system, but the SSN record will not display the hyphens/apostrophes.
So with hyphens and apostrophes you might be able to get away with a little syntax error mischief, I suppose.
Turns out the SSA has really detailed public documentation of all their procedures.
Friday, March 28. 2008
I had never heard of Haskell Curry before I read the preface to SICP just now, but it occurs to me that his distinction of having each of his names (first and last) turned into a term used in the discipline he studied ( Haskell the programming language and Curry the operation) is sort of like how Glenn Seaborg, working at LBL, could have a letter addressed to him using element names (seaborgium / lawrencium berkelium / californium / americium.)
Who else can you think of that falls into this admittedly fuzzy-edged bucket?
Wednesday, May 16. 2007
A PHP issue that comes up over and over again is how the static keyword doesn't know about inheritance. That is, code such as:
class Masons {
static $where = "World-wide";
static function show() {
print self::$where;
}
}
class Stonecutters extends Masons {
static $where = "Springfield";
}
Stonecutters::show();
prints World-wide, not Springfield because the self inside Masons::show() is bound at compile time to the Masons class. This is different than how $this works in instances, so it can be unexpected.
There are plenty of good reasons why PHP 5 works this way and it seems that in PHP 6 the static keyword will be able to be used in place of self to get the dynamic behavior a lot of folks are looking for (that is, print static::$where; in Masons::show() would cause Stonecutters::show() to print Springfield.
All well and good once PHP 6 is done.
In the meantime, I was noodling around with runkit and came up with some glue that lets you do something like this:
class Model {
public static function find($class, $filters) {
// This method would actually do an SQL query or
// REST request to retrieve data
print "I'm looking for a $class with ";
$tmp = array();
foreach ($filters as $k => $v) { $tmp[] = "$k=$v"; }
print implode(', ', $tmp);
print "\n";
}
public static function findById($class, $id) {
return self::find($class, array('id' => $id));
}
}
class Monkey extends Model { }
class Elephant extends Model {
public static function findByTrunkColor($color) {
return self::find(array('color' => $color));
}
}
And then after the mysterious (for a few seconds) glue:
MethodHelper::fixStaticMethods('Model');
you can do:
Monkey::find(array('name' => 'George', 'is' => 'curious'));
// prints: I'm looking for a Monkey with name=George, is=curious
Elephant::findById(1274);
// prints: I'm looking for a Elephant with id=1274
Elephant::findByTrunkColor('grey');
// prints: I'm looking for a Elephant with color=grey
Monkey::findById('abe');
// prints: I'm looking for a Monkey with id=abe
(The innards of MethodHelper after the jump...)
Continue reading "Runkit, "static", and inheritance"
Thursday, July 27. 2006
The slides from my OSCON 2006 presentation, "I'm 200, You're 200: Codependency in the Age of the Mashup," are available at http://www.sklar.com/files/I'm-200-You're-200.pdf.
Monday, June 19. 2006
What would regular expressions for non-text look like? I think many of the quantifiers would be similar, but the notions of "character classes" and what goes in an atom would be totally different.
Two ways of specifying bits to match for audio could be score-oriented or sample-oriented. Score-oriented "classes" could match patterns consisting of particular notes, notes in particular keys, with particular duration, particular chords, parts of particular chords, and so on. These could be built up into multi-note patterns. Other notation would look for other parts of the score -- tempo, all those Italian words that describe how to play the notes, etc.
Sample-oriented classes could match particular samples (or subsets of samples), certain rhythms, melodies, etc. With fancy enough signal processing, the classes could match against particular words being sung, a particular singer doing it, or slices that "sound like" some other slice (where "sound like" is implemented by some pluggable algorithm.
A text-based notation might be doable for the score-oriented classes and some of the sample-oriented classes but to fully extend the analogy the expression of the audio-regex could/should be with audio (or visual representations of the audio) as well -- "Find the five bars on either side of any music that sounds like **this**" where **this** is some scoring or sample that's either taken literally or has been approrpriately "expression-ified" (with audio processing filters?), or, alternatively, is a visual representation of a score that has been similarly transformed.
Extending this to video is an expansion of the "sample-oriented" audio expression language, but with visual idioms in addition to the auditory ones -- some possible classes to match against are things such as frames with human faces in them, frames with a particular face, indoor frames, outdoor frames, frames that are predominantly a particular color, frames from the work identified on IMDB by ID XXX, and so on.
Sunday, May 21. 2006
From the novel Olympos, by Dan Simmons: Is he dead? Irretrievably dead?
The Prime Integrator standing near the surgical table watching the procedure does not lift his head as he answers on the common band. No... normally the procedure would be to inject several million nanocytes to repair the human's damaged aorta and heart muscle, then insert more specialized molecular machines to replenish his blood supply and strengthen his immune system. ...[T]his is not possible with scholic Hockenberry.
Why not? asks the Callistan integrator, Cho Li.
Scholic Hockenberry's cells are signed.
Signed? says Mahnmut. ...
Signed - copyrighted and copy-protected, sends Asteague/Che on the common band. ... The cells and subcellular machinery ignore our own nano-interrogation and destroy any alien intrusion.
Tuesday, February 14. 2006
I got my OSCON proposal (" I'm 200, You're 200: Codependency in the Age of the Mashup") in under the wire. The theme of the talk is thinking about all the dependencies your web app has -- some you're probably aware of and some you're not -- as well as strategies for minimizing those dependencies and routing around failure.
I've got a bunch of examples I'm planning on talking about (if the talk is accepted), but I'd be interested to hear other folks' experiences in solving related problems -- do you have novel methods for caching information, application or network architecture, code structure, or anything else that helps you layer reliability onto a distributed set of resources not all under your control? Let me know!
OSCON is getting bigger every year -- it's interesting to see it grow as the world of open source encompasses more and different kinds of hackers, business people, and others. Like Adam, I'm not picking PHP track talks this year for OSCON, so you can consider this post a relatively unbiased conference plug.
This demo is stupendous. Changing interaction-with-"mouse"-pointer from serial to parallel provides such a wider information flow between you and the computer.
I look forward to when the screens in the demo are commonplace!
Thursday, March 3. 2005
Why do folks who want the freedom to remix content as they see fit get their digital dander up when other people remix their own content? Gmail's introduction was accompanied by a barrage of complaints that the automated scanning-of-messages to display ads was involuntarily subjecting those who correspond with Gmail users to a privacy invading examination and modification of their message. Now, the introduction of a version of Google Toolbar that annotates addresses, ISBNs and similar data with links to maps and Amazon.com has provoked a similar storm of outrage. The complaint goes something like "I am a virtuous content publisher whose brilliant web pages are being chewed up by the Evil Google Content Manipulation Borg without my consent! Foul! Foul!" What these complaints conveniently elide, however, is that it is individual users who are making the choice to have their web pages modified. These users must install the Toolbar and then click on the appropriate button in the Toolbar to follow the additional link. Nothing sneaky is going on behind the scenes. Objecting to users modifying web pages before, as, or after they view them is a dead end. Tools and add-ons such as popup blockers, custom style sheets, screen readers, auto-form-field-filler-outers are precisely those sorts of content modifiers. And surely no one suggests that those be banished to preserve the sanctity of the web browsing experience, right? What would be next? Preventing me from putting squibs of black electrical tape over the annoying LEDs that glow all night from various devices in my home? Should I expect a nastygram from Uniden because my application of tape has prevented me from enjoying the experience of the "Charge" light on my cordless phone as they intended? What applies to the MPAA and the RIAA applies just as strongly to Joe Homepage. If you've got thoughts/words/songs so brilliant that you can't bare to have them disparaged by content-modifying Philistines (hint: nestled at the bottom of this particular slippery slope is "The EULA for this sonata requires you to listen to it on speakers that are at least THIS good to properly receive the artist's message.") then keep that brilliance to yourself. If you require the ego gratification/financial compensation/curiosity satisfaction that comes from transmitting your message to other humans then you absolutely, positively must accept the idea that you lose some control over what happens to your baby. A related important point that Cory Doctorow makes in his swell pro-Toolbar commentary is: This shows how an authors' association like the Science Fiction Writers of America could collect its members' ISBNs and affiliate IDs for their favorite web-stores and provide plugins that would rewrite every single instance of my ISBNs on pages viewed through the plugin with a link to my affiliate account on Amazon, making me some serious coin. Wanna support an author? Install her plugin and help her feed her kids. Wanna support a charity? Install its plugin and have all the affiliate links rewritten to its benefit. Wanna support youself? Install the plugin that rewrites every ISBN with your own affiliate ID. The framework of configurable end-user content modification provides a powerful engine for consumer choice. This goes beyond tossing affiliate commissions for your purchases into whichever non-profit bucket you value (although that's a fine idea). This extends to content ratings, corporate business practice review, editorial commentary, and many other areas. Once the framework is in place, users can choose to have recomendations on whether a particular site is appropriate flow into their browser from the Christian Coalition or from MoveOn. As you shop for refrigerators, you choose whether you want annotated information about the refrigerator makers from Greenpeace, the Cato Institute, or Consumer Reports. The annotation and modification of arbitrary content, with end user consent, is a golden opportunity to build a world of informed readers and consumers. All that said, the Toolbar isn't perfect (and it is, after all a beta). For example, you can choose your map provider, but not your bookseller. The primacy of end-user choice would certainly be reinforced if all the various link destinations were configurable. I don't mean to be an apologist for Google. I like some things about the company and its products, I dislike other things about the company and its products. The most important issue here is not the specifics of the toolbar. The most important issue is recognizing that we all have to give up the control over our content that many of us demand of Big Media Corporations.
Thursday, September 30. 2004
I want to receive marketing, billing, and other corporate communications via RSS instead of e-mail.
When a company sends me an e-mail message, it is subject to all of the annoyingness of e-mail: spam filtering, drowning in my inbox (unless I spend the time to filter it somewhere), etc. Plus, I either have to send my e-mail address off into the wilderness, hoping it doesn't become spambait or go to the trouble to set up a new alias for each commercial purpose.
If, instead, I could just add a personalized feed into my feed reader, we (me and the company) would both be better off: I can trust the source of the message, so I don't need to do any spam filtering. I can retrieve the feed over an SSL connection, so it's a feasible way to provide sensitive data like a bank statement or monthly bill. The company I'm getting the feed from knows when my feed is retrieved so it has some rough metrics for determining that I've actually received the data in the feed.
A nice bonus would be if there was a way for the feed publisher to provide advisory information to my feed reader on how often to check for updates -- the hypothetical feed for my monthly phone bill doesn't need to be refreshed every few hours. Once a day is ample.
So, how about it, airlines, banks, utility companies, credit card companies, and all other corporate behemoths that are, I'm sure, exemplary custodians of my personal and financial data? Instead of (ok, in addition to, I'll be realistic) asking for my e-mail address, give me a URL like:
https://sklar:password@feeds.example.com/monthly-bill/
to plug into my feed reader. I promise I'll read everything you send me!
Monday, May 3. 2004
Much of the Gmail-inspired outrage has focused on what happens to email messages sent by non-Gmail subscribers to Gmail subscribers. "If you want to sign up to have Google's version of SkyNet scan your messages for ads," some objectors say, "that's fine, but don't involuntarily subject me to that scanning just because I send you a message."
That is, Gmail subscribers themselves may willingly opt-in to whatever onerous TOS Gmail provides, but third party correspondents are afforded no such opportunity, nor should they have to.
This faulty objection springs from the seductively misleading "privacy" of email.
When I send you an email message, that message is out of my control the instant I send it. This is a lesson that has been learned by countless Internet users who accidently include someone they're mocking on a CC: list, mistakenly send personal correspondence to a mailing list, or send something so outrageous that their friends can't keep it to themselves and it ends up in the New York Times.
It is this last circumstance that is most relevant to the Gmail debate. The custodianship of email messages you send lies with the recipient. Shared values of (on- and off-line) etiquette, friendship, and sociability usually govern that custodianship acceptably. When I share sensitive personal thoughts with friends, whether via e-mail, phone, or good old face to face conversation, they don't rebroadcast those thoughts to others. Not because of a legal requirement or a Terms of Service agreement, but because of our friendship. Even handwritten letters (with ink and paper, remember those) are subject to unauthorized distribution. In a professional context, I choose to whom and how I disclose confidential or sensitive information based on my judgement about the trustworthiness and motivations of the recipient of the information. A non-disclosure or other legal agreement helps, but doesn't prevent disclosure. It just makes punishing the disclosure easier.
The technology that underlies e-mail doesn't remove the need for the same kind of social guidelines for how it is used. If I find the computerized scanning of e-mail text to generate context sensitive ads repellent (which I don't), then I must balance my repulsion with my desire to communicate with whomever@gmail.com. It is certainly impractical for me to familiarize myself with the practices of all handlers of all destinations of all email messages I send, but that is not a new problem.
When you send someone an e-mail message, what do you know about the server that it eventually ends up on? Do you trust the administrators of that server? Where do that server's backups live? Who is the night manager at that off-site storage facility? All of these unknowns certainly affect your privacy as an e-mail author. These mysterious individuals and locations guard your prose. Any one of them could give or sell it to the world.
Encrypting your correspondence doesn't really buy you much more bulletproof protection. Yes, a PGP encrypted e-mail message gives you some protection against snoopers while the message is in transit and probably guarantees that the first person to read the decrypted message is your chosen recipient. But what happens then? Does the recipient save a plain text copy of the message to his computer? Forward on the decrypted contents to others? The same social necessities and system administration unknowns apply.
So, how to pre ventnefarious, rude, encryption-inexperienced, or just plain disagreeable correspondents from making (dare I say it) fair use of your email messages that you don't like? One way is to cuddle up to the DRM boogeyman. If the Internet has made everyone a publisher, no personal printing press turns out more content than the email client. Individual publishers of email now have something very much in common with the media behemoths that want to squash song sharing. The same technology that is derided for putting restrictive encumbrances on legally acquired PDFs, DVDs, and MP3s could also prevent perceived villains like the Gmail ad-bot from operating on your lovingly crafted email content.
Such a restrictive solution as bad a policy for email as it is for most other digital content. Email authors must realize that they give up control when they send an email. This has always been true, but perhaps the Gmail fuss makes it clearer.
Over and over again, I read and hear that the communication implications of the Internet mean distributed publishing power, grassroots efforts, infinite channels, reduction in centralized control, insert starry-eyed phrase of choice. If true, this applies to everyone, not just large corporations. If we are publishers, we all must give up some control of our creations.
|
|
|