July 2008 – Jarrett House North

Personal wikis, and other diversions

StockTrader 2.0 Sample (Nicholas Allen's Indigo Blog)

Sample end-to-end .NET/WCF application

(tags: c# sample)
Getting Things Done: Get organized with GTDTiddlyWiki

A simple personal wiki engine.

(tags: wiki web javascript free)
Set up your personal Wikipedia (Lifehacker)

MediaWiki isn't the easiest to install wiki software out there, but it's still (for my money) the best. A quick personal setup guide for running it on a Windows XP box.

(tags: wikipedia windows)
Weaver, McCain's Former Strategist, Calls "Celeb" Ad "Childish" (The Atlantic)

It's interesting to see the ranks start to crumble.

(tags: 2008 mccain)

links for 2008-07-31

PerversionTracker

They’re back! And they’re unleashed on the App Store!

(tags: humor iphone apple)
Poor security quality in software. Someone is watching over me. (Zero in a bit)

Why do we assume that our software is secure?

(tags: security)

The gloaming

So here I am back in Lenox. It’s beautiful but ominous skies and a day of Russian ahead; our residency for Tschaikovsky’s Eugene Onegin has begun.

I’m currently flashing back to my one encounter with the language, a class in 1986, and am very grateful that I was exposed to the soft consonants ahead of time. Some of our Boston-bred palates are having real difficulty with the vowel sounds, though you can’t tell en masse, thank goodness.

It’s always a crapshoot, the lodging that our fair parent organization provides. Usually it’s just fine, but tonight my roommate isn’t here, they almost mixed up my room with a bunch of sopranos next door, and I had to manually configure my IP address so that I could get on the motel wireless. But I’m on now. (And it’s a good thing I’m not doing demos anymore; it’s slow, slow, slow.)

links for 2008-07-30

Alaska Senator Is Indicted on Corruption Charges (NYTimes.com)

Oddly, VECO’s pipelines also look like a series of tubes.

(tags: gop congress tedstevens)
A Simple Book Repair Manual (Dartmouth)

How to rescue your torn pages and broken book covers in, well, a bunch of steps.

(tags: books bookbinding)

Around Boston: new light, old park

Two stories caught my eye in the Globe, one with proximity to my vocation and one to my avocation.

The first was regarding the undeveloped land to the south of our offices in Burlington. Pointedly subtitled “city can’t develop land in Burlington, Woburn,” the story details the ongoing dance between citizens of the suburbs who want to see the Mary Cummings Park maintained as parkland, and the City of Boston, which was deeded the land by Cummings on the condition that it stay a “public pleasure ground,” who apparently would prefer that nothing ever be done with it. If the city can’t develop it, that is. A word to the Friends of the Park: better keep a close eye on the docket. Boston’s actions here smell like a delaying tactic until they can get a judge to break the conditions of the deed and allow them to sell the property to developers.

Speaking of delays, the second article regards the removal of the blackout panels over the top windows in Symphony Hall. I remember looking up at the interior panels from the stage during a rehearsal this spring and wondering about them to my fellow tenors, none of whom agreed that they were really windows. And no wonder; there’s no living memory of them ever having been windows. The panels were put into place in the early 1940s, and their removal, I imagine, leaves the old hall emerging blinking into the sunlight like Hiroo Onoda. But the removal, as the article highlights, indicates the profoundly conservative attitude of the BSO regarding the hall’s acoustics. I wonder what the impact on the aesthetics will be?

links for 2008-07-29

What He Knows for Sure (The New Yorker)

A profile of Tavis Smiley, who “hates blowouts”, and his cautions on Obama.

(tags: 2008 obama)
R. Stevens Steers Diesel Sweeties Back to Its Roots (Wired)

R. makes the analogy between webcomics and organic farming. Of course mass market news is the national supermarket chain! Can’t believe I didn’t make that analogy first.

(tags: kommix webcomics dieselsweeties)
Apple Fails to Patch Critical Exploited DNS Flaw (TidBITS)

What the hell is Apple thinking?

(tags: security)
Evilgrade Will Destroy Us All (Metasploit)

Implementation of an exploit for Dan Kaminsky’s DNS vulnerability that allows an attacker to send fake updates via a variety of mechanisms to a victim’s computer.

(tags: security)
Upcoming Byrne/Eno album: “Everything That Happens Will Happen Today” (Boing Boing)

New collaboration from Eno and Byrne. The last thing they did together was either “My Life in the Bush of Ghosts” or “Remain in Light,” depending on how you count.

(tags: music davidbyrne brianeno)
Foreigners (The New Yorker)

A balanced review of last week’s Obama trip finds him mostly sinking 3-pointers, with John McCain throwing a lot of elbows.

(tags: 2008 election obama mccain)
Terms and Conditions (NASAImages.org)

So can you use the images on Wikipedia?

(tags: nasa copyright)
Personal History: All the Answers (The New Yorker)

Charles Van Doren’s first person account of the quiz show scandal.

(tags: history television)
Tom Vanderbilt’s Why We Drive the Way We Do (Wired)

I guess I’ll need to read the whole thing to find out whether he goes deeper than information assymetry, into physics phenomena (stop and start traffic is like wave motion, e.g.)

(tags: physics book review psychology automobiles)

Upcoming: Business of Software 2008 in Boston

I was about to delete an email from Bob Cramblitt on my old blog, until I actually read it and realized it was relevant to at least some of my readers:

Hi Tim:

Thought you’d like to know that Seth Godin, Joel Spolsky, Jason Fried and others are coming to Boston for the Business of Software 2008 conference. This is the only conference run by people who actually manage successful software companies. All substance, no BS and not a Web 2.0 to be found.

Your blog readers can get $100 off registration by entering “MASS” when registering at www.businessofsoftware.org.

So there you go. Never let it be said that reading my blog got you nowhere. (Disclaimer: this was my only contact with Bob Cramblitt and I’m not getting anything for posting this.)

BrowseRank and the challenge of improving search

I posted a quick link to an article about Microsoft’s new BrowseRank search technology a few days ago. Here’s why the paper is informative, why I think BrowseRank is an interesting technology for improving search, and why I think it’s doomed as a general-purpose basis for building relevance data for the web.

Informative: This paper should be required reading for anyone who wants to know the fundamentals of how web search ranking currently works, what PageRank actually does for Google, and how to objectively test the quality of a search engine. It also offers an interesting two-pronged critique of PageRank:

PageRank can be manipulated. PageRank assumes that a link from a page with authority to another page confers some higher rank on the second page. The paper points out the well-known issue that, since the “authority” of the first page is also derived from inbound links, it’s possible to use Google bombing, link farms and other mechanisms to artificially inflate the importance of individual pages for fun and profit. It’s pretty well known that Google periodically adjusts its implementation of PageRank to correct for this problem.
PageRank neglects user behavior. The paper argues this somewhat tendentiously, saying that PageRank doesn’t incorporate information about the amount of time the user spends on the page–of course, the paper’s whole hypothesis is that time on page matters, so this doesn’t reveal any deep insight into PageRank. But it’s an interesting point that PageRank does assume that only web authors contribute to the ranking algorithm. Or does it? I’ll come back to this in a bit.

Interesting: The proposed BrowseRank algorithm uses user data–pages visited, browse activity, and time on page–to create a user browsing graph that relies on the user’s activity in the browser to confer value on pages. The authors suggest that the user data could be provided by web server administrators, in the form of logs, or directly by users via browser add-ins. A footnote helpfully suggests that “search engines such as Google, Yahoo, and Live Search provide client software called toolbars, which can serve the purpose.”

The claim of the paper is that user behavior such as time on page confers an “implicit vote” on the content in a way that’s harder to spam than PageRank. I’ll come back to this point too.

Doomed: BrowseRank relies on the following:

A way to obtain a statistically valid sample of user browsing information
A reliable way to determine intent from user browsing information, such as session construction
Time on page is a statistically valid indicator of page quality.

There are problems with each of these requirements that are non-trivial.

User browsing information. The paper proposes that user browsing data can be obtained by the user of a client-side browsing input or by parsing server logs, and says that this practice would eliminate linkspam. Well, yeah, but it opens up two concerns: first, how are you going to recruit those users and site administrators so that you get a representative sample? And second, how do you ensure that the users are not themselves spamming the quality information? In the first case, we have plenty of evidence (Alexa, Comscore) that user-driven panel results can yield misleading information about things like site traffic. In the second case, we know that it’s trivial to trick the browser into doing things even without having a toolbar installed (botnet, anyone?), and it’s been proven that Alexa rankings can be manipulated.

There are two main problems with the user browse data model: it’s difficult enough to recruit a representative panel of honest users to install a browser plugin that will monitor their online activities, but screening out spam activities becomes far more difficult.

Session construction: Knowledge about the user’s session is one of those interesting things that turn out to be quite difficult to construct in practice, especially when you care about meaningful time on page data. The method described in the Microsoft paper is pretty typical, and neglects usage patterns like the following:

Spending large amounts of time in a web app UI opening tabs to read later (web based blog aggregator)
Going quickly back and forth between multiple windows or multiple tabs (continuous partial attention)
The last page in a session gets assigned too much time on page because of the arbitrary 30 minute session limit (the “bathroom break” problem)

Time on page as an indicator of search quality: This is where my main gripe with the article comes from. The authors conclude that their user browsing graph yields better results than PageRank and TrustRank. The problem is, better results at what? The tests posed were to construct a top 20 list of web sites; differentiate between spam and non-spam sites; and identify relevant results for a sample query. The authors claim BrowseRank’s superiority in all three areas. I would argue that the first test is irrelevant; the second was not done on an even playing field; and the third is incomplete. To wit: First, if you aren’t using the relationship between web pages in your algorithm, you shouldn’t need to know what the absolute top 20 sites are because the information is completely irrelevant to the results for a specific query. Second, conducting a test on spam sorting with user input that operates on a spammy corpus without spammy users is not a real world test.

Third, the paper’s authors themselves note that “The use of user behavior data can lead to reliable importance calculation for the head web pages, but not for the tail web pages, which have low frequency or even zero frequency in the user behavior data.” In other words, BrowseRank is great, if you only care about what everyone else cares about. The reality is that most user queries are in the long tail, so optimizing how you’re doing on the head web pages is a little like rearranging deck chairs on the Titanic. And because we don’t know what the sample queries were for this part of the study, it’s impossible to tell for which type of searches BrowseRank performs better.

Finally, there’s a real philosophical difference between BrowseRank and PageRank. BrowseRank assumes that the only interaction a user can have with a web page is to read it. (This is the model of user as consumer.) PageRank makes a more powerful assumption: that if a user is free to make contributions to the web by adding to it, specifically by writing new content. The paper talks a lot about Web 2.0 in the context of sites like MySpace and Facebook, but arguably PageRank, which implictly empowers the user by assuming their equal participation in authoring the Web, is the more Web 2.0-like metric.

links for 2008-07-28

Elliott Carter, a modernist oasis (Exhibitionist – Boston.com)

I’m really sorry I wasn’t out there for any of the Carter. There’s little enough modern music programmed at Tanglewood when the TFC is in residence.

(tags: tanglewood bso)
URLInfo Reveals Hidden Web Site Server Details (LifeHacker)

Interesting promise to make HTTP headers visible. Didn’t work for me tonight though.

(tags: web security)
Tchaikovsky saw reality in ‘Eugene Onegin’ (Boston Globe)

Preview for next weekend’s concert–the TFC will be in performance under Sir Andrew Davis for this one.

(tags: bso tanglewood tchaikovsky onegin)
Black Radio on Obama Is Left’s Answer to Limbaugh (NYTimes.com)

Interesting perspective from black radio on the Democratic candidate.

(tags: 2008 election obama black)
Dr. Horrible’s iPhone Remote is Available as a Web App (Geekdad)

Nice!! Someone needs to hook this up to the Bluetooth in late model BMWs and really take them for a drive.

(tags: iphone humor)
Indiana Jones and the Temple of Absurdly Implausible Excess (NYTimes)

“Nuked the fridge” does have a certain ring to it.

(tags: movies)
FSF’s “Defective By Design” Targets Apple Genius Bars (Slashdot)

Clogging the Genius Bars so Apple customers can’t get service isn’t enlightening. It’s a sign of defeat from an organization who acts as though it has no power and no honor.

(tags: apple fsf)

links for 2008-07-26

Wow what a picture (Dave Winer/AP)

Dave is right, it’s an impressive photo (of Obama greeting the crowd in Berlin) on several levels.

(tags: 2008 election obama photography)
Microsoft tries to one-up Google PageRank (CNET)

BrowseRank sounds a lot like an algorithm I saw being discussed internally a few years ago. Interesting to see the proposal in daylight, as it relies on browser plugins and the cooperation of server admins rather than publicly accessible HTML documents.

(tags: microsoft search google)
Prof whose ‘last lecture’ became a sensation dies (AP)

Alas, Randy Pausch has given his last “Last Lecture.”

(tags: obit)
Brief analysis of “Analyzing Websites for User-Visible Security Design Flaws” (Attrition.org)

Update to yesterday’s link about vulnerable banking sites: Breezy, old, inaccurate security research considered harmful.

(tags: security)

Veracode is hiring

If you’ve ever wondered what it would be like to work at an amazing company in the security space, wonder no more. Veracode is growing, and we’ve got quite a few openings in sales, engineering, QA, research, and even (particularly) in product management.

If you’ve read my posts about security and product management, if you’ve read about us in the press, and if you think you’ve got what it takes, drop us a line. Of course you’re welcome to contact me and ask questions about the company too.

links for 2008-07-25

CNN reporter says bad things about the TSA, gets hassled every time he flies (Boing Boing)

Can the next administration please turn on the lights at TSA and see what the nest of vermin over there are really up to?

(tags: securitytheater tsa)
I’m am absent-minded engineer (Salon)

There’s a kind of poetry in Cary’s answer; what might have been condescending turns into a celebration of systematic thinking and absentmindedness.

(tags: psychology)
Wikipedia, Meet Knol (New York Times)

Cathedral vs. bazaar? One possible benefit I can see for this is that the ability to have non-grey prose in an online encyclopedia might spur some discussion about how to liven up Wikipedia a little.

(tags: wikipedia google)
Letterpress From Scratch ( i love typography )

I don’t need another expensive hobby. I don’t need another expensive hobby. I don’t need another expensive hobby.

(tags: typography)
Funky16Corners Radio v.54 – Come Together (Funky16Corners)

Funky16Corners hits the million visitor mark, celebrates with an hour of funky and soulful Beatles covers. Hellz yeah.

(tags: beatles funk music)
Analyzing Websites for User-Visible Security Design Flaws

76% of all bank websites have at least one security design flaw? Wonder if that’s more or less true now than when the study was done.

(tags: security)
Transcript – Obama’s speech in Berlin (NYTimes)

100,000 people show up in Berlin waving American flags to meet a US presidential candidate. Maybe our international currency isn’t totally devalued. (Oh, and the speech? Totally inspirational.)

(tags: 2008 election obama)
WordPress for iPhone › Version 1.1 and Beyond

WordPress for iPhone is an open source project. Here’s the roadmap for the next few versions.

(tags: wordpress iphone opensource)
Sorry We Asked, Sorry You Told (washingtonpost.com)

If you shine enough light on the motivations driving the folks who oppose gays in the military, I hope all of them put on a performance like Elaine Donnelly. What a nutjob.

(tags: military politics)
Robbins Spring (The Arlington Advocate)

A brief history of the neighborhood in which I live, through the early part of the 20th century.

(tags: arlington localhistory)

Update: Images and WordPress 2.6

I may have been too hasty to condemn the WordPress for iPhone app. One of my criticisms was that it couldn’t upload a photo to my site. Well, I just discovered that I couldn’t either, even using the browser. This appears to be another issue with WordPress 2.6.

Fortunately the fix is simple: fill in the otherwise optional Full URL path to files (optional) field on the Settings » Miscellaneous section of your control panel with the actual path to your images–usually http://yourdomain.com/yourwordpressdirectory/wp-content/–and save the settings. The forum doesn’t have a consensus on what caused this optional field to become mandatory, but that appears to fix it for most users.

I’m going long in arks.

Seriously, people, what is going on with the rain out here? We have had deluging thunderstorms every day this week. There was a stranded van on my commute this morning. On Route 2A in Burlington, for heavens’ sake.

On Monday this week, I was picking up some things at the Walgreens in Arlington Heights, which has the World’s Smallest Parking Lot™ — and shares it with a Trader Joe’s and a Starbucks. The parking lot abuts the Minuteman Trail, which runs alongside some six feet below street level between the parking lot and a field behind. On this particular day, there was a lake about fifteen feet across in the middle of the parking lot, of unknown depth. I skirted it carefully as I parked my car, but when I got out I heard a noise like a waterfall. And I realized that there was a storm drain in the middle of the lake, which connected to an overflow pipe that emptied out beside the trail. Well, there must have been a few hundred gallons a minute going through that pipe:

(That’s the overflow pipe on the left. The lake in the background is the bicycle trail.)

It was raining so hard on Monday that Mass Ave flooded in Arlington Heights in front of the Panera. There were still sandbags there later in the week. And it was raining harder than that this morning.

All I’m saying is, when I start to see animals coming up the hill to get to higher ground at my office, I’m cornering the market on gopher wood.

links for 2008-07-24

Exposing Bush’s historic abuse of power (Salon)

Does the executive branch keep a massive database of illegally obtained information on dozens of individuals?

(tags: bush politics intelligence)
Oh man, I’m gonna abuse this (Big Contrarian)

Instant Rimshot, meet Cowbell Plus, a real iPhone app that allows you to play percussion instruments by shaking the phone.

(tags: iphone humor)
Vanity Fair Covers The New Yorker

Heh. The only possible reply to TNY’s really screwed up attempt at irony. Check the fireplace.

(tags: 2008 election mccain humor schadenfreude)
The Return of a Lost Jersey Tomato (NYTimes)

Alas, if only I could grow tomatoes! Maybe next summer.

(tags: cucina garden)