The delusions of community
Publish the PageRank algorithm now writes Harold Davis on an otherwise reputable O’Reilly site.
There is something deeply troubling about the complex and opaque nature of the 100+ variable unpublished PageRank algorithm as it stands today. In effect, this means that nobody (except Google insiders) understands how information in this most important of information portals passes the gate keepers.
Because there is no cryptography involved, it is also gaining some small amount of security by not being published. Given the horrendous abuse of the commons that we see today, I find it difficult to believe that any sane person would believe that there’s anything to gain for Google in publishing it. From some second hand information (i.e. I know people at Google), I know that there is a constant battle to keep ahead of people who have managed to reverse engineer the algorithm to some small extent. Giving them the whole algorithm will make it nearly impossible to do so. This isn’t cryptography, where secrecy is indicative of bad design, but instead a situation where secrecy is the only option, unless they care to begin to ignore huge sections of things on the Internet—something that’s already happening with blogs.
But not publishing the details of the PageRank algorithm goes against the tenets of open source espoused by many who work at Google, violates the idea that information should be freely available (after all, this is a most important piece of meta information!), and deprives Google of the open-source-like benefits of community scrutiny.
Does Mr. Davis really believe everything benefits from a million unexpert eyes? Google has managed to hire just about everyone in the field, so the likelihood of someone solving some huge problem in their equation is pretty small. How many of the big math problems get solved by random people, and they’ve been published for hundreds of years?
But what is Google to gain? Some “good will” that will not help them when their engine is trounced over and over in a mutual assured destruction pack with those who care little about the commons? A piffle. Those who are interest in these fields are already pursuing them. Google doesn’t stand to benefit one whit from Joe Random Programmer looking at a 100 variable equation. Most programmers suck at understanding equations to start with, otherwise you wouldn’t see so many hair-brained bad ideas that would have been solved with the opening of a volume of Knuth.
Google has become a magnet for the best and brightest in many many fields. Those who believe that open source is about raw numbers are delusional in the extreme. It is instead about the natural attraction of those with interest to the field of interest. That is something that doesn’t happen in the commercial world often. By working on what you’re interested in—and likely knowledgable in—you increase the average quality of the individual. This is the “magic,” if there is any—and I’m not saying there is, honestly.
So I say, free the PageRank algorithm now!
I say free your mind from the hegemony of “open source everywhere,” and understand that some things are fits, and some are not. Also, for someone who so admires the “feedback” from open source, he doesn’t have comments or trackbacks turned on on his own blog.
This entry was posted at 4:31 pm on 19 April 2005 and is filed under Technology. You can follow any responses to this entry through the post-specific RSS 2.0 feed.
[...] osoft’s tenacity A response from the Googleplex Some of my earlier comments did not sit well with Harold Davis. He writes in his blog, Googlepl [...]
Responses are currently closed, but you can trackback from your own site.
http://www.braintique.com/research/mt-archives/000149.shtml