Thursday, October 27, 2005

The Power Laws of Blogosphere Popularity

If you are reading this blog, you are most likely an English speaker (with Tagalog, Bisayan, Northern Californian or "Austryan" accent), but the subject matter today applies to most languages, and to much else besides...

Consider the English language. It is made up of words such as the ones that make up this sentence. But a very few words, like "the" or "and" or "of" or "or" -- are used in spoken and written English much more frequently than say "zugzwang" or "pleonasm", which are used very rarely indeed.

By compiling the statistics on a one-million word sample called the Brown University Corpus of Standard American English, linguists have discovered that the word the occurs about 7% of the time; the word of about 3% of the time. But a huge number of English words were called hapax legomena--they occurred only once in the one million word sample, that is, they are very rarely used in written English. (My use of zugzwang has probably given that poor word a tremendous boost.)

The main scientific result of such research was Zipf's Law: the frequency of usage of any word is inversely proportional to its rank in the frequency table. If you list all the words by RANK starting with the most frequently used to the least frequently used, and plot how frequently they are used, you get a hyperbola.

So what, you say. Well it turns out that so called power-law distributions are ubiquitous in all sorts of collections, for example, in the size of human settlements, in the magnitude of earthquakes, and, in the distribution of wealth.

THE GAP BETWEEN THE RICH AND THE POOR: In the case of the rich and the poor, it's actually worse than one might think. The New Scientist reports that
THE rich are getting richer while the poor remain poor. If you doubt it, ponder these numbers from the US, a country widely considered meritocratic, where talent and hard work are thought to be enough to propel anyone through the ranks of the rich. In 1979, the top 1 per cent of the US population earned, on average, 33.1 times as much as the lowest 20 per cent. In 2000, this multiplier had grown to 88.5. If inequality is growing in the US, what does this mean for other countries?

And it turns out that BLOGOSPHERE POPULARITY is also governed by a measureable POWER LAW DISTRIBUTION:

Above plot comes from Clay Shirky's Power Laws, Weblogs and Inequality. His post was quite famous in 2003 when it first came out and revealed an important aspect of blogosphere dynamics. The so-called A-List or top bloggers get a lion-share of the trillion or so clicks or eyeballs that are the currency and lifeblood of the World Wide Web. A click is a neuronal signal in the pathways of the global mind. Here are the tycoons of that world, the main synapses of the blogosphere.

So let me pose this question to Philippine Commentary readers: What do Instapundit and Andrew Sullivan or Michelle Malkin have in common with the simple, ubiquitous word "the" and the connective preposition "of" that they should all be at the top of their respective power law distributions?

What makes them such powerful memes?


