I'm out of town visiting family this week, but stole a few minutes away at the local library to post Chapter 6. This chapter begins the second section of the book, which explores Crowdsourcing in action (as opposed to the first section of the book, which is devoted to the conditions that made crowdsourcing possible-nay-inevitable. For anyone just tuning in, I'm hoping to collect critical comments on these excerpts, which will then be published in an appendix in the crowdsourcing book due out this July.
“There were never in the world two opinions alike, any more than two hairs or two grains. Their most universal quality is diversity.” — Michel de Montaigne
Looking for a diversion one winter evening in 1995, Caltech professor Scott E. Page built a computer model in which “artificial agents”—little computer programs that interact according to rules written into their computer code—tried to solve a difficult problem. Such computer simulations are helpful to economists because they provide a controlled environment in which to test how humans, that most unpredictable of species, interact in complex systems like financial markets.
Page ran his simulation using two groups of agents. One was meant to represent the best and the brightest possible solvers—we’ll call this the Mensa group. The other group was composed of agents with a wide variation of problem-solving ability—some of the agents were talented, but many were not. It was as if he’d stopped by the faculty lounge at a mid-tier university and culled out everyone wearing brown socks. To Page’s great surprise, the brown socks outperformed the Mensa agents. As a random collection of mathematicians could hardly be expected to outperform Mensa’s best minds, Page decided to fiddle with his simulation, changing the rules by which the agents interacted. He got the same result. Still incredulous, he rewrote the program in a different computer language. The brown socks still won, over and over and over again. Page wanted to know why.
What began as a study break blossomed into a decade-plus research project, culminating, in 2007, with a book entitled, “The Difference: How the Power of Diversity Creates Better Groups, Firms, Schools, and Societies.” By applying logical rigor and mathematical precision to collective intelligence, Page has created a theoretical framework to explain why groups often outperform experts. Why did the brown socks beat the Mensa agents so consistently? The brown socks weren’t as talented as the Mensa members, but they had something better: Diversity.
Page has developed what he calls the Diversity Trumps Ability theorem. At its heart is the observation that people of high ability are a homogenous group. They are often trained in the same institutions, tend to possess similar perspectives and apply similar problem-solving techniques, or heuristics. They are indeed better than the crowd at large, but at fewer things. And many problems don’t succumb to a single heuristic, or even to a set of similar ones. They require the brown socks, so to speak, to come along and try an approach that the “best minds” would never think to apply. “This theorem,” Page writes, “is no mere metaphor or cute empirical anecdote that may or may not be true ten years from now. It’s a logical truth.”
Understanding diversity is imperative to understanding collective intelligence, and collective intelligence is an essential ingredient in one of the primary categories of crowdsourcing—the attempt to harness many people’s knowledge in order to solve problems or predict future outcomes or help direct corporate strategy. Collective intelligence is the form of group cognition that we see at work in ant colonies that act like cells of a single organism. We also see it in the very human ritual of voting, in which millions of individual choices result in a single decision. Scholars from disciplines ranging from sociology to behavioral psychology to computer science have studied the phenomenon since the early years of the 20th century, but the emergence of the Internet has given new import to collective intelligence, for the simple reason that the Internet has done more than anything else in history to facilitate it.
The types of crowdsourcing that traffic in collective intelligence take two primary forms. The first category is the prediction, or information market, in which investors purchase “futures” pegged to some expected outcome, like the winner of a presidential contest or the Oscar for Best Picture. These function much like a stock market: Individuals open an account, then buy and sell shares at the going market value. If an investor buys low (a dark horse nominee for best picture perhaps), he cleans up when his prediction turns out to be correct. For instance, the best insight into the prevailing political winds in the US election of 2008 isn’t a poll or a pundit, but a chart on Slate that displays the relative price of the candidates on the futures market, Intrade. (Intrade showed Obama leading Clinton just after the Super Tuesday primaries, weeks before national polls reflected his improved fortunes.) The second form is the problem-solving, or crowdcasting, network in which someone with a problem broadcasts it to a large, undefined network of potential solvers. InnoCentive is an example of this, with its distributed collection of 120,000 scientists that tackle thorny R&D problems for Fortune 500 companies.
But, as is often said of Wikipedia, collective intelligence works better in practice than it does in theory. The results it produces seem counter-intuitive, running against the grain of what we thought we knew about the way the world works—basically, that Mensa should always trump brown socks. In this as in other ways, the Internet provides an opportunity to rethink our intuitive understanding of human behavior. Before we dive into crowdsourcing’s practical applications of collective intelligence, we’ll try to establish some theories for why it works.
Crowdsourcing is rooted in a fundamentally egalitarian principle: Every individual possesses some knowledge or talent that some other individual will find valuable. In the broadest terms, crowdsourcing involves making a connection between the two. That is to say, in another counter-intuitive twist, the individual—with all his or her peculiarities—is at the center of crowdsourcing. In our particulars, we are all the raw stuff of circumstance: birthplace, family, geography, experience and the innumerable other variables that combine in the strange alchemy that produces the unique person. When uniqueness persists in large groups we call it diversity, a term that’s been saddled with some unfortunate baggage in two decades of identity politics. But so far as crowdsourcing is concerned it’s important to divorce the concept of diversity from the politics of diversity. What scholars and entrepreneurs are discovering is that the sum of our differences constitutes an immensely powerful force that can be applied to solving problems or developing new products or simply making the world, online and off, a more interesting place to live. To paraphrase Montaigne, the only thing we all have in common is that we are all quite different. In the networked age, that can be a very good thing.
The Difference a Tweak Makes
Ned Gulley operates something of a Petri dish for diversity’s effects on group intelligence. Gulley is a software designer at The Mathworks, the company best known for developing MATLAB, a language used by mathematicians and engineers for crunching the kinds of extravagantly complex calculations that leave most people slack-jawed. In 1999 the company decided to hold a programming contest. Previous contests had been conducted over email, but judging the resulting entries was cumbersome and tedious. So Gulley suggested hosting the new competition on the Website and grading it in real time.
The company’s goal was straightforward enough: “Provide an entertaining diversion to the community of MATLAB users while encouraging the exchange of good programming practices .” Programming contests have occupied a time-honored position in geek culture since the earliest days of the computing, for just this reason: they make the development of skills feel like a game. Collegiate computer science departments began holding various tourneys as early as 1970, and informal matches were being held long before that. Mathworks was taking part in a venerable tradition.
At first blush Gulley’s format appeared utterly conventional. Contestants were required to solve what is commonly called a “traveling salesman problem,” the classic example of which asks for the shortest possible round trip a salesman can take through a given list of cities. Participants submitted a solution in the form of an algorithm, or computer code that directed the salesman through a number of steps. The contest ended after ten days, at which point the most efficient algorithm would be declared the winner.
But Gulley added an extra twist: Participants were allowed to steal each other’s code in order to create a better solution. Every time a new solution was sent in, it was quickly scored, ranked and posted to the Web site. Every other contestant could then see the programming code, in full. They could cut-and-paste the best bits and resubmit it with any improvements, however minor. If the tweaks, as Gulley calls them, created a more efficient algorithm, it vaulted the contestant into first place, even if he or she had only changed a few lines of code.
The result, Gulley says, closely resembles the actual process of software development. “In an office full of developers, if one person solves a problem, everyone else will gather around to see how they did it, have their ‘a ha’ moment, then factor it into their own code,” he says. “There’s a pervasive myth of Thomas Edison in our culture. The smart guy who’ll get us out of this fix, who’ll walk into a room and come out with a brilliant solution.” In reality, most breakthroughs are the product of teamwork. “I wanted to create a contest that more accurately modeled the way ideas really move through the world.”
Far from alienating MATLAB users, encouraging outright theft has inspired ever-greater levels of obsessive effort on their part. Nathan, a contestant from Ireland, wrote Gulley to note that he was afflicted by “physical trembling while making the final preparations to submit code.” The most devoted players schedule vacations around it, or even cut classes and use sick days in the race to be at the top of the leader board. Gulley believes the MATLAB contest falls into a new category of competition he calls “addictive collaboration.” The way it works, Gulley says, is that one programmer will spend all night devising a brilliant algorithm that takes the lead. “Then someone comes along, adds a little tweak and then they go into first place. The first programmer is like, ‘That asshole! He just cut me off by copying my code!’ And so the first programmer adds another tweak in order to regain the lead.’” The ultimate goal, Gulley says, isn’t to win. It’s to come up with a brilliant tweak that will impress the other competitors. “It’s like a shadow scoring system based on reputation.”
But the extraordinary aspect of MATLAB isn’t the fervor it inspires, but the fact that the ten-day hurly-burly—in which all intellectual property is thrown into the public square to be used and re-used at will—turns out to be an insanely efficient method of problem solving. The contest has been held twice a year since its inception in 1999. On average, Gulley notes, the best algorithm at the end of the contest period exceeds the best algorithm from day one by a magnitude of 1000.
This is the point at which practice would seem to leave theory in the dust, gasping for breath. How can we explain so mind-boggling an improvement? Obviously the presence of the crowd helped. Many smart programmers were able to create a better solution than any single one of them would have put forth. It’s also clear that the free exchange of ideas abetted the process, as it fostered a collaborative environment in which good ideas could be tweaked into better ones. But none of this explains the exponential arc traced by each successively more efficient algorithm. What’s striking is that the best programmers aren’t necessarily the ones making the most valuable contributions. The novices often provided a crucial tweak that led to a breakthrough. Or as Gulley puts it: “Sometimes a script kiddie”—geek slang for an inexperienced, usually pubescent hacker—“will come along and make this tiny change, and even Edison has to rub his eyes and say, ‘Whoa!’”
In other words, the brown socks beat Mensa again. However counter-intuitive this might seem, it’s not entirely unanticipated. But the theoretical basis for the MATLAB contest’s results was developed some 50 years before Page ever started pitting his computer agents against one another.
Diversity and the Market
The economist F.A. Hayek is best remembered, and not always fondly, as having composed the theories that drove the free market policies of Margaret Thatcher and Ronald Reagan. Born in Vienna at the turn of the century, Hayek was already a highly reputed economist by the time Hitler and Stalin had come into power. At the London School of Economics and Political Science Hayek advocated a harshly critical view of the planned economies utilized by both Nazi and Soviet states at the time. Hayek believed the marketplace was a highly efficient mechanism, whose ability to coordinate economic activity suffered only to the extent any individual or expert attempted to meddle with it. Hayek did as much to champion Adam Smith’s famous “invisible hand” than any other contemporary economist, and in 1974 he won the Nobel Prize in Economics in part for his work on this subject. With friends like Thatcher and Reagan, Hayek never wanted for enemies, but controversy has tended to overshadow the broad range of his contributions. In a seminal 1945 paper, “The Use of Knowledge in Society,” Hayek observed that society had failed to properly appreciate a category of knowledge that rested in neither the academy nor corporate boardrooms, “the knowledge of particular circumstances of time and place.” Due to such “local knowledge,” or what’s now commonly called “private information,” nearly every individual “has some advantage over all others because he possess unique information of which beneficial use might be made.” The remaining project, then, lies in obtaining all that dispersed information. Hayek writes that:
Each member of society can have only a small fraction of the knowledge possessed by all, and each is therefore ignorant of most of the facts on which the working of society rests … civilization rests on the fact that we all benefit from knowledge which we do not possess. And one of the ways in which civilization helps us to overcome that limitation on the extent of individual knowledge is by conquering ignorance, not by the acquisition of more knowledge, but by the utilization of knowledge which is and which remains widely dispersed among individuals.
And this was written before the emergence of the Internet, which has proven better at aggregating and utilizing widely dispersed information than Hayek ever could have imagined. The MATLAB contest could almost be an exercise attempting to prove Hayek’s central observation—that we may well already possess the solutions to our greatest dilemmas, and the project before us is simply to gather all that knowledge into a central storehouse. It’s no coincidence that Gulley adopts the language of Hayekian economics when he speaks of the open architecture of the MATLAB contest as a way of “tickling people’s private information out of each other.”
Gulley has painstakingly gathered a wealth of observational data, from which he’s been able to draw some conclusions about exactly why this addictive collaboration works as effectively as it does. By rendering the data visually in the form of a line graph, Gulley was able to determine that advances are made in great leaps, which are followed by longer periods of minor tweaking. “People will sniff out the slack in an algorithm, like hyenas worrying over a carcass. Then they get exhausted until someone comes along and whips the carcass into a new position, then it starts all over again.”
The charts Gulley has created could just as easily be depicting the course of evolution in animal species. Genetic mutation doesn’t follow a linear curve either, but is instead characterized by, well, leaps and tweaks, an observation central to the theory of “punctuated equilibrium” developed by the evolutionary biologists Stephen Jay Gould and Niles Eldredge. Gulley’s hardly unaware of the parallels and believes, on the basis of some compelling evidence, that his data on the MATLAB contest might reveal some deeper truth about how progress is rendered—in social as well as biological terms. “We’re taught a version of history in which great men, Napoleon for instance, are the sole actors. But the reality is much messier, and involves a complex interplay between those that make the leaps and those that make the tweaks.” History requires the service of the script kiddies—the brown socks—to come along and employ their unique perspective to reorient everyone to a new viewpoint.
“We have truly brilliant people playing. One of them will make a breakthrough, and on its own it would have been the best solution in an old school contest. Just because they’re brilliant. But with the MATLAB contest, immediately people are able to come along and tweak it and squeeze some slack out of it. No single person could do that. It’s the swarm, this great big collective brain we have access to. What will really be amazing is if we can tap that brain to cure cancer.”