At a conference in Mexico recently, I ran into Wired editor Chris Anderson. His essay on the petabyte age, published a couple of years ago, sounded the death knell for scientific method. I was seduced by the argument at the time, as well as by the beautiful graphics that accompanied the piece. Visualising Big Data can be a pleasure, as this graphic of edits of Wikipedia pages shows.
But when I started to dig around, I found that there’s nothing new about Big Data. People have been complaining about the data deluge since the 1600s.
“One of the diseases of this age is the multiplicity of books; they doth so overcharge the world that it is not able to digest the abundance of idle matter that is every day hatched andbrought forth into the world,” thundered Barnaby Rich in 1613. He himself contributed 26 books to the multiplicity and eventually gave his name to the Barnaby Rich effect: “a high output of scientific writings accompanied by complaints on the excessive productivity of other authors.”
What about the fact that new technologies are allowing us just to throw gobs of data at the wall, see what sticks, and turn that into a new theory, rather than starting with a hypothesis and laboriously collecting the data to confirm or refute it? In an essay just out in Prospect I’m forced to conclude that hypothesis driven science has always been a bit of a myth, shaped more by the way science is funded than by the need to create or maintain rigour.
I had fun writing the essay because it gave me an excuse to sit in the rare manuscripts room of the glorious Wellcome Library, rummaging through books written 300 years ago by the fathers of data mining and scraping, John Graunt and William Petty. As I note in the essay:
In one of his “Essays on Political Arithmetick,” Petty took death rates collected for another purpose, stirred them with a couple of wild assumptions on population, and seasoned them with a dash of prejudice to conclude that British hospitals were much less likely to kill their patients than French ones, where “Half the said numbers did not die by natural necessity but by the evil administration of the hospital.” In a precursor to the World Bank’s habit of pricing productivity lost by ill-health, Petty goes on to calculate the cost of the unnecessary deaths, valuing the French at £60 each, “being about the value of Ariger Slaves (which is less than the intrinsik value of People at Paris).”
English commentator trashes French health system. Indeed, there’s nothing new about the way we use data…