One of the great advantages of data journalism is that it allows you to go beyond anecdote and produce evidence, said Steve Doig, professor at the Walter Cronkite School of Journalism at the University of Arizona, speaking at the International Journalism Festival in Perugia.
The first in a series of data journalism panels at the Festival looked at how data journalism and computer assisted reporting have developed over the last few decades.
Doig was among the first journalists to make use of a computer in reporting. Philip Meyer was possibly the first, Doig said, when he carried out surveys in 1967 to help analyse the civil rights riots. Meyer wrote a book called Precision Journalism, but at the time it was difficult to carry out many of his ideas because the computers available were not yet accessible. After the ‘micro-revolution’ of the late 1970s, reporters began to play around with computers, Doig said, and he realized a computer could help him do his job better.
A key advance in data journalism came in 1989, Doig said, when Bill Dedman, a reporter from the Atlanta Journal Constitution won a Pulitzer for The Color of Money, a story that used data journalism techniques to investigate the unfairness in how money was being lent to buy homes.
Sarah Cohen, journalism professor at Duke University and former Washington Post writer, believes that it is important for journalists to borrow from other fields such as social sciences or history.
Her work at Duke is now looking to the future of data reporting, she said, such as text mining, machine learning, and how to extract more meaning from audio and video records. Aron Pilhofer, interactive editor at The New York Times, agreed that such areas might be the future of data journalism.
Simon Rogers, data editor at the Guardian, explained that a lot of the journalism the Guardian does these days involving data is “quick and dirty.” There are various free tools available that make it much easier to process data and display it in exciting ways. “There is almost no excuse these days not to use these tools,” Pilhofer said, “now that there’s free open-source software.” Now that the technological barriers have been largely flattened, it’s more a problem of people and how to obtain data, he added.
Rogers highlighted an interactive visualisation on the Guardian website that Pilhofer described as the “single best social media data interactive I’ve ever seen.” The visualization demonstrates how rumours spread and were retracted on Twitter during last summer’s riots in London. It is not easy to make something meaningful out of social media statistics, Pilhofer stressed.
It’s now almost impossible to work alone with data, panelists emphasized. “I don’t think there’s anyone anymore who knows how to do everything in data journalism,” Doig said. Cohen explained that when she was at the Washington Post, most data investigations involved at least three people: one with the idea, one good writer and one data expert. Rogers works with a researcher and investigative reporter.
You can do quite a bit with a small team, however, said Pilhofer, explaining that at The New York Times, even though there are 14 in his team, most projects involve two or three people. All panelists agreed that reaching out to the wider community to find people with the skills you need is an important part of getting the most out of data. “Have a statistician as a friend,” advised Elisabetta Tola, founder of Formicheblu.
The panel offered advice to journalists who are keen to start working with data but fear that their editors will not approve: just do it and trust that they will see its value. “Start to build stuff,” said Rogers. “If you create a great visualization and show it to your editor you are going to get it published,” said Cohen.
Working with data requires a culture change in more ways than one, said Pilhofer. Already, The New York Times has released all sorts of open source software, and he expects the future of this kind of reporting to be much more collaborative and involving more people who have never considered themselves journalists.
Journalists should learn to admit what they don’t know, he said: when it comes to data it’s better to admit what you do and don’t know and allow others to help you learn more.