Reading Time: 6 minutes

Data can be the starting point, the basis for or the thing that conclusively proves out the premise behind investigative journalism. It can also be misinterpreted, used selectively to make a case that’s technically accurate but contextually false or misleading. 

In his job as a data journalist at the Center for Public Integrity, Joe Yerardi does analyses that help uncover great stories hiding in big sets of numbers and questions the factors that can bring nuance to or counterintuitive conclusions about the patterns that appear obvious.

He combines traditional reporting techniques with data analysis to tell investigative stories. His reporting has, among other honors, been honored with a Gerald Loeb Award for an investigation into the influence pharmaceutical companies wield over state Medicaid programs’ drug purchasing decisions. 

This is Yerardi’s second stint at Public Integrity, having interned on Public Integrity’s data team in 2012. Before rejoining, he covered a wide range of beats as a data reporter for inewsource in San Diego and as the data editor at the San Antonio Express-News. He earned his undergraduate degrees in history and journalism at New York University and his master’s in journalism at the University of Missouri. 

This year, Yerardi worked as a data reporter on two major series: “Criminalizing Kids,” which revealed that schools refer tens of thousands of students to law enforcement every year, and “Cheated at Work,” which revealed that U.S employers that illegally underpay workers face few repercussions, even when they do so repeatedly.

Public Integrity data journalist Joe Yerardi
Joe Yerardi

What does your day-to-day work as a data journalist look like? Can you explain your work to the average person? 

Big picture? I turn data into compelling narratives. So how does one actually do that? The same way any reporter finds a story: You ask questions. In my case, I’m asking questions of the data.

This might strike some readers as kind of weird, but think about it. As reporters, we want to know things like: How often is something happening? Is it happening more or less since a new policy was implemented? Are different groups of people affected by it to a greater or lesser degree? Is it happening more often in some places than others?

You can ask a politician or an activist or a researcher these questions. You can also ask data these questions. Both people and data can actually mislead you, but that’s another discussion.

What we tend to do at CPI is ask these questions of the data, take the answers the data gives us to humans and then ask those humans to put the answers from the data in the broader context. That’s how I’ve always conceived of data reporting: The data tells you what’s happening, and the people tell you why it’s happening.

You worked on two major projects this year: “Criminalizing Kids” and “Cheated at Work.” What did your involvement look like on both? 

I was the primary data reporter on both series. Since the core of both projects rested on complex data analyses, I’ve been pretty busy this year.

I’ll tackle Cheated at Work, first.

We got the key data from a FOIA request our labor reporter, Alexia Fernández Campbell, filed with the Department of Labor. The DoL sent us a database of closed investigations conducted by their Wage and Hour Division.

The database was split into a few main tables that we had to join together in order to analyze.

I’ve now worked on three stories in the series, each requiring me to ask different questions of the data. For some of the stories, the DoL database on its own allowed us to answer questions we wanted to know: How much did the Department collect in back wages owed to employees vs. how much did they leave uncollected, for example? For some of the stories, I had to add data from outside sources like the Census Bureau in order to reach conclusions the DoL data on its own could not.

Nobody had to file a FOIA to get the data  for Criminalizing Kids. Instead, I downloaded a school year’s worth of law enforcement referral and enrollment data by demographic group from the Department of Education’s Civil Rights Data Collection (CRDC).

As with the DoL data, the referral and enrollment data came in separate tables I needed to join together before I could analyze. And as with the DoL data, I had to combine the CRDC data with data from other sources to reach certain conclusions.

A lot of the questions I asked of the data were informed by the reporting of my co-reporters — Alexia Fernández Campbell and Susan Ferriss on Cheated at Work, and Corey Mitchell and Susan Ferriss on Criminalizing Kids.

Much of my work involved reshaping the data in all sorts of different ways that would allow me to ask different questions of it. Counting wage theft cases by year requires differently formatted data than counting wage theft cases by industry. Likewise, I had to reformat the data for Criminalizing Kids to run calculations at the school level, then the district level, then the state level, and finally the national level.

It’s definitely not the sexiest part of the job but a lot of investigative data journalism is figuring out which column to group by.

And I have to stress that I spent a lot of time communicating with — and convincing government PR folks to let me communicate with — data specialists at government agencies for both stories to make sure I was properly interpreting the data. I want to stress just how important it is to really make sure you understand how the data is collected, stored and interpreted before you publish anything. That means getting in touch with the people who use this data everyday in their work. That’s crucial.

Both stories involved some work with partner news organizations. With Criminalizing Kids, especially, I spent a good deal of time helping local reporters take our data (which we’ve since made publicly available on the internet for anyone who wants to use it) and use it to tell stories relevant to their communities.

As a data journalist, your work spans across the whole newsroom and doesn’t give you a specific beat. What’s exciting about this? What can be challenging about this? 

I’ve always been a happy generalist. There are so many fascinating things (and you can define that word broadly) in the world. And I enjoy learning a little bit about a lot of them. I sort of view my job in the same way: basically, gain just enough knowledge about a topic to make you dangerous.

I think a lot of journalists would tell you something similar. Learn fast and then share what you’ve learned with the world (or at least your readers). That’s fun because it keeps the job fresh. You’re constantly learning about new issues, institutions, phenomena and then translating that knowledge into compelling narratives for a broad audience.

The challenge, of course, is that it’s difficult to come up with story ideas when you lack a beat. You don’t have sources feeding you stories, you don’t have a list of other reporters you can follow to see what they’re doing. You really have to come up with story ideas out of thin air. That’s a big challenge.

Why should people reading this support Public Integrity’s award-winning work?

There’s all sorts of types of news out there and many ways to categorize it. But one way of thinking about it is to divide the great mass of content on our screens into two buckets: news that closes your mind and news that opens your mind.

In the former bucket, I’d put a lot of the stuff you’d read in clickbaity tabloid sites or watch on cable news: stories that are designed to reinforce people’s existing biases and worldviews. Stuff that makes audiences go “Ah, yes. I was right all along to feel that way about x topic or y group of people.”

In the latter bucket, I’d put the sort of work we do at the Center for Public Integrity. I’m talking about journalism that seeks to tell people things they didn’t know. And not just in the “Oh, that’s a fun bit of trivia” way like some journalistic Snapple Cap Fact. I mean investigating issues and institutions that affect people’s lives to uncover abuse, discrimination and injustice. And then going and writing about it so people can get mad, get organized and force the people with the power to make it right.

As consumers, each of us gets to choose which sort of news we want to see more of. Which would you rather support?


Help support this work

Public Integrity doesn’t have paywalls and doesn’t accept advertising so that our investigative reporting can have the widest possible impact on addressing inequality in the U.S. Our work is possible thanks to support from people like you. Donate now.


Help support this work

Public Integrity doesn’t have paywalls and doesn’t accept advertising so that our investigative reporting can have the widest possible impact on addressing inequality in the U.S. Our work is possible thanks to support from people like you.

One reply on “Investigative reporting and asking questions of the data”

Comments are closed.