Posted on October 23, 2019

Building a Database from Scratch: Behind the Scenes with Documenting Hate Partners

Rachel Glickhouse, Pro Publica, October 23, 2019

For nearly three years, ProPublica’s Documenting Hate project has given newsrooms around the country access to a database of personal reports sent to us by readers about hate crimes and bias incidents. We’ve brought aboard more than 180 newsrooms, and some have followed up on these reports — verifying them, uncovering patterns and telling the stories of victims and witnesses. Some partners have done significant data journalism projects of their own to augment what they found in the shared dataset.

The latest such project comes from News12 in Westchester County, New York. Reporter Tara Rosenblum joined the Documenting Hate project after a spate of hate incidents in her coverage area. Last month, News12 aired her five-part series about hate crimes in the Hudson Valley and a half-hour special covering hate in the tri-state area. The station also published a public database of hate incidents going back a decade. {snip}

Rosenblum and her team built the database by requesting records from every police department in their coverage area, following up on tips from Documenting Hate and collecting clips about hate incidents the news network was already reporting on. Getting records was a laborious process, particularly from small agencies, some of which accept requests only by fax. {snip}

She also expanded the scope of the project beyond her local newsroom and brought in News12 reporters from the network’s bureaus in Connecticut, New Jersey, Long Island, the Bronx and Brooklyn. The local newsrooms used Rosenblum’s investigation as their model, examining hate incidents since 2016. In all, six News12 reporters in three states documented around 2,300 hate incidents.


Catherine Rentz, a reporter at The Baltimore Sun, wanted to investigate hate incidents in her area after learning how the Maryland State Police tracks hate crimes. (Since this writing, Rentz left the Sun to pursue freelance projects.) Maryland has been collecting hate crime data since the 1980s, so there was much to explore, Rentz said. {snip}


To collect the data, she set up a spreadsheet and entered each case by hand, since the state police records were in PDF files and she wasn’t able to easily extract data from them. She had a number of other challenges. For instance, many agencies redacted victims’ names, so it was a challenge to use the data to find potential sources to interview. And when she did find names, some victims didn’t want to talk about what happened to them.


In the course of her investigation, Rentz discovered that there were agencies that did collect reports of potential bias crimes, but that they weren’t reporting it to the state police, so the data wasn’t being counted. She also looked at prosecutions; in 2017, there were nearly 400 bias crimes reported to police, but only three hate crime convictions.


Last year, Reveal investigated hate incidents that involved the invocation of President Donald Trump’s name or policies. They published a longform story and produced a radio show. Reporter Will Carless built a database using reports from the Documenting Hate project and news clips. He worked his way through a color-coded spreadsheet of hundreds of entries to verify reports and find sources to highlight in the story. After the investigation published, Carless says he received emails from readers who said similar incidents had happened to them; others thanked him for connecting the dots and gathering data on previously disparate stories. He also said a few academics told him they were going to include the story in their courses that involve hate speech.

And this year, HuffPost created a database for a forthcoming story examining hate incidents in which the perpetrator used the phrase “go back to your country” or “go back to” a specific country. Their database combined tips submitted to the Documenting Hate project, along with news clips culled from the Lexis-Nexis database, social media reports, as well as police reports gathered by ProPublica. The investigation is slated to publish this fall.


Like News12, HuffPost opened the project up to its newsroom colleagues, bringing in reporters from HuffPost bureaus in the United Kingdom and Canada. {snip} Their plan is to publish stories using the tips they collect when HuffPost’s U.S. newsroom publishes its investigation.

Want to create your own database of hate crimes? Here are some tips about how to get started.

  1. Get hate crimes data from your local law enforcement agency.


Some things to keep in mind:

More than half of hate crime victims don’t report to the police at all. And the police don’t always do a good job handling these crimes.

That’s because police officers don’t always receive adequate training about how to investigate or track hate crimes. Still, training isn’t a guarantee to ensure these crimes are handled properly. Some police mismark hate crimes or don’t know how to fill out forms to properly track these crimes. {snip}

  1. Put together a list of known incidents using media reports and crimes tracked by nonprofit organizations.


You can also consult organizations that track incidents and add them to your list of known crimes. They can give you a sense of how police respond to hate crimes against these groups. Here are some examples.

  • CAIR (Muslim community)
  • ADL (Jewish community)
  • SAALT (South Asian community)
  • AAI (Arab community)
  • AVP (LGBTQ community)
  • MALDEF (Latino community)
  • NAACP (black community)
  • NCTE (trans community)
  • HRC (LGBTQ community)
  1. Review the police records carefully, and request incident reports to get the full picture.

Once you receive data from the police department, compare it with your list of known hate crimes from media and nonprofit reports. That will be especially useful if the police claim to have no hate crimes in the time period. Ask about any discrepancies.


Then review the data and incident reports for potential mismarked crimes. Take a look at the types of bias listed for each crime. We found that reports of anti-heterosexual bias crimes were almost always mismarked, either as different types of bias crimes or crimes that weren’t hate crimes at all.

{snip} We’ve encountered cases in which police marked incidents as having anti-Native American bias in their forms or computer systems because they thought they were selecting “none” or “not applicable.”

Next, check the crime types. We’ve also seen that certain crime types are unlikely to involve a bias motivation but are sometimes erroneously marked as hate crimes; examples include drug charges, suicide, drug overdose and hospice death. {snip}