The Marshall Project “Ask a Question”

The Marshall Project is answering questions from the community about Cuyahoga County’s criminal courts, and sharing what we have gathered from the public docket of felony cases.

The Marshall Project spent months analyzing court records and voting patterns to understand who chooses county judges and who experiences the consequences.

The analysis is based on the period from 2016 through 2021; exceptions are noted in specific articles.Court recordsThe Marshall Project scraped criminal case records from the Cuyahoga County Clerk of Courts’ Search Selection and Entry site, which provides web-based access to basic information about criminal cases. It includes defendant information like race and home address, along with case dockets that include descriptions of events like sentencing and links to original PDF filing documents. The scraper has run on and off from May 2021 through January 2022. The Marshall Project will never make public personally identifying information using this data.

This data was loaded into a PostGIS database. Defendants’ home addresses were geocoded with geocod.io and joined with geographies from other sources such as Cuyahoga Board of Elections precincts (to compare with election results), and U.S. Census places (to compare with population demographics). These spatial joins form the bulk of the analysis used in this piece.

The court provided us with a list of case numbers spanning 2016 to 2021. We used this list to audit our database of scraped cases and ensure our scrape was a complete record of all cases seen by the court in this timeframe. Over 98% of our data matched the case numbers provided by the court. The handful of mismatches represented cases that had either been sealed, or, we believe, recently expunged. Since the details of expunged cases are no longer open to the public, the court was not free to confirm the status of certain cases that were missing from the court’s list of case numbers but captured by our scraper.

To calculate the disparity in outcomes for common charges like theft and drug possession, multiple techniques were used. One approach used a natural language classifier to determine outcome. Another used a simple flag to determine if a case ended in the defendant being sent to prison and applied more restrictive criteria, only considering cases with a single count of the charge. A third approach employed a dataset of cases from 2009 to 2019 obtained and processed by Lawstata that includes a count of defendants’ prior cases. Using this dataset, we applied similar criteria, and filtered based on a maximum number of priors, looking at scenarios where the defendant had a maximum of zero priors, as well as scenarios with one and two prior cases. All techniques show similar variation between judges.

This type of transparency alone can’t change the harms experienced by people in the system. We hope it will spark conversation and deeper understanding of how the court operates.

We’ll be adding more information, so check back or sign up for an alert when we answer more questions. Have a question you’d like us to tackle?

Click here to Ask us. If we can answer it, we will.

We do not include the personal information of people who faced charges. To look up a specific case, use the clerk of the court’s website. Check out our download page to get some of the raw data we used to answer the questions below.