Understanding the aggregate behavior of the courts is one of the most difficult tasks to accomplish at the federal level because of the paywall on court records. As we’ve found time and time again in our interviews with users (a huge thank you to all of our interviewees by the way!), simply being able to search across all of the 94 federal districts and count the number of cases would be a gigantic step forward.
The Administrative Office of the United States Courts (AO) publishes yearly statistics on the number of civil and criminal filings for all of the courts. Of course, this is a very clean representation of what is contained within PACER in comparison to what is retrieved with the search function.
In order to get a grasp on how many cases were filed across all of the district courts in 2016 and the variation in volume, we searched every single district’s PACER Case Management/Electronic Court Filing system for cases that were filed between January 1st and December 31st. For those that are curious, executing the search to identify a year’s worth of case numbers across every district ends up costing just under $2,000 as long as nothing goes wrong with PACER and the page has to reload (leading to a double charge).
The FJC statistics cover April 1, 2016 to March 31, 2017 while we chose to download all cases from January 1 to December 31, so this is not a strict comparison. However, the 3-month difference in starting date does allow us to begin investigating how variable the volume of filings may be and how reliably can users ‘count’ and generalize search results from PACER.
Which court has the most filings?
Simple questions are great to start with – allowing us to see how much variation there might be and, more importantly, examine our own assumptions. If you are a non-lawyer (like myself) you may expect to see a greater volume of cases in populous court districts, like the Southern District of New York, or in well-known districts, such as the Eastern District of Texas which is frequently reported on because of the number of patent infringement cases it has heard historically.
To examine the total volume in a year we count every civil case (cases filed with a ‘cv’ designation) and criminal defendant (cases filed with a ‘cr’ designation).

In 2016 it turns out that the the ‘big easy’ nickname of New Orleans doesn’t carry over to the workload of the Eastern District of Louisiana (LAED), which is solidly first in both the FJC statistics and query data. Noticeably, the difference in case volume between the two data sources in LAED is very small, less than a percent, despite being offset by three months.
For those of us without intimate knowledge of the Eastern District of Louisiana, it may come as a bit of a surprise that it would be solidly as a first. The Law360 report on how LAED is struggling with its caseload because of unfilled vacancies becomes even easier to understand after seeing the volume of cases—it makes one wonder how 14 appointed and 5 magistrate judges (current count of active judges) could possibly make it through over 15,000 cases in a year.
After Louisiana the results may be more expected for those of us that were thinking of population – the Central District of California (covering Los Angeles, Orange County, and San Bernardino among others) is second and the Southern District of Texas (covering Houston, Galveston, and Corpus Christi) is third.
Visually we see that there is variation between the datasets in the volume for individual courts. The largest differences are seen in Hawaii and the Eastern District of North Carolina (NCED). NCED is a bit difficult to notice given its mid-pack ranking, but we count 24 percent less cases using the search query which would drop it down 9 rankings. Hawaii, on the other hand, is starkly apparent as an anomaly amongst low volume courts at the bottom of the ranking. Based on the search query we count 164% more cases than are in the FJC data (1,575 to 955 cases), which would increase the District of Hawaii’s ranking 15 places.
However, the differences are generally much smaller—71 out of the 94 courts move two or less positions when comparing the two datasets. This is reflected in the aggregate, with the AO reporting 367,937 civil and criminal cases and our count totaling 361,353 from the queries, a 1.8% difference due to the difference in time and other unknown factors. To better understand and contextualize the variation that we do see it’s best to examine civil and criminal case filings separately.
Civil case filings

Examining the civil case filings alone easily informs us what drives traffic in LAED – almost all of its filed cases are civil. Given the high volume, it’s likely a safe assumption that there is at least one extremely large multi-district litigation case that is in LAED in 2016. Otherwise, we see that the Central District of California is still second, but now the Southern District of West Virginia is third which breaks simplistic assumptions about population areas being used to guess at case volume.
The FJC data places the District of New Jersey fourth, but from the queries we would place it seventh, behind Northern Illinois, Southern New York, and Southern Florida. The largest rank difference between the two datasets is still the Eastern District of North Carolina, which drops 11 positions in rank using the query data due to a decrease of ~500 case filings. Overall this comparison is as stable as the total results, with 72 of the 94 districts moving two or less positions in rank–which makes sense given that civil filings make up the bulk of cases in the federal courts.
Criminal case filings

When we examine the criminal filings, we see a stark difference from the total filings graph. This is to be expected in some sense since criminal filings are lower volume, totaling about a third of civil filings, and is reflected by the fact that only 45 of the 94 districts move two less positions in rank.
In terms of criminal filings, we see that everything is bigger in Texas, with the Western and Southern Districts taking first and second. There is a flip-flop in who comes first or second based on the dataset, but in the context of the top courts this difference likely isn’t very surprising in a 3-month period. When we look at the top 6 courts—Arizona, New Mexico, Southern District of California, and Southern District of Florida—we can see a pattern that is known to many, that a large number of criminal filings are related to immigration. These six courts all fall along the southern border of the United States where a large number of these cases would arise and the volume could be variable over time (explaining the toss-up between the Western or Southern District of Texas being first).
In general, we would expect to visually see more variation given the smaller number of criminal case filings. There is also the potential that a 3-month difference in time is possibly more consequential depending on how filings may be tied to a large investigation that ‘breaks’ all at once (which may explain the differences seen in Maryland and Eastern District of Virginia). In general these instances of difference are difficult to understand without reading the docket reports, e.g. the District of Hawaii having over 800 defendant filings in the Query data, while the FJC reports just 216.
We pulled the filing dates for the criminal cases in Hawaii and counted the number filed per month to see if there were any anomalous patterns in the time period that the FJC data doesn’t cover. However, when we look at the figure there doesn’t seem to be anything odd about the number of filings in January, February or March that would explain the discrepancy. Further, even if we exclude those three months we still count 618 criminal defendants from April to December, which is triple the number reported in the FJC data. It’s clear that information from the docket will be necessary to understand where this difference comes from.
Counting on PACER?
Overall we find a pretty strong agreement between the number of cases that we can count in the query results and the number reported by the FJC despite the 3-month difference in the starting date. While that should have been a given from the start, it still feels impressive to reach that result after how much we all talk about the deficiencies in PACER.
Most variations are small, especially in civil filings, and it feels relatively safe to assume that much of this variation is due to the 3-month difference, which suggests that running a search on PACER and counting the number of returned cases should match up closely with what the AO would provide (assuming they ever started providing a service to tell you the number of cases involving ‘Nike’ or ‘beekeepers’). The single biggest point against that claim would be what we observe in the District of Hawaii. Unfortunately, we’re going to have to spend somewhere between 80 and 2,400 dollars to clarify what’s happening in that district in 2016—making the largest deficiency of PACER abundantly clear.

One thought on “What does a year of court filings look like?”