One of the persistent complaints that we heard over the last year during our interviews revolved around data access–namely that the current model was clunky and expensive. These two factors heavily contribute to the next insight from our interviews: users will want to inspect all individual cases even when they’re interested in a question and answer about how the courts systematically operate. The reason for this behavior and need is rather basic–put simply no one is entirely sure what all is contained within PACER and if their idea of how to search for cases of interest will actually capture all of the corresponding cases.
For those without an institution that pays for a license to Westlaw or Bloomberg, this issue is exacerbated because search results are one of the categories that is exempted from a maximum fee cap. That means that no matter how many ‘pages’ of results are returned, whether it’s 1 or 1,000, you pay 10 cents for each one. Thus the ominous warning on every PACER search portal page.
Without knowing what kind of information is on the other side, this current cost structure does not incentivize exploration to make sure all pertinent cases are identified and your search query is as encompassing as possible.
Finding all cases of a certain type may not seem like a real issue. A person simply puts the correct case type and plaintiff or defendant name as a filter, then PACER retrieves all the corresponding cases back. But it really is an issue because of the nature of what PACER is.
PACER is an operational system–it processes and stores the daily transactions that occur in a case with a number of players generating those transactions. For those of us that are not litigators it can be easy to forget the second half of PACER’s full name, Case Management/ Electronic Case Filing (CM/ECF). PACER is built to facilitate the daily functioning of the courts–it’s where individual plaintiffs and lawyers file lawsuits, motions, and briefs and judges and court clerks record events for the case.
With that many cooks in the kitchen, it’s easy to understand how variation could occur and make being certain of identifying all cases for a certain query difficult. One of the easiest ways to see it is if you would be curious in all cases filed against a corporation. If you were curious about cases filed against IBM, do you search for IBM? Or International Business Machines? Or International Business Machines, Inc.? Simple differences of opinion in what to write or how to use PACER as a tool can lead to large variation when we inspect all 94 districts. This variation isn’t confined to only differences in how something is written though; it also impacts basic aspects of using PACER as a consumer of data. Something that seems simple like searching for all filed search warrants is actually an arduous task because of this variation.
As one example to illustrate this variation, we can look at what case designations are filed where. When I talk about cases on PACER with my colleagues we’re typically talking about criminal or civil cases, which have a designation of ‘cv’ or ‘cr’ in the case name. In our last post we focused on examining the volume of cv and cr case filings and how that compared to what was reported in the IDB. But there are more types of filings–mj for magistrate cases or mc for miscellaneous cases and so on.
When we examine the distribution of case designation filings in 2016, we see that there is a vast world of case type designations that we rarely discuss!

The most frequent case designations are what we would generally expect: civil (cv), magistrate (mj), criminal (cr), petty offense (po), and miscellaneous (mc). But as the usage becomes less frequent, the observed case designations become more unique and surprising. Put simply what we are seeing is differing operational usage of PACER itself.
If you pay attention to the case designations on the bottom of the graph, you will see a number of case types that differ from the others—namely, that these case types are capitalized (e.g., ‘AM’, ‘AL’). Every single capitalized case type originates from one court, the Eastern District of Pennsylvania. All 72 of those cases are filings that are in response to a multi-district litigation that involves a drug and the case type abbreviation actually is an abbreviation of the drug name itself! So ‘AL’ cases are in response to the ‘Albuterol Cases’ and ‘LD’ cases are in response to ‘Lidocaine Cases’.
While those case types may be extremely unique and unexpected, the Eastern District of Pennsylvania isn’t alone in their construction. If we examine 2:16-av-55555 in the District of South Carolina, we find that it is the case of “Plaintiff v. Defendant”, but really it is a docket for a case against the Charlotte Division of Pfizer, Inc. and related to MDL 2502 that regards Lipitor, the name brand of Atorvastatin Calcium.
However, not all of these rare cases are related to multi-district litigation. If we look at 5:16-dj-00005-SL in the Northern District of Ohio we find that the case is “United States of America v. DNA in the form of salive from [omitted]”–a search warrant. The designation of dj doesn’t fit with the nature of the case or it’s commonly used designations, leaving its construction a mystery.
Unfortunately, search warrants are a rather difficult type of case to find systematically. SW is the 7th most used type designation, with over 1,000 cases being filed in 2016. Surprisingly (or maybe not so given our case in Northern Ohio), ‘sw’ cases only occur in 5 of the 94 districts. Does this mean that there were no federal search warrants filed in the other 89 district courts? Highly unlikely. What it points to is the varying usage of the PACER system across each of the 94 districts.
Of course some of this variation in the volume of filings is natural and expected not because of the usage of PACER, but because of the court’s jurisdiction. Petty offenses are more common cases in Arizona, Maryland, Southern Texas, Eastern Virginia and Western Washington and rarely filed in most of the other districts. What’s similar between all the locations with common filings of petty offenses is that they have larger areas of federal land which could trigger a petty offense charge like speeding through a national park or being arrested at a federal building–but we of course would need to systematically assess these cases to determine if that was a real cause.

Dark matter everywhere
The data variety across district courts in PACER combined with its fee structure and limited search interface is problematic for all users. The reality is that there is more variety in how the system is used across the courts than most people would ever expect, stemming in no small part that PACER is really just a software tool that lawyers, judges, and clerks have to use in the course of their daily duties. However, this lack of standardization puts a dramatic burden on users who want to answer empirical questions–making it extremely costly and difficult to ensure that all cases of a certain type are appropriately identified and included in analysis.
With the SCALES OKN we’re implementing features to make it easier to find the cases of interest. Some fixes are easy. For example, products that compete with PACER implement keyword search which allows for searching for individual words or phrases in the contents of the cases. Other fixes are harder and that’s where we’re concentrating our efforts. A key part of our development work right now is on identifying when litigation events occur (a motion is filed for in forma pauperis, a case is transferred or consolidated in multi-district litigation, etc.) and their corresponding outcomes. Codifying these events and exposing them to search filters along with our work to resolve lawyers, law firms, and corporations will enable more advanced identification of cases that are of interest and a quicker path to turn questions into answers–and hopefully a near-term future where we can be confident in answers without inspecting every single docket because we have an idea of what’s in PACER.

Thank you foor sharing this