The new parser tool is designed to work in tandem with our existing public codebase, but it can operate on any downloaded PACER docket, whether acquired from the SCALES scraper tool or saved directly from your search results. After loading the raw HTML, the parser breaks the docket into its component parts and pulls out names, dates, and other useful data points. We hope that by automating many of the rote tasks involved in cleaning PACER data, we will help researchers improve the usefulness & uniformity of their datasets while cutting down on the time required to compile them. (You can read a full outline of the fields generated by the parser here.)
The parsed dockets can be also interpolated into any data pipeline and put to work for statistical analyses, data visualizations, large-scale aggregation, and countless other use cases. To get started with the SCALES software and explore how it can improve your workflow, you can check out the tutorial included at the top level of the `PACER-tools` repository and the more detailed documentation in each subfolder. And if you’re curious about the work we do at SCALES, or have ideas for improving these tools, drop us a line at firstname.lastname@example.org or open an issue on the git repository.