Corpus Linguistics in the Sixth Circuit and Beyond

Corpus linguistics has been in the news lately, which gives us a chance to discuss this interesting tool of statutory interpretation and, in the process, revisit some Sixth Circuit views about it.

What is corpus linguistics? We will let Circuit Judge Amul Thapar explain:

[Corpus linguistics] draws on the common knowledge of the lay person by showing us the ordinary uses of words in our common language. How does it work? Corpus linguistics allows lawyers to use a searchable database to find specific examples of how a word was used at any given time. . . These databases, available mostly online, contain millions of examples of everyday word usage (taken from spoken words, works of fiction, magazines, newspapers, and academic works). . . Lawyers can search these databases for the ordinary meaning of statutory language . . . The corresponding search results will yield a broader and more empirically-based understanding of the ordinary meaning of a word or phrase by giving us different situations in which the word or phrase was used across a wide variety of common usages. . . In short, corpus linguistics is a powerful tool for discerning how the public would have understood a statute’s text at the time it was enacted.

Wilson v. Safelite Grp., Inc., 930 F.3d 429, 440 (6th Cir. 2019) (Thapar, J., concurring) (internal citations omitted).

Corpus linguistics is a tool used to identify the original public meaning of words – no small thing when the outcome of a case often hinges on the meaning of a single word in a statute or the Constitution. And, if you fail to address corpus linguistics in your brief, you may receive a letter from the court requesting you to submit supplemental briefing correcting the omission.

Consider what happened a few days ago in the highly anticipated Ninth Circuit case of Jones v. Bonta, which involved the Second Amendment. Judges Nelson, Lee, and Stein “asked the parties to file supplemental briefing addressing in part the applicability of corpus linguistics to [the] case.” No. 20-56174, 2022 U.S. App. LEXIS 12657, at *16 n.6 (9th Cir. May 11, 2022). Similarly, a Sixth Circuit panel of judges consisting of Judges Thapar and Siler and Eastern District of Kentucky Judge Hood “asked the parties to file supplemental briefs on the original meaning of Article III’s case-or-controversy requirement, specifically whether the corpus of Founding-era American English helped illuminate that meaning.” Wright v. Spaulding, 939 F.3d 695, 700 n.1 (6th Cir. 2019). In neither Jones v. Bonta nor Wright v. Spaulding did corpus linguistics control the outcome of the case. But they do show the appetite among judges for bringing new tools to bear when discerning the original public meaning of the Constitution or other statutory text.

Regardless of prior briefing, judges at the district or circuit level may use corpus linguistics to rule for or against you. Two cases illustrate the point. See United States v. Woodson, 960 F.3d 852 (6th Cir. 2020); Health Freedom Def. Fund, Inc. v. Biden, No. 8:21-cv-1693-KKM-AEP (M.D. Fla. Apr. 18, 2022). These examples underscore corpus linguistics’ utility in ascertaining the meaning of statutes.

In Woodson, the defendant and his accomplices robbed over a dozen diamond stores across multiple states. At sentencing, the district court determined that the defendant’s sentence should be enhanced because the defendant “relocated, or participated in relocating, a fraudulent scheme to another jurisdiction to evade law enforcement or regulatory officials.”

Judge Readler, writing for the majority, helpfully broke down the statute into four elements that trigger the enhancement: “(1) relocation or participation in relocation, (2) of a fraudulent scheme, (3) to another jurisdiction, (4) to evade law enforcement or regulatory officials.” In dispute were the district court’s reading of the first two elements. The defendant argued that because the scheme’s “home base” or “hub” remained Toledo, Ohio throughout the robberies, he never relocated the scheme. The district court ruled otherwise, holding that the defendant “had purposefully targeted stores” across multiple states “to impede communication between law enforcement, triggering the relocation enhancement.”

The panel disagreed with Woodson’s interpretation of “a scheme” as something tangible, such as a “hub.” Judge Readler, referencing numerous dictionaries, instead found that a scheme is something intangible such as a plot or plan. Thus, the district court was correct that purposefully targeting diamond stores in multiple states was sufficient to satisfy the first two elements. Not stopping there, Judge Readler conducted a corpus linguistics analysis of the term “scheme”, and found the analysis was in agreement with the dictionaries cited. Corpus linguistics extinguished any doubt.

In Health Freedom Def. Fund, Inc., Middle District of Florida Judge Kathryn Kimball Mizelle used corpus linguistics to fortify her opinion in one of the most publicized cases so far this year. The Health Freedom Defense Fund challenged the CDC’s imposition of a mask mandate on all air travel in the U.S. pursuant to the Public Health Services Act of 1944 (“PHSA”).

The PHSA authorized the CDC to “make and enforce such regulations” necessary to “prevent the introduction, transmission, or spread of disease” through “fumigation, disinfection, sanitation” and other actions. The U.S. argued that the airline mask mandate was a sanitation measure. Judge Mizelle found that, at the time of the passage of the PHSA, sanitation had one of two meanings: 1) “measures that clean something or that remove filth” or 2) “measures that keep something clean”. If sanitation was given the former meaning, the mandate failed because masks do not actively clean anything. If given the latter, the mask mandate should be upheld because masks keep the air clean from COVID-19.

Judge Mizelle relied upon “all the traditional tools of statutory interpretation” to settle the question. First she looked to context. “Sanitation” was accompanied by active (not preventative) words such as fumigation and disinfection. This favored the first meaning. The structure and history of the statute were also in harmony with the first definition.

Moreover, by using corpus linguistics, Judge Mizelle determined that “customary usage at the time agree[d]” with her findings. She searched uses of “sanitation” in the relevant corpus during the relevant time period and found that the most frequent use of sanitation was in the context of “a positive act to make a thing or place clean” whereas only five percent was “of sanitation as a measure to maintain a status of cleanliness, or as a barrier to keep something clean.” As such, Judge Mizelle concluded that “sanitation” carried the first meaning and the mask mandate was unauthorized.

Corpus linguistics is not a silver bullet. As Judge Thapar noted, “corpus linguistics is one tool—new to lawyers and continuing to develop—but not the whole toolbox. Its foremost value may come in those difficult cases where statutes split and dictionaries diverge.” Wilson v. Safelite Grp., Inc., 930 F.3d 429, 440 (6th Cir. 2019); see also id. at 445-48 (Judge Stranch, concurring in her own opinion to respond to Judge Thapar’s “endorsement of ‘corpus linguistics’” and noting in particular: practical issues of privileging newsworthy connotations of a term” over “the ordinary meaning,” the difficulty of culling “irrelevant results,” and concerns with courts conducting such statistical analyses).

Yet corpus linguistics is likely something litigators should familiarize themselves with when construing the meaning of a statute or constitutional provision. You may be asked to brief judges on the matter. And it might end up being the tool that persuades the court to accept an interpretation of a statute or the Constitution in your client’s favor.