Introduction
In March, the United States Supreme Court heard a case involving the issue of whether a private arbitration panel in another country is covered by the statutory phrase “foreign or international tribunal.” The statutory language, enacted in 1964, authorizes a federal district court to order witness testimony or production of evidence “for use in a proceeding in a foreign or international tribunal” if the witness or holder of the material resides or is found in the district. The Respondent here seeks to invoke this statutory authorization to assist them in private arbitration held in a foreign country.
Whether Respondent can so rely on this statute is no small matter. In the case, the Respondent, Luxshare, Ltd, plans to initiate private arbitration proceedings in Germany against Petitioner ZF Automotive US, Inc. The German arbitration arises out of a business dispute involving hundreds of millions of dollars in alleged damages, under a private agreement calling for private commercial arbitration overseen by arbitrators who are private citizens selected and paid for by the parties.
At its core, this dispute hinges on a linguistic question: what did the term foreign tribunal mean in 1964? Petitioners argue that a foreign tribunal only refers to entities imbued with government or quasi-government authority. Respondent takes a broader view, arguing that foreign tribunal refers to any entity in a foreign country that can enter a decision and bind parties, even if that entity is purely private. The parties devote large chunks of their briefs to the underlying linguistic question, looking to dictionaries and various legal materials to support their position. But the parties’ attempts to divine the meaning of foreign tribunal suffer from shortcomings common to legal interpretation. This article turns to a tool that avoids these shortcomings and provides a more rigorous, objective, and transparent answer to the question at hand. That tool? Corpus linguistics.
Increasingly, our courts (including the U.S. Supreme Court) have looked to corpus linguistics to better answer the linguistic questions that judges face in interpreting the words of the law. Understandably, judges use economic tools to tackle economic questions and historical tools to answer historical questions. Should they not use linguistic tools for linguistic questions? “[W]ords are . . . the material of which laws are made. Everything depends on our understanding of them.” We can and should use the right tools for seeking this understanding.
This article will proceed in four parts. Part I presents the linguistic debate as framed by the parties, highlighting shortcomings of the traditional tools they employ. Part II explains how the tools of corpus linguistics can address these shortcomings. And Part III presents a corpus linguistic analysis of the terms foreign tribunal and foreign tribunal(s). This approach, more rigorous than that undertaken by the parties, can provide data on the linguistic question that undergirds the legal issue—which reading of the statute is more probable than the other. After all, a “problem in [legal interpretation] can seriously bother courts only when there is a contest between probabilities of meaning.” Corpus linguistics can help with that contest.
I. Background
The parties frame the linguistic debate at issue here as a question of the ordinary meaning of the statutory terms. They thus point to various sources to support their preferred reading of the statute, including dictionaries, ordinary usage, and legal usage. Some of these tools are a good start. But they do not provide a sufficiently objective, transparent basis for resolving the contest between dueling senses of the statutory terms at issue because they do not fully answer the linguistic question, instead requiring linguistic intuition to fill in the gaps.
A. The Linguistic Debate at Issue Here
1. Dictionaries
Both the petitioners and the respondent turn to dictionaries contemporaneous to the statute’s enactment to proffer a definition that supports their litigating position. They frame their reliance on dictionaries as a quest for the ordinary meaning of the statutory language. For example, ZF Automotive cites four contemporaneous ordinary dictionaries and one contemporaneous legal dictionary for the meaning of tribunal. Respondent Luxshare likewise quotes two ordinary dictionaries and two legal dictionaries for tribunal, though strangely two of these dictionaries are of recent vintage—2019 and 1996—calling into question their utility. From these dictionaries emerge the following definitions. First, the narrower sense:
- “[t]he seat of a judge;” “the bench on which a judge and his associates sit for administering justice”
- “[t]he whole body of judges who compose a jurisdiction”
- “a court or forum of justice:” “[a] seat or court of justice”; “a judicial court”
- “a judicial assembly”
The 1969 edition of Ballentine’s Law Dictionary, which the parties did not cite, also defined tribunal as “[a] court. The seat or bench for the judge or judges of a court.”
Second, the broader sense:
- “[a] court of justice or other adjudicatory body”
- “a person or body of persons having to hear and decide disputes so as to bind the parties”
- “[a]nything having the power of determining or judging”
- a “person or body of persons having authority to hear and decide disputes so as to bind the disputants”
At least one other dictionary not cited by the parties—Funk & Wagnalls New Standard Dictionary of the English Language, published in 1960—included the narrow sense, though it is unclear whether it also included the broad sense given the example it used to illustrate, which at first seems like the broader sense but may actually be referring to an international tribunal that has government authority: “1. A court of justice; any judicial body, as a board of arbitrators. 2. The seat set apart for judges, magistrates, etc.”
Thus, dictionaries reveal that, around 1964, there were at least two senses of tribunal. One sense, common to every dictionary we or the parties could find, legal or ordinary, was narrow in nature and referred mostly to courts. The other, found in two (maybe three) ordinary dictionaries (and two later legal dictionaries that we are not giving weight to, given their date of publication), was broad in nature and could cover private arbitration bodies. One could be tempted from this evidence to infer that the narrow sense was the more common of the two senses. But as described below, such an inference would be a mistake based merely on dictionary frequencies. Likewise, parties sometimes refer to a “lead legal definition[],” “primary definition[],” or “secondary definition.” As described below, such labels are mistaken when derived from dictionaries.
None of the dictionaries defined the actual statutory terms, leaving the parties to look up their constituent words in dictionaries. Thus, the parties also looked up the definition of foreign. “Putting these definitions together,” the petitioners argued that the statutory terms “most naturally refer[] to a court or other governmental adjudicative or quasi-adjudicative body convened to render justice.” Thus, the terms do “not encompass a private arbitral panel whose authority derives solely from the contractual agreement of private parties rather than any government, and which is not composed of government adjudicators.” Respondents never put the two terms together to create a definition for foreign tribunal, but rather use dictionaries to argue that private commercial arbitration panels in foreign countries satisfy both the definition for foreign and the definition for tribunal.
2. Ordinary Usage
The parties claim to look at “ordinary” usage to support their legal positions. Hence, in rejecting a definition of foreign that could mean just located in a foreign country and instead embracing a definition that means belonging to another country, ZF Automotive presented examples such as “foreign leader,” “foreign official,” “foreign flag,” “foreign law,” and “foreign country.” From this, the petitioners concluded that “[w]hen the word ‘foreign’ modifies a noun with potential governmental or sovereign connotations—like ‘tribunal’—it typically indicates that the noun belongs to the sovereign entity.” However, neither party actually presented any evidence of ordinary usage of the term foreign tribunal. And Luxshare’s evidence of ordinary meaning was “dictionaries [some being legal dictionaries], judicial opinions, and other legal sources.” Legal sources are not very good indicators of ordinary meaning.
3. Legal Usage
Finally, the parties turned to legal usage. Thus, petitioners looked at the use of the term foreign as a modifier in other portions of the 1964 Act, how Congress has both used the term tribunal and described private arbitration, and how federal courts and legal scholars have used the terms foreign tribunal and arbitral tribunal. Likewise, respondent turned to a recent (2021) legal treatise, recent caselaw (2004 & 1997), and German law in defining foreign. Then, it used federal judicial usage (both recent and contemporaneous to 1964), the same recent legal treatise, various arbitration bodies’ rules, the Geneva Treaties, and legal commentary and scholarship to support its reading of tribunal.
B. The Weakness with the Parties’ Evidence & Methodologies
1. The Limitations of Dictionaries
a. Non-compositionality
Dictionaries generally define single words, not multi-word terms or phrases. Thus, if relying on dictionaries, one has to slice and dice statutory text rather than looking up the whole operative phrase. But this is deeply problematic. That is because of the linguistic phenomenon of non-compositional expression, wherein “a particular word sequence should be considered a single lexical item.”
Normally, the principle of compositionality applies. Linguists define compositionality as when “[t]he meaning of a semantically complex expression is a compositional function of the meanings of its semantic constituents.” In other words, often what you see is what you get: cherry pie is a pie made from cherries.
But sometimes, “the combination of words has a meaning of its own that is not a reliable amalgamation of the components at all,” such as for good or at all. In short, a phrase may be more (or less) than the sum of its parts. Related to “non-compositionality” is the idiom principle: “a language user has available to him or her a large number of semi-preconstructed phrases that constitute single choices [in communication], even though they might appear to be analysable into segments.” Take, for example, of course or in fact. Looking up their constituent words separately will not tell one the idiomatic meaning of the combined phrase. Non-compositional expressions come in several varieties, such as phrasal idioms (pulling someone’s leg); cliches, grammatical idioms (by and large), and frozen metaphors (the ball’s in your court).
The Supreme Court has recognized this linguistic phenomenon, observing that “two words together may assume a more particular meaning than those words in isolation.” In fact, in a different area of law—trademark law—the Court has noted this principle for over a century, which has come to be known as the Anti-Dissection Rule. This same principle can and should be applied to statutory interpretation so that the meaning of a multi-word term or phrase should be “derived from it as a whole, not from its elements separated and considered in detail”—“it should be considered in its entirety.” Judge Frank Easterbrook perhaps put this most colorfully when he observed in a trademark case involving a church’s name:
[T]he World Church produced . . . nothing but a dictionary. It did not offer any evidence about how religious adherents use or understand the phrase as a unit. It offered only lexicographers’ definitions of the individual words. That won’t cut the mustard, because dictionaries reveal a range of historical meanings rather than how people use a particular phrase in contemporary culture. (Similarly, looking up the words “cut” and “mustard” would not reveal the meaning of the phrase we just used.)
Thus, looking up the words foreign and tribunal in dictionaries may not give us a complete and accurate meaning of foreign tribunal. Yet because the parties were heavily relying on dictionaries, that is exactly what they resorted to here. This same criticism can be levied at the parties for looking at the usage in legal materials of just the words foreign, international, and tribunal.
b. Dictionaries as “museums of words” and linguistic intuition
Relatedly, dictionaries are not always very useful for dealing with context. That is because dictionaries are just “museum[s] of words”—“historical records (as reliable as the judgment and industry of the editors) of the meanings with which words have in fact been used by writers of good repute.” Hence, dictionaries “are often useful in answering hard questions of whether, in an appropriate context, a particular meaning is linguistically permissible,” not what is linguistically probable in a given context.
Thus, when lawyers, scholars, or jurists countenance one dictionary definition over another as the ordinary meaning of a word or phrase, that tells us more about their linguistic intuition than the dictionary because it is that intuition that is the analytical bridge from dictionary evidence to the interpretive conclusion. After all, dictionaries do not indicate which sense of a word is the ordinary sense—that would depend on context. And besides a lack of transparency, that intuition has at least two pitfalls stemming from the fact that an individual’s linguistic intuition is informed by her exposure to language over her lifetime. The first limitation of linguistic intuition, at least for most lawyers, scholars, and judges, is that they are seldom representative of ordinary members of society, tending to hail from more elite social circles with much more education. These demographic factors influence the language to which they are exposed.
Second, even if an attorney, academic, or judge was just an ordinary person who ran in ordinary circles with an ordinary level and source of education, she is still a product of her time. And that time confines—even distorts—her ability to properly intuit meaning from a time during which she did not live. That is due to the reality of linguistic drift. If the English language were static, then statutes written in an earlier time would not pose challenges to a later person’s linguistic intuition. But English is not static. Over time, meanings can change, sometimes dramatically and quickly. Take the constitutional term domestic violence. From the 1770s through the 1970s, the term consistently meant insurrection, rebellion, or rioting within a state. But starting in the 1980s, that began to change, and by the 1990s, domestic violence almost always means “violent or aggressive behavior within the home, esp[ecially] violent abuse of a partner.” The previous sense that dominated for two centuries has now almost completely fallen out of use. And that shift occurred within less than two decades. Thus, someone relying on her own linguistic intuition formed in a time after a statute was adopted may miss that linguistic drift had occurred and inaccurately understand a statutory word or phrase.
Yet, when the parties, namely their well-educated and arguably upper-class lawyers, propose ordinary usage terms, like “foreign leader” or “foreign flag,” they are relying on their linguistic intuition formed by language exposure long after the statute was enacted.
c. “Lexicographical prescriptivism”
In the 1960s, Webster’s Third International Dictionary made a move deemed controversial in the world of lexicography: it decided to define words according to actual usage rather than proper usage. This move to descriptive definitions rather than normative ones was a break from the past as “[l]exicographical prescriptivism in the United States is exactly as old as the making of dictionaries, because of the role played by the dictionary in a society characterized by a great deal of linguistic insecurity.”
Normative, or prescriptive, dictionaries “establish[] what is right in meaning and pronunciation,” providing users with what the lexicographer deems the “proper” usage of each word. Therefore, “the prescriptive school of thought relie[d] heavily on the editors of dictionaries to define and publish the proper meaning and usage of the terms.” In contrast, “[t]he editors of a descriptive dictionary describe how a word is being used and, unlike their prescriptive counterparts, do not decide how a word should be used.” To the extent any dictionary is prescriptive, it is less useful for determining how people actually used language—and dictionaries before and during the 1960s, outside of Webster’s Third, tend to be of the prescriptive variety. And these are many of the very dictionaries relied on by the parties here.
d. Relying on dictionary sense-ordering
Dictionaries list senses in numerical order. This sometimes gives rise to what has been called the “sense-ranking fallacy.” That fallacy is to deem a sense listed before another as being more “primary.” Justice Breyer did this in Muscarello v. United States. In looking at the verb carry, Justice Breyer deemed one sense as “primary” and another as “special,” in part because he observed that the “primary” sense occurred first in three dictionaries, whereas the “special” sense was numerically ranked lower. This sense-ordering caused Justice Breyer to consider the sense listed sooner as more ordinary.
Such a conclusion is flawed because dictionaries do not claim that the ordering of senses is based on which are more common, frequent, or ordinary. Rather, senses are either ordered based on when they were deemed to have historically entered the lexicon, or they are admittedly “an arbitrary arrangement or rearrangement.” Thus, at least based on the order senses appear in dictionaries, there is no “primary,” “lead,” or “secondary” sense, as some of the parties argued here.
e. Sense frequency across dictionaries
Another common mistake is to deem a sense that occurs more often across multiple dictionaries as the more common, ordinary, or primary sense. This misses the fact that the very “‘system of separating senses’ is ‘only a lexical convenience.’” And dictionaries do not agree as to where to draw the line. That is because “[l]exicographers tend to fall into one of two categories when it comes to writing definitions: lumpers and splitters.” A lumper “tend[s] to write broad definitions that can cover several or more minor variations on that meaning.” By contrast, a splitter “tend[s] to write discrete definitions for each of those minor variations.”
Additionally, “[t]he history of English lexicography usually consists of a recital of successive and often successful acts of piracy.” This tendency, at least historically, for dictionaries to use the definitions of other dictionaries, “can create a false consensus whereby it looks like all of the dictionaries independently agree, and thus reflect contemporaneous linguistic reality, but in actuality only reflect the views . . . of a few dictionary makers.” To what extent lexicographical piracy was occurring in the 1950s and 60s is uncertain. Many of the dictionaries the parties cite here have identical or near identical definitions, though. At the very least, extreme caution is warranted in surmising anything from the frequency of senses when surveying multiple dictionaries.
2. Non-Systematic Usage Sampling
To overcome the limitations of dictionaries, one can sample actual usage of the complete term at issue. The parties do this, but not in a systematic way or in sufficient numbers that we can have much confidence. Like dictionaries, these examples of language usage have the potential to suffer from the same defect of relying on legislative history—looking out among the crowd and calling on one’s friends. Or, to put it more bluntly, cherry-picking examples that support one’s position. The parties only present a handful of samples of usage and often they rely on just the usage of one of the words of the multi-word term. Much more is needed to have any confidence in the results. And the sampling must either be random (if there are sufficient examples to need to sample) or weighted towards the usage that is closest in time to the relevant date—here, 1964. What is more, parties are prone to read the data in a way favorable to their position, even if only subconsciously through confirmation bias or motivated reasoning. Our methods below help overcome these shortcomings.
II. A Brief Introduction to Corpus Linguistics
Due to the above-noted limitations with traditional statutory interpretive methodology and tools, something better is needed. Corpus linguistics has the potential to be that something better—in the words of Law Professor Larry Solum, to “revolutionize statutory . . . interpretation.” In this sense, corpus linguistics is akin to a paradigm-shifting technology or tool like the Hubble Telescope. Certainly, astronomers could glimpse the heavens from earth before the Hubble was launched. But the increased clarity and scope the Hubble brought to astronomic inquiries was revolutionary. What is more, corpus analysis brings transparency—researchers, courts, and parties can access the corpus and perform the same searches to analyze the data for themselves.
While corpus linguistics and corpora may sound exotic, they are not. A language corpus is similar in some regards to a corpus (or body) of precedent. Moreover, corpora are used in the construction of most modern dictionaries. Corpus linguistics—a robust empirical methodology within the field of linguistics—provides a variety of methods for analyzing a corpus to answer legal interpretive questions.
A. The Purpose of Corpus Linguistics
Corpus linguistics is the empirical study of language using samples (or bodies) of texts called corpora (in the plural). A corpus is constructed in order to study a particular register (variety of texts associated with a situational context) or speech community (group of language users who share the same dialect or language norms). Corpus linguistics is premised on the idea that “the best way to find out about how language works is by analyzing real examples of language as it is actually used.” In studying naturally occurring language use, corpus linguistics can avoid the observer’s paradox—the phenomenon whereby people tend to change their behavior when they are aware they are being studied (i.e., the Hawthorne Effect).
Corpus linguistics is founded on two premises: (1) that a corpus of texts can be constructed to be sufficiently representative of a particular register or speech community, and (2) that one can “empirically describe patterns of language use through analysis of that corpus.” So corpus linguistics “depends on both quantitative and qualitative analy[sis].” And corpus linguistics results “in research findings that have much greater generalizability and validity than would otherwise be feasible.” Because “a key goal of corpus linguistics is to aim for replicability of results, researchers and data creators have an important duty to discharge in ensuring the data they produce is made available to analysts in the future.”
B. Corpora
A corpus can be made of any kind of naturally occurring texts. Common examples include collections of samples of newspapers articles, books, or legal documents. The utility of a corpus will depend on the degree to which it represents the target language domain of interest. Corpus representativeness depends on two key considerations—“what types of texts should be included in the corpus and how many texts are required.” What is true for computing is true for corpus linguistics: “garbage in, garbage out,” as corpus-based results can be no better than the corpus being used (and it can be worse if the corpus data is not properly analyzed). If a corpus does not adequately represent the texts used within the register or by the speech community one wants to make observations about, then other features of the corpus, such as its size, will make little difference. For example, a corpus composed of the transcripts of the television show Game of Thrones will not tell us much about language usage among early 20th century Ethiopian children, no matter how big the corpus is. The corpus must match and represent the register or group about which one wants to draw inferences. Otherwise, one cannot make generalizations about the larger register or speech community of interest. Hence, using Google for corpus linguistics research is arguably not very effective because the searchable web represents a wide range of registers and speech communities.
C. Corpus Linguistic Methods
There are a large number of linguistic methods that have been developed and applied to corpus data. We first introduce a selection of methods that have been used for legal interpretation. Then we briefly introduce several other methods that are used within the larger field of corpus linguistics. Perhaps the most basic method for quantitatively analyzing corpus data is frequency—measuring how often, for instance, a word is used over time or in different types of texts (i.e., registers or genres).
Another corpus method commonly used in legal interpretive research is concordance line analysis. These can be used for qualitative analysis or in order to get at frequency data. Concordance lines are excerpts from texts centered on a search term. In cases where there are many hits resulting from a corpus query, researchers can extract a random sample of concordance lines from the corpus.
To get meaning out of the concordance lines often requires classifying (or “coding”) the search results. We recommend that researchers base concordance line coding on the best practices and principles of content analysis and survey methodologies. For instance, one could search for a particular word, then classify each result presented in a concordance line according to a particular sense of that word. Additionally, if greater context than one sentence is needed, one can expand the size of the text excerpt surrounding the search hit to account for more context. In this way, one could analyze the results to determine something a dictionary cannot usually convey: which sense is more common in a given context (i.e., the distribution of senses). This particular exercise, using concordance lines to classify senses, has proven to be an effective method for addressing questions regarding the meaning of words and phrases in legal texts. Further, the nature of the search results prevents one from cherry-picking examples. Of course, classifying senses involves a measure of subjectivity in considering the context to properly classify (or code) a sense. But as explained further below, we have taken measures to minimize this subjectivity.
Another tool found in most corpora is collocation. Some words “co-locate” more frequently than other words. One can think of this phenomenon as “word neighbors.” These semantic patterns of word association can sometimes be intuitive: we expect dark to appear more often in the same semantic environment as night than with perfume. But sometimes the patterns are surprising. This linguistic phenomenon has long been implicitly recognized in the law in the canon of construction called noscitur a sociis: “it is known by its associates.” Linguists just put it a slightly different way: “[y]ou shall know a word by the company it keeps!”
By seeing which words are collocates of each other, we can sometimes get additional insight into how people understand those words. This can be done in a corpus by searching for a word and indicating (1) how many words to the left or right (or both) of the search term one wants to examine, and (2) which statistical measure (e.g., frequency, MI score, T score) will be used to measure the strength of association. In this way, researchers are able to estimate how common it is for words to co-occur in close proximity. We can also use collocate analysis to see how usage patterns change. For instance, one of us in an earlier paper noted that the top five collocates (in raw frequency) of the term domestic violence from 1760-1979 were (1) against, (2) state(s), (3) protect, (4) convened, and (5) invasion. This reflects the sense as used in the Constitution of a rebellion or insurrection within a state. But the top five collocates of domestic violence from 1980-2009 showed a radical shift: (1) women, (2) abuse(d), (3) honor, (4) national, and (5) victims. These collocates reflect the sense of violence against a member of one’s household.
Besides analysis at the word or phrasal level, through a corpus search one can consider grammatical context by looking at a term or phrase in a specific syntactic structure (i.e., a noun modified by a particular adjective). For example, in a recent paper, one of us applied grammatical analysis of corpus data to determine whether language users use the term vehicle to refer to scooters. To do this, we identified 230 instances where scooter occurred in close proximity with vehicle, and then we classified each of these into one of three categories: (1) scooters are referred to as vehicles, (2) scooters are not referred to as vehicles, and (3) inconclusive. For each of these categories, we established a number of grammatical structures that clearly indicated the category. Based on this, we found that scooters are referred to as vehicles in 87% of the cases where the data is conclusive.
There are other methods in corpus linguistics that we have not discussed in this section. Among these are methods that have been used in previous legal scholarship (e.g., n-grams or lexical bundles), as well as many others—such as dispersion, keyword analysis, collostructional analysis, text type analysis, multi-dimensional analysis,—that could potentially be used to address legal interpretative questions as research at the intersection of corpus linguistics and legal interpretation continues to grow.
III. Corpus Linguistic Analysis
A. Selecting a Corpus
While the parties never pointed to an instance of the term foreign tribunal(s) being used in a source of ordinary American English, the parties did argue that the term should be understood according to its ordinary meaning. To look at this, we turned to the Corpus of Historical American English, or COHA (pronounced koh-uh). COHA “is the largest structured corpus of historical English.” It contains more than 475 million words from 115,000 texts ranging from the 1820s to the 2010s. It is balanced by genre within each decade, with texts from four types of genres (or registers): fiction, magazines, newspapers, and non-fiction. COHA is also “balanced across decades for sub-genres and domains as well (e.g., by Library of Congress classification for non-fiction; and by sub-genre for fiction—prose, poetry, drama, etc.)” Further, “[t]his balance across genres and sub-genres allows researchers to examine changes and be reasonably certain that the data reflects actual changes in the ‘real world,’ rather than just being artifacts of a changing genre balance.”
While claiming they were looking at ordinary meaning, the parties also looked at various legal sources: cases, statutes, and legal scholarship. For cases, we first turned to the Corpus of Supreme Court Opinions of the United States, which “includes all opinions in the United States Reports and opinions published by the Supreme Court through the 2017 term,” resulting in a corpus of about 98 million words and 62,000 texts. As there are no other corpora created for the remaining sources of legal documents the parties relied on, for federal cases we turned to Westlaw, for U.S. statutes we turned to HeinOnline’s U.S. Code, and for legal scholarship we turned to HeinOnline’s Core U.S. journals database.
B. Best Coding Practices
Given the subjective nature of coding—reading samples of language usage to try and classify that usage into a sense of a word or term—and the tendency of people to read evidence to confirm their pre-existing position or in light of their own biases, we implemented some best practices for the sense-coding portion of our analysis. We do this to pursue the twin pillars of good social science research: reliability and validity. Reliability, which could also be called replicability, is the ability of others to replicate the results. Validity is the accuracy of the results in measuring the phenomena claimed to be measured.
To achieve reliability and validity, we used two coders, with both coders coding all of the material independently of each other. We did this so we could see the rate of agreement between the coders. A low rate could mean the material is too difficult to code or that one coder is providing an idiosyncratic view of the material. Having two coders with a high rate of agreement provides greater confidence that the results are accurate and that others will reach similar results. Second, at least one of the coders, if not both, was completely blind to what the authors thought the results would be, thus eliminating any thumbs on the scale, so to speak. If coders think a certain outcome is expected or more likely, they may lean that way in their coding, so having the coders “blind” to such information helps mitigate confirmation bias or motivated reasoning, increasing both validity and reliability. Third, we only coded one instance of a term in a document, coding the first. We did this because multiple uses of a term in the same document are likely to take on the same sense, thus biasing the overall numbers if they are counted as separate instances. Public opinion pollsters do something similar, randomly sampling households rather than individuals since the opinions of members of the same household are highly correlated.
C. “Foreign Tribunal”
1. Corpus of Historical American English (COHA)
To determine what the term foreign tribunal, in both its singular and plural form, meant in “ordinary” American English, we turned to COHA. In the more than 298 million words found in the corpus through 1964 (the cut-off year for our search), the term only showed up six times in six documents, and never again after 1895. At the very least, this means that the term foreign tribunal(s) is a rare one in “ordinary” American English, and this may mean that there is no ordinary meaning of the term and that it only has a legal meaning.
We only coded one of the instances in the document that had two for reasons noted above, resulting in six hits. These six instances of the term were each independently coded by two coders. Coders determined the sense of foreign tribunal being used. They were given the following options and directions:
- Government sense: a tribunal that operates under government authority, such as a court
- Private/non-government: a tribunal that operates under non-governmental/private authority, such as private arbitration
- Other: if the term being used to describe something that does not fit into the first two categories
- Unclear: you cannot tell, which could be because there is not enough information or because you are not sure whether the tribunal mentioned fits into the government or non-government category
The first coder classified all six instances as falling under the government-authority sense. The second coder classified four of the instances as invoking the government-authority sense and two of the instances of the term as being unclear. Not once did a coder deem a use of foreign tribunal(s) in COHA to invoke the private-authority sense, nor did either coder deem any other sense as being used.
We also asked the coders to record the specific type of tribunal being discussed, such as a court, a legislature, arbitration, etc. For the six COHA instances, one coder deemed five references as being to a court and one reference as unclear, while the other deemed four of the six to be to a court, one to be to a state legislature, and the other to be unclear. Not once did a coder conclude the use of the term foreign tribunal(s) referred to arbitration. Of course, having only six instances of the term, and none after 1895, severely limits the conclusions we can draw from the findings. But at the very least, there is no clear evidence that the term foreign tribunal(s) as used in ordinary American English invoked the private/non-governmental sense and applied to arbitration.
2. Corpus of Supreme Court Opinions of the United States (COSCO-US)
We next looked to a corpus of U.S. Supreme Court opinions: COSCO-US. We limited the search to cases up through 1964. We also only coded the first instance of the term foreign tribunal(s) in a case, even if it appeared more than once. This resulted in forty-three instances, ranging from 1808 to 1958. Two coders independently coded all of these instances. They first coded the following sense categories (the same as coded in COHA, though described here in abbreviated form):
- government-authorized sense
- private, non-government-authorized sense
- any other sense
- unclear
The coders agreed 88% of the time, a sufficiently high rate of agreement. In the chart below are the results.
Sense Distribution of Foreign Tribunal(s) in Supreme Court Cases, 1789–1964

At least 90% of the time, coders found that the government-authority sense of tribunal was being invoked for the term foreign tribunal. The rest of the time, it was unclear which sense was being used. And not once did a coder find that the U.S. Supreme Court was using the private/non-government-authority sense.
The coders were also asked to record the type of tribunal being referenced. The first coder found that all but one of the instances were referring to a court, the one outlier being a legislature. The second coder concluded that thirty-six of the forty-three instances were referencing a court, six were unclear, and one referenced a surveyor general. This evidence indicates that the Supreme Court consistently used foreign tribunal in the narrow, government-authority sense before the statute was enacted to refer to courts, not arbitration.
3. Westlaw Federal Court Opinions
A corpus of federal court decisions does not exist outside of the Founding Era. But for this type of analysis, where one is coding concordance lines in a corpus, a digital database without the additional tools of a linguistic corpus will still work. So, we searched in Westlaw for “foreign tribunal” to capture the terms foreign tribunal and foreign tribunals. We limited the search under “Filters by Jurisdiction” to “Federal Courts of Appeal” and “Federal District Courts.” We also limited the search to any cases prior to 10/03/1964, the date the new statutory language of issue here was enacted. We then ordered the results by date with the most recent listed first since caselaw closer to 1964 would be more relevant and less likely to be influenced by linguistic drift. We coded the first 100 cases that had a valid hit (some had to be discarded because the term foreign tribunal(s) appeared in a headnote rather than in the body of the opinion). This resulted in cases from 1868 to 1964.
The coding was for one of four categories:
- government-authorized sense
- private, non-government-authorized sense
- any other sense
- unclear
The coders coded the material independently of each other, resulting in an agreement rate of 98% for the senses of tribunal, a very high agreement rate. The findings are in the chart below.
Sense Distribution of Foreign Tribunal(s) in Federal Cases

Ranging from 98–100% of the time, the coders determined the government sense was being invoked. Twice the second coder determined that the private sense of tribunal was invoked. In the first instance, the district court judge appeared to be referring to arbitration performed by a court in Spain, which would be more consistent with the government-sense. The second case coded as invoking the private sense does refer to arbitration, but appears to do it in contrast to a foreign tribunal: “Arbitration clauses are found in virtually all the standard forms of charter parties and are particularly favored by shipping men as a means of avoiding litigation in distant countries before foreign tribunals.” In other spots in the opinion, the court appears to be contrasting arbitration and litigation, so this use of the term foreign tribunals is likely referring to courts in a foreign country, and thus the government-authority sense. It appears, then, that the second coder may have been mistaken in finding two instances of the private/non-governmental sense.
Further, 99% of the time the first coder classified the entity being referred to as a foreign tribunal as a court, with the lone other instance being where the entity was a patent office. The second coder deemed 98% of the entities being referred to as a foreign tribunal were courts, with the other 2% referencing arbitration, though these were the same two cases just discussed above, leading us to believe these references were mistaken. Thus, it appears federal courts used the term consistent with how the Supreme Court used the term during that time—in the narrow, government-authority sense and usually referring to courts.
4. U.S. Code
We next looked at the United States Code as found in HeinOnline. We limited the results to those before 1965. We searched in “All Titles” under U.S. Code, limiting our search to the terms foreign tribunal and foreign tribunals that occurred up through 1964. After eliminating duplicates and only sampling the first instance if the term appeared more than once in a particular document, we were left with twelve results. The first coder found all twelve instances to refer to the narrow, government-authority sense. The second coder determined that eleven of the twelve used the narrow, government-authority sense, with the other instance being unclear. Not once could we find an example of the private/non-government sense. As for the type of entity that was referred to as a foreign tribunal, the first coder deemed all twelve instances to be courts, while the second coder found that eight of the twelve were courts, and the other four were unclear. We did not find an example of an arbitration body being referred to as a foreign tribunal. This usage is consistent with how the courts were using the term.
5. U.S. Law Reviews
Finally, we looked at HeinOnline’s Core U.S. Journals database to see how foreign tribunal(s) was used in legal scholarship. Given how many times the terms occurred, we limited the years to 1950–1964, which resulted in 201 hits. We eliminated any result quoting another source, any duplicates, or any articles that were merely titles of statutes with no context. If foreign tribunal(s) appeared multiple times in the document, we only sampled it once—the first time it was listed, unless that first instance was eliminated for the reasons just noted. This resulted in ninety-eight instances of foreign tribunal(s) that we coded. The coding was for one of four categories:
- government-authorized sense
- private, non-government-authorized sense
- any other sense
- unclear
Two coders coded the material independently of each other, resulting in an agreement rate of 96% for the senses of tribunal, a very high rate of agreement. In the figure below, we report the percentages for each category coded:
Sense Distribution of Foreign Tribunal(s) in U.S. Law Reviews

The results are very clear and very stark. Almost every single time the terms foreign tribunal or foreign tribunals were used in the decade and a half before 1964 in U.S. legal scholarship, the term took on the government-authorized sense. Arguably only once did it take on the private sense. For that one instance, the coders disagree, with one classifying it as taking on the government sense and the other coding it as being the private sense. The context was the trial in Israel of the infamous Nazi Adolf Eichmann. The sentence in which the term appeared was, “While arrangements were made for the taking of affidavits and for cross-examination before foreign tribunals, the understandable reluctance of former Nazis to appear before the court largely derogated from whatever direct applicability the territorial theory might have had to the Eichmann case.” Given this is in the context of a criminal case, it seems unlikely that the term foreign tribunals would cover private entities in other countries. The coder who coded this instance as involving the private/non-governmental sense was likely mistaken. The coder also classified the type of foreign tribunal here to be a court, which is in tension with it being the private/non-governmental sense and further supports the government sense. Hence, it appears the private sense of tribunal never occurred once in our sample of U.S. law reviews.
What is more, in determining what type of foreign tribunal was being discussed, the coders never found anything other than courts being referenced. This usage in legal scholarship is consistent with how Congress, the Supreme Court, and lower federal courts used the term. Furthermore, this legal usage is consistent with the ordinary usage.
* * *
The data are about as one-sided as we have ever seen in doing corpus linguistic analysis. In 259 instances of the use of the term foreign tribunal or foreign tribunals across ordinary American English, U.S. Supreme Court opinions, federal court opinions, the U.S. Code, and U.S. legal scholarship, we found only three debatable instances of the use of a private/non-government-sense of tribunal—and those three were probably mistakenly coded. We also only found two possible instances where foreign tribunal(s) may have been referencing arbitration, but we also think those were probably mistakes. That is about as linguistically lopsided as it can get. Of course, we are not saying that it is impossible for foreign tribunal(s) to refer to a private, commercial arbitration panel. No doubt one could find an instance if one looked long and hard enough, just as one could probably find a few Republicans who would vote for Bernie Sanders for President. We are just saying that, based on the data we sampled, such usage was uncommon.
D. Alternative Explanation
1. Real-world Frequency
There is an alternative explanation to frequency data in a corpus. It may not reflect linguistic reality but, assuming the corpus is properly constructed, it could reflect non-linguistic reality. In other words, it could reflect the frequency of the real world as to certain phenomenon. Thus, if one looks in the corpus at the word car, one is more likely to find instances of Fords or Toyotas than Ferraris because there are just many more Fords and Toyotas in existence than Ferraris. But that does not mean a Ferrari is not a car. And to confirm that, one could look to see if every time a Ferrari showed up in the corpus, it was described as a car. Is the fact that the term foreign tribunal almost never shows up as referring to a private, non-government-authorized tribunal or to arbitration merely a reflection of how much less arbitration occurs as compared to government-authorized tribunals and courts?
One way to get some leverage on this question would be to know how many lawsuits are filed in courts each year versus how many arbitration proceedings are instituted. Of course, one would need to know that historical data for the time periods analyzed here—pre-1965. We do not have that data. But it does not appear that the data we have sampled could be entirely driven by the real-world frequencies of courts and lawsuits being more prevalent than arbitration because that would mean arbitration seldom exists.
2. Arbitration Analysis
To look at this difference between linguistic frequency and real-world frequency from another angle, we decided to sample 100 instances of the word arbitration from COHA, to capture more ordinary language, and COSCO-US, to capture more legal meaning. We recorded the general word used for the entity conducting the arbitration proceeding (panel, body, tribunal, commission, etc.). We did so to see whether when the term arbitration is used it is predominantly referred to as a tribunal or predominantly referred to as something else. If arbitration predominantly referred to something other than a tribunal, then it would be further evidence that it is not something about the frequency of arbitration in the real world that may be driving the frequency data we see in our analysis of foreign tribunal(s)—though we recognize this type of analysis is less direct evidence of the meaning of foreign tribunal(s).
3. COHA
We searched for the terms arbitration and arbitrations in COHA that occurred from 1950–1964, finding 192 documents. We only took the first instance if there were multiple instances from the same document. This reduced our total to 117. The overwhelming majority (74%) of the hits did not reveal the type of entity performing the arbitration. Below are the results we found when we could determine the entity type.
Type of Entity Performing Arbitration in COHA, 1950–1964
Entity Type |
Total |
% |
board(s) |
19 |
63.3% |
commission |
4 |
13.3% |
committee |
1 |
3.3% |
court |
3 |
10.0% |
panel |
2 |
6.7% |
tribunal |
1 |
3.3% |
As is evident, it is possible to refer to the entity that is performing arbitration as a tribunal—in this instance a tribunal to handle disputes over the Suez Canal constituting one member named by Egypt, one by the complaining party, and the third by both together or by the International Court of Justice in The Hague. (The coder deemed the source of this arbitration tribunal’s authority to be governmental in nature.) But from 1950–1964 in the representative sample of more “ordinary” American English we examined, it was rare to refer to an entity performing arbitration as a tribunal.
4. COSCO-US
We performed the same analysis in COSCO-US to see what type of entity the U.S. Supreme Court referenced as performing arbitration. We only sampled the first instance if the term arbitration was used more than once in an opinion, treating majority and separate opinions as distinct. We also limited our results from 1789 to 1964. This resulted in 88 instances, though again, an overwhelming majority (73%) did not reveal the entity type performing the arbitration. Below are the results we found when we could determine the entity type.
Type of Entity Performing Arbitration in COSCO-US, 1789–1964
Entity Type |
Total |
% |
association |
1 |
4.2% |
board |
12 |
50.0% |
body |
1 |
4.2% |
commission |
3 |
12.5% |
committee |
2 |
8.3% |
tribunal |
5 |
20.8% |
Here we see that the Supreme Court refers to the entity that performs arbitration as a tribunal about a fifth of the time, though it is not the most common term, which is board. Of these five instances of tribunal, in one the Court referred to the entity both as a tribunal and as a commission. In another, it referred to the entity as both a court and a tribunal and seemed to be referring to a court proceeding as arbitration. The other three instances all seem to refer to an international tribunal of arbitration between the United States and Great Britain that was created by treaty and convened in Geneva, Switzerland to handle claims that arose out of the Civil War.
In sum, whether in more ordinary American English or in legal American English, at least as used by the U.S. Supreme Court, entities performing arbitration are unlikely to be referred to as a tribunal. This is further evidence that our findings for foreign tribunal are not driven by something other than linguistic usage.
Conclusion
In ZF Automotive US v. Luxshare, the parties have presented the Court with what Justice Frankfurter would call a “contest between probabilities of meaning.” But the methodologies and evidence presented by the parties to resolve that contest—dueling dictionaries and small samples of usage of the individual words of a multi-word term—were inadequate. After sampling 259 usages of the terms foreign tribunal and foreign tribunals across collections of texts using both ordinary and legal American English—including U.S. Supreme Court and federal court opinions, the U.S. Code, and U.S. legal scholarship—the data overwhelmingly show that the term foreign tribunal(s) was used in the sense of an entity using government authority to resolve a dispute, almost always a court. While there may be additional considerations the Court should take into account in resolving the legal question before it, the linguistic question is very clear: the term foreign tribunal seldom referred to a private arbitration body in American English prior to 1965, and the entity that was referred to as conducting arbitration was usually called something other than a tribunal.