U.S. Supreme Court uses corpus created by BYU professor Mark Davies
His billion-word online corpora aid study of language, culture
PROVO — Linguists like to joke that you can tell a lot about a word by the words it hangs out with. And 50 years ago, "gay" was hanging out with "grave" and "brilliant" while "sex" palled around with "hygiene" and "conflicts." Today, the words "gay" and "sex" have much more controversial companions, illustrating not only a change in the grammatical and lexical structure of the English language, but a cultural shift as well — all seen through the study of words.
"I love linguistics," said BYU linguistics professor Mark Davies. "I love looking at how and why language changes, but I'm equally as interested in history and culture, and language can serve as a beautiful window on that."
Davies, regarded by many in the linguistic community as a standard setter, has created a window of more than 1 billion words, gathered from books, magazines, newspapers, academic sources and transcribed interviews.
His corpora, plural for corpus, the Latin word meaning a body or collection of writings used for analysis, are the largest, free collections of English words on the Internet, searched by tens of thousands of users each month, from linguists, teachers, students to district Judges and Supreme Court justices, all trying to make sense of this odd language we call English.
Corpora in courtooms
Thirty years ago, when lawyers or judges disagreed on a word's meaning, there were two solutions: dictionaries or telephone surveys.
"Both of them are unreliable," said BYU linguistics professor and department chair William Eggington. "Dictionaries are usually way behind the times and usually don't cover the full range of the meaning of the word, and in a dictionary, there's no way to measure frequency, how often this meaning is used. And surveys, they're hit and miss. But then the corpus comes along and changes everything."
Suddenly, instead of relying on stale definitions or unscientific survey methods that left room for doubt, judges and attorneys could turn to hefty databases that painted a much more accurate picture of words in context, he said.
These corpora are even finding their way into high-profile cases, like the March 1 Supreme Court decision, where Chief Justice John Roberts cited corpus data as a foundation for limiting the descriptive ability of 'personal' to people, not corporations.
AT&T had been asking for a "personal exemption" so they didn't have to reveal certain financial documents. It made perfect sense, the company argued, because legally a business can already be considered a "person."
- KSL-TV welcomes 2 new anchors, new format
- Utah woman adopted as baby faces deportation...
- If you want to live a long time, stay in school
- Final movement: Retiring violinist reflects...
- Weekend rescuers save horse in basement,...
- Clinton man arrested in shooting death of...
- Dangerous silence: Why you need to talk to...
- Identities released in St. George fatal plane...
- Is this dress too short? Tooele teen...
58 - Dangerous silence: Why you need to talk...
27 - Studies try to find why poorer people...
27 - Sarah Palin catches flak over her Orrin...
24 - Liljenquist pushing to make name for...
21 - KSL-TV welcomes 2 new anchors, new format
17 - Several Utah high schools moving to...
13 - Utah woman adopted as baby faces...
12







DeseretNews.com encourages a civil dialogue among its readers. We welcome your thoughtful comments.
— About comments