His billion-word online corpora aid study of language, culture
Davies makes his corpora by downloading and scanning texts from various sources
(in the olden days, they had to be hand-compiled), categorising them by genre,
and feeding them into his software that tags all the words. The resultant
database can then be searched, though it leans a bit toward the technical
side.Political spin? Yes; by default, the media data include
media-preferred language. Spin is two-edged, though. A common anecdote of
Davies is the negative traits associated with "Republican" and the
positive ones with "Democrat" that showed up in the news data, instantly
exposing its bias. Fiction would be closer to language on the ground. Spoken
data goes both ways, since it's largely from national television
transcripts. But the assumption that overanalysis obscures the truth is
backward here.Tithing? BYU did buy some extra hardware, but most
support came from the NEH. Still taxpayer money, anyway.A time
waster? Well, we all have our hobbies that our busy-body neighbours might tell
us to abandon to spend more time vegetating with family. Seeing that language
is man's interface with God, family, and the world, I wouldn't
overlook the importance of understanding it better.
As I understand it, Professor Davies is trying to understand words by the
context they are used? Interesting endeavor. Best of Luck.
Yep I thought about all of these things in jr. high. But I wasn't married. Don't
you have loved ones to spend time with? Shouldn't they be receiving every free
moment of your time? Analyze this for three hours, miss another three hours with
the person who chose to marry you.
I'm just trying to make enough money to feed my family and pay tithing so I can
fund this program at the Y
I dig the stuff. (Get it?). My linguistics prof. Herr Ludvig would ask question
from a German point of view. Kind of not important in the big picture, but
really interesting,Most annoying and over used phrase of 011? "going
Go BYU linguistics!
This sort of reminds me of Richard Wirthlin's "Quantifiably Safe
Rhetoric."The whole dynamic changes when wordsmiths become spin
doctors.Too much analysis leads to political correctness over
truth-telling.And heaven knows we need more truth and less hype.
How is a corpus formed? This article didn't seem to address that key question.