Next: Panini
Up: The history
Previous: The history
The first thing that has to happen in order for linguistics to be a
remotely
plausible enterprise is that language must be available in permanent
form. By about 3000 B.C. this had happened for Egyptian hieroglyphics,
as well as other written languages.
In most cases the texts were incidental by-products of commerce and
trade.
Reliable recordings of music
and speech had
to wait until the late 19th century, or mid-twentieth century if
high-quality reproduction is required.
Data availability is a prerequisite for many forms of scientific
endeavour: For example:
- Indonesia has undoubtedly had a rich literary tradition, but
the climate makes it highly improbable that paper documents
will survive for very long. This limits the potential for
diachronic literary studies.
- Nobody really knows how Classical Greek was pronounced. While it
is possible to make inferences from contemporary
descriptions of the language, from the patterns found in poetry,
and from the spellings of evidently onomatapoetic [check spelling]
words, there are many areas in which doubt must remain.
- Little can be said about the acoustics and physiology of the
European-trained
castrato voice. There is one early recording of a singer from the
Papal chapel, recorded in [whenever] when he was well past his prime.
While this may be of some interest, one sample is no basis on which to
draw any but the most cautious inferences about such singers.
Early music specialists constantly face the problem of making
intelligent
inferences from many different grossly incommensurable sources, and
must
learn to live with the inevitable uncertainty.
This problem of exploiting limited partial information arises again
and
again in data-intensive linguistics.
By around 1000 B.C there was a substantial body of authoritative texts
which we now recognise as the Hebrew Bible.
Note that this body of text is a more or less closed colection of
texts
imbued with particular authority, and that major social engineering
would be necessary to add or subtract anything.
Texts which are like this are usually termed ``canonical''.
This is in marked
contrast to a public library, where the contents are open-ended and
constantly changing. An important question for data-intensive linguisticsis whether
language, seen as the object of study, is more like a canon, more like
a public library, or even more like the sort of chat which you
randomly overhear on a bus.
Next: Panini
Up: The history
Previous: The history
Chris Brew
8/7/1998