Next: Applying probabilities to Data-Intensive
Up: Probability and Language Models
Previous: Results
This chapter has introduced the basics of probability and statistical
language modelling:
- Events are things which might or might not happen.
- Many processes can be thought of as long sequences of
equivalent trials. Counting things over long sequences
of trials yields probabilities.
- Bayes' theorem lets you unpack probabilities into
contributions from different sources. These
conditional probabilities provide a means
for reasoning probabilistically
about causal relationships between events. You can do this
even if you are guessing some of the parameters.
- There is a close connection between bigrams, contingency tables
and conditional probabilities.
- It is often worthwhile to work with simplified models of
probabilistic processes, because they allow you to get
estimates of useful quantities which are otherwise inaccessible.
- In language processing you need to be alert to the consequences
of limited training data, which can mean that the theoretically
ideal answer needs adjustment to work in the real world.
- Language identification is a relatively simple illustration of
these ideas.
In chapter 9 we add
basic information theory to the repertoire which
we have already developed. We will show the application
of this tool to a word-clustering problem
Then in chapter
we bring back the n-gram models introduced in the current chapter,
combining them with information theoretic ideas to explain the
training algorithm which makes it possible for part-of-speech
taggers and speech recognisers to work as well as they do.
Next: Applying probabilities to Data-Intensive
Up: Probability and Language Models
Previous: Results
Chris Brew
8/7/1998