next up previous contents
Next: Medical diagnosis: Up: Events and probabilities Previous: Conditional probabilities and independence:

Bayes rule

Furthermore, because P(sherlockn-1,holmesn) means exactly the same thing as P(holmesn,sherlockn-1) it follows that:

\begin{displaymath}
P(sherlock_{n-1}\vert holmes_{n}) \times P(holmes_{n}) \equiv 
 P(holmes_{n}\vert sherlock_{n-1}) \times P(sherlock_{n-1})\end{displaymath}

This equivalence works for any pair of words, in the form:

\begin{displaymath}
P(w_{n-1}\vert w_{n}) \times P(w_{n}) \equiv 
 P(w_{n}\vert w_{n-1}) \times P(w_{n-1})\end{displaymath}

You can then divide through by P(wn) to get the usual form of Bayes' rule. This is:

\begin{displaymath}
P(w_{n}\vert w_{n-1}) = \frac{P(w_{n-1}\vert w_{n}) \times 
 P(w_{n})}{P(w_{n-1})}
 \end{displaymath}

At first sight, all this algebra looks circular, because it only tells you how to calculate one probability on the basis of another which looks nearly identical. To understand the reason why this isn't always so, it's best to step aside from linguistics for a moment and consider an example from medicine.



Chris Brew
8/7/1998