next up previous contents
Next: Be conservative (small c) Up: Generating your own corpus Previous: Extraneous factors

Human factors in annotation

Ideally, use the same phoneticians, parse tree annotators and semanticists for all your 100,000,000 words. Where human judgement is involved make sure there are at least 2 annotators, and provide explicit guidelines for how the annotation is to be done.

If you want to make claims about how well people can agree on the task, ensure that annotators work independently. If this is impossible, or not scientifically worthwhile, make sure that you define the procedure for resolving disagreements in advance. You might take a majority vote or insist upon unanimity.

If annotators differ, record this in the publicly available form of your corpus. There is no reason to suppress information for the sake of clarity.



Chris Brew
8/7/1998