next up previous contents
Next: Manipulating the results Up: Stuttgart corpus tools Previous: Getting started

Queries

XKWIC supports quite complicated queries through its Corpus Query Processor. We list here some examples of possible queries. There is a separate manual on CPQ which gives more detail about possible queries and about how the queries are processed.

"research"
[word = "research"]
Both queries search for all occurrences of the word ``research''.

[word = "research.*"]
Search for all words starting with ``research''.

[lemma = "research"]
Search for all words related to the lemma ``research''.

[pos = "JJ"]
Search for all occurrences of words tagged as adjectives (i.e. with the part of speech tag ``JJ'').

[word="research" & pos="JJ|NN"]
Search for all occurrences of the word ``research'' tagged as an adjective or a noun.

[lemma = "research" & pos != "V.*"]
Search for occurrences of the lemma ``research'' whose part of speech does not start with ``V'' (i.e. which are not tagged as VB--verb, base form; VBD-verb, past tense; VBG-verb, gerund; etc).

[lemma = "research"] "a|the"
Search for the lemma ``research'' followed by the words ``a'' or ``the''.

[pos = "JJ" & word !="such"] [lemma="research"]
Find all adjectives other than ``such'' that precede the lemma ``research''.

[lemma="research"] []* "funding"
The lemma ``research'' followed sometime later by the word ``funding''. There is no restriction on the naure or amount of material intervening between ``research'' and ``funding''.

[lemma="research"] []* "funding" within s
As before, but the word ``funding'' should occur in the same sentence as the word ``research''.

Exercise:

Search the BNC for uses of the word ``zap''. Does ``zap'' ever occur as an adjective? Does it occur as a noun? Do you agree that all the occurrences found are indeed nouns?

Solution:

Select the BNC (using the Question Mark button) and launch the query [word="zap" & pos="JJ"]. This reveals that the word ``zap'' never occurs as an adjective. When you launch the query [word="zap" & pos="N.*"] on the BNC, it returns examples like ``it will become illegal to zap food with radiation''. This is clearly a verb rather than a noun, suggesting that some of the part-of-speech tagging may have been wrong.
Exercise:

Can you see what the difference will be between the following searches: (i) [word = "research.*"], (ii) [lemma = "research"], and (iii) [word = "research.*"]? Which ones return the same result, and is this by accident or by necessity?

Solution:

The first query will find all words that start with ``research'', including ``research-led'' or ``research-intensive''. Search (ii) will return words morphologically or inflectionally derived from ``research'', which will exclude compounds like ``research-intensive''. (iii) will return words morphologically derived from all words that start with ``research''. Since all these morphological derivations also happen to start with ``research'', the result of (i) and (iii) will be the same.

next up previous contents
Next: Manipulating the results Up: Stuttgart corpus tools Previous: Getting started
Chris Brew
8/7/1998