Next: Ngram Statistics Package
Up: Statistical Language Modeling
Previous: CMU-Cambridge Statistical Language Modeling
Contents
Description
The software described herein is for finding pairs of words which co-occur with
high frequency in text. Starting from a corpus of text--several million words
is what we have in mind--the program automatically discovers ordered pairs of
words where the occurrence of the first word in a pair makes the subsequent
appearance of the second word much more likely than it otherwise would be. For
example, the toolkit might discover that the word "patient" augurs the imminent
appearance of "drug." To rank pairs of words, the toolkit uses mutual
information. For more details about mutual information and why triggers might
be useful, see the references.
Author Adam Berger
License Academic Free
URL http://www.cs.cmu.edu/~aberger/software
Notes
Trigger Toolkit is located at $NLP/share/slm/trigger/.
binary programs:
findPairs, rankPairs, writeScripts
Next: Ngram Statistics Package
Up: Statistical Language Modeling
Previous: CMU-Cambridge Statistical Language Modeling
Contents
Zhang Le
2003-10-26