next up previous contents
Next: Ngram Statistics Package Up: Statistical Language Modeling Previous: CMU-Cambridge Statistical Language Modeling   Contents

Trigger Toolkit

Description
The software described herein is for finding pairs of words which co-occur with high frequency in text. Starting from a corpus of text--several million words is what we have in mind--the program automatically discovers ordered pairs of words where the occurrence of the first word in a pair makes the subsequent appearance of the second word much more likely than it otherwise would be. For example, the toolkit might discover that the word "patient" augurs the imminent appearance of "drug." To rank pairs of words, the toolkit uses mutual information. For more details about mutual information and why triggers might be useful, see the references.

Author Adam Berger
License Academic Free
URL http://www.cs.cmu.edu/~aberger/software
Notes
Trigger Toolkit is located at $NLP/share/slm/trigger/.

binary programs: findPairs, rankPairs, writeScripts


next up previous contents
Next: Ngram Statistics Package Up: Statistical Language Modeling Previous: CMU-Cambridge Statistical Language Modeling   Contents
Zhang Le 2003-10-26