Abstract
Recognizing shallow linguistic patterns, such as basic syntactic relationships between words, is a common task in applied natural language and text processing. The common practice for approaching this task is by tedious manual definition of possible pattern structures, often in the form of regular expressions or finite automata. This paper presents a novel memory-based learning method that recognizes shallow patterns in new text based on a bracketed training corpus. The examples are stored as-is, in efficient data structures. Generalization is performed on-line at recognition time by comparing subsequences of the new text to positive and negative evidence in the corpus. This way, no information in the training is lost, as can happen in other learning systems that construct a single generalized model at the time of training. The paper presents experimental results for recognizing noun phrase, subject-verb and verb-object patterns in English.
Original language | English |
---|---|
Pages (from-to) | 369-390 |
Number of pages | 22 |
Journal | Journal of Experimental and Theoretical Artificial Intelligence |
Volume | 11 |
Issue number | 3 |
DOIs | |
State | Published - 1999 |
Bibliographical note
Funding Information:The research work has been funded by the Natural Science Foundation of China under Grant No. 61303181.
Funding
The research work has been funded by the Natural Science Foundation of China under Grant No. 61303181.
Funders | Funder number |
---|---|
National Natural Science Foundation of China | 61303181 |
Keywords
- Chunking
- Machine learning
- Memory based learning
- Natural language processing
- Noun-phrases
- Sequential patterns
- Shallow parsing
- Statistical language processing