Abstract
Many of the most effective compression methods involve complicated models. Unfortunately, as model complexity increases, so does the cost of storing the model itself. This paper examines a method to reduce the amount of storage needed to represent a Markov model with an extended alphabet, by applying a clustering scheme that brings together similar states. Experiments run on a variety of large natural language texts show that much of the overhead of storing the model can be saved at the cost of a very small loss of compression efficiency.
| Original language | English |
|---|---|
| Pages (from-to) | 367-376 |
| Number of pages | 10 |
| Journal | Data Compression Conference Proceedings |
| State | Published - 1997 |
| Externally published | Yes |
| Event | Proceedings of the 1997 Data Compression Conference, DCC'97 - Snowbird, UT, USA Duration: 25 Mar 1997 → 27 Mar 1997 |
Bibliographical note
Funding Information:* The work of the first author (AB) was supported, in part, by NSF Grant IRI-9307895-A01. The author gratefully acknowledges this support. We also wish to acknowledge support given by the Academy of Finland to TR, t To whom all correspondence should be addressed: tel: (773) 702-8268, fax: (773) 702-9861, [email protected]; [email protected], and [email protected]
Funding
* The work of the first author (AB) was supported, in part, by NSF Grant IRI-9307895-A01. The author gratefully acknowledges this support. We also wish to acknowledge support given by the Academy of Finland to TR, t To whom all correspondence should be addressed: tel: (773) 702-8268, fax: (773) 702-9861, [email protected]; [email protected], and [email protected]
| Funders | Funder number |
|---|---|
| National Science Foundation | IRI-9307895-A01 |
Fingerprint
Dive into the research topics of 'Overhead reduction technique for mega-state compression schemes'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver