The performance of data compression on a large static text may be improved if certain variable-length strings are included in the character set for which a code is generated. A new method for extending the alphabet is presented, based on a reduction to a graph-theoretic problem. A related optimization problem is shown to be NP-complete, a fast heuristic is suggested, and experimental results are presented.
|Title of host publication||Language, Culture, Computation. Computing-Theory and Technology|
|Editors||Nachum Dershowitz, Ephraim Nissan|
|Publisher||Springer Berlin Heidelberg|
|Number of pages||16|
|State||Published - 2014|
|Name||Lecture Notes in Computer Science|