Quasi-distinct parsing and optimal compression methods

Amihood Amir, Yonatan Aumann, Avivit Levy, Yuri Roshko

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

2 Scopus citations


In this paper, the optimality proof of Lempel-Ziv coding is re-studied, and a much more general compression optimality theorem is derived. In particular, the property of quasi-distinct parsing is defined. This property is much weaker than distinct parsing required in the original proof, yet we show that the theorem holds with this weaker property as well. This provides a better understanding of the optimality proof of Lempel-Ziv coding, together with a new tool for proving optimality of other compression schemes. To demonstrate the possible use of this generalization, a new coding method - the APT coding - is presented. This new coding method is based on a principle that is very different from Lempel-Ziv's coding. Moreover, it does not directly define any parsing technique. Nevertheless, APT coding is analyzed in this paper and using the generalized theorem shown to be asymptotically optimal up to a constant factor, if APT quasi-distinctness hypothesis holds. An empirical evidence that this hypothesis holds is also given.

Original languageEnglish
Title of host publicationCombinatorial Pattern Matching - 20th Annual Symposium, CPM 2009, Proceedings
Number of pages14
StatePublished - 2009
Event20th Annual Symposium on Combinatorial Pattern Matching, CPM 2009 - Lille, France
Duration: 22 Jun 200924 Jun 2009

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume5577 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference20th Annual Symposium on Combinatorial Pattern Matching, CPM 2009

Bibliographical note

Funding Information:
The first author was partly supported by ISF grant 35/05. The third author is partly supported by Israel Science Foundation grant 347/09.


Dive into the research topics of 'Quasi-distinct parsing and optimal compression methods'. Together they form a unique fingerprint.

Cite this