TY - GEN
T1 - Optimal partitioning of data chunks in deduplication systems
AU - Hirsch, Michael
AU - Ish-Shalom, Ariel
AU - Klein, Shmuel T.
N1 - Place of conference:Czech Republic
PY - 2013
Y1 - 2013
N2 - Deduplication is a special case of data compression in which repeated chunks of data are stored only once. For very large chunks, this process may be applied even if the chunks are similar and not necessarily identical, and then the encoding of duplicate data consists of a sequence of pointers to matching parts. However, not all the pointers are worth being kept, as they incur some storage overhead. A linear, sub-optimal solution of this partition problem is presented, followed by an optimal solution with cubic time complexity and requiring quadratic space.
AB - Deduplication is a special case of data compression in which repeated chunks of data are stored only once. For very large chunks, this process may be applied even if the chunks are similar and not necessarily identical, and then the encoding of duplicate data consists of a sequence of pointers to matching parts. However, not all the pointers are worth being kept, as they incur some storage overhead. A linear, sub-optimal solution of this partition problem is presented, followed by an optimal solution with cubic time complexity and requiring quadratic space.
UR - http://www.scopus.com/inward/record.url?scp=84884615333&partnerID=8YFLogxK
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84884615333
SN - 9788001053300
T3 - Proceedings of the Prague Stringology Conference 2013, PSC 2013
SP - 128
EP - 141
BT - Proceedings of the Prague Stringology Conference 2013, PSC 2013
T2 - Prague Stringology Conference 2013, PSC 2013
Y2 - 2 September 2013 through 4 September 2013
ER -