请输入您要查询的百科知识:

 

词条 Incremental encoding
释义

  1. Applications

  2. References

Incremental encoding, also known as front compression, back compression, or front coding, is a type of delta encoding compression algorithm whereby common prefixes or suffixes and their lengths are recorded so that they need not be duplicated. This algorithm is particularly well-suited for compressing sorted data, e.g., a list of words from a dictionary.

For example:

myxamyxophytamyxopodnabnabbednabbingnabitnabknabobnacaratnacelle
no preceding word'myx''myxop'no common prefix'nab''nabb''nab''nab''nab''na''nac'
0 myxa3 ophyta5 od0 nab3 bed4 ing3 it3 k3 ob2 carat3 elle
Input Common prefix Compressed output
64 bytes46 bytes

The encoding used to store the common prefix length itself varies from application to application. Typical techniques are storing the value as a single byte; delta encoding, which stores only the change in the common prefix length; and various universal codes. It may be combined with other general lossless data compression techniques such as entropy encoding and dictionary coders to compress the remaining suffixes.

Applications

Incremental encoding is widely used in information retrieval to compress the lexicons used in search indexes; these list all the words found in all the documents and a pointer for each one to a list of locations. Typically, it compresses these indexes by about 40%.[1]

As one example, incremental encoding is used as a starting point by the GNU locate utility, in an index of filenames and directories. The GNU locate utility further uses bigram encoding to further shorten popular filepath prefixes.

References

1. ^Ian H. Witten, Alistair Moffat, Timothy C. Bell. Managing Gigabytes. Second edition. Academic Press. {{ISBN|1-55860-570-3}}. Section 4.1: Accessing the lexicon, subsection Front coding, pp.159–161.
{{storage-software-stub}}

2 : Lossless compression algorithms|Database index techniques

随便看

 

开放百科全书收录14589846条英语、德语、日语等多语种百科知识,基本涵盖了大多数领域的百科知识,是一部内容自由、开放的电子版国际百科全书。

 

Copyright © 2023 OENC.NET All Rights Reserved
京ICP备2021023879号 更新时间:2024/11/14 0:06:32