[go: up one dir, main page]

Jump to content

LZWL: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
SmackBot (talk | contribs)
m Date the maintenance tags or general fixes
Avoyt (talk | contribs)
m Fixed a link.
 
(35 intermediate revisions by 30 users not shown)
Line 1: Line 1:
{{Multiple issues|
{{Notability|date=February 2008}}
{{more citations needed|date=January 2013}}
{{Tone|date=August 2009}}


}}
'''LZWL''' is a syllable-based variant of the character-based [[LZW]] compression algorithm.


'''LZWL''' is a syllable-based variant of the [[Lempel–Ziv–Welch|LZW (Lempel-Ziv-Welch) compression algorithm]], designed to work with syllables derived from any syllable decomposition algorithm. This approach allows LZWL to efficiently process both syllables and words, offering a nuanced method for data compression.
LZWL can work with syllables obtained by all algorithms of decomposition into syllables. This algorithm can be used for words too.


==Syllables==
==Algorithm==
The LZWL algorithm initializes by populating a dictionary with all characters from the alphabet. It then searches for the longest string, '''S''', that exists in both the dictionary and as a prefix of the unencoded portion of the input. The algorithm outputs the identifier of '''S''' and augments the dictionary with a new phrase, which combines '''S''' with the subsequent character in the input. The input position advances by the length of '''S'''. During decoding, LZWL addresses scenarios where the received phrase identifier does not exist in the dictionary by constructing the missing phrase from the concatenation of the last added phrase and its initial character.
According Compact Oxford English Dictionary syllable is defined as: ‘A unit of pronunciation having one vowel sound, with or without surrounding consonants,
and forming all or part of a word.’


===Syllable-Based Adaptation===
As the decomposition to syllables is used in data compression, it is not necessary to decompose words into syllables always correctly.
In its syllable-based adaptation, LZWL employs a list of syllables as its alphabet. The initialization step includes the empty syllable and integrates small, frequently occurring syllables into the dictionary. Identifying '''S''' and encoding its identifier mirrors the original algorithm, with the distinction that '''S''' represents a syllable string. If '''S''' is an empty syllable, the algorithm extracts a syllable '''K''' from the input and encodes '''K''' using methods for new syllables before adding '''K''' to the dictionary and advancing the input position accordingly.

==Algorithm==
Algorithm LZWL can work with syllables obtained by all algorithms of decomposition into syllables. This algorithm can be used for words too.
<br><br>
In the initialization step the dictionary is filled up with all characters from the alphabet. In each next step it is searched for the maximal string '''S''', which is from the dictionary and matches the prefix of the still non-coded part of the input. The number of phrase '''S''' is sent to the output. A new phrase is added to the dictionary. This phrase is created by concatenation of string S and the character that follows '''S''' in file. The actual input position is moved forward by the length of '''S'''.
Decoding has only one situation for solving. We can receive the number of phrase, which is not from the dictionary. In this case we can create that phrase by concatenation of the last added phrase with its first character.
<br><br>
The syllable-based version works over an alphabet of syllables. In the initialization step we add to the dictionary the empty syllable and small syllables from a database of frequent syllables. Finding string '''S''' and coding its number is similar to the character-based version, except that string '''S''' is a string of syllables. The number of phrase '''S''' is encoded to the output. The string '''S''' can be the empty syllable.
<br><br>
If '''S''' is the empty syllable, then we must get from the file one syllable called K and encode K by methods for coding new syllables. Syllable K is added to the dictionary. The position in the file is moved forward by the length of '''S'''. In the case when S is the empty syllable, the input position is moved forward by the length of K.
<br><br>
In adding a phrase to the dictionary there is a difference to the character-based version. The phrase from the next step will be called S1. If '''S''' and S1 are both non-empty syllables, then we add a new phrase to the dictionary. The new phrase is created by the concatenation of S1 with the first syllable of '''S'''. This solution has two advantages: The first is that strings are not created from syllables that appear only once. The second advantage is that we cannot receive in decoder number of phrase that is not from dictionary.


===Dictionary Expansion===
==External links==
A notable variation in the syllable-based LZWL involves dictionary expansion. When both '''S''' and the subsequent string '''S1''' are non-empty syllables, a new phrase is added to the dictionary by concatenating '''S1''' with '''S'''’s initial syllable. This method prevents the formation of strings from syllables that appear only once and ensures the decoder does not encounter undefined phrase identifiers.
*[http://ftp.informatik.rwth-aachen.de/Publications/CEUR-WS/Vol-176/paper5.pdf Detailed description]


==References==
{{Compression Methods}}
{{reflist}}
* [http://www.cs.vsb.cz/dateso/2005/slides/slides6.pps Data Compression Using the LZWL Algorithm]
* Salomon, David; Motta, Giovanni (2010-01-18). ''Handbook of Data Compression''. Springer. ISBN 9781848829039. [https://books.google.com/books?id=QfTzSTcTalAC&pg=PA1 Google Books]. Retrieved 2014-07-11.


==Categories==
{{Comp-sci-stub}}
[[Category:Data compression]]
[[Category:Lossless compression algorithms]]

Latest revision as of 06:46, 15 May 2024

LZWL is a syllable-based variant of the LZW (Lempel-Ziv-Welch) compression algorithm, designed to work with syllables derived from any syllable decomposition algorithm. This approach allows LZWL to efficiently process both syllables and words, offering a nuanced method for data compression.

Algorithm

[edit]

The LZWL algorithm initializes by populating a dictionary with all characters from the alphabet. It then searches for the longest string, S, that exists in both the dictionary and as a prefix of the unencoded portion of the input. The algorithm outputs the identifier of S and augments the dictionary with a new phrase, which combines S with the subsequent character in the input. The input position advances by the length of S. During decoding, LZWL addresses scenarios where the received phrase identifier does not exist in the dictionary by constructing the missing phrase from the concatenation of the last added phrase and its initial character.

Syllable-Based Adaptation

[edit]

In its syllable-based adaptation, LZWL employs a list of syllables as its alphabet. The initialization step includes the empty syllable and integrates small, frequently occurring syllables into the dictionary. Identifying S and encoding its identifier mirrors the original algorithm, with the distinction that S represents a syllable string. If S is an empty syllable, the algorithm extracts a syllable K from the input and encodes K using methods for new syllables before adding K to the dictionary and advancing the input position accordingly.

Dictionary Expansion

[edit]

A notable variation in the syllable-based LZWL involves dictionary expansion. When both S and the subsequent string S1 are non-empty syllables, a new phrase is added to the dictionary by concatenating S1 with S’s initial syllable. This method prevents the formation of strings from syllables that appear only once and ensures the decoder does not encounter undefined phrase identifiers.

References

[edit]

Categories

[edit]