[go: up one dir, main page]

Published on

Run-length Encoding Algorithm in JavaScript

Authors
  • avatar
    Name
    Nik Shevchenko
    Twitter

Abstract

Compression purpose to reduce the size of data and speed up the data transmission process between client and server. To solve the size problem of the data and transmission process, I implement Run-length Encoding (rle) algorithm for JavaScript. This implementation currently works only for binary data (bi-level) and rle is better for this case based on the results of research.

Introduction

Run-length Encoding is a form of lossless data compression in which a stream of data is given the array of numbers (i.e. [0, 0, 0, 1, 1, 0, 0]) and the output is a sequence of counts of consecutive data values in array (i.e. [3, 0, 2, 1, 2, 0]). For example, you want to use Image Segmentation with predictions and completions data and you need to work with binary data between the client (browser) and server. But every GET or POST request with binary data will be to large. Image with the natural size 5000px x 5000px will be too large for good working of your user experience, matrix 5000 x 5000 in binary data ~195MB as an array and ~25MB as a string. I recommend converting an array with numbers to a string because you save 400% (8 bytes / 2 bytes).

Algorithm

RLE works by reducing the physical size of repeating characters. And this repeating characters, called run, is typically encoded into two bytes. The first byte represents the number of characters in the run and is called the run count.

Uncompressed, a character run of 20 0 would normally require 20 bytes to store: [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0] The same string after RLE encoding would require only two bytes: [20, 0]

Method

Compression data:

  1. Check of the current value with the neighbors values, when current value same with neighbor's, then combine the values into one and add a counter to the values.
  2. Go to the next value and if the current value isn't the same with the neighbor's value then save the current value and repeat the first step back.
  3. After processes 1 and 2 finished, then save compression result as an array.
function encode(arr) {
	let encoding = [],
		previous,
		count;

	for (count = 1, previous = arr[0], i = 1; i < arr.length; i++) {
		if (arr[i] !== previous) {
			encoding.push(count, previous);
			count = 1;
			previous = arr[i];
		} else {
			count++;
		}
	}

	/**
	 * Add a last pair
	 */
	encoding.push(count, previous);

	return encoding;
}

Decompression data:

  1. Check the current value with the neighbor's value, neighbor's value is the amount of current value.
  2. Wrote the current value as much as the neighbor's value who has been checked.
  3. Go to the next value and repeat the first step and second step until all back to the original values.
function decode(encoded) {
	let uncompressed = [];

	/**
	 * Create a new array with decoded data
	 */
	encoded.map((element, ind) => {
		if (ind % 2 === 0) {
			uncompressed.push(...Array(element).fill(encoded[ind + 1]));
		}
	});

	return uncompressed;
}

Example of compression data

Input data:

[0, 0, 0, 0, 0, 0, 0, 1, 1, 1, 0, 0, 0, 0, 0]

Output data:

[7, 0, 3, 1, 5, 0]

We were able to compress the data from 15 elements down to 6. (60%)

Implementation

I create a tiny (117 bytes) javascript library on github: rle-data

If you want to use implementation, you can add lib to your project with npm:

npm i --save-dev rle-data

let rle = require("rle-data");

rle.encode([0, 0, 0, 0, 0, 1, 1, 0, 0]); // encode data
rle.decode([4, 0, 4, 1, 3, 0]); // decode data

References