What is the superior compression algorithm? The most famous

Thread replies: 33
Thread images: 1

Anonymous
2017-06-29 12:15:43 Post No. 61137114
[Report] Image search: [Google]

File: zip-1[1].png (1MB, 500x500px) Image search: [Google]

Anonymous 2017-06-29 12:15:43 Post No. 61137114 [Report]

What is the superior compression algorithm?

The most famous ones are of course the zip and rar, but zip is dead and done for any professional work, since it was designed back when computers had 64KB of ram and is now outdated as hell.
So what should a neckbeard use to archive his millions of text files images and other stuff today if he wants to keep up with the times?

Anonymous 2017-06-29 12:17:19 Post No.61137124
[Report]

Anonymous 2017-06-29 12:17:19 Post No.61137124 [Report]

LZMA2 probs

Anonymous 2017-06-29 12:20:43 Post No.61137155
[Report]

Anonymous 2017-06-29 12:20:43 Post No.61137155 [Report]

>>61137124
Probably, which means you could just use 7zip, as .7z uses LZMA

Anonymous 2017-06-29 12:20:57 Post No.61137158
[Report]

Anonymous 2017-06-29 12:20:57 Post No.61137158 [Report]

Electro optical attenuators or FET are the way to go.

Anonymous 2017-06-29 12:26:34 Post No.61137206
[Report]

Anonymous 2017-06-29 12:26:34 Post No.61137206 [Report]

How does file compression work?

Anonymous 2017-06-29 12:28:52 Post No.61137231
[Report]

Anonymous 2017-06-29 12:28:52 Post No.61137231 [Report]

>>61137114
>he wants to compress already compressed images
>he doesn't know what happens when compressing high entropy files

Anonymous 2017-06-29 12:41:58 Post No.61137382
[Report]

Anonymous 2017-06-29 12:41:58 Post No.61137382 [Report]

>>61137231
>>he doesn't know what happens when compressing high entropy files
Nothing. Literally.

Anonymous 2017-06-29 12:45:57 Post No.61137426
[Report]

Anonymous 2017-06-29 12:45:57 Post No.61137426 [Report]

>>61137206

Finds repeating strings in the data and replaces with a dictionary mark instead. For a really, really simple example lets pretend your file is, at its core, this string

AKGISGUAPWKGISFJWUKGISODSPKGIS

You'll notice KGIS appears 4 times. So when compressed you can use dictionary markings in their place with a single reference. Your compressed archive now looks like this

A*GUAPW*FJWU*ODSP*

And a dictionary file says

* = KGIS

Congratulations, you just shaved out 16 characters of the file, saving space. When it decompresses it will replace the * with KGIS and your file restores to its original form.

Anonymous 2017-06-29 12:46:58 Post No.61137437
[Report]

Anonymous 2017-06-29 12:46:58 Post No.61137437 [Report]

>>61137426
>>61137206

Sorry, I meant to say you saved 12 characters. You replaced 16 characters with 4.

Anonymous 2017-06-29 12:49:34 Post No.61137465
[Report]

Anonymous 2017-06-29 12:49:34 Post No.61137465 [Report]

Who cares they're all worthless niggers until someone implements multithreaded decompression.

Anonymous 2017-06-29 12:51:03 Post No.61137480
[Report]

Anonymous 2017-06-29 12:51:03 Post No.61137480 [Report]

>>61137426
>>61137437
Thanks

Anonymous 2017-06-29 01:02:46 Post No.61137611
[Report]

Anonymous 2017-06-29 01:02:46 Post No.61137611 [Report]

>if he wants to keep up with the times?

Get an 8 TB drive for $180 and leave all your shit uncompressed.

Anonymous 2017-06-29 01:09:24 Post No.61137671
[Report]

Anonymous 2017-06-29 01:09:24 Post No.61137671 [Report]

>>61137426
There's more to it than that especially in certain applications like image compression

Anonymous 2017-06-29 01:16:29 Post No.61137741
[Report]

Anonymous 2017-06-29 01:16:29 Post No.61137741 [Report]

>>61137671
Not that guy but Image compression it's a different subject than general file compression.

Like saving the same .jpeg file multiple times with any edition image edition software will reduce the image quality drastically (jpeg it's like that by design) but compressing it with Winrar a million times wont.
However, the size reduction will be pretty small.

Anonymous 2017-06-29 01:21:09 Post No.61137790
[Report]

Anonymous 2017-06-29 01:21:09 Post No.61137790 [Report]

>>61137114
>superior compression

That'll be BASE64.

The "64" is because it tries the top 64 different algorithms on each block, using the most efficient each time.

Anonymous 2017-06-29 01:23:56 Post No.61137819
[Report]

Anonymous 2017-06-29 01:23:56 Post No.61137819 [Report]

>>61137114
LZMA in the 7zip container

Anonymous 2017-06-29 01:26:22 Post No.61137854
[Report]

Anonymous 2017-06-29 01:26:22 Post No.61137854 [Report]

>>61137114
your own custom version of Zlib

Anonymous 2017-06-29 01:36:48 Post No.61137958
[Report]

Anonymous 2017-06-29 01:36:48 Post No.61137958 [Report]

>>61137741
>>61137671
Lossless compression(FLAC,7z,PNG) differrs from lossy compression (mp3,jpg)

Anonymous 2017-06-29 01:42:04 Post No.61138036
[Report]

Anonymous 2017-06-29 01:42:04 Post No.61138036 [Report]

>>61137480

computerphile has a great set of videos about compression that goes into a lot more detail than the anon who replied to you

https://www.youtube.com/watch?v=Lto-ajuqW3w

here's a really good one about jpg compression too, which is a totally different ballgame: https://www.youtube.com/watch?v=Q2aEzeMDHMA

Anonymous 2017-06-29 02:19:43 Post No.61138395
[Report]

Anonymous 2017-06-29 02:19:43 Post No.61138395 [Report]

>>61137671
The breif history of ZIP:
At first there was to major ideas in compression:
- one (as told earlier) the marking of repeating sequnces
- and two, the Huffman coding https://en.wikipedia.org/wiki/Huffman_coding
(in short, it is common that data is structured by bytes, 8 bit groups, but if the frequency of some things are greater, like the character 'a' is present in the data in vast amounts, then we could mark it not by it's eight bit signiture, but by a 3, or 4 bit one (example: 1110) this means that we mark 'a' with less then half it's original size, however this leads to dynamic lenght of markings for each character (for example, let's say 'b' will be next, if the marking will be 1111 for 'b' and 'c' will be 11110, we will never detect 'c' because 11110 starts with 1111, which is 'b' and we will read the rest as a new character mark starting with 0), this there is a fixed Huffman coding table, the most used characters (and sequence markings) are 7bit signitures and the least are 9bit signitures, but it is possible to make a specialized Huffman table for each data, also one file is not necessary consist of one compression block, each block of compressed data can has it's own Huffman table https://en.wikipedia.org/wiki/DEFLATE)
Also there is the LZ77 and LZ78
https://en.wikipedia.org/wiki/LZ77_and_LZ78
these are also part of the zip, it's basicly a moving search-window, it tells how distant can the algorithm look back in the original data to look for repeating sequnces, at default this is a 32KB window (legacy reasons, this is what most modern compression softwares raise higher, like WinRAR and 7z, it's quiet suprissing that raising this parameter doesn't increase efficiency that much)
Then there is the ZIP64 expansion, this is a small patch to the file structure to solve the 32bit size limitation of files and directory naming things

damn character limitations...

Anonymous 2017-06-29 02:27:21 Post No.61138447
[Report]

Anonymous 2017-06-29 02:27:21 Post No.61138447 [Report]

>>61137382
But the rotational velocidensity will get out of whack

Anonymous 2017-06-29 02:31:09 Post No.61138481
[Report]

Anonymous 2017-06-29 02:31:09 Post No.61138481 [Report]

>>61138395
a few years ago there I worked on making a compression system for our company in-house to be optimized on our data structures, we started with understanding the ZIP... boy it was a long run, but worth every minute...
I would like give some spotlight for two major player in what the ZIP is today
first is the maker: Phil Katz
https://en.wikipedia.org/wiki/Phil_Katz
He was the one who put the ZIP file format together first, with amazing insight for future and backwards compatibility
And the other one is Mark Adler, he with Jean-loup Gailly made the "zlib" library which is used by all around the world from games to business softwares and still wanders around the internet helping people understand how ZIP really works:
https://stackoverflow.com/questions/20762094/how-are-zlib-gzip-and-zip-related-what-do-they-have-in-common-and-how-are-they

Anonymous 2017-06-29 02:57:48 Post No.61138821
[Report]

Anonymous 2017-06-29 02:57:48 Post No.61138821 [Report]

Why does adding one tiny text file into an existing zip archive takes million billion years to complete?

Anonymous 2017-06-29 03:03:48 Post No.61138878
[Report]

Anonymous 2017-06-29 03:03:48 Post No.61138878 [Report]

>>61138821
well, it depends on the software, some do the whole thing again even for the slightest of changes

Anonymous 2017-06-29 03:21:17 Post No.61139132
[Report]

Anonymous 2017-06-29 03:21:17 Post No.61139132 [Report]

>>61137114
Depends on what you mean by superior.
Size reduction?
Compression rate? (trust me this matters more than you think it would, specially for large files)
Decompress rate? (doesn't take as much cpu, but it is a metric)
Something else?

Different algorithms do different things.

Anonymous 2017-06-29 03:58:42 Post No.61139634
[Report]

Anonymous 2017-06-29 03:58:42 Post No.61139634 [Report]

>>61137124
>>61137155
>>61137819
the answer is LZMA, which yields the smallest file sizes yet is faster than bzip2. 7zip uses LZMA in a traditional Windows archive file way, and xz uses LZMA in the traditional Unix way

Anonymous 2017-06-29 04:29:22 Post No.61140065
[Report]

Anonymous 2017-06-29 04:29:22 Post No.61140065 [Report]

use 7-zip and tell WinRAR to go fuck itself

use lzma

but really just play around intill you get the size you want.

Anonymous 2017-06-29 04:40:41 Post No.61140230
[Report]

Anonymous 2017-06-29 04:40:41 Post No.61140230 [Report]

>>61137611
>just throw more hardware at it XD
do you work for Currysoft by any chance?

Anonymous 2017-06-29 04:43:46 Post No.61140279
[Report]

Anonymous 2017-06-29 04:43:46 Post No.61140279 [Report]

>>61140065

Whys so sour bro?. What did Winrar do to you?.

>>61140230

You do realize even if he were to compress all his shit, he would still need to throw more hardware in order to download any more right?.

If he's compressing just to compress then he's not gaining any space. Videos and images are as well compressed as they can be by as distributed and I doubt he has 1-3 TB worth of text files.

I'm not saying to buy new hardware. I'm saying don't waste time compressing videos/images/pretty much anything because it's already compressed. Unless you and him just want to waste time and hard drive lifespan by excessive and un-necessary read/writes and time it takes to read/write from and to compressed files.

Anonymous 2017-06-29 05:40:22 Post No.61141221
[Report]

Anonymous 2017-06-29 05:40:22 Post No.61141221 [Report]

>>61137114
Hash the file.
Whenever you want it back just brute force it.

Anonymous 2017-06-29 05:41:16 Post No.61141236
[Report]

Anonymous 2017-06-29 05:41:16 Post No.61141236 [Report]

>>61141221
That is very unreliable. Hashes can produce duplicates.

Anonymous 2017-06-29 05:47:30 Post No.61141336
[Report]

Anonymous 2017-06-29 05:47:30 Post No.61141336 [Report]

>>61141236
Yeah but chances are, most collisions you find will be random bullshit, or atleast nothing close to the original file.

Good luck "decompressing" anything over any meaningful size tho.

Anonymous 2017-06-29 05:50:40 Post No.61141385
[Report]

Anonymous 2017-06-29 05:50:40 Post No.61141385 [Report]

>>61137790
But then you have to store an extra 6 bits of information at the beginning of each block so that you can tell what each block is compressed with.

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible. Read more on this topic here - https://archived.moe/talk/thread/1694/

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/