[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y ] [Search | Free Show | Home]

Compression algorithms work by ignoring redundant and repetitive

This is a blue board which means that it's for everybody (Safe For Work content only). If you see any adult content, please report it.

Thread replies: 9
Thread images: 2

File: you right now.jpg (11KB, 225x225px) Image search: [Google]
you right now.jpg
11KB, 225x225px
Compression algorithms work by ignoring redundant and repetitive data, right?

Can someone remind me how much you can compress images and videos? I can't recall how those worked.
>>
it depends
>>
Its almost the same way, by ingnoring mutiple of the same frame of video and only saving the differences between frames.
>>
>>58468688
What about static images?
>>
>>58468727
Either exploit perfectly repeating patterns and a limited number of unique color values in screenshots and pixel art-ish images for lossless compression (like PNG).

Or, transform the image into some representation where it's easier to tell what parts are important for humans, drop the less important stuff (usually based on some quality parameter) and try to encode what remains afterwards as efficiently as possible, exploiting gaps and biases in the data.

For example, JPEG does this by splitting the image into small blocks, transforming them from pixels to a frequency-domain representation and simplifying very high frequency detail, keeping larger features intact. The blocks are then encoded one by one, exploiting the facts that nearby blocks tend to be similar (delta compression), that there are now many zeroes in the frequency-domain data (run-length encoding) and that the remaining values' distribution is not even --- some values are more likely than others, so it pays to encode them with fewer bits even if it makes the rarer ones require more (Huffman compression).

Also, human eyes are more sensitive to contrast and brightness than color so the above is usually performed after breaking down the image into a full-resolution "luminance" component and two lower-res color layers that together encode the color information.

Video codecs do something like the above lossy compression to deliver a "keyframe" once in a while (so it's possible to seek in the video) and e.g. then encode the frames after a keyframe as differences from the previous one(s), exploiting the fact that videos that aren't massively epilepsy inducing tend to have very high repetition in content between frames.
>>
>>58468926
That was a very nice explanation. Could you elaborate a bit more on how "what's important for humans" is determined?

Also are these techniques used for things like zip files?
>>
>>58469149
Zip and others are general purpose lossless compression, it uses other techniques that aren't domain specific and usually a bit more limited.
>>
>>58469149
>Could you elaborate a bit more on how "what's important for humans" is determined?
Big things are more important than small ones and density of detail is more important than reproducing the details precisely: Everybody expects to see a slightly grainy pattern in asphalt, concrete, etc. but few care about where each grain is. JPEG sort of handles this through the cute little frequency transform it does to its 8x8 pixel blocks: When it keeps the low-frequency coefficients more precisely it preserves the overall appearance and color of things while the "quantization noise" from the crudely preserved high-frequency detail sort of looks like detail even when it's quite a bit off. The artifacts you can occasionally see near sharp edges in .jpgs are one weakness in this scheme.

Audio codecs do something like the above when they try to at least keep the overall sound level around each frequency band about the same when they can't reproduce the exact waveform. This also makes sense because the human cochlea is pretty much built to split sound into different frequencies and "feel" their intensities.

Zip files (like most things not intended to imperfectly reproduce some imperfectly recorded natural image, sound or shape) are lossless and can't fuck with the data; what goes in comes out the same way. These general purpose compressors usually use a rolling dictionary to handle often-repeated sequences while being able to adapt to content, then using some probability scheme (chances of encountering a specific token in the dictionary) to encode references to this dictionary as efficiently as possible. They're not very efficient unless compressing something with lots of redundancy like human-readable text, though.
>>
File: images.jpg (10KB, 259x194px) Image search: [Google]
images.jpg
10KB, 259x194px
>>58469796
Altough I cannot even begin to imagine how that would look in code, the way you explained it was very clear.

Thank you. It was very useful.
Thread posts: 9
Thread images: 2


[Boards: 3 / a / aco / adv / an / asp / b / bant / biz / c / can / cgl / ck / cm / co / cock / d / diy / e / fa / fap / fit / fitlit / g / gd / gif / h / hc / his / hm / hr / i / ic / int / jp / k / lgbt / lit / m / mlp / mlpol / mo / mtv / mu / n / news / o / out / outsoc / p / po / pol / qa / qst / r / r9k / s / s4s / sci / soc / sp / spa / t / tg / toy / trash / trv / tv / u / v / vg / vint / vip / vp / vr / w / wg / wsg / wsr / x / y] [Search | Top | Home]

I'm aware that Imgur.com will stop allowing adult images since 15th of May. I'm taking actions to backup as much data as possible.
Read more on this topic here - https://archived.moe/talk/thread/1694/


If you need a post removed click on it's [Report] button and follow the instruction.
DMCA Content Takedown via dmca.com
All images are hosted on imgur.com.
If you like this website please support us by donating with Bitcoins at 16mKtbZiwW52BLkibtCr8jUg2KVUMTxVQ5
All trademarks and copyrights on this page are owned by their respective parties.
Images uploaded are the responsibility of the Poster. Comments are owned by the Poster.
This is a 4chan archive - all of the content originated from that site.
This means that RandomArchive shows their content, archived.
If you need information for a Poster - contact them.