An idea breading for long time already: in image compression, wouldn't a processing pass subtracting such a raw approximation of the image from the actual image allow for better compression? Is this used in any of the actually used image compression formats? If not why not? Why is jpg f.e. using only tiny local windows/blocks for processing?
Lossy video compression currently does that - it creates key frames at regular intervals and all the frames between (i-frames) are based on the previous key frame. Some codecs have i-frames depend on multiple previous frames.
E.g. a 320x240 video has 3 seconds at 25fps, i.e. 75 frames and key frames are created each second. Frame sizes will then be something like:
Now, this technique might be good when compressing images - have a large image that roughly approximates the original and then a smaller one reducing the errors. Or even better, multiple smaller images that correct local errors?
I suggest we create a startup immediately. Someone might already have a cool video explaining how their compression algorithm is going to change the web and the world.
Yea! To get the js-css-trickster-folks on board the first version works 'obviously' by splitting an image into the shadow and non-shadow parts offline. The result gets recombined in the browser by overlaying them and using some fancy blending modes.
Some micro-benchmarking should be able to show that the two shadows (jpg compressed 'obviously') plus css and js put together are smaller than a simple jpg compressed version of the input image...
Go! Go! Go! Before Hooli steals the idea... ;)
JPEG is quite old. Some of the best still image compressors are found as I-frame compressors in video codecs. Daala for example uses adaptive block sizes and inter-block prediction.
Lots of compression techniques use variations of this idea. For example, all wavelet tree codecs (eg JPEG2000) effectively do this.
As to why it wasn't done for JPEG, etc.? Mostly the cost of hardware implementations in compute and memory. Think of all the cheap cameras etc. that used JPEG, the 8x8 block coding and integer representation of the discrete cosine transform coding made this possible.
The problem with this approach is that this does not correspond to the way the brain compresses/processes images. So a coarse image gets a lot of extra "features" which the brain processes as such (features). Instead, going to a coarser representation should remove features.