If I remember correctly, it is very popular in video games because it is faster ...

gens · on Jan 27, 2021

It is used, and most every other compression technique was/is used in video games.

I was thinking of using LZ4, but it doesn't really work that great on floating point, and images are already compressed (png, jpg, and even BCn, can't be compressed much further). So idk. Good thing about lz4 is that it's very simple[0] and probably faster then memcpy().

http://ticki.github.io/blog/how-lz4-works/

garaetjjte · on Jan 27, 2021

>and even BCn, can't be compressed much further

S3TC is block compression, so if there is repeating data in images it will compress quite well.

gens · on Jan 27, 2021

I tried compressing BC7 (from AMD-s Compressonator) using lz4c and it wasn't much.

Just re-ran it (with -hc, version 1.9.3(latest now)) and: "Compressed 5592544 bytes into 5413783 bytes ==> 96.80%".

7z with "7z a -mx=9 -t7z asd.7z albedo_PNG_BC7_1.KTX" does 5592544 bytes to 4844707 bytes (down to ~87%).

Original file is 10227047 bytes (PNG, RGBA), i can't remember if the ktx has mipmaps.

EDIT: Note that the image is fairly noisy (gravel). Could/should be better with more "artificial" textures.

I don't know if ktx does some extra compression, but, looking at it, i doubt it.

PS I think that BC could be massaged at compression to be better compressible, and i think i read something about that. Don't remember.

garaetjjte · on Jan 27, 2021

Yes, it depends on content and won't do much for grainy surfaces. I don't have AAA game quality textures to compare, but I think for usual textured model it is still worthwhile. eg. this https://milek7.pl/.stuff/somelocotex.png weights 16MiB uncompressed, 5.1MiB as PNG, 5.3MiB as DXT5, and 2.1MiB as DXT5-in-LZ4. (mipmaps included in DXT5)

>PS I think that BC could be massaged at compression to be better compressible, and i think i read something about that. Don't remember.

There's Oodle BC7Prep from RAD Game Tools: http://www.radgametools.com/oodletexture.htm

EDIT: RAD page states that "BC7 blocks that are often very difficult to compress", so that might be also a factor why my DXT5 test compressed much better than your BC7.

EDIT2: Yeah, with BC7 LZ4 only compresses it down to 4.6MiB.

bob1029 · on Jan 27, 2021

If you really want to impress your customers, use SQLite to aggregate LZ4-compressed entities. In many AAA games, there can be hundreds of thousands of assets to load & keep track of. If you have to load an entire scene from disk, you could write a simple SQL query to select all assets assigned to the scene (i.e. a SceneAssets mapping table) and then stream them all into memory from the unified database file handle.

dolmen · on Jan 28, 2021

Do you have any tricks to design CREATE TABLE and INSERT statements so the SQLite file has the best layout (proximity of data) ?

bob1029 · on Jan 28, 2021

The best approach I can think of is to have 1 very small table that is just scene_id + asset_id, and then 1 very humongous table that is asset_id+blob.

You could further optimize by profiling access patterns during QA testing. There wouldnt be 1 global ideal ordering of assets if you had multiple scenes involved using varying subsets, but you could certainly group the most commonly used together using some implicit insert ordering during creation. This would help to minimize the total number of filesystem block accesses you require.

I think one other important IO trick is to make sure you vacuum the sqlite database before you publish it for use. Presumably, these should be read-only once authored in this context of usage. This will clear out empty pages and de-fragment the overall file.

Jonnax · on Jan 27, 2021

I did a search and found this patch note for a game which has a comparison graph comparing cold load, hot load and size of some algorithms:

https://store.steampowered.com/news/app/991270/view/18064466...

TwoBit · on Jan 27, 2021

How does it compare to Kraken? It was fast enough that Sony recently built Kraken into the PS5 hardware.