If I remember correctly, it is very popular in video games because it is faster to load compressed assets from disk and decompress them in memory than loading the uncompressed assets from disk, even on an SSD.
It is used, and most every other compression technique was/is used in video games.
I was thinking of using LZ4, but it doesn't really work that great on floating point, and images are already compressed (png, jpg, and even BCn, can't be compressed much further). So idk. Good thing about lz4 is that it's very simple[0] and probably faster then memcpy().
Yes, it depends on content and won't do much for grainy surfaces. I don't have AAA game quality textures to compare, but I think for usual textured model it is still worthwhile. eg. this https://milek7.pl/.stuff/somelocotex.png weights 16MiB uncompressed, 5.1MiB as PNG, 5.3MiB as DXT5, and 2.1MiB as DXT5-in-LZ4. (mipmaps included in DXT5)
>PS I think that BC could be massaged at compression to be better compressible, and i think i read something about that. Don't remember.
EDIT: RAD page states that "BC7 blocks that are often very difficult to compress", so that might be also a factor why my DXT5 test compressed much better than your BC7.
EDIT2: Yeah, with BC7 LZ4 only compresses it down to 4.6MiB.
If you really want to impress your customers, use SQLite to aggregate LZ4-compressed entities. In many AAA games, there can be hundreds of thousands of assets to load & keep track of. If you have to load an entire scene from disk, you could write a simple SQL query to select all assets assigned to the scene (i.e. a SceneAssets mapping table) and then stream them all into memory from the unified database file handle.
The best approach I can think of is to have 1 very small table that is just scene_id + asset_id, and then 1 very humongous table that is asset_id+blob.
You could further optimize by profiling access patterns during QA testing. There wouldnt be 1 global ideal ordering of assets if you had multiple scenes involved using varying subsets, but you could certainly group the most commonly used together using some implicit insert ordering during creation. This would help to minimize the total number of filesystem block accesses you require.
I think one other important IO trick is to make sure you vacuum the sqlite database before you publish it for use. Presumably, these should be read-only once authored in this context of usage. This will clear out empty pages and de-fragment the overall file.