Hacker Newsnew | past | comments | ask | show | jobs | submitlogin
FLAC 1.3.4 (xiph.org)
176 points by libele on Feb 23, 2022 | hide | past | favorite | 162 comments


Shoutout to LossyWAV (formerly LossyFLAC) [1]. It's a preprocessor to lossless codecs shaping noise such that higher compression ratios can me reached. It works out much better than that sounds like.

1. https://wiki.hydrogenaud.io/index.php?title=LossyWAV


Reminds me of https://pngquant.org/

Same concept just for making images more efficient for PNG



> It's a preprocessor to lossless codecs shaping noise...

So it's a lossy compression? Why use FLAC then?


Other lossy compression methods generally happen in the frequency domain and, depending on the type of music, can introduce audible distortions.

This preprocessor on the other hand throws away some of least significant bits to save data. This increases the quantization noise but has no other sort of artifacts. The quantization noise can be dithered to fall in the higher frequencies (noise shaping) and is generally not perceptible.


Doesn't 16 bit audio already require dithering to capture the full audible dynamic range? That makes me worry somewhat about cutting more bits and layering dithers on dithers.


While it's good practice, a 16-bit recording is usually not exactly ruined by not dithering it. 96dB dynamic range is a lot.


You always want dithering no matter the bit depth, but 16-bit is actually quite a lot - there's literally no point in better than CD-quality audio. (Except of course that real life is in surround not stereo.)

But yes, if you're doing it repeatedly you'd want an un-dithering filter. Noise reduction tends to do this by accident but it helps if you know what the dither shape was.


Sounds more like trellis quantization in near-lossless H.264, explicitly trading coding cost of details against psychovisual impact of said details.


That's psy-rd (in x264 terms).

Trellis quantization is just a more optimal way to divide numbers - think of it like rounding to nearest instead of down. "optimal" means "optimal rate-distortion tradeoff" and "distortion" means whatever you want it to, but usually it's difference between original and compressed pixels (absolute error/SAD/PSNR).

That can look blurry, because given all alternatives with the same SAD, blurry ones compress more. So psy-rd changes the definition of distortion to add a "has similar amount of noise" factor. That's very far from human optimal (if anything it's SSIM optimal) but it's free detail. Uses the same quantization to get there, though.


It's just a hobby project, and the structure isn't too different from how other lossy codecs work. (Or lossless ones - you can construct one of those from any lossy one just by sticking the difference from the original on the end.)

Most codecs sacrifice transparency to reach a bitrate and this one does the opposite.


> Most codecs sacrifice transparency to reach a bitrate and this one does the opposite.

That's what LAME's presets and the Ogg Vorbis quality settings do as well, isn't it?


Yep, but codecs tend to have maximum bitrates either because of design tradeoffs or to work with hardware decoders. MP3's is too small to be perfectly transparent on some things, like cymbals.


It works out much better than that sounds like.

That... doesn't sound good ;)

Really cool idea, though more comparable to lossy compression I guess.


This reduces the bitrate by about 3 times. Even at the lowest bitrate at about 300 kbps I can't even hear the slightest difference to the original. Impressive, but I wonder what it would sound like when you reduce the bitrate even further. Would it be competitive with mp3?


I mean, you don't need flac to have audio quality where people can't distinguish in double blind test. Something like 160kbps (iirc, ±30) Opus does the job as well.

(Mp3 is way less efficient and depending on the encoder it can have characteristic artifacts. Vorbis was already better, but opus blows that away. Software support for opus is nearly as good as mp3 since recently. We should either stop considering mp3 the default lossy format or hack it into a container format so audio files can have opus inside while carrying the familiar mp3 extension for the 90s folk to be happy about.)


Ogg was merely a (very bad) container, you're probably thinking about Vorbis.


Ah yes, I keep mixing those up. Edited, thanks.


This at 300kbps vs opus 300kbps? mp3 is not a good comparison target


Since the decoder complexity of flac is much lower than that of opus LossyWAV easily wins this competition at 300 kbps. I just wonder what the compression artifacts with LossyWAV would sound like if you push the bitrate down further (like e.g. 128 kbps). Sadly the command line tool does not allow that.


Would be interesting. Unlikely to work out though. On other note, opus 128 kbps is transparent (in my tests i can distinguish 100kbps files at most), with safety margin 256 kbps should be enough for all practical purposes (except archiving).

There are other worser codecs around, e.g. LDAC (best Bluetooth codec) 660kbps vs 990kbps is very noticable (cant setup the same way to compare 990kbps LDAC vs 128kbps opus)


What are you doing with it that opus decoding complexity is any kind of barrier?


How cheap does storage have to get before people stop bothering to use FLAC?

By my calculations, $65 will buy you a 4TB hard drive which can hold over 6,000 hours of uncompressed CD quality PCM. With FLAC you might squeeze 9,000-12,000 hours in. But, is that really worth the bother?

I know it’s not obviously relevant to consumers, but an uncompressed PCM .wav file is a format that any high school coder can figure out completely. FLAC on the other hand has had decades of work put into it. It is an engineering marvel. I just wish people would think simpler than “engineering marvel” when it comes to archival formats when the simplest conceivable format works just about as well… :$

Edit: Many people are bringing up MP3/Ogg/Opus, compressing podcasts, how silly it would be to toss out all forms of audio compression.

To be clear, MP3/Ogg/Opus make perfect sense to me. They are complicated. But, in return you get 8-12x the content at great (not perfect) quality. FLAC on the other hand Is at least as complicated, has a lot less widespread support and has an ROI of 1.5-2x…

As a consumer, I can see that 2x in return for a modest amount of hassle can be an OK deal in constrained situations -like price-gouged mobile storage. As an engineer who has maintained software systems over many years, the complexity of FLAC compared to trivial PCM as an archive format makes me a sad, cranky, old greybeard.


> How cheap does storage have to get before people stop bothering to use FLAC?

Having cheap storage is not a valid reason to not to use that space efficiently.

With today's, even last decade's processors, encoding FLAC at ripping speed is not even a CPU saturating task, and decoding is merely a blip on processor's queue. If I can store the same bitstream on a smaller space, why shouldn't I?

If we're not compressing anything, why not forego lossless compression completely? Let's not GZ/XZ our log files. These 20MB files will become 30-40GB, but heck, a 4TB drive is just $65.

Similarly, let's not compress gigabytes, even terabytes of genomic data, scientific outputs and other stuff. They're just ASCII or Unicode strings, a high-schooler can figure these out. Why invest time in lossless compression tools like XZ and ZStd? They just use CPU cycles needlessly.


I’m not clear how you are jumping from “30% of a few hundred gigs is a bad complexity trade off” to “100000% of terabytes is worthless” :p

Thanks for at least arguing against the use case I argued for. Even if you were a bit hyperbolic.

I get that I’m being a cranky old engineer. As someone who has spent a couple decades optimizing and maintaining bespoke high performance file formats, the particular use case of FLAC as an archive format still seems like a bad complexity vs benefit trade.

I’d rather have something that’s trivial for any software to work with everywhere forever. I know FLAC feels like forever. But, so did a lot of dead formats I used to work with.


FLAC isn't an unmaintainable "engineering marvel", the algorithm is pretty simple and all the data is checksummed. That's enough to make it a better archival format than WAV.


Checksumming isn't enough to make something a good archival format. You need to go beyond being able to detect data corruption to being able to correct or cope with data corruption. For a FLAC file, if you flip a bit in the file, then the checksum will detect that the file has been corrupted, but that chunk of the file will be unrecoverable. If you flip a bit in PCM data, likely you won't even hear the difference. If you want to know whether corruption has occurred, you can always run md5sum on it. For FLAC, if you want to be able to cope with data corruption, you'll need to create something like par2 recovery data.


This.

My dad is very skeptical of compression, not understanding the concept of lossless and thinking it reduces quality to compress it. Even if you explain it in terms of "so instead of writing 1111111 it writes 7x1, look how much less space that takes! But it's the same information in the end." Even NTFS compression is a checkbox he insisted I disable. So it's wav all the way.

Recently had hard drive issues and I copied all data to a new drive at painstaking speeds (the hdd was barely limping). I now have no idea which files might have glitches from broken bits or sectors in them.

I guess a checksummed filesystem or keeping hash sums somewhere (the standard utilities make it pretty easy to verify many files with one simple command) is the solution for him. For me... just use a normal format and save half the space and money. Heck, go one step further and use opus: a healthy young individual can't hear the difference at a good bitrate let alone my "audiophile" "listens to too loud music" old man.


Some of the older generation have been convinced there's a difference with lossless compression, usually by British salesmen using words like "jitter" and "PRAT". It's sad that by the time you're old enough to afford being an audiophile, you probably don't have high-end hearing anymore.


Actually, from my experience, hi-fi audio is more about the complete sound rather than details itself.

I've played in orchestras and have a respectable system at home, and the biggest differentiator is not the details, but how these details interact and create a bigger, more immersive soundstage. Yes, you can hear subtle sounds of a bow or a cymbal, but the exciting part is how sounds mesh and play with each other.

So, as I age, the excitement of listening the same or new songs with that system doesn't fade away. I still get the same joy, and shiver when I hear that soundstage.

Actually, jitter is really something about audio CDs burned in CD Writers. I still have a Yamaha CRW-F1 CD recorder, and it has a feature to abuse Red Book standard to record audio with bigger pits to ease the CD player's job of tracking the disc. It reduces a 80 minute disc to 66 minutes, but with a higher quality CD like TDK, the sound was noticeably different on the system I aforementioned in this comment.

Currently, that set is fed by a much better CD player, and I'm sure that its tracking is leaps and bounds ahead of the older player, but it was really made a difference back then. I'm not sure with today's electronics, it'd make such a difference anymore.


> more about the complete sound rather than details itself

Agreed. My sound system isn't by any stretch audiophile; and I have the hearing of a 65-year-old. I can't point to things I can hear on a FLAC that I can't hear on an MP3; I hear the same notes, and the same instruments, with much the same tonality. But the former has a presence and vibrancy that is lacking in the latter; when I listen to MP3s, my involvement in the music rapidly tails off.

I suspect that you don't have to be able to hear higher harmonics in isolation, for those higher harmonics to affect what you do hear. That is, even if my hearing range cuts off at 12KHz, I can still tell the difference between an unfiltered sound, and a sound that is low-pass-filtered at 15KHz. The difference seems to be clearest with voices.


Try an ABX test: https://wiki.hydrogenaud.io/index.php?title=ABX

There's a reason MP3 is obsolete, though. You probably can't pass an ABX test with high bitrate MP3, but you almost certainly can't with AAC or Opus without extremely critical listening.


What is a “much better” CD player? It’s a digital medium so that doesn’t make any sense. It either reads the disc and the data on it or it doesn’t.

If you’re referring to the DAC portion that does make a difference obviously but that doesn’t have to do with the CD itself or tracking.


No, I'm talking about tracking and digital signal generation for the DAC, not the DAC itself.

The two players have 20 years of development apart. First one was a lower cost Sony, and the latter one is an entry level, yet higher end one (Yamaha CD-S300).

On the tracking stability, first one skips if you knock it lightly, and Yamaha doesn't care if you bump into it accidentally. Also, newer electronics can switch much faster, and in turn, creating a clearer eye pattern for DAC to work on [0].

When you used CRW-F1's audio mode, it elongated the pits and lands, so the digital part had more time to switch properly. This created a clearer eye pattern.

A clearer eye pattern allows DAC to create more cleaner signal since it switches and understands the signal much better and allows it to create a more "correct" (or clearer if you pardon the term) analog signal, esp. on higher frequencies.

[0]: https://en.wikipedia.org/wiki/Eye_pattern


I'm sorry but this is nonsense.

Digital data on a Red Book CD is encoded with CIRC error correction, which provides an extra parity byte for every 3 data bytes. In normal operation, the recovery of the digital data is 100%. The DAC receives a stream of 1s and 0s, there is no "clearer eye pattern" for it to "interpret". The data is read or it is not. If it is not, the error is corrected. If it is uncorrectable, for example if there is a particularly large scratch, then - and only then - an attempt will be made to "guess" the signal.

Unless the CD is scratched, a 20$ computer optical drive with digital SPDIF output will be completely indistinguishable to a DAC than a drive costing 100x as much.

The "tracking stability" you mention in the newer model is simply a larger buffer, so that it has time to recover the read.


Fair enough, I wasn't considering the actual mechanics of the signal generation.

That said I see no usage for a CD player anymore, really. I can rip a bit-perfect copy, store it in FLAC, and play it pretty much anywhere. The only place I still use CDs sometimes is my car, because it's more convenient than fiddling with my phone sometimes.


I also do my casual listening over FLACs or streaming services in daily life.

OTOH, Listening CDs and vinyls is more of a ritual for me. A good coffee, some good reading material (or nothing), an album I especially like and an hour for myself. That Hi-Fi set also has USB/iPod connectivity over the CD player (and its DAC is a treat for both MP3s and over the iOS streaming) and it has bluetooth connectivity for very lazy times.

So a vintage set with some modern connectivity, and some older formats for enjoying the music for the sake of music.


FLAC is about future-proofing.

Opus and MP3 are lossy formats. You can’t transcode them to anything else without losing data.

With how cheap storage is these days there’s really no argument against FLAC because you can also transcode it to whatever. Players like Navidrome use ffmpeg to transcode on the fly.

If I put all this effort into ripping CDs and maintaining backups, it’s not going to be in a lossy format that I can’t convert to anything else.

Use ZFS to prevent bit rot.


> Even if you were a bit hyperbolic.

It was an intentional dramatization to showcase some slippery slope, tbh. Genuinely sorry if that sounded too harsh, or rude. That wasn't the intention.

> I’d rather have something that’s trivial for any software to work with everywhere forever. I know FLAC feels like forever. But, so did a lot of dead formats I used to work with.

I understand the fear and pain of obsolescence in the world of file formats. VQF is the first notable example in my mind, but we need to keep in mind that FLAC is proper FOSS, and it'll live on in one way or another. People still keep Commodore64 kernals, emulators alive. Similarly MiniDisc has a cult following. On the worst case, I can install an old distro to a VM, and convert my music via that VM to something more modern.

I personally don't think FLAC as an "Archival Format". It's more of a bona fide storage format for me. I'm a serious music listener, and I tag my archives ambitiously, and besides being lossless and provides space advantages, it supports proper tagging via ID3 tools and allows me to store a truckload of metadata with my music.

To prevent corruption in my archives, I'd rather store recovery data and take regular backups. There's no other way in my book to keep data safe.


"bad complexity vs benefit trade"

that makes me think of "C compiler" (actually C89 with benign bits of c99/c11) vs "c++ compiler", or "javascript web browser" vs "noscript/basic (x)html browser".

that said, aren't the 30TB hard drives around the corner?


Exactly. We are currently debating $10 vs. $6 to store 1000 hours of music.

The only answer to my actual question so far indicates that when it gets down to $1 vs 60c, they’ll just say “Whatever. Leave it uncompressed.”

Thus 90% of the discussion has instead been about 4G streaming lossless real time voice and podcasts to mobile devices with price-gouged storage to be listened to through very lossy Bluetooth compression :p

And, 9% has been about how FLAC as a container format did a good job standardizing metadata. But, that has nothing to do with audio encoding within the container :P


Who says your filesystem isn't compressing data under you? Actually, who says the disk itself isn't?

(Sometimes they both are.)


> Who says your filesystem isn't compressing data under you?

Me, because we don't use transparent compression anywhere.

> Actually, who says the disk itself isn't?

I'm not very sure about that, to be honest. SSDs do that to prolong their life, but the saved space doesn't return as more space on FS level. It just returns as endurance, which is more important in my case.

So, disk's transparent compression doesn't mean very much from a sysadmin/OS operations perspective.


> Me, because we don't use transparent compression anywhere.

Since Fedora 34, a clean install will result in btrfs with zstd compression turned on. (An update of an older install will of course not change the filesystem from underneath the files.)


I generally don't reinstall my OS unless it breaks. My current Debian installation is around 8 years old at this point (it's updated, of course), reinstalled to migrate it to amd64.

Other systems we install are configured on many levels, so even if the defaults are BTRFS w/zstd, it might be either known or changed to suit our needs better.


That's why i use Slackware - does not try to be smart.


What other lossless open source codec is "better" than FLAC? I'm not aware of any.


I don't know about "better", it depends on how you measure that, but wavpack[1] is a worthy competitor and has some features that FLAC doesn't.

1. https://www.wavpack.com/


> Let's not GZ/XZ our log files. These 20MB files will become 30-40GB, but heck, a 4TB drive is just $65.

But when you tar.gz your log files you do not loss information.


When you encode your audio files with the Free Lossless Audio Codec (FLAC), you do not lose any information either.


Yes, it is my bad, I had misread the original comment.


When you FLAC audio file you do not lose information either.


Yes, it is my bad, I had misread the original comment.


Besides the other good points people have made, I would just like to add that this is the kind of mentality that I really dislike to see. Just because resources are cheap, do we really need to waste it all? Computers and smartphones are faster than ever, yet there are always websites that I find completely unusable on my phone. And some of these websites I actually need to use because they are required by the government... Looking at you, MitID!

I'm inclined to call this The Javascript Mentality, though I don't mean to insult JS users. JS definitely has its place, it's just very much abused, unfortunately.


Ok so what exactly is better than FLAC? I'm not aware of any "usable" lossless encoding algo. Truth is "nobody" bothers to develop open source lossless audio codecs. Of course you can always buy Dolby because there seems to be the "efficiencies" that you are looking for along with locked-in hardware and locked-in software.


By "resources" in this case I was referring to storage space. I could have written my comment more clearly, sorry. I do really like FLAC for saving space, and I hate websites that have 20+ seconds of interaction latency.

Edit: Of course websites being slow has nothing to do with storage space, but it has everything to do with wasting resources – whether that's storage space or CPU cycles.


the "mentality" you are referring to is not limited to JS at all, and has less to do with the particular language. Market dynamics, company incentives and a larger pool of developers among other reasons contribute to that more.


I don't think GP meant "JavaScript Mentality" as "a mentality that's limited to js", but more as "a mentality you see a lot in the js-adjacent world".


well, setting blanket labels in that way is bound to be inaccurate and even inflammatory, as on the surface it sounds like bait to disparage groups around one specific language.

or, a more better term would be 'Atleast it works' mentality.


Precisely :)


I think of it as the Python Mentality


Right, that's not too far either. I guess the difference is that the abuse of Python doesn't hurt me in my daily life in an obvious way, unlike Javascript.


I'm surprised to hear this perspective. I feel like FLAC is to PCM as PNG is to BMP. Why not just work with FLAC directly? Half the file size in exchange for negligible processing time.

As a practical use case, FLAC is the only way I can fit my song library on my phone. Flagship phones today come with 128GB of storage, about 90GB is actually usable. Every additional 128GB is $100 more.


>Flagship phones today come with 128GB of storage

IMO, one of the biggest scams in the mobile space currently.

My old OnePlus 3T flagship from 2016 came with 128GB of storage and it "only" cost €480. Also, best phone I ever owned by far BTW.

Today, nearly 6 years later, Apple and Samsung flagships (I'm staying away from OnePlus nowadays) are charging huge markups for more than the base 128GB of storage, on phones that already cost €1000+, despite flash storage getting significantly cheaper since 2016.

This is beyond insulting, especially since modern flagships also lack microSD expansion.


> This is beyond insulting, especially since modern flagships also lack microSD expansion.

They'll do everything they can to push people into using their cloud services. It erodes the concept of ownership when you have to ask for permission from another party to access your own stuff. The more dependent they can make you on them the better. Plus when I listen to an MP3 stored on my phone, there's no opportunity for 3rd parties to track what I'm listening to and how often or to push ads at me.


Nothing about downloading an MP3 that says they can't put a targeted ad in it; there's podcasts that do that based on your IP. I assume the reason most don't is just how much more trustworthy it sounds when the host reads an ad script live in the episode.

(I swear there was an episode of The Weeds where they read an ad for a therapy service for dogs.)


> Nothing about downloading an MP3 that says they can't put a targeted ad in it; there's podcasts that do that based on your IP. I assume the reason most don't is just how much more trustworthy it sounds when the host reads an ad script live in the episode.

I'm in the UK and noticed both kind of happening: an ad for LinkedIn (I think) was localised to use a UK example in several episodes of This American Life during the last year. It was spliced into the programs seamlessly and sounded just like any of their usual sponsorship segments.

I'm presuming they just edited a handful of versions of the program for their largest listener regions and there was no need to do anything on the fly except pick which to serve. Of course I paid little attention to the ad as I was distracted more by it being the first time I'd noticed that kind of localisation.


I was watching some old Apple videos one day and back in 2001 Apple asked $500 more just to have a better _optical_ drive in your iBook https://youtu.be/ZxIgyG_7jcI?t=243


It was also for double the RAM.


Same thing with MacBooks. Air +200$ for 256 GB of disk drive (from 256 to 512).


This’ll be because since 2016 it’s now expected you’ll stream more media than you’ll did back then, so not much much storage needed.


That doesn't make sense of why the prices increased so much despite storage offering stagnation while NAND flash getting cheaper.

Any way you slice it it's still price gouging.


Price gouging has a legal meaning and means raising prices in an emergency. Charging more than you’d happen to like isn’t ‘price gouging’.


I meant price gouging from the consumers' perspective, not being pedantic about the legal meaning of it.


So what is price gouging to you? Just being asked to pay more than you’d like to? Then it’s a meaningless term - I’d always like to pay less.


>Just being asked to pay more than you’d like to?

No, I already explained above why I consider it price gouging.


Because they make a profit margin that you'd prefer was smaller? That’s just commerce, and we'd always prefer people charged smaller profit margins.

Are you 'gouging' your employer by asking for the highest salary you can get away with? Of course not. It's a market price.


You missed the point completely.


Funny how a 200€ Xiaomi phone has 256 GB of storage these days. No idea though if that is fake HDD somehow.


Why would it be fake? NAND flash is cheap, that's why Xiaomi can offer 256GB on cheap phones since few people rush to buy Xiaomi. Apple and Samsung are name brands and can afford to overcharge their customers.


Isn't Xiaomi a name brand? I mean, the name is right there: Xiaomi.


Name Brand generally refers to something not being a White Label Brand. So, "Apple" is a Name Brand, but "TALK WORKS" iPhone chargers (on Amazon) are probably a White Label and, therefore, not a Name Brand.


The market decides; people pay those rates even though there's alternatives available.


There is not much choice when the giants in the space with the majority market share are all pulling the exact shame shenanigans.


Also it's 128GB not 128GiB.


FLAC also has no metadata interop issues since the flac container always specified vorbis comments.


> Every additional 128GB is $100 more.

Your phone doesn't support some kind of SD cards?


> How cheap does storage have to get before people stop bothering to use FLAC?

It's not about the storage, it's about the transfer.

Streaming services are only just beginning to support lossless at all. Improved lossy codecs like Opus provide indistinguishable-from-lossless audio at 128Kb/s, so the formula is store the audio losslessly, then transcode it to lossy (on the fly, even) for streaming.


I found an use for lossless streaming:

A weakness in the open source landscape I found is that there's a lack of fast codecs. If you want to encode in real time (eg, streaming speech or an artist playing in the moment), and encode per client for positional audio in a 3D world, then Opus becomes a bottleneck.

FLAC according to some preliminary tests helps by being faster than Opus to encode.


Both FLAC and Opus[0][1] have a dial to trade efficiency against CPU time. Have you even tried? I doubt FLAC remains a superior choice at low-complexity Opus settings. The library also has a complexity setting for the encoder.

[0] https://opus-codec.org/docs/opus-tools/opusenc.html [1] > Set encoding computational complexity (0–10, default: 10). Zero gives the fastest encodes but lower quality, while 10 gives the highest quality but slower encoding.


I have tried it. It works, but the difference in my experience isn't a huge one.

Complexity 0 is about twice as fast as 10. Which isn't bad, but if you want to go really fast it still leaves something to be desired.


I hope this has been measured with the binary optimized for the CPU architecture in question, in oder to sqeeze out reasonable gains from likely more portability-focused opus code, particularly regarding vectorization opportunities.


By "fast to encode" do you mean that in terms of low latency or low CPU usage?

I ask because I know Opus is used quite commonly in real-time applications (e.g. voip), and I remember when researching in the past that it is actually capable of lower inherent latency because of support for smaller frame sizes than some of its competition.

I haven't looked into how expensive it is to encode in terms of CPU time, so I assume maybe you're taking about a bottleneck in terms of the number of simultaneous streams you can support on a single CPU?


Both, primarily the first. Low latency is definitely a requirement, but no issues with Opus on that account.

> I haven't looked into how expensive it is to encode in terms of CPU time, so I assume maybe you're taking about a bottleneck in terms of the number of simultaneous streams you can support on a single CPU?

Yup! I work on https://vircadia.com/ -- we have to compress audio in real time and every user gets their own mix since it depends on their location in 3D space. It turns out to add up pretty fast, and you can't fit that many people into a cheap VPS.

That's why I'm working on FLAC support for it. If bandwidth is plentiful but CPU resources are lacking it's a good alternative to have.

This code originally came from the High Fidelity company, which made their own codec. It's some piece of black magic that cuts down audio by exactly 1/4th and is amazingly fast. But it's closed source.


There's no way their proprietary codec is less black magic than Opus is. I've rarely if ever seen a closed source codec designed by people who really know what they're doing; they're usually just whatever the cowboy they hired thought would be cool. It's not like management is going to be able to tell if it's state of the art or not.


Well, I say "black magic" because at least to me it's a quite mysterious creation. But I'm no audio engineer.

It's a lossy codec that's for some reason fixed rate, and shrinks audio by exactly 1/4th. The resulting quality is very good (works perfectly fine for music), and it's much faster than Opus. It retains stereo and high frequencies and sounds just fine. I'm sure there's a tradeoff somewhere, but it's certainly good enough to hold music events without people complaining.

I think it might be possibly related to codecs like AptX.

So far I've not found anything competitive that we could use -- stuff is either closed, or far more CPU intensive.


Yeah, it sounds like ADPCM with a fixed ratio like that. It's extremely simple and adds no latency, which are both good for streaming.

Usually a fixed compression ratio is a terrible tradeoff, but for streaming it kind-of makes sense.


I don't think it's ADPCM since it sounds really good. I'm not an audiophile or have done any comprehensive testing, but it' the kind of quality a normal person would use for music without complaining.


That was meant to agree with you - AptX is a kind of ADPCM with some other filters added on, so this could be the same thing. That's how it gets low latency.


What are the use-cases for pre-baked positional audio? Ie where the client can't do the positional audio processing itself.


Not pre-baked. I work on https://vircadia.com/ -- it has to give every user their own audio mix to account for their 3D position.

This means codec costs add up fast. You can't just encode once and stream the same thing to a dozen people.


Oh right, I read that wrong. Thought you did per-voice 3D positional encoding on server but mixing on client.

Interesting use-case, got my brain racing off to find silly solutions.


There are a number of reasons:

* Most FLAC tools support metadata. For the more "bare" formats that's not the case. When converting to a lossy format, I want my metadata intact.

* FLAC can be played directly on most media players (with metadata).

* The format may be complicated, but the source is open and actively maintained, and there's support everywhere. Using FLAC in your software project is a breeze.

* 2x compression is nothing to sneeze at. My archival collection takes up 1TB on my mirrored, backed up NAS. I like that better than it taking up 2TB (which would be 1/6 of my space). Someday when I look at 1TB the way I look at 100GB today, this won't be a concern anymore.


> Someday when I look at 1TB the way I look at 100GB today, this won't be a concern anymore.

I'm halfway there, 1TB collection on a 30TB array. When I repurpose a previously used drive, I've got a bad habit of making a multi-TB image in case I forgot to copy anything off, and then forgetting about it. (I'm in the middle of writing some tools to sort through those.)

Still, what would be the advantage to moving to WAV? There is zero experience-overhead to FLAC. This isn't like some proprietary crapware format that becomes hit or miss due to patents or poor licensing. It's a Free format so common players all seamlessly support it. We're talking like 300kB of binary code, in a world where stuffing 300MB CRUD apps at users is normal.

If I want to transcode to MP3 for some bespoke device, then the FLAC compression makes that process quicker via saved IO bandwidth. There's literally no end-user complexity that I would save by moving to WAV. And if I wanted to develop software to read an audio file, then I'd import the FLAC library just as how one would generally import a WAV library. OP might as well advocate for storing audio as textual numbers line by line, so it could be easily processed with awk.


> Someday when I look at 1TB the way I look at 100GB today, this won't be a concern anymore.

And, there it is. Thank you.


You missed the other points. This one point is the weakest of them all. FLAC is simply better supported for the things people care about.

JPEG2000, for example, never caught on because nobody cares enough about the storage savings, and there were no other compelling features that anyone cared about. But for audio, we want our metadata. Had WAV been built with strong metadata support, FLAC wouldn't have gotten very far. But that ship has sailed.


For me, it's mostly about being able to stream it over a poor quality 4G connection in a speeding car/train. I could care less about the storage; I've got over half of a 120TB array unused, but halving the bandwidth requirement for crappy client connections without having to deal with transcoding is nice.

The storage savings are relevant when I consider cached audio on my phone, but... I'm not really sure how relevant.


Most desktops nowadays use SSD and the price of 4Gb SSD converted to dollars is about $500 where I live while salaries are significantly lower than in the US. Why not to use FLAC given that decoding doesn't require too much processing power?


As a consumer you might really not bother, that's your choice of course.

As a developer I'm working with large amounts of data totalling some dozens of TB compressed. I don't have the option of lossy-compressing it since bit-by-bit reproduction is part of the constraints. I don't want to store it uncompressed since that also increases I/O times. CPU is simply not the bottleneck anywhere along this line. And flac works great. :)


I need an SD card on my phone just to happily fit my highly compressed podcasts, so I don't think we're anywhere near the point of skipping compression.

Even somewhere you don't want to use FLAC because you're worried about the complexity, you should at least throw gzip at your files. PCM is a waste in any kind of computer scenario.


Isn't the 4TB hard drive an engineering marvel itself? Worse, it's an engineering marvel that costs each time its produced, while FLAC is mostly done.


Portable devices have to come with more internal storage before file size doesn't matter.

I'm still converting to MP3 before adding to my phone for that reason (used to be ogg which I remain convinced sounded better, but player compatibility wasn't there) because 128GB of expanded internal storage isn't much.


People will stop bothering with FLAC the instant they care about the specifics of the file format more than saving even the most trivial amount of space on their devices. Never, in other words.


Metadata.

FLAC allows for metadata. Those .wav files? Nope.


.wav files are really just https://en.m.wikipedia.org/wiki/Resource_Interchange_File_Fo...

It’s trivial to put a metadata chunk into that. And, trivial to ignore or blindly carry it along if your software doesn’t specifically support it.


> And, trivial to ignore or blindly carry it along if your software doesn’t specifically support it.

Which is about as good as not having metadata in the first place.

And since people rarely actually code to spec and it’s usual for wav files to not have metadata (and as demonstrated by GP common belief that WAV files can’t have metadata), pessimistically I would absolutely expect choking on wav metadata to be common.


It's that last statement that makes it completely impractical.

There isn't even a de facto standard for metadata with wav files. Most players/software don't even bother at all.

FLAC pretty firmly settled on the ogginfo format and literally everything that supports FLACs use that same metadata format.


There is no de facto supported standard because WAV files are de facto not used as anything but intermediate format these days. You could have fifty standards, that wouldn't change much.

WAV files can have any kind of compressed stream inside. A program that used some ancient API like WinMM ACM could theoretically play anything inside WAV as long as audio driver for that compressed format (now commonly called “codec”) was present. In addition, there are all kinds of standard metadata (like author and comment fields), and players in the '90s had no problem with displaying it in their interface if it was present. Old Windows Media Player had some of these status lines shown by default.

By the way, “driver” was an actual driver, with an .inf file, update of win.ini/system.ini or registry, and probably a reboot required to start using the decoder.

Old SDK tells us to look further into MMREG.H, so let's do just that.

https://github.com/tpn/winddk-8.1/blob/f6e6e4da7d1894536cf1f...

https://github.com/tpn/winddk-8.1/blob/f6e6e4da7d1894536cf1f...

(Last line is quite ironic.)

It's interesting to look how all those early '90s ideas that Universal Tagged Formats would be used to hold anything and everything, and dragged-and-dropped into any and all application completely flopped. Mac had those, Windows had those, Amiga had those. You can say that MP4, an international standard, is actually MOV in disguise, but it doesn't support all those early crazy options that could turn the file into poor man's PowerPoint presentation, poor man's Hypercard, or poor man's FMV game. On the other hand, Windows Media Player did support ASF interactivity like “open the link in a browser at this timestamp”, “show user some text at that timestamp”, and the consequences were so bad that it all had to be killed in mid-00s.


There does seem to be at least some de facto standard, at least checking my collection for stuff I haven't converted to flac yet I found the Wingspan soundtrack in wav format with, according to kid3, an ID3v2.3.0 tag (including cover art) and RIFF info. It works with mpv and VLC, I didn't try anything else (well, I tried a couple of command line id3 utilities that didn't show the tags). I've seen tagged wav files a few other places as well so it doesn't seem super uncommon. However, checking quick it doesn't seem like flac notices the tags when converting.


Arguably there doesn't need to be a de facto standard, because there's a de jure one: RIFF INFO.

But anyway, if you're going to do WAV tagging (and remember software support remains poor) most people are using ID3.


Good luck getting VLC to read an embedded cue sheet for track locations.


You can definitely embed metadata in wav files.

Another fun fact about wav files: they don't necessarily contain LPCM data. The format allows for a variety of encodings, including lossy ones.


Windows Vista included WAV files with the OS that contained MP3 audio :)

I don't know if newer versions still do that. I'd be a little surprised if they don't, given the previous version.


WAV can’t deal with metadata / ID3s, cover art etc.

Meanwhile everything bar Apple now supports FLAC natively.

No way in hell we ever switch to WAV.


The problem could be solved by placing PCM audio into some other container format. WAV proliferates for historical reasons but perhaps we need a slightly more modern audio container format that will accept any kind of bit stream + metadata, etc. I suppose MKV could fit the bill today. Are there any other good candidates?


It can - it has supported RIFF INFO from the start, and ID3 support exists.

If software writers choose to not extract metadata from WAVs that's up to them... but I guess it amounts to the same thing if you only want to use software that doesn't read tags in WAVs.


In a pinch you can play FLACs using the built-in Files app, actually. Still better off using a different playback app for any long listening, though.


I'd argue (I think many would) that FLAC is substantially less complicated than MP3. Perceptual compression, the frequency domain, and things like the FFT are more complicated in implementation and in the math than Run-Length, residual / delta encoding, linear predictive coding. And perceptual compression schemes use both "classes" of compression tools anyway (broadly speaking).


$RESOURCE might be cheap and plentiful today. In the greybeard world you would have been right because everything was trending towards more-for-less. In 20XX you could wake up tomorrow and find out that some bullshit caused a global supply chain issue at the same time that a Proof of $RESOURCE blockchain is causing a scramble for it.


"complexity of FLAC"? As complex as gzipping a file

Sounds like the people who are surprised at their end of month AWS bill because they couldn't bother to optimize anything.


Optimization is about trade offs. My career has been data processing pipelines for commercial game engines. I’m quite familiar with the trade offs of compressing all kinds of formats, including audio.

FLAC is at least as complex as OGG+gzip. That’s reasonable to deal with because there are libraries with many thousands of hours of effort put into them freely available.

But, trivial is trivial. Anyone who has had to deal with format support rot will sing praises of trivial.

My argument is that it’s a disappointing archive format given that any teenage coder can come within 30-50% of it in an afternoon by doing the most trivial thing possible.

Expectedly, few people here are arguing with me about archives. Instead we’re talking about streaming podcasts over 4G. Even for for high-end music, saying your personal bandwidth ethics/budget necessitates FLAC but not Opus seems more than a bit of a stretch, in my opinion.


> Anyone who has had to deal with format support rot will sing praises of trivial.

Anyone who has had to deal with finding and recovering old data should appreciate the ability to fit significantly more on a disk. More redundancy and cheaper storage means more data survives.

> teenage coder [...] in an afternoon

Finding a FLAC decoder is very unlikely to take longer than that, even 50 years from now.


I see

If your point is that most users of FLAC would be happier with AAC/Opus etc I agree (and with a more significant savings)

And yeah if you're looking at longer archival times, PCM probably makes more sense. But I think FLAC suffers way less rot than, for example WMA


FLAC bitrots less than WAV, because WAV doesn't have checksums.


If you're concerned about bitrot (and I feel everyone should be), you're going to checksum at the filesystem level, so it doesn't make a difference


The more the better. Sometimes there's a bit flip on the way to the streaming server…

https://news.ycombinator.com/item?id=25335936


As someone pointed out above, WAV bitrots results in unreported micro-blips. FLAC bitrot results in reported large-scale failure.


It bitrots at the same rate, but you detect that it happened with FLAC


And then you restore it from backup/rip the CD again/download it from the original source again. And test your RAM.


> It is an engineering marvel

There are lots lossless audio compression. Each with different set of priorities and trade offs. From Compression Ratio, Decoding and Encoding Speed etc. I agree FLAC is a very good codec, especially for consumers. But I am not entirely sure it is a "marvel". Arguably WavePack does somethings better, especially when doing it patent free. ( Sort of irrelevant now at this point in time )

The world has moved to streaming, and it is only a matter of time when lossless becomes a standard. Like both Apple and Spotify are planning. 5G will improves capacity, so by the end of 5G migration sometimes in 2030? This is sort of off topic but HN on one hand dont understand 5G and suggest not upgrading to 5G but on the other hand wants to have more Data for the same price and start doing lossless streaming. Well you cant have both.

While I wish we could research and push the lossless ratio down to 33%, or push 128kbps lossy audio codec to be truly transparent. But much like modern audio codec all research are now going to low latency and real time. Which seems to be a much much harder topic.


Gotta give the audiophools lossless so they can listen over compressed Bluetooth audio.


The reason this "audiophool" uses FLAC is because I want my music collection to be stored at the same quality it was when I paid for it; and FLAC is more conservative with storage than PCM.

I don't stream it; and I don't use it on a phone. If I ever need to do those things, I can convert from FLAC to lossy, knowing that I'm starting with something that is as accurate I can get.

That is, I'm implicitly using FLAC as an archive format. Gzipped PCM would work, except that I can't play that directly in a music player. I don't care about a bit-flip rendering a FLAC unplayable; my collection is backed-up.

Basically, I'm not going to pay for a CD full of bits, and then immediately throw away a lot of those bits.


This. It's just silly to throw data away. Who knows when you might need to transcode?


LOL I didn't write that part since HN has a long and what I think is pointless argument last time we discussed bluetooth and audio.


Going from lossless to lossy is better than lossy to lossy.


Right about the time where computers are finally fast enough where we can just have all software run with Electron without caring about the performance hit.

Just because we can doesn't mean we should waste ssd/cpu when there are absolutely no benefits, aside perhaps from saving a few kilobytes of extra decoder code to save multiple megabytes per song.


There is no inherent advantage to bigger files. You could losslessly upsample those wav files of yours to some insane precision if all you wanted was bigger files.

If the argument instead is that raw audio is a simpler format, the answer is that everybody wants some sort of container that carry metadata.


What would make life simpler is the industry standardising on at least one format. For example Apple doesn't actually support FLAC, despite it being the most popular lossless format.


Core Audio added support for FLAC in High Sierra, macOS 10.13, in 2017.

Apple has supported "ALAC" for longer, which unlike FLAC uses only integer math and is therefore less power hungry on mobile devices. You can transcode losslessly between FLAC and ALAC.


I can't comment on the "power hungry" part, but FLAC only requires and (to my knowledge) has ever only required integer math. Source: just looked at my own FLAC implementation [1].

[1] https://github.com/astoeckel/libfoxenflac/blob/master/foxen/...


The FLAC decoder has been almost completely integer for a long time (and may now be completely so). Apple Music and iOS still do not play FLAC, which is one significant reason why I won't buy an iPhone. Transcoding my entire music library certainly isn't "just works".


iOS supports FLAC, you can play it in Safari and the Files app. It's true you can't add it to the Music app, but you can use another app.

Also, the $9 Apple headphone dongle is objectively superior to most audiophile DACs, which isn't surprising since it has a larger R&D budget than that entire industry. (Same goes for Google's IIRC.)


I would guess these days, many people buy audiophile DACs for the amplifier stage, for use with demanding high impedance headphones. However, there have been some really bad phone DACs in the past, the Qualcomm audio from the Note4 that I used to own sounded significantly worse than the iPhone using IEMs. What are people measuring with regard to the Apple dongle? Is it "objectively superior" with regard to multiple measurements?

It shouldn't be too hard to build a decent DAC and amp stage, at least for ear buds and IEMs, it's just that many manufacturers (e.g. Samsung / Qualcomm) don't care. Apple sell music and so probably should care more.


https://www.audiosciencereview.com/forum/index.php?threads/r...

https://www.kenrockwell.com/apple/lightning-adapter-audio-qu...

It's still useful to have an amp for very high impedance headphones for electrical reasons, otherwise it can be quiet and lack bass (plus drain the battery faster.)


> Apple Music and iOS still do not play FLAC, which is one significant reason why I won't buy an iPhone.

You can't just install VLC on an iphone?


You can.


That FLAC support is only in CAF containers IIRC, means that most FLAC files won't play. But yes, if you transmux it into CAF, it'll work.


On macOS AVFoundation (and so QuickLook and QuickTime) can open .flac files. The Music app can only playback a artificially limited number of formats and can't play FLAC.


All lossless codecs use only integer math or else they wouldn't be lossless due to hardware/compiler optimization rounding differences.

…is what I want to say, but IIRC Lagarith actually does use floating point so you have to emulate x87 to decode it.


IPhone 13 Pro starts from 128GB and you need to pay extra 500$ to get 1TB. For portable players 2x reduction in storage space is still significant.


There is also other storage like memory cards for cars or portable players.


…which is more expensive than $65 for 4TB. Thus, the question is framed around price.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: