I guess it doesn't matter to human ears with a well-mastered 16 bits, but the video linked in the OP explains that typically dithering noise is shaped (toward frequencies we're less attuned to). The models used to lossy-compress also typically put more noise in some of the higher freqs.