I think the complaint was they scan your entire doc, for all it's content.
This is ~super common with things like copyright violations and child porn. You store a database of hashes of copyrighted or pornographic data. And specifically you can use visual-similarity hashing to detect even somewhat perturbed documents.
(https://en.wikipedia.org/wiki/PhotoDNA)
I think the complaint was they scan your entire doc, for all it's content.