Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

But they still don't get it right, they explicitly allow not identifiable Unicode identifiers. The C20 committee recently allowed also insecure identifiers, completely ignoring the Unicode identifier guidelines. They stated that nobody cares, everybody wants them and making them secure would need the entire Unicode database. Why do they allow noobs into such committees? What is needed are the normalization tables (tiny), the script list (tiny) and the two xid lists.


> they explicitly allow not identifiable Unicode identifiers. [...] They stated that nobody cares, everybody wants them and making them secure would need the entire Unicode database.

Could you elaborate? rustc ships with the entire Unicode db and only allows indents with codepoints advertised by Unicode as allowed in indents.

The closest to walking off the beaten path is a (still unmerged) parser recovery PR that accepts emojis as identifiers if and only if a parse error would otherwise occur as a way to avoid knock down errors when someone tries to use them.


For identifier security you don't need the entire Unicode DB. Only rust or glibc would do that, nobody else. You need the XID_Start/Continue list of bits, a single normalization table if NFC (or two if NFD), the scripts list (ranges of a single byte), and a bit of logic. With confusables I'm not sure.

That's about 2k vs 20m.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: