Is this a real cause for concern? Simply don't copy code with strange unicode ch...

mkl · on Nov 1, 2021

The point of the vulnerability is that you can't necessarily see the strange Unicode characters.

samus · on Nov 1, 2021

It's a problem in any environment where people can input Unicode characters. Reviewers might use tools that are not able to see those things.

At the same time, one can't just put a blanket ban on Unicode. It exists for a reason. People want to use their native languages to name identifiers, or at least to write comments. Restricting ourselves to ASCII again and thus forcing English on everybody is not a solution.

lixtra · on Nov 1, 2021

> Restricting ourselves to ASCII again and thus forcing English on everybody is not a solution.

Yet most programming languages force them to use English Arabic numbers.

Wouldn’t it be great to use Roman numerals?

And then images in source code are really difficult to handle. Wouldn’t it be nice to compile a word document with embedded images?

I think I wouldn’t mind staying with ASCII for source code, except for string literals (difficult enough).

samus · on Nov 2, 2021

Arabic numerals can be justified since they are dominant and most commonly used in modern science and technology across the whole world. Supporting additional systems would not introduce too much hassle though. Numbers follow a rigid syntax and generally do not employ free mixing with numerals from other systems.

What I have in mind actually exists already: Jupyther notebooks, which combine code, text, and resources combined into a nice JSON ball. Horrible for SCMs and editors without using special plugins of course.