It works on a semantic level (i.e. what the code actually means), rather than fingerprinting strings. This means that reordering code segments, renaming variables, inlining or lifting functions wouldn't affect a match, if the code is semantically equivalent.
How does your techniques compare to Winnowing? http://theory.stanford.edu/~aiken/publications/papers/sigmod...