I wish there was a more universal format for parsers, but I just don't think there enough people who know their stuff.
Take PHP, a language that a lot of people use: the tree-sitter-php extension doesn't support features added in 2019, let alone features added towards the end of 2020.
If you want an up-to-date PHP parser, there's really only one open-source parser[0] that's accurate enough to be used on PHP codebases old and new, and it's written in PHP. Then if you want to parse in a robust fashion you have to adopt a number of hacks to get everything working.
I hadn't encountered LSIF before – can GitHub be configured to use those maps?
We've looked at LSIF before, and decided against it for a few reasons, mostly around COGS, operational overhead, and indexing latency. I gave a talk at last year's FOSDEM [1] going into some of the details. (Caveat that that talk was from when we were using a different open-source library, Semantic, to power fuzzy Code Nav. It's much easier to support new languages using the now-current tree-sitter query approach!)
Take PHP, a language that a lot of people use: the tree-sitter-php extension doesn't support features added in 2019, let alone features added towards the end of 2020.
If you want an up-to-date PHP parser, there's really only one open-source parser[0] that's accurate enough to be used on PHP codebases old and new, and it's written in PHP. Then if you want to parse in a robust fashion you have to adopt a number of hacks to get everything working.
I hadn't encountered LSIF before – can GitHub be configured to use those maps?
[0] https://github.com/nikic/PHP-Parser