This is a motte-and-bailey argument if I've ever seen one. It's true that it's s...

throwawaygh · on June 24, 2020

> But this is clearly a novel program

"not novel code" here was referring GP's "novel programming task", not the synthesis method. I think we're probably using different definitions of "task". Where you mean it in a very particular sense (this exact piece of code) and I mean it in a more "writing if-else blocks inside a single procedure with no loops and no recursion using functions that are in-scope" sense.

The proper way to determine if there's anything interesting here would be to run gpt-3 on some existing program synthesis benchmarks. Literally any program synthesizer can look super impressive if you just show one working example in a yt video. My suspicion is that gpt-3 isn't going to do particularly well on those benchmarks at least out of the box, and that getting it to work as well as sota would require a bunch of non-trivial engineering work.

Veedrac · on June 24, 2020

You have a much rosier view of program synthesis than I do. Could you link a paper that you think is particularly impressive? I know Idris can do trivial inferences interactively, but I don't know anything that can do anything non-trivial that isn't also very slow and very unreliable.

IIUC, the Generalized Program Synthesis Benchmark Suite[1] is still mostly unsolved, including problems like “Given three strings n1, n2, and n3, return true if length(n1) < length(n2) < length(n3), and false otherwise.”

[1] http://cs.hamilton.edu/~thelmuth/Pubs/2015-GECCO-benchmark-s...

throwawaygh · on June 24, 2020

Oh, no, sorry, I don't.

My point wasn't that current program synthesis is particularly great, although I do think modern program synthesis tools can probably beat gpt-3 on lots of problems (and allow that the other direction is probably true too...)

My point was that I'm skeptical that GPT-3 would do particularly well on those benchmarks without lots of additional effort. And then, since you can build pretty much anything anyway with enough blood and sweat, the actual question is: would the same amount of effort poured into an alternative approach generate the same/better results but in a way that's far easier to interpret/extend?

It could work. But the yt video alone is more "huh, interesting" than "wow, impressive". If that makes sense.

Veedrac · on June 24, 2020

Well the key difference is that you don't have to think much to get a code-specialized language model, and when you do train one it's much more general (eg. inferring constraints from text, using user-provided types, correctly naming variables, less prone to exponential complexity as sample length grows, etc.). And then the model just gets better over time as AI improves, and all you have to provide is a comparatively cheap bit of compute.

I got the impression from you saying “You can synthesize programs at this level of complexity with a few minutes on a single ten year old laptop using 5-10 year old algorithms.” that you thought this was generally solved at this level of complexity, rather than merely true for an easier example here and there.

deep_etcetera · on June 24, 2020

Maybe it would be helpful if you gave an example of the simplest python function it won't be able to synthesize, and if/when they release the code GPT into the API we can test your prediction.