Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Thanks - glad you like it! I probably won't get to all of these but let me try a couple:

1. There's a spectrum (sort of) between using full on RL techniques and just doing sequence modeling. We're trying to pick a reasonable place on that spectrum that lets us model whether things have gone well without doing too much fiddling.

3. It really depends on how closely related the domains are. I think it's safe to say that you should expect more transfer of abstract/high-level capabilities than nitty-gritty things related to the specific domain - that's part of why we're excited about training one big model to use all software tools.



Consider applying for YC's Summer 2026 batch! Applications are open till May 4

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: