Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Is there any potential improvements over transformers for interpretablity or alignment?


For anything past 8k context size

We are talking about over 10x reduction in GPU time for inferencing tokens and for training too

Aka it’s cheaper and faster

Alignment is frankly IMO purely a dataset design and training issue. And has nothing to do with the model




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: