Is there any potential improvements over transformers for interpretablity or ali... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		lukeplato on May 23, 2023 \| parent \| context \| favorite \| on: RWKV: Reinventing RNNs for the Transformer Era Is there any potential improvements over transformers for interpretablity or alignment?

pico_creator on May 23, 2023 [–]

For anything past 8k context size

We are talking about over 10x reduction in GPU time for inferencing tokens and for training too

Aka it’s cheaper and faster

Alignment is frankly IMO purely a dataset design and training issue. And has nothing to do with the model

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact