Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

For _K_S definitely not. We quantized 3b with q4_K_M since we were getting good results out of it. Officially Meta has only talked about quantization for 405b and hasn't given any actual guidance for what the "best" quantization should be for the smaller models. With The 1b model we didn't see good results with any of the 4b quantizations and went with q8_0 as the default.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: