> And around ~2012, a bunch of researchers have reported you don't even need 2nd... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		jules on Nov 14, 2013 \| parent \| context \| favorite \| on: Deep Learning 101 > And around ~2012, a bunch of researchers have reported you don't even need 2nd-derivative information. You just have to initialize the neural net properly. This sounds very interesting. How do you property initialize the weights? Do you have a link to a paper about this?

kkjkok on Nov 15, 2013 [–]

Check out this paper:

Practical recommendations for gradient-based training of deep architectures, Y. Bengio

http://arxiv.org/abs/1206.5533

There is a section on weight initialization on page 15. In general, this paper has a lot of good information in one place.

Consider applying for YC's Summer 2026 batch! Applications are open till May 4
Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact