Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Assuming with GDM, you mean Google-Deep Mind. They pioneered RL with deep nets as policy function estimator. The deep nets being a result of CNNs and massive improvements in hardware parallelization at the time.

RL was established, at the latest, with Q-learning in 1989: https://en.wikipedia.org/wiki/Q-learning



i didn't say they invented everything; in science you always stand on the shoulders of giants

i still think my original statement is fair


"gdm pioneered rl" is definitely not actually right, but it's correct to assert that they were huge players.

people who knew from context that your statement was broadly not actually right would know what you mean and agree on vibes. people who didn't could reasonably be misled, i think.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: