Assuming with GDM, you mean Google-Deep Mind. They pioneered RL with deep nets as policy function estimator. The deep nets being a result of CNNs and massive improvements in hardware parallelization at the time.
"gdm pioneered rl" is definitely not actually right, but it's correct to assert that they were huge players.
people who knew from context that your statement was broadly not actually right would know what you mean and agree on vibes. people who didn't could reasonably be misled, i think.
RL was established, at the latest, with Q-learning in 1989: https://en.wikipedia.org/wiki/Q-learning