IMO large social media platforms are actually much more capable of running these experiments than academics. (Disclaimer: I work on ads at one of these companies).
Platforms can accurately determine who engaged with an ad (basic logging on their sites), they have infrastructure to create statistically balanced ad experiments, and can also accurately determine whether a conversion happened (either through a conversion pixel or through data brokers).
Running these tests on behalf of advertising clients, or for internal research is fairly standard. If we couldn't prove statistically that our ads were working, I would have left a long time ago.
I think the difference from academic studies that fail to replicate is less about capability and more about the fact that experimentation in adtech is fundamentally straightforward and occurs under far more idealized conditions - those you mentioned, plus massive sample sizes and easy iteration - than most academic research.
Of course, this just reinforces your point that experimentation in adtech is largely not subject to the same issues that have fueled the replication crisis in academia.
Platforms can accurately determine who engaged with an ad (basic logging on their sites), they have infrastructure to create statistically balanced ad experiments, and can also accurately determine whether a conversion happened (either through a conversion pixel or through data brokers).
Running these tests on behalf of advertising clients, or for internal research is fairly standard. If we couldn't prove statistically that our ads were working, I would have left a long time ago.