This is hard to say definitively. The new Nvidia Vera Rubin chips are 35-50x more efficient on a FLOPS/ megawatt basis. TPU/ ASICS/ AMD chips are making similar less dramatic strides.
So a service ran at a loss now could be high margin on new chips in a year. We also don’t really know that they are losing money on the 200/ month subscriptions just that they are compute constrained.
If prices increase might be because of a supply crunch than due to unit economics.
Given the massive costs on training, R&D, and infrastructure build out in addition to the fact that both Anthropic and OpenAI are burning money as quickly as they can raise it, the safe bet is on costs going up.
Honestly some of this info is quite hard to parse. I think the efficiency is ~35X on the system level but 10X on the hardware level. I think this is due to Nvidia bringing in Groq in addition to chip improvements.
Seems like the real costs and numbers are very hidden right now. It’s all private companies and secret info how much anything costs and if anything is profitable.
That's like saying driving for Uber is profitable if you only take into consideration gas mileage but ignore car maintenance, payments, insurance, and all the other costs associated with owning a car.
Not sure which exact model you're talking about, but I've run the 30B and the 3.5 32B models and both can get some things done and can waste tons of time getting some things completely wrong.
They're fun to mess around with to figure out what they can and can't do, but they're certainly not not tools in the way I can count on Codex.