> Those other companies have no flipping hardware to run it "on edge", in a "generic" way, which is the goal
Maybe? This is why I responded to:
> It's everyone's goal. Apple is actively achieving that goal
This is is the issue I found disagreeable. Other organizations and individual people are achieving that goal too. Google says GPT-Nano is going to device, and if the benchmarks are to be believed, if it runs at that level, their work so far is also actively achieving that goal. Meta has released multiple distilled models that people have already proven to run inference at the device level. It cannot be argued that meta is not actively achieving that goal either. They don't have to release the hardware because they went a different route. I applaud Apple for the M chips. They are super cool. People are still working on using them so Apple can realize that goal too.
So when you go to the statement that started this
> Apple's AI strategy is to put inference (and longer term even learning) on edge devices
Multiple orgs also share this. And I can't say that one particular org is super ahead of the others. And I can't elevate apple in that race because it is not clear that they are truly privacy-focused or that they will keep APIs open.
> You cannot effectively do this on competitor hardware, with good performance, from "budget" to "Pro" lineup, which is a requirement of the goal
Why do you say you cannot do this with good performance? How many tokens do you want for a device? Is 30T/s enough? You can do that on laptops running small mixtral.
> What hardware are they running it on? Are they taking advantage of Apple (or other) hardware in their strategy?
I don't know. I have nothing indicating necessarily apple or nvidia or otherwise. Do you?
> [Regarding the rest]
Sure, my point is that they definitely have an intent for bespoke models. And why I raised the point that not all computation will be feasible on edge for the time being. My point with what raised this particular line of inquiry is whether a pure edge experience truly enables the best user experience. And also why I raised the point about Apple's track record of open APIs. Which is why "actively achieving" is something that I put doubt on. And I also cast doubt on apple being privacy focused. Just emphasize tying it back to the reason I even commented.
Maybe? This is why I responded to:
> It's everyone's goal. Apple is actively achieving that goal
This is is the issue I found disagreeable. Other organizations and individual people are achieving that goal too. Google says GPT-Nano is going to device, and if the benchmarks are to be believed, if it runs at that level, their work so far is also actively achieving that goal. Meta has released multiple distilled models that people have already proven to run inference at the device level. It cannot be argued that meta is not actively achieving that goal either. They don't have to release the hardware because they went a different route. I applaud Apple for the M chips. They are super cool. People are still working on using them so Apple can realize that goal too.
So when you go to the statement that started this
> Apple's AI strategy is to put inference (and longer term even learning) on edge devices
Multiple orgs also share this. And I can't say that one particular org is super ahead of the others. And I can't elevate apple in that race because it is not clear that they are truly privacy-focused or that they will keep APIs open.
> You cannot effectively do this on competitor hardware, with good performance, from "budget" to "Pro" lineup, which is a requirement of the goal
Why do you say you cannot do this with good performance? How many tokens do you want for a device? Is 30T/s enough? You can do that on laptops running small mixtral.
> What hardware are they running it on? Are they taking advantage of Apple (or other) hardware in their strategy?
I don't know. I have nothing indicating necessarily apple or nvidia or otherwise. Do you?
> [Regarding the rest]
Sure, my point is that they definitely have an intent for bespoke models. And why I raised the point that not all computation will be feasible on edge for the time being. My point with what raised this particular line of inquiry is whether a pure edge experience truly enables the best user experience. And also why I raised the point about Apple's track record of open APIs. Which is why "actively achieving" is something that I put doubt on. And I also cast doubt on apple being privacy focused. Just emphasize tying it back to the reason I even commented.