Yes, I realized after posting that this is probably a better paper to link to: h...

Yes, I realized after posting that this is probably a better paper to link to: http://cseweb.ucsd.edu/~yfreund/papers/bandits.pdf The algorithms for the partial information setting are sometimes surprisingly similar to the algorithms where you see all the results. The algorithm in the paper linked above is essentially the same algorithm but with a small exploration probability. The regret bound gets worse by a factor of sqrt(N), however.