Ditto on not being able to rely on some mechanical turk services - although the quality of answers/labels you get back inevitably depends on how well you have laid out the task at hand, we have additionally witnessed poor quality output that was outright rogue - tasks being completed in sub seconds or same answers given by a user no matter the question - both pointing to tasks being completed by bots not humans.
In fact we had resolved to building a turker bot detector and started rejecting tasks completed in suspicious ways. Only once we have built ourselves a trusted turkers population did we start to get quality data back. I suspect most people don't bother to go back and reject poor quality answers and that is why the bots survive.
Wow, I believe it. We didn't have so much a rogue situation, but you really do have to constrain their actions to just what you wanted. I tried to find the source but there was some YouTube video I watched where the guy made this good comment about creating GUIs where you have to put a real emphasis on preventing users from doing things you do not want them to do. You can't always focus on features but also constraints. I really took that message to heart after experiencing some of the human unpredictability found in building up training data. It made for some interesting payment debates to ask them to redo some work that was incomplete. Fun stuff.
In fact we had resolved to building a turker bot detector and started rejecting tasks completed in suspicious ways. Only once we have built ourselves a trusted turkers population did we start to get quality data back. I suspect most people don't bother to go back and reject poor quality answers and that is why the bots survive.