The big thing that reddit has done, is by letting people see data (to whatever extent), studies can be conducted that tell us what is actually going on.
Right now, when I talk to anyone dealing with trust and safety, the discussions start around the point in a workflow where a case is filed in court.
But the amount of clarity we have on whether certain forms of forums result in happier communities, whether they reduce polarization and so on - is remarkably small.
And, from my experience, its mostly in English and mostly focused on the “global North”. W
I used the reddit pushshift data set for my work, and realized that conducting sentiment analysis for content relating to India was … beyond the tooling I could find available. Code mixed language analysis is its own kettle of fish. A problem Ive been thinking about every since.
Now multiply these limitations to under resourced communities around the world and its concerning.
As I see it, there’s tons of law related work being done in this space.
But it seems the data and code analysis of it is mostly held under an NDA. And to illuminate a future painful discussion - its held by companies in the US, a geo political fault line in the making. What happens to Puerto Rico? Kenya? Morocco?
This is the one thing that tech can still do, which is to build the tools that give us an objective view of how people actually behave online, and make that information public and a common good.
The big thing that reddit has done, is by letting people see data (to whatever extent), studies can be conducted that tell us what is actually going on.
Right now, when I talk to anyone dealing with trust and safety, the discussions start around the point in a workflow where a case is filed in court.
But the amount of clarity we have on whether certain forms of forums result in happier communities, whether they reduce polarization and so on - is remarkably small.
And, from my experience, its mostly in English and mostly focused on the “global North”. W
I used the reddit pushshift data set for my work, and realized that conducting sentiment analysis for content relating to India was … beyond the tooling I could find available. Code mixed language analysis is its own kettle of fish. A problem Ive been thinking about every since.
Now multiply these limitations to under resourced communities around the world and its concerning.
As I see it, there’s tons of law related work being done in this space.
But it seems the data and code analysis of it is mostly held under an NDA. And to illuminate a future painful discussion - its held by companies in the US, a geo political fault line in the making. What happens to Puerto Rico? Kenya? Morocco?
This is the one thing that tech can still do, which is to build the tools that give us an objective view of how people actually behave online, and make that information public and a common good.