We have created a dataset of Hindi-English Code-Mixed Social Media Text, which consists of Tweets from Twitter. Tweets are annotated with the associated class label i.e, Hate Speech or Normal Speech.
Due to privacy policy of twitter we are releasing only tweet ids not the tweet text. Tweet text can be requested from us by mailing at [email protected]
This dataset in in development and in future we will extend this to more number of sentences.