Skip to content

DravidianNLP/KanHopeEDI

Repository files navigation

KanHope

This is the code for the paper "Hope Speech detection in under-resourced Kannada language"

  1. Download the corresponding files from Zenodo:
https://zenodo.org/record/5006517/
  1. Set the path to 'path_to_repo/KanHope/Dual Channel models/'.

  2. For the models that follow the architecture of BERT, run the classifier.py and find the string 'read_csv'. Add the paths to the train, test, and validation dataframes. Change the path to the dataset where the files have been stored after downloading from Zenodo.

  3. Run test.py for inference.

  4. Under the same directory, run get_predictions.py to view classification reports and confusion matrix.

  1. Download the English translations of the code-mixed Kannada-English dataset, along with the splits:
https://Zenodo.org/record/4904729/
  1. run dc_classifier.py to train the Dual channel BERT model.

  2. For the names of the models (model1;model2), follow the naming conventions as listed in Huggingface Transformers' pretrained models. a)model1: Monolingual English language model (Translated Texts). b)model2: Multilingual language model (Kannada-English code-mixed text).

  3. under the same directory run get_predictions.py to view the classification reports and confusion matrix.

  4. The architecture of the dual channel model is as follows:




This approach could be used for any multilingual datasets. The weights of the fine-tuned models are available on my Huggingface account [AdWeeb](https://huggingface.co/AdWeeb).

We have provided the notebooks for reference.

Experiments, Results, and Discussions

The code and their explanation for all the experiments are present in the Jupyter Notebook. We document interesting findings, results, discussions and qualitative analysis in the manuscript.




If you use our dataset, and/or find our codes useful, please cite our paper:

@misc{hande2021hope,
      title={Hope Speech detection in under-resourced Kannada language}, 
      author={Adeep Hande and Ruba Priyadharshini and Anbukkarasi Sampath and Kingston Pal Thamburaj and Prabakaran Chandran and Bharathi Raja Chakravarthi},
      year={2021},
      eprint={2108.04616},
      archivePrefix={arXiv},
      primaryClass={cs.CL}
}

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published