This repo contains my final project for Applied Natural Language Processing (University of Trento, a.y. 2023/24).
- Introduction
- Data
- Experiments
- Preliminary Experiments
- Finetuning
- Evaluation (with Transfer learning across domains and Forgetting of previous knowledge)
I construct the first Fassa Ladin-Italian-English parallel corpus, and train a machine translation model on it. More information can be found in the accompanying report.
You can try translating text from English/Italian to Fassa Ladin using the model on Hugging Face Spaces 🦀
The dataset draws from multiple resources in 5 different domains: literature, news, games, laws, and brochures. It is available in the data
directory, either as a single file or split into train, validation, in-domain test, and out-of-domain test sets.
Evaluate the performance of the pre-trained models.
Fine-tune the pre-trained models on the Fassa Ladin-Italian-English parallel corpus, with the two approaches: Pivot-based transfer learning and Multilingual translation.
Evaluate the models' performance, investigate Transfer learning across domains, and Forgetting of previous knowledge.