This is a implementation of TTMR++(Enriching Music Descriptions with a Finetuned-LLM and Metadata for Text-to-Music Retrieval). This project aims to search music with text query.
Update Soon
This project is under the CC-BY-NC 4.0 license.
- see this repo: Tag-to-Caption Augmentation using Large Language Model
Part of the code is borrowed from the following repos. We would like to thank the authors of these repos for their contribution.
- Modified ResNet: OpenAI CLIP
- Audio Frontend: OpenAI Whisper
- Distributed Data Parallel Training: Pytorch DDP
Please consider citing our paper in your publications if the project helps your research. BibTeX reference is as follow.
@inproceedings{doh2024enriching,
title={Enriching Music Descriptions with a Finetuned-LLM and Metadata for Text-to-Music Retrieval},
author={Doh, SeungHeon and Lee, Minhee and Jeong, Dasaem and Nam, Juhan},
booktitle={ICASSP 2024-2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)},
year={2024}
}