GitHub - amirsalarsafaei/MHC-peptide-Binding-affinity

Solutions I had

Fine-tuning prot_bert, the model was trained just like bert, so I thought I could use [SEP] token to separate MHC and peptide sequences and use the output of token [CLS] at the beginning for the classifier head unfortunately due to lack of resources I only managed to run the model for 1.5 epochs because each epoch took 19 hours, with that said I achieved avg Precision of 90% and a good enough ROC-curve and my F1 score was about 80 percent which could drastically change with 4-5 more epochs.
The other solution that I didn't have time to try because of the time the first one took was using facebook ESM model to embed the sequences and then feed it to a neural network, although because of the huge demension I was going to use PCA to lower the dimension while keeping the important features in the data

File Formats

in EDA I searched and found out some info about MHCs and extracted some features from given MHC type, like allele group, etc. Then I cleaned the data and used in bert notebook to tokenize train and finally test the model. I used a dense layer with Relu activation and some drop out to prevent from over-fitting and a sigmoid function to create the answer in form of a probability. Because of the large model state (1.8 GB) I didn't include it in the uploaded files. And lastly in the evaluate-model notebook I evaluated the model using the test answers created from the bert notebook which is included with the solution.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.gitignore		.gitignore
README.md		README.md
analysis.ipynb		analysis.ipynb
bert.ipynb		bert.ipynb
data.zip		data.zip
eda.ipynb		eda.ipynb
evaluate-model.ipynb		evaluate-model.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Solutions I had

File Formats

About

Releases

Packages

Languages

amirsalarsafaei/MHC-peptide-Binding-affinity

Folders and files

Latest commit

History

Repository files navigation

Solutions I had

File Formats

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages