Drug-Discovery

A simple easy-to-follow tutorial on Drug Discovery with Machine Learning. Here I chose the target protein PIK3CA phosphatidylinositol-4,5-bisphosphate 3-kinase catalytic subunit alpha (human) which is overexpressed in breast cancer. First, I fetched the data from the chembl of known experimentally validated inhibitors of our target. Using the smiles of each compound I calculated the rdkit descriptors which later on serve as features while the corresponding pic50 value is the the label. Since the pic50 is continuous, I trained a regression model. Below are the regression plots of train and test data comparing the R2 score of real vs predicted values.

The model seems to be overfitting. In the next tutorial, I will upload how to train a graph convolutional network (GCN) for the same task. GCNs are known to capture better expressivity as a graph is a better representation of a molecule compared to sequences, images, or data in tabular format. Later on, I will find novel compounds that might BE potential inhibitors for this target. I will also calculate its drug-likeness properties to filter out the best potential inhibitors.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
LazyRegressor_pik3ca.ipynb		LazyRegressor_pik3ca.ipynb
PIK3CA_target.ipynb		PIK3CA_target.ipynb
README.md		README.md
brc-targets.ipynb		brc-targets.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Drug-Discovery

About

Releases

Packages

Languages

sumone-compbio/drug-discovery

Folders and files

Latest commit

History

Repository files navigation

Drug-Discovery

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages