Phytocytokine Scripts

Aim of the project

The aim of this project was the identification of orthologs of phytocytokines in Zea mays.

Mining of cross-species phytocytokines

Therefore a BLAST database containing 79 predicted plant proteoms retrieved from the ensemble plants database (https://plants.ensembl.org/info/data/ftp/index.html) was created using the NCBI BLAST+ 2.2.18 application was created.

Additionally, 18 previously published phytocytokines amino acid sequences were selected as queries for the BLAST searches.

psiBLAST searches of the queries in the database were performed with five iterations and an e-value threshold of 0.05 to detect distant relationships between proteins. A parallel BLASTp with the same settings was performed.

The following steps were performed with the script "psiblast_ensemble_propeptides_analysis.py" if not noted differentely. "psiblast_ensemble_propeptides_analysis_v2.py" was used to generate motifs with longer length and also some errors from the previous version were removed.

After the BLASTs, the results for all queried propeptides were combined, and unique identifiers were assigned to each hit linking it to the respective query.

The psiBLAST of all queried propeptides included many redundant hits (a hit found in rounds 1 to 5 was in the results five times) and could not be compared to the BLASTp results. Therefore, these redundant hits were removed after sorting all hits by their bit score. Additionally hits with low quality were removed (below 10% protein identities, query coverage below 25%) by the "process_data" function.

Therefore the script "ensemble_accession_organism.py" was used to link each id with the source proteom. The created table was loaded and linked to the hits with the "create_map" function in the script.

The remaining hits varied strongly in their amino acid sequence length. Therefore, all hits whose amino acid sequence length was 1.5-times longer or 1.5-times shorter than the longest or shortest hit within the group of hits with more than 75% identity, respectively, were removed (function "rel_length", length filtering).

In the last step, hits were filtered according to their similarity to the respective peptide hormone of each query. Initial multiple sequence alignments (MSA) with ClustalOmega 1.2.2 (prepared with function "create_clustal_cmd") of the hits indicated that there were still some false-positive hits without an alignment in the region of the peptide hormone, which might disturb the MSA and result in large gaps within the MSA. Therefore, all hits for each queried propeptide were aligned with the ClustalOmega application, and the region containing the peptide hormone was analyzed and scored (functions "read_clustal", "clustal_analysis", and "filter_data"). The similarity of each alignment region containing the peptide hormone was scored based on a BLOSUM62 matrix and low gap costs (gap: -8, extension: -1) using the biopython package:

Later motif diagrams were created with the logomaker 0.8 package (function "create_fasta" and "create_weblogo").

Figures

Phylogenetic tree

The phylogenetic tree of the subset of organisms was created as follows: 140 complete proteins present in all proteoms of the selected organisms either as one or multiple copies were selected using BUSCO 5.1.2 in its standard settings. If multiple copies were present, only the best hit was used to create the phylogenetic tree. The orthologs of each BUSCO protein were aligned via ClustalOmega 1.2.2. The alignments of all proteins were concatenated. A phylogenetic tree of the concatenated multiple sequence alignments was calculated via RAxML 8.2.12 using the standard settings. These procedures were performed with the python script ”psiblast_ensemble_propeptides_analysis.py”.

Venn diagram

The venn diagram was created with "venn_diagram.R" using the package "ggvenn".

qPCR graphs

Graphs displaying the fold change of gene expression were created with "3h+24h_foldchange.R".

Motif comparison AtPEP1 and ZmPEP1

Protein sequences of hits for AtPEP1 and ZmPEP1 were clustered together with CLUSTAL Omega. Afterwards, results were seperated based on the query. Per query the peptide sequence was extracted and a weblogo motif was created with the script "weblogo.py" using the package "logomaker".

PR gene expression analysis

PR gene expression was analyzed with the script "pr_genes_phytocyto_24h.R".

Botrytis Lesion Area analysis

Lesion size analysis including ANOVA and plotting was performed with R script "ANOVA lesion size Botrytis.R".

Cell death quantification

Quantification of cell death caused by the phytocytokines was performed with the script "ANOVA cell death quant.R".

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
align_AtPEP1_ZmPEP1		align_AtPEP1_ZmPEP1
botrytis		botrytis
cell death		cell death
clustal1		clustal1
clustal2		clustal2
clustal3		clustal3
clustal4		clustal4
logos		logos
logos_org_filtered		logos_org_filtered
plots		plots
plots_longer_motif		plots_longer_motif
plots_longer_motif2		plots_longer_motif2
qPCR data		qPCR data
scripts		scripts
sequences		sequences
summary		summary
.gitignore		.gitignore
Blastp_ensemble_allpropeptides_new_E1_max200000.csv		Blastp_ensemble_allpropeptides_new_E1_max200000.csv
PsiBlast_ensemble_allpropeptides_new_E0_5_iter5_max1000000_rep2.csv		PsiBlast_ensemble_allpropeptides_new_E0_5_iter5_max1000000_rep2.csv
README.md		README.md
propeptide peptide sequences.xlsx		propeptide peptide sequences.xlsx

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Phytocytokine Scripts

Aim of the project

Mining of cross-species phytocytokines

Figures

Phylogenetic tree

Venn diagram

qPCR graphs

Motif comparison AtPEP1 and ZmPEP1

PR gene expression analysis

Botrytis Lesion Area analysis

Cell death quantification

About

Releases

Packages

Languages

dmoser1/phytocytokine

Folders and files

Latest commit

History

Repository files navigation

Phytocytokine Scripts

Aim of the project

Mining of cross-species phytocytokines

Figures

Phylogenetic tree

Venn diagram

qPCR graphs

Motif comparison AtPEP1 and ZmPEP1

PR gene expression analysis

Botrytis Lesion Area analysis

Cell death quantification

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages