Skip to content

johon-lituobang/MD

Repository files navigation

Matrix dissimilarity from the differences of Moments and sparsity

Generating a dissimilarity matrix is typically the first step in big data analysis. Although numerous methods exist, such as Euclidean distance, Minkowski distance, Manhattan distance, Bray-Curtis dissimilarity, Jaccard similarity and Dice dissimilarity, it remains unclear which factors drive dissimilarity between groups. In this paper, we introduce an approach based on differences in moments and sparsity. We show that this method can delineate the key factors underlying group differences. For example, in biology, mean dissimilarity indicates differences driven by up/down-regulated gene expressions, standard deviation dissimilarity reflects the heterogeneity of response to treatment, and sparsity dissimilarity corresponds to differences prompted by the activation/silence of genes. Through extensive reanalysis of transcriptome, proteome, metabolome, immune profiling, microbiome, and social science datasets, we demonstrate insights not captured in previous studies. For instance, without considering metadata such as age, BMI, sex, or biomarkers, it is feasible to predict COVID-19 patient mortality based solely on matrix dissimilarities observed during the first week with high accuracy.

This work was initiated two years ago, and later I shared to some professors for collaboration. It is still ongoing now, and prepared for PNAS. I am introducing this work in YouTube and Quora, if you are interested, please visit: https://www.youtube.com/@Iobiomathematics or https://www.quora.com/profile/Tuobang-Li-1/answers . Also, the manuscript has been deposited in Zenodo. Tuobang Li. (2023). Matrix dissimilarities based on differences in moments and sparsity. Zenodo. https://doi.org/10.5281/zenodo.10406288 and research gate https://www.researchgate.net/publication/377974505_Matrix_dissimilarities_based_on_differences_in_moments_and_sparsity

Feel free to share it or contact tl@biomathematics.org, for more materials available by request.

view counts since 2024, February, 8th Visitor Count