USING DATA SCIENCE TO PREDICT AND PREVENT CANCER-THE THESIS OF QIAN XIN, A GRADUATE STUDENT, WAS PUBLISHED IN BIOINFORMATICS
September 6, 2018
HNU computing science student Qian Xin is mining the full clinical and genomic dataset in an effort to predict and prevent severe syndromes in human undergoing cancer treatment.
The Paper "DMCM: A Data-adaptive Mutation Clustering Method to Identify Cancer-related Mutation Clusters" by Qian Xin, a master student under the guidance of Prof. Lu Xinguo, is published in the journal "Bioinformatics", which is one of the top journals of international bioinformatics major (area 1 of JCR Mathematics and Computational Biology). This is the first time for a master's degree student to publish a paper in this top-level publication. The Hunan University is a separate steering body.
Establishing accessible pharmacogenomic screening in China
Cancer is one of the major diseases that pose a serious threat to human life and is one of the biggest invisible killers of modern human life. At present, the vast majority of cancer-driven studies remain on a single amino acid / nucleotide unit or single gene level. However, in recent years, researchers have gradually found that the formation of cancer may be related to the combination of multiple somatic mutations, that is, the formation of cancer may be caused by multiple single amino acids or single nucleotides. The study of cancer driven mutation should focus on the driving mutation region or driving mutation cluster on nucleotide or amino acid sequences. therefore In 2001, how to identify the mutation hot spots in amino acids or nucleotide sequences has become the focus of cancer big data analysis.
In this paper, an adaptive kernel density estimation based clustering method, DMCM (Data-adaptive Mutation Clustering Method) Mining driven mutation pattern), is proposed. This method improves the traditional kernel density estimation model which depends on the fixed kernel bandwidth. Firstly, a data adaptive kernel bandwidth estimation model is constructed to form an adaptive kernel density estimation model. Then using the model to estimate the mutation density of pancancerous somatic cell mutation data, the boundary of mutation class is determined by Gaussian distribution model, and the boundary of mutation class is optimized by using EM algorithm. The final somatic mutant. The experimental results show that the DMCM method has high robustness and the identified mutants are of driving significance.
The results of this work were published online on July 3, 2018 in the leading international journal of Bioinformatics. The original text of the paper is linked as follows:
https://academic.oup.com/bioinformatics/advance-article/doi/10.1093/bioinformatics/bty624/5053323?searchresult=1