Accumulating reports demonstrate that lncRNAs have significant roles in human cancers. Using experimental methods to study the relationships between lncRNA and cancer is time consuming and costly. In contrast, computational methods enable integration of multi-omics data and provide additional information for data mining.
Researchers from Xishuangbanna Tropical Botanical Garden (XTBG) developed a new method, CRlncRC2, based on a powerful machine learning algorithm — XGBoost, Laplacian score feature selection, and SMOTE over-sampling, to predict associations of lncRNAs with cancer.
The researchers addressed the data imbalance problem, which is caused by the relatively small size of available positive sets of cancer-related lncRNAs, using the Synthetic Minority Over-sampling Technique (SMOTE) method, to balance imbalanced data, while aiming to retain all important information.
Moreover, CRlncRC2 uses a more powerful machine learning model, extreme gradient boosting machine (XGBoost), to improve its predictive performance. The results show that both XGBoost and SMOTE can help to improve model accuracy and specificity.
After feature engineering, most of the expressed and methylated features are retained, indicating their importance for predicting lncRNAs with potential functions in cancer. Using much fewer features, CRlncRC2 has a mean AUC value 0.04 higher than that of CRlncRC.
In addition, their predicted top-ranking cancer-related lncRNA candidates are supported by Inc2Cancer v2.0, literature reports, and statistical data.
The researchers thus concluded that CRlncRC2 is an effective and useful method for lncRNA-cancer association identification.
The study entitled “Identification of Cancer-Related Long Non-Coding RNAs Using XGBoost with High Accuracy” has been published in Frontiers in Genetics.
Contact
LIU Channing Ph.D Principal Investigator
Key Laboratory of Tropical Plant Resources and Sustainable Use, Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Menglun 666303, Yunnan, China
E-mail: liuchangning@xtbg.ac.cn