Yan Xu, Yingxi Yang, Hui Wang and Yuanhai Shao* Pages 166 - 174 ( 9 )
Motivation: Lysine malonylation in eukaryote proteins had been found in 2011 through high-throughput proteomic analysis. However, it was poorly understood in prokaryotes. Recent researches have shown that maonylation in E. coli was significantly enriched in protein translation, energy metabolism pathways and fatty acid biosynthesis.
Results: In this work we proposed a predictor to identify the lysine malonylation sites in E. coli through physicochemical properties, binary code and sequence frequency by support vector machine algorithm. The experimentally determined lysine malonylation sites were retrieved from the first and largest malonylome dataset in prokaryotes up to date. The physicochemical properties plus position specific amino acid sequence propensity features got the best results with AUC (the area under the Receive Operating Character curve) 0.7994, MCC (Mathew correlation coefficient) 0.4335 in 10-fold cross-validation. Meanwhile the AUC values were 0.7800, 0.7851 and 0.8050 in 6-fold, 8-fold and LOO (leave-one-out) cross-validation, respectively. All the ROC curves were close to each other which illustrated the robustness and performance of the proposed predictor. We also analyzed the sequence propensities through TwoSampleLogo and found some peptides differences with t-test p<0.01. The predictor had shown better results than those of other methods K-Nearest Neighbors, C4.5 decision tree, Naïve Bayes and Random Forest. Functional analysis showed that malonylated proteins were involved in many transcription activities and diverse biological processes. Meanwhile we also developed an online package which could be freely downloaded https://github.com/Sunmile/ Malonylation E.coli.
Malonylation, support vector machine, post translational modification, E. coli, Receive Operating Character (ROC), Prokaryotes.
Department of Information and Computer Science, University of Science and Technology Beijing, Beijing 100083, Department of Information and Computer Science, University of Science and Technology Beijing, Beijing 100083, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100080, School of Economics and Management, Hainan University, Haikou 570228