|
|
Similarity Analysis of Protein Sequences Based on the EMD Method |
School of Science, Dalian Jiaotong University, Dalian 116028, China Department of Computer Science and Technology, Dalian Neusoft University of Information, Dalian 116023, China |
|
|
Abstract An Empirical Mode Decomposition (EMD) method to analyze the similarities of protein sequences is
proposed. The EMD method was used to divide a signal sequence converted from a protein sequence
into a group of well-behaved Intrinsic Mode Functions (IMFs) and a residue which is monotonic or a
trend. This is so that the similarities can be compared among protein sequences by the corresponding
residues conveniently and intuitively. This work verifies the method's suitability by using the cytochrome
c protein sequences of seven different species.
|
|
Fund:
Project supported by the Educational Commission of LiaoNing Province of China (Grant No.
L2012167) and the National Natural Science Foundation of China (Nos. 61273022, 11271060,
U0935004, U1135003, 11071031, 11290143).
|
|
|
|
Cite this article: |
Jihong Zhang,Junsheng Zheng,Fenglan Bai, et al. Similarity Analysis of Protein Sequences Based on the EMD Method[J]. Journal of Fiber Bioengineering and Informatics, 2014, 7(3): 387-395.
|
|
[1] Randic M. On graphical and numerical characterization of protemics maps. J Chem Inf Comput
Sci. 2001; 41: 1330-1338.
[2] Randic M, Zupan J, Novic M. On 3-D graphical repressentation of proteomics maps and their
numerical charaterization. J Chem Inf Comput Sci. 2001; 41: 1339-1344.
[3] Randic M, Zupan J, Novic M, Gute B, Basak S C. Novel matrix invariants for charaterization of
changes of proteomics maps. SAR QSAR Environ Res. 2002; 13: 689-703.
[4] Randic M, Vracko M, Lers N, Plavsic D. Novel 2-D Graphical representation of DNA sequence
and their numerical characterization. Chem Phys Lett. 2003; 368: 1-6.
[5] Randic M. 2-D Graphical representation of proteins based on virtual genetic code. SAR QSAR
Environ Res. 2004; 15: 147-157.
[6] Randic M, Zupan J, Balaban AT. Unique graphical representation of protein sequences based on
nucleotide triplet codons. Chem Phys Lett. 2004; 397: 247-252.
[7] Nandy A, Nandy P. On the uniqueness of quantitative DNA difference descriptors in 2D graphical
representation models. Chem Phys Lett. 2003; 368: 102-107.
[8] Liao B, Wang TM. Analysis of similarity of DNA sequences based on triplets. J Chem Inf Comp
Sci. 2004; 44: 1666-1670.
[9] Bai FL, Wang TM. A 2-D graphical representation of protein sequences based on nucleotide triplet
codons. Chem Phys Lett. 2005; 413: 458-462.
[10] Bai FL, Wang TM. The construction of phylogenetic tree by graphic representation of DNA
sequences. WSEAS Trans Inf Sci Appl. 2005; 2: 463-467.
[11] Bai FL, Wang TM. On graphical and numerical representation of protein sequences. J Biomol
Struc Dtn. 2006; 23: 537-545.
[12] Bai FL, Liu YZ, Wang TM. A representation of DNA primary sequences by random walk. Math
Biosci. 2007; 209: 282-291.
[13] Bai FL, Li DC,Wang TM. A new mapping rule for RNA secondary structures with its applications.
J Math Chem. 2008; 43: 932-943.
[14] Zupan J, Randic M. Algorithm for coding DNA sequences into \Spectrum-Like" and \Zigzag"
representations. J Chem Inf Model. 2005; 45: 309-313.
[15] Huang NE, Shen Z, Long SR et al. The empirical mode decomposition and the Hilbert spectrum
for nonlinear and non-stationary time series analysis. P Roy Soc Lond A. 1998; 454: 903-995.
[16] Huang NE, Wu ML, Qu WD. Applications of Hilbert-Huang transform to non-stationary financial
time series analysis. Appl Stoch Model Bus Ind. 2003; 19: 245-268.
[17] Bai FL, Zhang JH, Zheng JS. Similarity analysis of DNA sequences based on EMD method. Appl
Math Lett. 2011; 24: 232-237.
[18] Zhu SM, Yu ZG, Anh V, Yang SY. Analysing the similarity of proteins based on a new approach
to empirical mode decomposition. ICBBE. 2010; 6: 1-4.
[19] Wang ZH, Wang W, Li Y, Lin YW, Huang ZX. Sequence analysis and structure comparison of
cytochrome c proteins. China Journal of Bioinformatics. 2010; 8: 274-278.
[20] Zhao YB, Li XH, Qi ZH. Novel 2D graphic representation of protein sequence and its application.
Journal of Fiber Bioengineering and Informatics. 2014; 7: 23-33. |
|
|
|