Русский
!

Conference publications

Abstracts

XX conference

E.coli promoter discrimination by PLS-DA

Temlykova E.A., Kamzolova S.G., Dzhelyadin T.R., Sorokin A.A.

Institute of Cell Biophysics RAS, Russia, 142290, Pushchino, Institutskaya str.3, phone: (4967)739319, E-mail: evgenia.teml@gmail.com

1 pp. (accepted)

The problem of accurate and precise promoter prediction on a prokaryotic chromosome is still unsolved. With the projections on latent structures disciminant analysis (PLS-DA) we build three models, trained to separate following classes: \begin{itemize} \item \textbf{Model 1:} promoters vs randomised sequances; \item \textbf{Model 2:} promoters vs coding and intergenic sequances; \item \textbf{Model 3:} promoters vs promoter ``islands'' [1]. \end{itemize} PLS-DA is a powerful tool for classifying multivariable data [2]. With PLS-DA each observation is charactarised by a number of discriptors, forming matrix X, and a number of responses, forming matrix Y (class identifiers). By this technique we determine a new system of axes as liniar combinations of discriptors for both matricies X and Y simultaniously. The procedure helps to reduce dimensionality and preserve all necessary information. In this study electrostatic potential distribution [3] has been used as discriptors for sequences.

It was shown that the accuracy of predicitions for each model was around 75%. In combination with textual analysis [1] the accuracy was as high as 89-95%.

References

1. Shavkunov K.S., Masulis I.S., Tutukina M.N., Deev A.A., Ozoline O.N. Gains and unexpected lessons from genome-scale promoter mapping // Nucleic Acids Research, V.37, 2009, pp.4919-4931

2. Sarker M., Rayens W. Partial least squares for discrimination// J.Chemom., 2003, V.17, p.166

3. Polozov R.V., Dzhelyadin T.R., Sorokin A.A., Ivanova N.N., Sivozhelezov V.S., Kamzolova S.G. Electrostatic potentials of DNA. Comparative analysis of promoter and nonpromoter nucleotide sequences// J. Biomol. Struct. Dyn. 16, 6, 1999, pp.1135 - 1143.



© 2004 Designed by Lyceum of Informational Technologies №1533