Abstract:
Objective This study aims to develop a pattern-recognition based system for automatic single nucleotide polymorphism (SNP)detection in diploid fragment sequencing and improve the detection accuracy.
Method The LabWindows/CVI 9.0 platform and Matlab environment were combined for analyzing.ab1 or.scf files generated in diploid PCR fragment sequencing. Firstly, four bases G, A, T and C were separated for eliminating noise through one-dimensional discrete wavelet filtering, following with extraction of typical features of each base position (peak) from a fluorescence curve. A classifier based on back-propagation neural network was then used for SNP recognition and diagnosis.
Result This established system was characterized by friendly interface, stable operation and manual modification accessibility. It classified the SNP reliability into six grades. Performance test with 143 SNPs of 26 sequencing fragments from Eucalyptus urophylla demonstrated that our system outperformed three previously reported software packages in detecting accuracy, false positive and false negative rates.
Conclusion Our system has a high rate of accuracy without the need for a reference sequence. It could be used for efficient SNP detection in diploid PCR fragment sequencing.