From Bioinformatics Core

  • Developer: Brad Sickler
  • Started: October 2006
  • Completed: March 2007

Overview

Prof. David G. Smith, a professor with the UC Davis Department of Anthropology, is working on QTL linkage studies and population dynamics in Rhesus macaque monkeys. In order to do this he needs to develop a library a sufficient number of genetic markers. The R. macaque draft genome is close to completion (~6.1X coverage) but only has a small number of markers currently available. This project was started as a first of its kind to use 454 pyro-sequencing technology as a means to develop generate a large pool of candidate SNP/Indel markers.

454’s sequencing technology can obtain hundreds of thousands of sequences each around 100bp long scattered randomly across the entire genome.

Project Goals

  • 1. Analyze the feasibility of using 454 generated sequences to generate potential polymorphism. Specifically, analyze pre-existing sets of 454 data and simulate a run on the R.macaque genome to verify experimental validity. 454 Distribution | 454 Simulation
  • 2. Develop, test, and tune a computational pipeline for SNP and Indel discovery using 454 sequences against a reference genome.
  • 3. Develop a set of visualization and analysis tools to facilitate candidate selection for sequencing and polymorphism verification. http://mamusnp.genomecenter.ucdavis.edu

Final Status

All goals were met as of November 2006. The completed pipeline is fully reusable and all generated data and queries are entered in a relational database. Overall, this pipeline identified 22,892 candidate SNPs and 2,923 candidate Indels from two initial 454 runs. Preliminary resequencing results confirm a success rate of over 60% in verifying the SNPs. Tracks were developed for SNPs and Indels in the UCSC genome browser and all SNP results are available online at http://mamusnp.genomecenter.ucdavis.edu.

With the success of the project, Dr. Smith is submitting a 5 year grant proposal to the NIH to develop markers using this method. The methods used to generate these markers has been published on PLoS One. Click MamuSNP: A Resource for Rhesus Macaque (Macaca mulatta) Genomics, 2007.

Time Estimates

285 total working hours.

Personal tools