Initial analysis of Mozy DNA sequence

In this exercise, we will take a first look at the DNA sequence for Mycobacteriophage Mozy. You will recall that we submitted Mozy DNA to collaborators at VCU for pyrosequencing. After performing the sequencing, our collaborators assembled the reads into 4 contigs, named contig00001, contig00018, contig00019, and contig00020.

Your task for today is to analyze the four contigs to identify their relationship to other sequences in GenBank. Two phages that are apparently closely related to Mozy are PMC and Tweety.

Use the four Mozy contigs and your knowledge of bioinformatics tools that we have covered in class thus far to address the following questions. Note that some aren't technically questions.

  1. Assuming the four Mozy contigs represent the entire genome, how does its size compare with the predicted sizes from the restriction digests we did earlier in the semester?
  2. Which of the four Mozy contigs are related to the Tweety and/or PMC genome?
  3. Which of the four Mozy contigs is related to the Mycobacterium smegmatis and Rhodococcus ruber genomes? Briefly describe the annotated function of genes that are present in this region of Rhodococcus ruber.
  4. Using the map of the PMC and Tweety genomes, identify those regions that are conserved in any of the four Mozy contigs. Label the printed genome maps accordingly.
  5. Construct a dot plot showing the relationship of PMC to Tweety.
  6. Construct a dot plot showing the relationship of the four Mozy contigs to PMC.
  7. Construct a dot plot showing the relationship of the four Mozy contigs to Tweety.