April 20

Lab 13: Research Project Week 4

Jess Hastings

Date of Work: 4/18/2017

Rationale: The purpose of this lab was to proof read each part of the poster that had been written and constructing the final product. Also, overall conclusions were drawn and the conclusion was written as a group.

Methods: 

  1. Obtain Dot Plot from Lathan
  2. Create Results Flow Chart
  3. Draw Overall Conclusions

 Tools used: NCBI Blast, Phamerator

Results:

Looking at the numerical data from the multiple alignment, the following percent similarities can be seen.

 

Figure 1: Percent Similarity From Multiple Alignment

The alignment gave these percentages of similarities of the nucleotide sequence of the tape measure proteins. The overall similarity of the three sequences was calculated by counting the times that all three sequences match and dividing by the largest number of base pairs. While all the percentages are relatively close, AN and AK share the greatest similarity which was also seen when looking over the alignment result.

A dot plot was also made of the sequences to visually compare sequences.

 

Figure 2: Dot Plot of AN, AM, and AK cluster tape measure proteins

The stronger similarity can also be seen in this dot plot. The following cluster sequences are placed on the graph in this way:

AN: 1-1998 bp

AK: 2009-4576 bp

AM: 4587-5648 bp

 

The AM portion of the dot plot is noticeably lighter, meaning that there is less similarity. There is even more similarity within the AM sequence than with the AN or AK cluster.

Conclusions:

After comparing the data and using the dot plot, the conclusion that was drawn was that the AM sequence was too short when compared to the AN and AK sequences. The AM sequence being short has significantly lowered the ability for the sequences to be similar. The AM tape measure is 1,062 bp long while the AN and AK sequences are 1,998 and 2,567 bp long respectively. Tape measure proteins would need to be more similar in length to get a more accurate comparison.

Future Plans:

After analyzing the data and writing all parts of the poster, the poster was constructed. There are still a few more edits to be made on our poster before next lab period. These changes include:

  • shortening text (introduction and abstract)
  • bulleting results
  • adding in a phamerator map of Bennie, Courtney3, and Circum
  • inserting a cluster map to better show the cluster comparisons
April 20

Lab 12: Research Project Week 3

Jess Hastings

Date of Work: 4/11/2017

Rationale: The purpose of this lab was to begin continue collecting information for our research project and begin drawing conclusions. Additionally, the rough draft of our poster was to be turned in at the end of class along with a detailed plan of which student was completing which part of the rest of the project. Lab time was spent finding primary sources and creating a multiple alignment of all three clusters to find the overall similarity.

Methods: 

  1. Obtain all three nucleotide sequences for the alignment (one sequence from each cluster)
  2. In a FASTA file, place each tape measure sequence following the correct format, name each sequence with an easily identifiable name
  3. Upload the FASTA file into Clustal Omega and run alignment
  4. Analyze results of sequence alignment

 Tools used: NCBI Blast, Clustal Omega, PubMed, PhagesBD

Results:

The FASTA file had to have all the sequences that were to be aligned in the format shown in Figure 1. Once the FASTA file was uploaded to Clustal Omega the alignment was generated within a few minutes.

Figure 1: Fasta File for Multiple Alignment

The alignment showed the similarity between the AM, AN, and AK clusters based on the phages Circum, Courtney3, and Bennie. When all three sequences matched the alignment has a star beneath the rows. When two of the sequences matched the nuclotide base is shown. And when a sequence did not match with the other a “-” is displayed. Figure 2 shows are of the alignment.

Figure 2: Multiple Alignment of AN, AM, and AK clusters

By analyzing the results of this alignment, we can deduce the similarity between the three clusters for these three pages.

Also during this lab, multiple primary sources were found to assist our knowledge for this project. All the sources used relate to the tape measure protein and phage cluster categorizing. These sources will be used in our poster to help write the abstract, introduction, and analyze our results.

Sources:

https://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-14-410https://bmcbioinformatics.biomedcentral.com/articles/10.1186/1471-2105-12-S9-S10http://bmcgenomics.biomedcentral.com/articles/10.1186/1471-2164-15-855

Conclusions:

By just looking at the multiple alignment sequence, it is noticeable that the AM sequence matches far less with either of the other two sequences. The AN and AK sequences have many more nucleotide base pairs in common. There are a few areas in the later end of the sequence where all three sequences match more frequently. Upon further investigation of these clusters this may be a trend that relates to the morphology of the phage.

Future Plans:

Using the information from the multiple alignment, we will have a numerical value that describes the similarity of the clusters. This can be used to draw conclusions about our results and overall our hypothesis. Using the FASTA file we will also create a dot plot to have a visual comparison of the clusters.

April 19

Lab 11: Research Project Week 2

Jess Hastings

Date of Work: 3/28/2017

Rationale: The purpose of this lab was to begin continue working on our research project and begin collecting information. In order to do this, our research question has to be clearly defined and a plan of action for the project must be outlined.

Methods: 

  1. Identify two other clusters that have myovirdae morphology
  2. Identify two published phage genomes within those two clusters
  3. Identify the tape measure gene in each genome and obtain nucleotide sequence
  4. Run an NCBI Blast to see similarity between tape measure genes in the same cluster

 Tools used: DNAMaster, Phamerator, NCBI Blast

Results:

The two clusters that contained myovirdae phages that were used were the AM and AK clusters.

The AM cluster only contains 1 published genome, Circum. While we were unable to compare this sequence with another AM phage to see similarity, the sequence will be used when comparing the different clusters.

Figure 1: NCBI Blast Results – Circum (AM)

Within the AK cluster the phages Bennie and Greenhouse were compared. When comparing the nucleotide sequence of these two phages tape measure genes they were found to have a 93% similarity. This is lower than the 99% similarity that was found between AN phages, but is still a high match.

Figure 2: NCBI Blast results for Bennie (93% match with Greenhouse)

By comparing these two phages in the the AK cluster, we were able to confirm that tape measure genes within a cluster share similarities in nucleotide sequence. With this understand, we then moved next to comparing the different clusters to see if there was any similarity in their tape measure genes.

Conclusions:

The results found during lab confirmed multiple questions in our research project. By comparing the sequences of two AK cluster, we found that they also share a high amount of similarity, as expected. This is expected because phages are put into clusters based upon genome similarity. Also, these two phages are of the same morphology. This matters because the tape measure protein is directly related to the length of the phages tail. We wanted to be sure to compare two different phages that have the same morphology because each morphology is known for having an average length of tail.

Future Plans:

After gathering all the nucleotide sequences from three different clusters and comparing their similarity with the cluster, we will now compare the three different clusters against each other. While we expect there to be some difference in the nucleotide sequence because they are from different clusters we also expect there to be some similarities. This is expected because the phages are of the same morphology which means that they all have a similar average tail length which is determined by the length of the tape measure protein.

March 22

Lab 10: Research Project Week 1

Jess Hastings

Date of Work: 3/21/2017

Rationale: The purpose of this lab was to begin working out our group research projects. As the question is explored more, it will be developed and changed due to findings. Also, the rough draft of the post layout was discussed. Additionally, we looked for more online Bioinformatics tools that would be helpful in our projects.

Methods: 

  1. Identify the pham of the tape measure gene in Courtney3
  2. Identify other published genes in that pham
  3. Collect the protein sequences of the tape measure from the other published genes
  4. Compare the sequences by aligning them on Gepard

 Tools used: DNAMaster, Phamerator, NCBI Blast

Results:

Using phamerator, we found that the tape measure gene on Courtney3 was in pham 6177. The other published genomes we found in that pham and researched were:

Maggie

Decurro

Chestnut

Moloch

Multtie

Upon doing this research, we discovered that each tape measure gene in these genomes all had the protein sequence:

MSRTAVLAVRIVTETKEANKGIDDTVSKLDKFERGLDKAALPAAAAGTAVLAFAKKTGDMASIAQQNAGAVDSVFKGNAKTVNEFAATAADKLGLSGSAYQQMASVIGSQLKNMGVPMDQVAGSTNDLIAKGADLAAMFGGTTSDAVDALSSLLRGERDPIEKYGVSINDAAIQAKKAELGLAGLSGEADKNATLTATMALLQKQTADATGQFAREADSAAGAQERANAKIQDAGAKLGSVFLPAMAAAATAAGGMATWASENSTVLLVLAGIIGGVAGAILLINGALKAWRAATAAVAAVQVVLNAVMSANPIGLVVLAIAALVAGLVWAYNNVGWFKDFVDQAFAAIGAVVAAVAQWFQDAWNNAVTFVQAYIEAWSIIINAVFTGIQTAVGAVAQFFTDAWNNAVTFVQAYISAWGIIINAVFTGVQSAVGAVADFFRNAWAVAVAIVAGVIRSWQAGVNAVFNAVGSFISGVVNNVRNVFSSVFNVILGIVTGVIAGVRGAIDGVTSTVQSVASIINGALVAAFNFVASAGRNAFAGITGAIQGVIGWIQNALSWVRNLASGIGNAVGQMLGLGGATAATADAPGLSYFGGGDPGFEGGATSIFGGNTFFGAPAPKAAAPIIVNLTVNGAMDPTAVGKQIYDILLKYLRRNGDVVNGATPWZ

Upon more research we discovered that all of these genomes had the same tape measure protein because they are all in the same pham. This is part the way that they are placed into phams, so there is 100% alignment between the sequences of the tape measure gene for pham 6177.

Figure 1: NCBI Blast Results for protein sequence (Courtney3)

Conclusions:

After recognizing this similarity in the pham and doing more research, we realized that we will need to compare different clusters. By comparing different clusters, the genomes will have different tape measure sequences, so they will not be identical. Pham are put into clusters based on the sequence of their genomes. So, in order to compare differences we must compare different clusters. We now have a better understanding of the branching/categorizing of phages.

Future Plans:

After discovering this, we had to redesign a bit of our research question. We will be looking at multiple tape measure sequences of phages in different phams. These are the sequences we will align and build a consensus sequence of if possible. After discovering the differences in the sequences, we want to look at the functions of the tape measure in each phage and see if there is any correlation.

March 22

Lab 8: Annotating Timinator

Jess Hastings

Date of Work: 2/27/2017

Rationale: The purpose of the lab was to annotate the full genome of Timinator by having each student annotate 5 genes individually. This process would allow us to annotate a whole genome in one class and provide more practice with annotating.

Tools Used: DNAMaster, DNAMaster Quick Start Guide, NCBI Website, Phagesdb Website, Phamerator, Staterator, HHPred

Methods: 

  1. Run an Auto-Annotation of Timinator_Draft in DNAMaster
  2. For each gene in the region, check and, if needed, change the following parts of the annotation:
    • SSC: call start of the gene
    • CP: coding potential found on GeneMark (on phagesdb)
    • SD: score, if it is the best score and if not why, z-score
    • SCS: does it agree with Glimmer and GeneMark
    • Gap: calculated from start/stop location
    • NCBI Blast: protein product on NCBI
    • Phagesbd BLAST: protein product on Phagesbd
    • HHPred: best hit with e value or no good hit
    • LO: longest open reading frame
    • ST: does it agree with Starterator
    • F: function
    • FS: evidence that supports function decision
  3. Fill each part of the annotation into the Timinator Google document
  4. Change status on the home page to completed

 

Conclusions:

The genes in Timinator that I annotated were: Gene 6, Gene 28, Gene 50, Gene 72, & Gene 78. Following is the result and eveidence for each annotation.

Gene 6:

Figure 1: Gene 6

 

The following annotation is my results for gene 6:

Start: 5593bp Stop: 5829bp BKWD GAP: 127bp Gap SD Final Value: SD Score: -3.523  (best score) Z-Value: 2.642 CP: The gene is covered SCS: Agrees with Glimmer, Agrees with GeneMark NCBI BLAST: hypothetical protien BARRETLEMON_6 [Arthrobacter Page Barret Lemon] q1:s1 E-Value: 1e-47 CDD: No good hit PhagesDB BLAST: Function Unknown [Sonny_6], evalue: 5e-31 q1:s1 E-Value: 5e-31 HHPred: No Good Hit LO: No, Blast q1:s1, Agrees with GeneMark and Glimmer ST: Agrees with Starterator F: NKF FS: NCBI, Phagesbd Notes:

 

Gene 6, followed the auto annotation. There was not start changes that needed to be made because the call that was made in the auto annotation agreed with Starterator, the GeneMark coding potential, and produced a q:1:s:1 score on the NCBI Blast. Even though this was not the longest ORF, I did not change the start becuase of the q1:s score and becasue this is a reverse gene, so there needed to be a gap for the promoters.

Evidence for gene 6:

Figure 2: Gene 6 Coding Potential from GeneMark

Figure 3: NCBI Blast results for Gene 6

Figure 4: Phagesbd Blast results for Gene 6

Figure 5: HHPred results for Gene 6; no good hit

Gene 28:

Figure 6: Gene 28

Start: 23729bp Stop: 25318bp FWD GAP: 11bp Overlap SD Final Value: SD Score: -2.578 (2nd best score) longest ORF Z-Value: 3.143 CP: The gene is covered SCS: Agrees with Glimmer, Agrees with GeneMark NCBI BLAST: endolysin [Arthrobacter Phage BarretLemon] q1:s1 E-Value: 0 CDD: PGRP Superfamily E-Value: 1.18e-7 PhagesDB BLAST: endolysin [BarretLemon 28] q1:s1 E-Value: 0 HHPred: lysin E-Value: 1.4e-23 LO: Yes ST: Agrees with Starterator F: endolysin, LysM-like FS: HHPred, NCBI, Phagesbd Notes:

Gene 28 also followed the auto annotation given by DNA Master. This gene was forward, and although the call is the 2nd best SD score, it is the longest ORF. Gene 28 has the function endolysin, and was part of the PGRP Superfamily.

Evidence for Gene 28:

Figure 7: Gene 28 Conserved Domain Results from NCBI

Figure 8: Gene 28 Coding Potential from GeneMark

Figure 9: Gene 28 NCBI Results

Figure 10: Gene 28 HHPred Hit

Gene 50:

Figure 11: Gene 50

Start: 37229bp Stop: 37447bp FWD GAP: 4bp Overlap SD Final Value: SD Score: -5.179 (Best score) Z-Value: 1.848 CP: The gene is covered SCS: Agrees with Glimmer, Agrees with GeneMark NCBI BLAST: Hypothetical Protien Barretlemon 50 [Arthrobacter phage Barret Lemon] q1:s1 E-Value: 5e-43 CDD: No good hit PhagesDB BLAST: [BarretLemon 50] E-Value: 4e-35 HHPred: No good hit LO: Yes ST: Agrees with Starterator F: NKF FS: NCBI, PhagesDB Notes:

Gene 50 followed the auto annotation given by DNA Master. This gene was forward and the call was the best SD score, longest ORF, and agreed with Starterator. Because of these reasons, I made no change. No function was found for Gene 50.

Evidence for Gene 50:

Figure 12: Gene 50 Coding Potential from GeneMark

Figure 13: Gene 50 Phagesbd Blast Results

Figure 14: Gene 50 NCBI Results

Figure 15: Gene 50 HHPred Hit; no good hit

Gene 72:

Figure 16: Gene 72

Start: 45650bp Stop: 46270bp FWD GAP: 4bp Overlap SD Final Value: SD Score: -6.6022 (6th best score) Longest ORF Z-Value: 1.283 CP: The gene is covered SCS: Agrees with Glimmer, Agrees with GeneMark NCBI BLAST: AlpA-like DNA binding protein [Arthrobacter phage BarretLemon], q:1, s:1, E-Value: 9e-75 CDD: No good hit PhagesDB BLAST: AlpA-like DNA binding [BarretLemon_72] E-Value: 9e-59 HHPred: TORI inhibition Protien – DNA binding protien E-Value: 1.7e-11 LO: No, was not best score ST: Agrees with Starterator F: dsDNA break-binding protein, AddA- like FS: Phagesbd, NCBI, Hhpred Notes:

Gene 72 followed the auto annoation produced by DNAMaster and was found to have a function of dsDNA break-Binding Protein. This call was not the longest ORF because it was not the best SD score. Additionally, this call had a q1:s1. There needs to be a gap for the promoters.

Evidence for Gene 72:

Figure 17: Gene 72 Coding Potential from GeneMark

Figure 18: Gene 72 Phagesbd Results

Figure 19: Gene 72 NCBI Results

Figure 20: Gene 72 HHPred Hit

Gene 7B:

Figure 21: Gene 7B

Start: 50785bp Stop: 49781bp BKWD GAP: 50bp Gap SD Final Value: SD Score: -4.79 (2nd best score) longest orf, q1:s1 Z-Value: 2.051 CP: The gene is covered SCS: Disagrees with Glimmer, Agrees with GeneMark NCBI BLAST: Hypothetical Protien [Arthrobacter phage BarretLemon 78] q1:s1 E-Value: 0 CDD: DUF932 SuperFamily E-Value: 1.39e-52 PhagesDB BLAST: Hypothetical Protien [BarretLemon 78] q1:s1 E-Value: 0 HHPred: No good hit LO: Yes ST: longest orf, q1:s1 F: NKF FS: NCBI, Phagesdb Notes:

 

Gene 78 did not follow the auto annotation given by DNA Master. In the auto-annotation, gene 80 was removed because it overlapped with gene one and gene 79. After this shift, I chose to extend gene 78 and move the start call. I made this decision based on the q1:s1 score recieved in BLAST. I found, when BLASTing the auto annotation, a score of q1:s3. By extending the gene, it was the longest ORF and the 2nd best SD score. This call still has a big enough gap for the promoters because it is a reverse gene.

Evidence for Gene 78:

Figure 22: Gene 78 Conserved Domain Results from NCBI

Figure 23: Gene 78 Coding Potential from GeneMark

Figure 24: Gene 78 NCBI Results

Figure 25: Gene 78 HHPred Hit; no good hit

 

Conclusion:

After annotating all 5 of these genes, I feel mostly comfortable with the annotation practices. The part I am least confident on is annotating reverse genes. I was able to annotate all 5 of these genes within the 2.5 hours of lab time.

Future Plans:

Next week in lab we will double check our annotations of Timinator, and once they are complete then we can begin the process of preparing Timinator to be submitted.

March 20

Lab 9: Research Project Planning

Jess Hastings

Date of Work: 3/13/2017

Rationale: The purpose of this lab was to create groups and brainstorm a project idea for our semester final project.

Methods: 

The following resources were explored to brainstorm research questions:

  1. phagesbd.org >> Glossary
  2. phagesbd.org >> Links
  3. Outside searches

 Tools that will be used: DNAMaster, Phamerator, Gepard, GeneBank, CustalW, NCBI Blast

Conclusions:

Our group is made up of Kayla Wilson and I. We want to explore the tape measure protein for sequence similarities. Our goal is to identify if there is similar sequencing within this gene that is present within all phages. The tape measure gene is directly correlated with the length of the tail. To narrow our question, we want to look at a single pham for comparison. We are planning on using Phamerator to identify initial similarities within the protein. CustalW is another program that we can use to compare and align multiple sequences. After looking through a pham (or two) and finding if there are any similarities, we want to make a consensus sequence that serves as an outline for the similarity. If there is time we can use Gepard to link the similarities to a CDD.

Future Plans:

Going into lab next week, we are going to start by identifying the pham we are going to research. We are going to start with the phage Courtney3 (a published genome) and identify the pham of that tape measure gene. From there, Phamerator and NCBI will be used to see initial similarities between the genes.  If there is time we would also like to look at the location of the tape measure gene in the phage.

 

 

February 22

Lab 7: Conclusion of Lore_Draft

Jess Hastings

Date of Work: 2/22/2017

Rationale: The purpose of this lab was for each group to present the region of Lore_Draft that they annotated and what they found during this process. Presentations were an efficient way to review the whole genome and look for annotations that needed to be double checked. All annotations need to be carefully checked before being sent to NCBI for publishing. We also began learning about the next steps in the process of sending a gene to be published.

Tools Used: DNAMaster

Methods: 

  1. Finish presentations and submit on Canvas under assignment
  2. Present to class the outcome of annotations, be sure to include evidence as to why decisions were made to make changes to the auto annotation or not
  3. Answer questions asked by lab instructors and classmates about decisions
  4. If needed, do further research to confirm annotation

Conclusions:

This lab was very beneficial to see a very important part of genome annotation. After each annotation was completed, they needed to be review by someone other than the person annotating. This helps to catch mistakes that could have been overlooked while working. As a class, we decided that gene 3 needed to be re-annotated to ensure that the best decisions were made. Also, the orginal call of gene 21 was deleted because there was a longer ORF in another reading frame that was in the reverse direction. This annotation had not yet been completed. Most of the genes found in Lore matched the auto annotation made by DNAMaster and did not have a known function.

After the genome had been checked, we began to learn about the next steps to publishing the genome. This requires some work in DNAMaster to rename all the genes and upload the final annotations. There are some double checks that can be done in DNAMaster, such as archive, to once again check the decisions made while annotating. Then there are more detailed that need to be assessed when submitting the genome. The report will have to include a Cover Letter with the name, authors, and interesting things found in the genome.

February 22

Lab 6: Lore_Draft Annotation

Jess Hastings

Date of Work: 2/14/2017

Rationale: The purpose of the lab was to check annotations of the Lore_Draft gene that was annotated during lab five. Additionally, genes that were annotated by other groups, but overlapped into our region were annotated and checked against the other group’s for differences. After all genes in the region were annotated, each group put together a PowerPoint presentation will annotation details and proof to present to the class.

Tools Used: DNAMaster, DNAMaster Quick Start Guide, NCBI Website, Phagesdb Website, Phamerator, Staterator, HHPred

Methods: 

  1. Run an Auto-Annotation of Lore_Draft in DNAMaster
  2. For each gene in the region, check and, if needed, change the following parts of the annotation:
    • SSC: call start of the gene
    • CP: coding potential found on GeneMark (on phagesdb)
    • SD: score, if it is the best score and if not why, z-score
    • SCS: does it agree with Glimmer and GeneMark
    • Gap: calculated from start/stop location
    • NCBI Blast: protein product on NCBI
    • Phagesbd BLAST: protein product on Phagesbd
    • HHPred: best hit with e value or no good hit
    • LO: longest open reading frame
    • ST: does it agree with Starterator
    • F: function
    • FS: evidence that supports function decision
  3. Fill each part of the annotation into the Lore_Draft Google document
  4. Change status on the home page to completed
  5. Create a PowerPoint to share annotations and evidence for decisions made with class

 

Conclusions:

Our region of the Lore_Draft genome was bp 1401-2800 which contained genes 4-6. Kayla and I decided to each annotate one gene and gene 6 was annotated by group 3. I annotated gene 5 in our region.

Figure 1: Region of Annotation for Group 1

 

The following annotation is my results for gene 5:

Start: 2294bp Stop: 2497bp FWD GAP: 5bp Gap SD Final Value: SD Score: -3.122 (Best score) Z-Value: 2.909 CP: The gene is covered SCS: Agrees with Glimmer, Agrees with GeneMark NCBI BLAST: hypothetical protien [Arthrobacter phage Jessica] q:1 s:1 E-Value: 3e-34 CDD: No good hit PhagesDB BLAST: Function Unknown [TymAbreu 5] E-Value: 1e-27 HHPred: No good hit LO: Yes ST: Agrees with Starterator F: NKF FS: NCBI, PhagesBD, HHpred Notes:

 

Gene 5, was a pretty simple gene to annotate. There was not start changes that needed to be made because the call that was made in the auto annotation agreed with Starterator, the GeneMark coding potential, and produced a q:1:s:1 score on the NCBI Blast. However, there was no function found for gene 5. The Blast did not return any conserved domain hits and additionally HHPred did not have any good hits. So, gene 5 was labeled as “No Function Known”.

Evidence for these calls can be seen in the follow figures:

Figure 2: Coding Potential of Gene 5 from GeneMark

Figure 3: NCBI Blast results for Gene 5

Figure 4: Phagesbd Blast results for Gene 5

Figure 5: HHPred results for Gene 5; no good hit

I also annotated Gene 6, to double check Group 3. I agreed with all of their annotations which came from the auto-annotation done by DNAMaster. There were no changes made to gene 6. The follow annotation is the result:

Start: 2549bp Stop: 3709bp FWD GAP: 51bp Gap SD Final Value: SD Score: -4.629 (5th best score) All the others don’t cover all coding potential. Some are under 200 base pairs long. Z-Value: 2.16 CP: The gene is covered SCS: Agrees with Glimmer, Agrees with GeneMark NCBI BLAST: portal protein [Arthrobactor phage Decurro] q:1 s:1 E-Value: 0.0 CDD: pfam04860 E-Value: 1.89e-19 PhagesDB BLAST: Portal Protein [Yank_6] E-Value: 0.0 HHPred: 3kdr_A, HK97, family phage portal protein E-Value: 7.1e-42 LO: Yes ST: Agrees with Starterator F: portal protein FS: PhagesDB, NCBI, HHpred Notes:

Gene 6 is a phage portal protein. This is decided upon because of the Blast results as well as the CDD information gathered. Additionally, HHPred confirmed the results that it was a portal protein.

Evidence for Gene 6:

Figure 6: Gene 6 Conserved Domain Results from NCBI

Figure 7: Gene 6 Coding Potential from GeneMark

Figure 8: Gene 6 NCBI Results

Figure 9: Gene 6 Phagesbd Blast results

After completing the annotation of these genes and compiling the correct evidence, we then put together a PowerPoint to present to the class.

 

February 13

Lab 5: Starterator

Jess Hastings

Date of Work: 2/7/2017

Rationale: The purpose of this lab was to learn the Starterator software and to continue refining skills in gene annotation. A full annotation was performed by each student.

Tools Used: DNAMaster, DNAMaster Quick Start Guide, NCBI Website, Phagesdb Website, Phamerator, Starterator

Methods: 

Starterator:

  1. Enter virtual box, open Starterator
  2. Search that link and gene number that you are annotating
  3. Review the map results and determine if you are calling the gene in the correct location
  4. Scroll to the bottom to ensure the base pair number that matches the call location
  5. If the gene is not called in the correct spot, change the call location and re-annotate

Conclusions: During this Lab I annotated Lore_Draft gene 5. The new system of inputting the annotation information into a google document was very organized and efficient. Starterator is a great program that helps to decide the big question of where to call the gene, which can change everything. This lab cleared up many questions that I had about annotating! I feel much more comfortable with the process now.

 

February 3

Lab 4: Phamerator

Jess Hastings

Date of Work: 1/31/2017

Rationale: The purpose of this lab was to learn the Phamerator software. This software contains all phages that have been recorded, and creates maps that can compare the similarities of genes.

Tools Used: DNAMaster, DNAMaster Quick Start Guide, NCBI Website, Phagesdb Website, Phamerator

Methods: 

Annotation Continued:

  1. To perform a Blast:
    1. Select the gene you would like to BLAST
    2. Select Product and copy the chain of amino acids listed
    3. Go to the NCBI website or phagesbd.org (wherever you would like to perform the blast)
    4. Select Protein Blast
    5. Paste the amino acid chain into the query box
    6. Select BLAST
    7. The Blast may take a little while to run, but after it is complete you can analyze the matches and note the error factor to determine accuracy
  2. After performing a blast on NCBI and phagesbd list the closest result, score, and evalue under the BLAST note
  3. Compare NCBI and phagesbd results to determine the function of the gene that is being annotated, record function under the F: note
  4. Under the FS: note, write reasoning for why that function was selected, reference phagesbd and NCBI results

Phamerator:

  1. Enter virtual box, open Phamerator and search the phage Link
  2. Select the phage Link along with 2 other similar phages
  3. Click “Map” in the upper left hand corner to compare the genomes of these phages

Conclusions: This lab was helpful because Phamerator provides a strong visual to be able to compare phage genomes. Also, a full manual annotation was done on Link Gene 22, using all the technology that we have learn at this point. My annotation results were:

Link Gene 12

Annotation:

Original Glimmer call @bp 6820 has strength 13.98
SSC: 6853-7170 Forward CP: Covers All SD: -5.552 but not best score, -1.414 is more practical SCS: Glimmer called at 6820, GeneMark called at 6853 Gap: 6bp gap
BLAST:
On phagesbd: Toulouse_11, function unknown, 116, score: 232, error: 8e-62
On NCBI: Hypothetical Protein, score: 237, error: 3e-79
LO: 318bp ST:n/a F:Hypothetical Protein FS: BLAST evalue on NCBI, top 15 all same results, 100% match