April 18

Project Week 4 (4/18/17)

Meredith Kim

April 18, 2017

Rationale of today’s work: Finish our official poster and prepare for next week’s presentation.

Tools used and/or Methods: PhagesDB glossary, Phamerator, DNA Master, NCBI

Results:

Title for our Project is Life if PHAM-tastic !!

Barret lemon Gene #7:

Gap Sequence:

TTGTCTTCTCTTCTCTCTGGCGGGTTGTTCGAACTGCTTGATCAAGTATGGACAGGCTGGCTCGTAAAGCGCAAGCCGAAACACGCGCACGTCCCACAACTGAGCCGCCCGGCACAGCACAGTTGAATCA

C-G Content: 

Promoter: 62.1% —> Promoter has MORE than genome…

Whole Gap: 55.4%

Genome: 60.9%

Boss Lady Gene #7:

Gap Sequence:

TTGTCTTCTCTTCTCTCGTTGTTCGAACTGTTAGAACAAGTATGGACAGATAGATCGAGATCGCGCAACCACCCAGCAGGCTTGTTCGATCGTGTCATCCCACAACGCAGCCGCGCGGCACGTCAGAGTTGACTCA

C-G Content:

Promoter: 40.7% —> Promoter has least

Whole Gap: 52.2%

Genome: 61.0%

(1) Beans Gene #6:

Gap Sequence:

ATTCTCGTGTTAATGTGAGCTTGCCTGTTTGGGACACGGCTTGACTGGGGAGTCTGGGAAGGGGCGGCGTCATTGCGACGCCGCCCTTTCTTGTGCCAGCTGATCGAACACGCCCGGCGTCCCACAATCCCGCAGCTGAGCGTTTCACAGTGGAGCTA

C-G Content:

Promoter: 62.1%

Whole Gap: 59.5% —> Gap has least, but promoter still has less than genome

Genome: 63.6%

(1) Brent Gene #6:

Gap Sequence:

ATTCTCGTGTTAATGTGAGCTTGCCTGTTTGGGAAGACGGCTTGACTGGGGAGTCTGGGAAAGGGCGGCGTCATTGCGACGCCGCCCTTTCTTGTGCCAGCTGATCGAACACGCCCGGCGTCCCACACTCGCGCAGCCGGGCGCTTCACAGTGGAGTCA

C-G Content:

Promoter: 51.9% —> Promoter has least

Whole Gap: 61%

Genome: 63.4%

JKerns Gene #7:

Gap Sequence:

TTTTCTTCTCTTCTCTCGTTGTTCGAACTGTTAGAACAAGTATGGACAGATAGATCGAGATCGCGCAACCACCCAGCAGGCTTGTTCGATCGTGTCATCCCACAACGGGGCCGCGCGGCACGTCAGAGTTGACTCA

C-G Content:

Promoter: 40.6% —> Promoter has least

Whole Gap: 52.2%

Genome: 61%

Jordan Gene #7:

Gap Sequence:

TTTTCTTCTCTTCTCTCGTTGTTCGAACTGTTAGAACAAGTATGGACAGATAGATCGAGATCGCGCAACCACCCAGCAGGCTTGTTCGATCGTGTCATCCCACAACGGGGCCGCGCGGCACGTCAGAGTTGACTCA

C-G Content:

Promoter: 40.7% —> Promoter has least

Whole Gap: 52.2%

Genome: 61.1%

Conclusion: Overall, the promoter sequences had less C-G content in comparison to the overall genome (except for BarretLemon). It would be interesting to graph this data and see if any patterns exist. If we find the averages of the C-G content, we can at least find an overall pattern in the results in terms of the gap sequence, genome, and promoter.

Next Steps: Next week, we will be presenting our posters in front of the class, so between now and then, my group and I will practice giving the presentation so that we can make sure our presentation isn’t too long or too short. In order to compile all of our data, we will make a google spreadsheet and look for trends in the data. In making the poster, we need to make visuals, including graphs, tables, and images from phamerator.com. We also need to complete our abstract, procedure, conclusion, etc.

April 11

Project Research Week 3 (4/11/17)

Meredith Kim

April 11, 2017

Rationale of today’s work: Delegate jobs to each group member and begin researching promoter sequences for the phages containing genes in pham 7442. This pham has 16 members, all of which are in the AO cluster. We will give each member about 5 genes to thoroughly investigate through DNA Master’s Promoter prediction and NCBI blast. This time, however, we will be focusing on running the promoter prediction for the gap before the gene itself. Previously, we ran the promoter prediction on the gene itself, which was not very useful since promoters are typically found previous to the actual gene.

Tools used and/or Methods: PhagesDB glossary, Phamerator, DNA Master, NCBI

Results:

Title for our Project: Life if PHAM-tastic !!

Phage Distribution:

Meredith:

  • Barrett Lemon Gene #7
  • Boss Lady Gene #7
  • Beans Gene #6
  • Brent Gene #6
  • JKerns Gene #7
  • Jordan Gene #7

Taylor:

  • Fanzy Gene #6
  • Jawnski Gene #6
  • Nahla Gene #6
  • Piccoletto Gene #6
  • LeeroyJ Gene #7
  • Timinator Gene #7

Katie:

  • Martha Gene #7
  • Shade Gene #7
  • Sonny Gene #7
  • StevieBAY Gene #7
  • TaeYoung Gene #7
  • Zartrosa Gene #7

For Each Gene:

  1. NCBI Blast Previous Gene
  2. Note gap between the gene and wether previous gene is forward/reverse
  3. Insert a gene into DNA Master in the gap between the previous gene and the gene in the pham we are investigating
  4. Run DNA Promoter on the gap sequence
  5. Copy/Paste the actual sequence of the gap containing the promoter
  6. NCBI Blast the Specified gene in the pham

Barret lemon Gene #7:

Previous gene (#6) is a reverse gene… Gap: 128bp

Blast NCBI for Gene #6 is hypothetical protein BARRETLEMON_6 e-value: 1e-48, q:1 s:1, no CDD

-35 -35 -35 -10 -10 -10
# Score Pos’n TTGACA Score Space Pos’n TATAAT Score
F 238 0.678 5927 CTGAGC 0.592 17 5950 TTGAAT 0.732

Gap Sequence:

TTGTCTTCTCTTCTCTCTGGCGGGTTGTTCGAACTGCTTGATCAAGTATGGACAGGCTGGCTCGTAAAGCGCAAGCCGAAACACGCGCACGTCCCACAACTGAGCCGCCCGGCACAGCACAGTTGAATCA

NCBI Blast for Gene #7: scaffolding protein [Arthrobacter phage BarretLemon], e-value: 0.0, q:1 s:1, Mu-Like_Pro superfamily

Boss Lady Gene #7:

Previous gene (#6) is a reverse gene… Gap: 134bp

Blast NCBI for Gene #6: hypothetical protein MARTHA_6 [Arthrobacter phage Martha], e-value 1e-47, q:1 S:1

-35 -35 -35 -10 -10 -10
# Score Pos’n TTGACA Score Space Pos’n TATAAT Score
F 436 0.657 5817 TTCTCT 0.633 15 5838 TAGAAC 0.72

Gap Sequence:

TTGTCTTCTCTTCTCTCGTTGTTCGAACTGTTAGAACAAGTATGGACAGATAGATCGAGATCGCGCAACCACCCAGCAGGCTTGTTCGATCGTGTCATCCCACAACGCAGCCGCGCGGCACGTCAGAGTTGACTCA

NCBI Blast for Gene #7: scaffolding protein [Arthrobacter phage Sonny], e-value: 0.0, q:1 s:1, Mu-Like_Pro superfamily

Beans Gene #6:

Previous gene (#5) is a forward gene… Gap: 156bp

NCBI Blast for gene #5: hypothetical protein SEA_BRENT_5 [Arthrobacter phage Brent], e-value: 1e-53, q:1 s:1, no CDD

-35 -35 -35 -10 -10 -10
# Score Pos’n TTGACA Score Space Pos’n TATAAT Score
F 166 0.682 5563 CTGATC 0.614 17 5586 CACAAT 0.719

Gap Sequence:

ATTCTCGTGTTAATGTGAGCTTGCCTGTTTGGGACACGGCTTGACTGGGGAGTCTGGGAAGGGGCGGCGTCATTGCGACGCCGCCCTTTCTTGTGCCAGCTGATCGAACACGCCCGGCGTCCCACAATCCCGCAGCTGAGCGTTTCACAGTGGAGCTA

NCBI Blast for Gene #6: scaffolding protein [Arthrobacter phage Jawnski], e-value:0.0, q:1 s:1, Mu-Like_Pro superfamily

Brent Gene #6:

Previous gene (#5) is a forward gene… Gap: 157bp

NCBI Blast for Gene #5: hypothetical protein SEA_BRENT_5 [Arthrobacter phage Brent], e-vlaue: 7e-60, q:1 s:1, no CDD

-35 -35 -35 -10 -10 -10
# Score Pos’n TTGACA Score Space Pos’n TATAAT Score
F 182 0.677 5478 TTGCCT 0.77 15 5499 TTGACT 0.625

Gap Sequence:

ATTCTCGTGTTAATGTGAGCTTGCCTGTTTGGGAAGACGGCTTGACTGGGGAGTCTGGGAAAGGGCGGCGTCATTGCGACGCCGCCCTTTCTTGTGCCAGCTGATCGAACACGCCCGGCGTCCCACACTCGCGCAGCCGGGCGCTTCACAGTGGAGTCA

NCBI Blast for gene #6: scaffolding protein [Arthrobacter phage Brent], e-value: 0.0, q:1 s:1, Mu-Like_Pro superfamily

JKerns Gene #7:

Previous gene (#6) is a reverse gene… Gap: 134bp

Blast NCBI for Gene #6: hypothetical protein SONNY_6 [Arthrobacter phage Sonny], e-value: 3e-50, q:1 s:1, no CDD

-35 -35 -35 -10 -10 -10
# Score Pos’n TTGACA Score Space Pos’n TATAAT Score
F 447 0.657 5814 TTCTCT 0.633 15 5835 TAGAAC 0.72

Gap Sequence:

TTTTCTTCTCTTCTCTCGTTGTTCGAACTGTTAGAACAAGTATGGACAGATAGATCGAGATCGCGCAACCACCCAGCAGGCTTGTTCGATCGTGTCATCCCACAACGGGGCCGCGCGGCACGTCAGAGTTGACTCA

NCBI Blast for gene #7: scaffolding protein [Arthrobacter phage Sonny], e-value: 0.0, q:1 s:1, Mu-Like_Pro superfamily

Jordan Gene #7:

Previous gene (#6) is a reverse gene… Gap: 134bp

Blast NCBI for Gene #6: hypothetical protein SONNY_6 [Arthrobacter phage Sonny], e-value: 3e-50, q:1 s:1, no CDD

-35 -35 -35 -10 -10 -10
# Score Pos’n TTGACA Score Space Pos’n TATAAT Score
F 453 0.657 5814 TTCTCT 0.633 15 5835 TAGAAC 0.72

Gap Sequence:

TTTTCTTCTCTTCTCTCGTTGTTCGAACTGTTAGAACAAGTATGGACAGATAGATCGAGATCGCGCAACCACCCAGCAGGCTTGTTCGATCGTGTCATCCCACAACGGGGCCGCGCGGCACGTCAGAGTTGACTCA

NCBI Blast for gene #7: scaffolding protein [Arthrobacter phage Sonny], e-value: 0.0, q:1 s:1,  Mu-Like_Pro superfamily

Conclusion: All the genes in the pham are scaffolding proteins and are preceded by hypothetical proteins. However, there was not a very clear correlation in the promoter sequences that I annotated. The other two members in my team have annotated the other 12 genes, so I need to look at those, too, when making generalizations and looking for a correlation. Our goal is to have all 18 phages in the pham blasted and their promoter sequences predicted.

Next Steps: Next week, we want to compare our results from the blasts and promotor predictions. If time allows, we will begin running various programs to determine the percent CG in the promotor sequences.

April 11

Project Research Off-Week (Dia)

Meredith Kim

April 4, 2017

Rationale of today’s work: Delegate jobs to each group member and begin researching promoter sequences for the phages containing genes in pham 7442. This pham has 16 members, all of which are in the AO cluster.

Tools used and/or Methods: PhagesDB glossary, Phamerator, DNA Master, NCBI, ExPASy

Results:

Barret lemon Gene #7:

Previous gene (#6) is a reverse gene… Gap: 128bp

Blast NCBI for Gene #6 is hypothetical protein BARRETLEMON_6 e-value: 1e-48, q:1 s:1, no CDD

NCBI Blast for Gene #7: scaffolding protein [Arthrobacter phage BarretLemon], e-value: 0.0, q:1 s:1, Mu-Like_Pro superfamily

Boss Lady Gene #7:

Previous gene (#6) is a reverse gene… Gap: 134bp

Blast NCBI for Gene #6: hypothetical protein MARTHA_6 [Arthrobacter phage Martha], e-value 1e-47, q:1 S:1

 

NCBI Blast for Gene #7: scaffolding protein [Arthrobacter phage Sonny], e-value: 0.0, q:1 s:1, Mu-Like_Pro superfamily

Beans Gene #6:

Previous gene (#5) is a forward gene… Gap: 156bp

NCBI Blast for gene #5: hypothetical protein SEA_BRENT_5 [Arthrobacter phage Brent], e-value: 1e-53, q:1 s:1, no CDD

NCBI Blast for Gene #6: scaffolding protein [Arthrobacter phage Jawnski], e-value:0.0, q:1 s:1, Mu-Like_Pro superfamily

Brent Gene #6:

Previous gene (#5) is a forward gene… Gap: 157bp

NCBI Blast for Gene #5: hypothetical protein SEA_BRENT_5 [Arthrobacter phage Brent], e-vlaue: 7e-60, q:1 s:1, no CDD

NCBI Blast for gene #6: scaffolding protein [Arthrobacter phage Brent], e-value: 0.0, q:1 s:1, Mu-Like_Pro superfamily

Conclusion: All the genes in the pham are scaffolding proteins. However, there was not a very clear correlation in promoter sequences. We may need to try promoter prediction on the entire genome instead of just on the gap sequence. That way, we can get the most accurate prediction.

Next Steps: We should look at promoter sequences that are not within the actual genes themselves. After all, promoter sequences are typically found before the gene. We need to insert a new “gene” in the gap in DNA master in order to run Promoter Prediction and further investigate.

March 28

Project Research Day 2 (3/28/17)

Meredith Kim

March 28, 2017

Rationale of today’s work: Continue consolidating ideas and forming a question/hypothesis for the research. We also need to continue thinking about what kind of data we are collecting and find a research article that explores something we are also planning on researching. Our goal for today is to find a relative and delegate specific jobs to each group member so that we can continue our research effectively.

Tools used and/or Methods: PhagesDB glossary, Phamerator, DNA Master, NCBI, ExPASy.

Results: We ended up scratching our previous idea of investing CDD’s. Instead, we are now looking for a correlation between phams and their corresponding promoter sequences. We also draw an outline for how we want to set up our poster. We decided the colors would be teal and grey and that our sections would include: abstract, background, methods, results, conclusions, further research, and acknowledgements. We aim to include enticing images and minimize the volume of words on the poster.

The research we found, entitled “Genomic comparison of 93 Bacillus phages reveals 12 clusters, 14 singletons and remarkable diversity,” was done by Julianne H. Grose, Garrett L Jensen, Sandra H. Burnett, and Donald P. Breakeveh. The study focuses on clusters, which is one of the aspects that our group is researching.

Question: What is the correlation between a pham (all containing the same gene) and its promoter sequence?

  • Pham 7492 (16 members- all 16 members are in cluster AO)
  • This means there are two phages in cluster AO that are NOT in the pham 7492.
  • Look at gene map on DNA master locate promoter in an intergeneric region of the genome.
  • Are there other genes that are called by the same promoter?
  • On DNA Master… DNA –> promoter prediction
  • Want -35 and -10 scores to be as close to 1.0 as possible. (they are ranked on a scale of 0-1.0)

Conclusion: Because we reformed our question, we did not do as much actual researching today, but we did, in fact get a solid hold of what we are officially doing in our project. In order to better understand the tools we will be using over the next four weeks, I need to read up on how to use the “promoter prediction” function in DNA master.

Next Steps: Now that we have a solid plan of action for our new question, our next step will be to delegate jobs to each member in the group and begin locating the promoter sequences for each gene in the pham. Once we do this, we can start comparing the similarities and differences of the sequences to find whether or not a correlation exists.

March 28

Project Research Day 1 (3/21/17)

Meredith Kim

March 21, 2017

Rationale of today’s work: Begin forming a plan of action for the group research project. My group includes Katie McMillan and Taylor Kowalski (group 8). Our goal is to brainstorm ideas and create a research question/topic that will require at least 4 weeks of research (make the question in-depth).

Tools used and/or Methods: PhagesDB glossary, Phamerator, DNA Master, NCBI, ExPASy (provides annotation sequences). We also did a lot of brainstorming and looked at posters from previous classes in order to get a better understanding of the project.

Results: We came up with the idea of researching conserved domains (CDD’s), promoters, and any correlation between the two.

  • Look up phages with shared CDD
  • Look at promoter sequence (how similar are they?)
  • Do all phages with a given promoter sequence have the specified CDD?
  • If you put two different phages that have the same CDD in the same median with the same repressor, will the repressor work on both?
  • Focus our research on one cluster.
  • (If time allows) Pull a random phage and compare how its results compare with the results from the phages within the same cluster.
  • Potential application of research: Transcription factor would increase phage production in phages with same CDD (Increased production of lytic phages that kill harmful bacteria)
  • Use Phamerator to look for CDD’s

Conclusion: Our group now has a solid idea of what our project will look life. I can now visualize a tangible way of completing the research and making the poster. The technological aspects of our research need to become more familiar to us, so we will work on learning how to use the various computer programs, such as ExPASy. I am very interested to see if there is, in fact, a correlation between the CDD’s and their promoters, and I am confident that our research methods will enable us to arrive at a solid conclusion. We concluded that we would research the cluster that includes Timinator (cluster AO: 18 members).

Next Steps:

  1. Consolidate brainstormed ideas
  2. Begin looking up phages in the same cluster
  3. Research those phages using Phamerator to look up CDD’s
  4. Record promoter sequences for the genes that contain the given CDD
  5. Begin working on a poster outline
  6. Continue investigating online tools that can help with research
March 14

Timinator Day 2 (3/14/17)

Meredith Kim

March 14, 2017

Rationale of today’s work: Continue annotating Timinator. Last week, I mistakenly annotated gene 51 as opposed to gene 53, so today I aim to finished annotating gene 53. We also chose our groups for our research poster projects. My group includes Katie McMillin and Taylor Kowalski.

Tools used and/or Methods: NCBI, PhagesDB, HHPred, Phamerator, Starterator, Glimmer, GeneMark

Results:

GENE 53: Start: 38174bp, Stop: 38467bp, FWD, 4bp Overlap,  SD Score: -4.417 (2nd best score), Longest ORF, Z-Value: 2.236, CP: The gene is not covered, SCS: Agrees with Glimmer, Agrees with GeneMark

GeneMark: leaves some potential uncovered, but covering all the potential would result in too large of an overlap

NCBI Blast: hypothetical protein BARRETLEMON_52 [Arthrobacter phage BarretLemon], q:1 s:1, E-Value: 4e-47, CDD: No good hit

PhagesDB Blast: Timinator_Draft_53, function unknown, E-Value: 3e-51

HHPred: No good hit

Starterator: Agrees with starterator

Final Annotation for Gene 53:

Start: 38174bp Stop: 38467bp FWD GAP: 4bp Overlap SD Final Value: SD Score: -4.417 (2nd best score) Longest ORF Z-Value: 2.236 CP: The gene is not covered Covering all potential would result in too large of an overlap SCS: Agrees with Glimmer, Agrees with GeneMark NCBI BLAST: hypothetical protein BARRETLEMON_52 [Arthrobacter phage BarretLemon], q:1 s:1 E-Value: 4e-47 CDD: No good hit PhagesDB BLAST: Timinator_Draft_53, function unknown E-Value: 3e-51 HHPred: No good hit LO: Yes ST: Agrees with Starterator F: NKF FS: NCBI, PhagesDB Notes:

Conclusion: I did not need to change much in annotating Gene 53. The start that I chose agreed with both Glimmer and Genemark. Based on the information from NCBI and PhagesDB, I concluded that there was no known function for this specific gene.

Next Steps: Next lab, we need to begin consolidating the genes from the entire class and putting together the cover sheet in order to finalize our research. For genes that multiple people annotated, students should share their results to make sure that both individuals agree upon a valid annotation.

February 24

Lore Annotations Day 3 (2/21/17)

Meredith Kim

February 21, 2017

Rationale of today’s work: Review final Lore annotations and begin compiling genes into final genome for publication.

Tools used and/or Methods: NCBI, PhagesDB, HHPred, Phamerator, Starterator, Glimmer, GeneMark

Results:

Gene # and function:

1
NKF
2
NKF
3
NKF
4
terminase, large subunit
5
NKF
6
portal protein
7
capsid maturation protease
8
NKF
9
NKF
10
major tail protein
11
minor tail protein
12
NKF
13
tape measure protein
14
NKF
15
NKF
16
peptidase
17
hydrolase
18
NKF
19
NKF
20
HTH DNA binding protein, MerR-like
21
helix-turn-helix DNA binding domain
22
helix-turn-helix DNA binding domain
23
helix-turn-helix DNA binding domain
24
NKF
25
NKF
26
HNH endonuclease

Changes that were made:

  • Gene 3 changed start site to 591 due to a better score
  • Gene 21 was an added gene, so we were not able to auto-annotate. Neither GeneMark nor Glimmer called the gene.
  • Gene 24: changed start site. Glimmer’s call does not cover all the typical coding potential, and the final score for GeneMark’s call was higher
  • Gene 26: Neither Glimmer nor GeneMark called the gene on the +2 reading frame. Glimmer called a gene on the 3+ reading frame, but the gene did not include a stop codon.

Conclusion and Next Steps: So far, we have only completed the initial steps in the gene publication process. We still need to go back and complete gene 21 and review a few others that were questionable. Then we need to work on a cover page and several other factors that are necessary before we can send them off to be published.

February 23

Lore Annotations Day 2 (2/14/17)

Meredith Kim

February 14, 2017

Rationale of today’s work: Continued working on Lore . Today we aim to complete the annotations for gene 25 and 26.

Tools used and/or Methods: NCBI, PhagesDB, HHPred, Phamerator, Starterator, Glimmer, GeneMark

Results:

GENE 25: Start: 14980bp Stop: 15276bp FWD GAP: 4bp Overlap

According to GeneMark, all typical and atypical coding potential is covered.

NCBI blast: does not provide known function of gene

 

Phages DB blast: does not provide known function

HHPred: does not provide known function, top hit had too high of an e value

Final Annotation for Gene 25:

Start: 14980bp Stop: 15276bp FWD GAP: 4bp Overlap SD Final Value: SD Score: -3.554 (Best score) Z-Value: 2.798 CP: The gene is covered SCS: Agrees with Glimmer, Agrees with GeneMark NCBI BLAST: hypothetical protein STRATUS_25 [Arthrobacter phage Stratus], q:1, s:1 E-Value: 4e-46 CDD: No good hit PhagesDB BLAST: StewieGriff_Draft_25, function unknown, q:1 s:1 E-Value: 2e-51 HHPred: No good hit LO: No All coding potential was covered with start position 14980 and LO had lower final score ST: Agrees with Starterator F: NKF FS: NCBI, Phages DB, HHPred Notes:

GENE 26: Start: 15311bp Stop: 15529bp FWD GAP: 34bp Gap

Original Call by Glimmer: (does not contain stop codon)

Edited call: (contains both start and stop codon)

GeneMark did not cover gene 26 at all. However, there was still some typical and atypical coding potential, which would support the hypothesis that a gene does exist in the region.

NCBI blast: HNHc superfam, supports HNH endonuclease function

Phages DB blast: supports HNH endonuclease function

HHPred: supports HNH endonuclease function

 

Final Annotation for Gene 26:

Start: 15311bp Stop: 15529bp FWD GAP: 34bp Gap SD Final Value: SD Score: -5.145 (2nd best score) Best score was -5.062 but only had length of 138 as opposed to 219 Z-Value: 1.489 CP: The gene is not covered left some typical and very little atypical uncovered, but they could only be covered using ORF with the large overlap SCS: Disagrees with Glimmer, Agrees with GeneMark Neither Glimmer nor GeneMark called the gene on the +2 reading frame NCBI BLAST: Yank_26, HNH endonuclease, q:1 s:1 E-Value: 7e-56 CDD: HNHc HNH nucleases E-Value: 1.96e-08 PhagesDB BLAST: Yank_26, HNH endonuclease, q:1 s:1 E-Value: 6e-47 HHPred: HNH endonuclease E-Value: 5.7E-14 LO: No LO would result in 14bp overlap and had SD score of -7.650 ST: Starterator calls a deleted gene F: HNH endonuclease FS: NCBI, Phages DB, HHPred Notes: Was originally in the third reading frame, but had to be moved to the second reding frame, the gene called in the third reading frame was not valid

Conclusions and Next Steps: Gene 25 was fairly straightforward, and I did not need to make any changes that deviated form the call by Glimmer. However, for gene 26, I chose a completely different reading frame than the Glimmer call. This resulted in an actual function with good hits by several databases. Another group took gene 22 for us, so we do not need to annotate gene 22. For next lab, I would want to choose one or two other genes from other groups to go back and review annotations. However, after today, most of the Lore genome should be annotated, so we are almost finished with the initial annotation process.

February 23

Lore Annotations Day 1 (2/7/17)

Meredith Kim

February 7, 2017

Rationale of today’s work: Starting initial work on Lore. Today we aim to complete the annotations for gene 23 and 24.

Tools used and/or Methods: NCBI, PhagesDB, HHPred, Phamerator, Starterator, Glimmer, GeneMark

Results:

GENE 23:  Start: 14113bp Stop: 14781bp FWD GAP: 4bp Overlap

According to GeneMark, all typical and atypical potential is covered.

NCBI blast: Identified HTH superfam, supports helix-turn-helix DNA binding function, q:1 s:1

Phages DB blast: does not give any known function for the gene

HHPred: supports helix-turn-helix DNA binding function

Final Annotation for Lore Gene 23:

Start: 14113bp Stop: 14781bp FWD GAP: 4bp Overlap SD Final Value: SD Score: -3.246 (Best score) Z-Value: 2.847 CP: The gene is covered SCS: Agrees with Glimmer, Disagrees with GeneMark Genemark called a reverse gene, but there was more potential in the forward gene that was not called NCBI BLAST: Helix-turn-helix DNA binding domain protein [Arthrobacter phage Decurro] q:1 s:1 E-Value: 2e-156 CDD: pfam13730, Helix-turn-helix domain E-Value: 1.09e-06 PhagesDB BLAST: StewieGriff_Draft_23, function unknown, q:1 s:1 E-Value: 1e-134 HHPred: Transcriptional Regulator E-Value: 5.9E-08 LO: Yes ST: Agrees with Starterator F: helix-turn-helix DNA binding domain FS: NCBI, HHPred, Phamerator Notes:

 

GENE 24: Start: 14768bp Stop: 14983bp FWD GAP: 14bp

According to GeneMark, all typical and atypical coding potential is covered.

NCBI blast: StewieGriff_Draft_24, no known function, q:1 s:1

Phages DB blast: does not provide known function, no CDD


HHPred: does not provide known function, best hit had too high of an e value

Final Annotation for Lore Gene 24:

Start: 14768bp Stop: 14983bp FWD GAP: 14bp Overlap SD Final Value: SD Score: -2.171 (Best score) Z-Value: 3.342 CP: The gene is covered SCS: Disagrees with Glimmer, Agrees with GeneMark Glimmer’s call does not cover all the typical coding potential, and the final score for GeneMark’s call was higher NCBI BLAST: hypothetical protein SEA_DECURRO_24 [Arthrobacter phage Decurro], q:1 s:12 E-Value: 1e-28 CDD: No good hit PhagesDB BLAST: StewieGriff_Draft_24, function unknown, q:1, s:1 E-Value: 1e-30 HHPred: No good hit LO: Yes ST: Agrees with Starterator F: NKF FS: NCBI, Phages DB, HHPred Notes:

 

 

Conclusions and Next Steps: Not very many chances were made to this gene, and there were no known functions. Next step would be to complete genes 25-26 and then go back and annotate gene 22.