Using the answers that you recoded during the practical, now answer the following questions.
1. How many promoter binding sites were predicted by 'Promoter 2.0' in total? 1 2 4 7 12 13
2. If we assume that a score of ≥0.6 is a good (if marginal) prediction, how many of the predicted sites from 'Promoter 2.0' are good predictions? 1 2 6 7 12 13
3. How many promoter binding sites were predicted by 'LDL Promoter'? 1 2 4 11 12 13
4. If we assume that a score of ≥0.99 is a confident prediction, how many of the predicted sites from 'LDL Promoter' are confidently predicted? 1 2 3 11 12 13
5. How many promoter binding sites were identified in total by TSSG? 1 2 4 11 12 13
6. How many of the promoter binding sites identified by TSSG had a predicted TATA box? 1 2 4 11 12 13
7. Comparing 'Promoter 2.0' and TSSG, assuming that a prediction is most likely to be correct if both predictors suggest a promoter in the same region, that a TATA box is essential and that confidence is also important, which base range is most likely to contain the true promoter? 1880 - 1960 10350 - 10660 700 - 929 9450 - 9800
8. How many potential poly-adenylation sites are predicted? 1 2 3
9. What is the e-value of the best hit in BLAST? 0.0 22583 22726 100% 99%
10. Having found the sequence in the nr/nt collection, does the TATA box correspond with the prediction? Yes No
11. From the gene entry, is there any evidence of alternative splicing? Yes No
12. What is the EC number for the encoded protein? AK1 2.7.4.3 2.7.4.6 1.3.2.7 194
13. How many amino acids are in the encoded protein? 18 39 194 9606
14. Which ligands are seen to bind to the protein at Binding sites? ATP only AMP only AMP and ATP ATP, AMP, NMP and LID
15. Does the secondary structure of the protein consist mostly of alpha helices or beta strands? alpha beta neither - they are approximately the same
16. Which, if any, of these PDB codes represent structures of the protein? 1YQV 1Z83 8FAB