A R N D C Q E G H I L K M F P S T W Y V - A 15 R 19 11 N 17 17 15 D 17 18 15 13 C 19 21 21 22 5 Q 17 16 16 15 22 13 E 17 18 16 14 22 15 13 G 16 20 17 16 20 18 17 12 H 18 15 15 16 20 14 16 19 11 I 18 19 19 19 19 19 19 20 19 12 L 19 20 20 21 23 19 20 21 19 15 11 K 18 14 16 17 22 16 17 19 17 19 20 12 M 18 17 19 20 22 18 19 20 19 15 13 17 11 F 21 21 21 23 21 22 22 22 19 16 15 22 17 8 P 16 17 18 18 20 17 18 18 17 19 20 18 19 22 11 S 16 17 16 17 17 18 17 16 18 18 20 17 19 20 16 15 T 16 18 17 17 19 18 17 17 18 17 19 17 18 20 17 16 14 W 23 15 21 24 25 22 24 24 20 22 19 20 21 17 23 19 22 0 Y 20 21 19 21 17 21 21 22 17 18 18 21 19 10 22 20 20 17 7 V 17 19 19 19 19 19 19 18 19 13 15 19 15 18 18 18 17 23 19 13 - 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 20 0
(We call this the MSA cost matrix, since it is used by the MSA software for optimal multiple sequence alignment.)
What are the costs for aligning the following column ?
/ V \ | V | | L | \ L /
84
86
87
90
/ G \ | S | | - | \ G /
97
101
104
116
2. Please identify the correct statement by checking the box next to the statement:
2b. If 3 sequences are identical, the optimal MSA-cost (see Exercise 0) alignment must cost zero.
2c. If 3 sequences are _not_ identical, the optimal MSA-cost alignment cannot be zero.
2d. Calculating the optimal alignment of 3 long sequences can be done by cutting all 3 sequences in half, aligning the subsequences, and concatenating the 2 resulting alignments.
2e. Given an optimal alignment of sequences 1, 2 and 3, we can immediately obtain the optimal alignments of sequences 1+2, 2+3 and 1+3 by deleting the missing sequence, and gap-only columns.
3. Are the alignments presented in section 1.2, namely
VSN-S and GN-S -SNA- GNA- ---AS -N-S
optimal in terms of the unit cost?
Show answer to Question 3
4. How is the last node of the alignment path corresponding to the alignment from the Introduction denoted ? (cf. Ex. ___11 of the coursetext)
VTISCTGSSSNIGAG-NHVKWYQQLPG VTISCTGTSSNIGS--ITVNWYQQLPG LRLSCSSSGFIFSS--YAMYWVRQAPG LSLTCTVSGTSFDD--YYSTWVRQPPG PEVTCVVVDVSHEDPQVKFNWYVDG-- ATLVCLISDFYPGA--VTVAWKADS-- AALGCLVKDYFPEP--VTVSWNSG--- VSLTCLVKGFYPSD--IAVEWESNG--
Show answer to Question 4
5. Take a look at the hyperlattice in Fig. __3 of the coursetext.
5b. If the path of the shadow line on the back face were (0,0) -> (0,1) -> (1,2) -> (2,2) -> (3,2) -> (4,3) instead, which pairwise projection would be denoted by it?
5c. Can there be a multiple alignment through the hyperlattice, if the path of the shadow on the back face were as in (b), but the other shadows stay the same ?
Show answer to Question 5
6. Let the hyperlattice be spanned by sequences VSNS, SNA, and AS, as in Fig. __1. Please identify the correct statement by checking the corresponding box:
6b. If an alignment path visits only nodes at the surface of the lattice, the alignment has only terminal gaps (i.e. gaps before the first or after the last sequence symbol).
6c. If an alignment path visits no nodes on the surface of the lattice, except for the start ("source") and "sink" nodes, the alignment does not have terminal gaps.
6d. Every path starting at the bottom-left of the lattice corresponds to exactly one alignment.
6e. Some alignments can correspond to more than one path through the lattice.
7. Assume that in Fig. __4 the current visit is in node (2,2) of the 2-dimensional lattice corresponding to the pairwise distance matrix. (The lattice nodes can be identified with the matrix entries). Which are the next nodes that can be calculated according to the dynamic programming scheme?
Show answer to Question 7
8. Again, take a look at Fig. __1. Using unit costs, please answer the following questions.
b. What is the cost to travel along the last edge of the alignment graph?
c. Is there a cheaper edge starting from the "start" node?
d. Is there a cheaper way to reach node (2,1,0)?
9. Apart from the entries in the hyperlattice, which other data structures need to be stored for calculating the multiple alignment using dynamic programming? What is their size?
Show answer to Question 9
10. Please identify the correct statement by checking the box next to the statement.
10b. The space requirement grows proportional to the number of sequences.
10c. The running time of the standard dynamic programming algorithm for multiple sequences grows proportional to 2^l, where l is the length of the longest sequence.
10d. The running time grows proportional to the number of sequences, multiplied by the length of the longest sequence.
10e. If you align one sequence more (e.g. 5 sequences instead of 4), the space requirement and the running time grow by the same amount: If space usage doubles, running time doubles as well.
Return to Chapter 3 Self-Assessment Index