Authors: | Tom Dunham |
---|---|
Date: | 2009-03-31 |
Tuples are immutable lists.
>>> tu = (2, 3, 5, 7) >>> tu (2, 3, 5, 7)
>>> tu.append(11) Traceback (most recent call last): File "<stdin>", line 1, in <module> AttributeError: 'tuple' object has no attribute 'append'
You can use immutable types as dictionary keys
Unpacking
>>> def botright(x, y, w, h): ... return x+w, y+h ... >>> botright(10, 10, 2, 2) (12, 12)
>>> x, y = botright(10, 10, 2, 2) >>> x 12 >>> y 12
Tuple unpacking is useful in for loops
for condon, aa in [('CTT', 'L'), ('TAG', '_'), ('ACA', 'T'), ('ACG', 'T'), ('ATC', 'I')]: ...
Tuple | >>> ("A", "T") ('A', 'T') |
Tuple | >>> "C", "G" ('C', 'G') |
Expression | >>> "C" 'C' |
Tuple | >>> ("C", ) ('C',) |
Tuple | >>> tuple('C') ('C',) |
>>> foo = [] >>> bar = (1, foo, "three") >>> bar (1, [], 'three')
>>> foo.append(222)
>>> bar (1, [222], 'three')
Remember containers use identity
We've written a lot of loops like this:
acc = [] for i in sequence: newi = do_something_with(i) acc.append(newi)
A list comprehension is more concise
acc = [do_something_with(i) for i in sequence]
>>> li = [3, 5, 7, 11] >>> li [3, 5, 7, 11]
>>> li2 = [2 * i for i in li]
>>> li2 [6, 10, 14, 22] >>> li [3, 5, 7, 11]
Slices
>>> [2 * i for i in li[1:3]] [10, 14]
Strings
>>> [i.upper() for i in ("foo", "bar", "baz")] ['FOO', 'BAR', 'BAZ']
We've also written a lot of loops following this pattern
acc = [] for i in sequence: if test(i): newi = do_something_with(i) acc.append(newi)
This can be written
acc = [do_something_with(i) for i in sequence if test(i)]
>>> li [3, 5, 7, 11] >>> [2 * i for i in li if i > len(li)] [10, 14, 22]
See handout
Rewrite your solution to Q3 from lecture 4 (mapping condon to amino acid) using list comprehensions
Write a function ispalindromic that tests if a sequence is palindromic (returning True if it is a palindrome and False if not).
Write a function that takes a DNA sequence and returns a list of tuples containing enzyme name and the location of the restriction site for every palindromic restriction site. Use the following list of enzymes:
EcoRI GAATTC EcoRII CCWGG BamHI GGATCC HindIII AAGCTT TaqI TCGA NotI GCGGCCGC HinfI GANTC Sau3A GATC PovII CAGCTG SmaI CCCGGG HaeIII GGCC AluI AGCT EcoRV GATATC KpnI GGTACC PstI CTGCAG SacI GAGCTC SalI GTCGAC ScaI AGTACT SphI GCATGC StuI AGGCCT XbaI TCTAGA
You can use triple-quotes (""") for multi-line strings, and the strip and splitlines string methods to turn this list into a data structure you can use.
You may find the pprint function from the pprint module useful to inspect the data structures you build.
Use this function to load a test sequence (we'll study reading from files in the next section)
def loadseq(fn="coursefiles\\sample.dna"): return open(fn).read().replace(" ", "").upper()
Turn this list of tuples into an index - a dictionary where the keys are enzyme names and the values are lists of locations