Containers: Tuples and Comprehensions

Authors: Tom Dunham
Date: 2009-03-31

Container Types

Tuples

Tuples are immutable lists.

>>> tu = (2, 3, 5, 7)
>>> tu
(2, 3, 5, 7)
>>> tu.append(11)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'tuple' object has no attribute 'append'

Tuples - why?

Tuple Unpacking

>>> x, y = botright(10, 10, 2, 2)
>>> x
12
>>> y
12

Tuple Unpacking

Tuple unpacking is useful in for loops

for condon, aa in [('CTT', 'L'), ('TAG', '_'), ('ACA', 'T'), ('ACG', 'T'), ('ATC', 'I')]:
   ...

Syntax

Tuple
>>> ("A", "T")
('A', 'T')
Tuple
>>> "C", "G"
('C', 'G')
Expression
>>> "C"
'C'
Tuple
>>> ("C", )
('C',)
Tuple
>>> tuple('C')
('C',)

Lists in tuples

>>> foo = []
>>> bar = (1, foo, "three")
>>> bar
(1, [], 'three')
>>> foo.append(222)
>>> bar
(1, [222], 'three')

Remember containers use identity

Containers: Review

Comprehensions

We've written a lot of loops like this:

acc = []
for i in sequence:
    newi = do_something_with(i)
    acc.append(newi)

A list comprehension is more concise

acc = [do_something_with(i) for i in sequence]

Example

>>> li = [3, 5, 7, 11]
>>> li
[3, 5, 7, 11]
>>> li2 = [2 * i for i in li]
>>> li2
[6, 10, 14, 22]
>>> li
[3, 5, 7, 11]

Example

Slices

>>> [2 * i for i in li[1:3]]
[10, 14]

Strings

>>> [i.upper() for i in ("foo", "bar", "baz")]
['FOO', 'BAR', 'BAZ']

Filtering

We've also written a lot of loops following this pattern

acc = []
for i in sequence:
    if test(i):
        newi = do_something_with(i)
        acc.append(newi)

This can be written

acc = [do_something_with(i) for i in sequence if test(i)]

Example

>>> li
[3, 5, 7, 11]
>>> [2 * i for i in li if i > len(li)]
[10, 14, 22]

Exercise

See handout

  1. Rewrite your solution to Q3 from lecture 4 (mapping condon to amino acid) using list comprehensions

  2. Write a function ispalindromic that tests if a sequence is palindromic (returning True if it is a palindrome and False if not).

  3. Write a function that takes a DNA sequence and returns a list of tuples containing enzyme name and the location of the restriction site for every palindromic restriction site. Use the following list of enzymes:

    EcoRI  GAATTC
    EcoRII  CCWGG
    BamHI  GGATCC
    HindIII  AAGCTT
    TaqI  TCGA
    NotI  GCGGCCGC
    HinfI  GANTC
    Sau3A  GATC
    PovII  CAGCTG
    SmaI  CCCGGG
    HaeIII  GGCC
    AluI  AGCT
    EcoRV  GATATC
    KpnI  GGTACC
    PstI  CTGCAG
    SacI  GAGCTC
    SalI  GTCGAC
    ScaI  AGTACT
    SphI  GCATGC
    StuI  AGGCCT
    XbaI  TCTAGA
    
    • You can use triple-quotes (""") for multi-line strings, and the strip and splitlines string methods to turn this list into a data structure you can use.

    • You may find the pprint function from the pprint module useful to inspect the data structures you build.

    • Use this function to load a test sequence (we'll study reading from files in the next section)

      def loadseq(fn="coursefiles\\sample.dna"):
          return open(fn).read().replace(" ", "").upper()
      
  4. Turn this list of tuples into an index - a dictionary where the keys are enzyme names and the values are lists of locations