Authors: | Tom Dunham |
---|---|
Date: | 2009-04-05 |
Data is held in tables
Data is manipulated using SQL (Structured Query Language)
Queries the database
Add, change and remove data items
You have mysql on your computer
To connect:
> mysql -u root -p (enter password)
The command mysql starts a process that connects to the MySQL server. - this is called a client process
To return the user you connected as:
mysql> select user(); +----------------+ | user() | +----------------+ | root@localhost | +----------------+ 1 row in set (0.00 sec)
MySQLdb is the standard module to connect Python to MySQL
Try it:
>>> import MySQLdb >>> db = MySQLdb.connect(user="root", passwd="coi") >>> curs = db.cursor() >>> curs.execute("select user()") >>> curs.fetchall() (('root@localhost',),)
The safest, most database neutral way of processing rows is this:
cursor.execute('SELECT ...') while 1: row = cursor.fetchone() if row == None: break # process row...
Since iterators were added in Python 2.2, most database libraries (including MySQLdb) have supported iterable cursors:
cursor.execute('SELECT ...') for row in cursor: # process row...
SQL uses the where clause to filter the rows that are returned:
SELECT field1, field2 FROM table WHERE field1 = 'value'
Often you want to filter for something the user has asked for
To do this, you make value a parameter:
cursor.execute( """SELECT field1, field2 FROM table WHERE field1 = %s""", ('value',))
Always prefer parameterization to creating queries using string formatting. Parameters handle quoting values transparently, avoiding the need to examine a value before passing it to the DB, which saves the programmer work (and avoiding bugs for people named O'Malley). Many databases cache parsed queries to make subsequent calls more efficient - this is ineffective if every query differs by the values of its parameters.
Different databases use different characters as placeholders, you can find out which character by examining the paramstyle attribute of the module.
qmark Question mark style, e.g. ...WHERE name=?
numeric Numeric, positional style, e.g. ...WHERE name=:1
named Named style, e.g. ...WHERE name=:name
format ANSI C printf format codes, e.g. ...WHERE name=%s
pyformat Python extended format codes, e.g. ...WHERE name=%(name)s
This scarcely scratches the surface of transactions.
The ACID properties on Wikipedia http://en.wikipedia.org/wiki/ACID
InnoDB http://www.innodb.com/ http://dev.mysql.com/doc/refman/5.0/en/innodb.html
Bernstein & Newcomer Principles of Transaction Processing
The database can fail at any time - including between you reading the last result and closing the connections
MySQL does not throw an exception here, but it's not forbidden by the spec.
Exceptions are not immediately obvious from a backtrace:
>>> curs.execute("select * from foo") Traceback (most recent call last): File "<stdin>", line 1, in <module> ... _mysql_exceptions.ProgrammingError: (1146, "Table 'meps.foo' doesn't exist")
This is better:
from MySQLdb import DatabaseError, ProgrammingError
So is this (avoids importing them all):
try: ... except db.ProgrammingError, err: ...
db = connect(...) try: curs = db.cursor() try: try: curs.execute(...) db.commit() except: db.rollback() raise finally: curs.close() finally: db.close()
That's too much typing
Python 2.6 includes context managers
Now you can write:
with db: curs.execute(...)
Remember this is an extension.
See handout
Details for connecting to the database will be given in class.
You can use this function to parse rebase_allenz.txt into 8-value tuples:
def rebase_rec(itr): rec = [] for line in itr: m = re.match(r"<([1-8])>(.*)", line) if m: field, val = int(m.group(1)), m.group(2) rec.append(val) if field == 8: yield tuple(rec) rec = []
Write a program that loads all the records in rebase_allenz.txt into the database.
Display the number of records for every source using:
SELECT source, COUNT(*) FROM enzyme GROUP BY source ORDER BY 2 DESC;
Display the names of the enzymes with a particular prototype:
SELECT enzyme_name FROM enzyme WHERE prototype = 'Bsp1407I';
Some more database links
Context Managers http://docs.python.org/reference/datamodel.html#with-statement-context-managers
Paul DeBois on PyDB http://www.kitebird.com/articles/pydbapi.html
AMK on PyDB http://www.amk.ca/python/writing/DB-API.html
The Python DB SIG http://mail.python.org/mailman/listinfo/db-sig
JWZ MySQL at Sabre http://jeremy.zawodny.com/blog/archives/001049.html