The database table which you will access has some data extracted from the PDB. Each row contains the following data:
Field | Type |
---|---|
pdb_code | varchar(4) |
name | varchar(255) |
date | date |
description | text |
source | varchar(255) |
authors | text |
resolution | float |
exp_method | enum('XR','NM','EM','SD','ND','FL','FD','TM') |
n_chains | smallint(6) |
n_residues | smallint(6) |
Your practical tasks are to write Python/PEP249 scripts as follows:
Count the number of entries in the table. The SQL required to count the rows in the table is:
select count(*) from protein where source = 'Homo sapiens'
The results should look like this:
There were 3218 rows
Extract and print the source, resolution and description for PDB code 1DM5. The SQL required to extract the data is:
select source, resolution, description from protein where pdb_code = '1DM5'
The results should look like this:
PDB Code: 1DM5 Species: Hydra vulgaris Resolution: 1.93 Description: ANNEXIN XII E105K HOMOHEXAMER CRYSTAL STRUCTURE
Extract the PDB code, resolution and name for all proteins from the species Leishmania major and print the results on multiple lines. The SQL required to extract the data is:
select pdb_code, resolution, name from protein where source = 'LEISHMANIA MAJOR'
The results should look like this:
1E7W 1.75 DIHYDROFOLATE REDUCTASE 1E92 2.20 PTERIDINE REDUCTASE 1EZR 2.50 HYDROLASE 1LML 1.86 LEISHMANOLYSIN 1OKG 2.10 TRANSFERASE 1R75 1.86 STRUCTURAL GENOMICS, UNKNOWN FUNCTION