Package Bio :: Package PDB :: Module PDBList' :: Class PDBList
[hide private]
[frames] | no frames]

Class PDBList

source code

This class provides quick access to the structure lists on the PDB server or its mirrors. The structure lists contain four-letter PDB codes, indicating that structures are new, have been modified or are obsolete. The lists are released on a weekly basis.

It also provides a function to retrieve PDB files from the server. To use it properly, prepare a directory /pdb or the like, where PDB files are stored.

If You want to use this module from inside a proxy, add the proxy variable to Your environment, e.g. in Unix export HTTP_PROXY='http://realproxy.charite.de:888' (This can also be added to ~/.bashrc)

Instance Methods [hide private]
 
__init__(self, server='ftp://ftp.rcsb.org', pdb='/home/mandrake/rpm/BUILD/biopython-1.47', obsolete_pdb=None)
Initialize the class with the default server or a custom one.
source code
 
download_entire_pdb(self, listfile=None)
Retrieves all PDB entries not present in the local PDB copy.
source code
 
download_obsolete_entries(self, listfile=None)
Retrieves all obsolete PDB entries not present in the local obsolete PDB copy.
source code
 
get_all_entries(self)
Retrieves a big file containing all the PDB entries and some annotation to them.
source code
 
get_all_obsolete(self)
Returns a list of all obsolete entries ever in the PDB.
source code
 
get_recent_changes(self)
Returns three lists of the newest weekly files (added,mod,obsolete).
source code
 
get_seqres_file(self, savefile='pdb_seqres.txt')
Retrieves a (big) file containing all the sequences of PDB entries and writes it to a file.
source code
 
get_status_list(self, url)
Retrieves a list of pdb codes in the weekly pdb status file from the given URL.
source code
string
retrieve_pdb_file(self, pdb_code, obsolete=0, compression='.Z', uncompress='gunzip', pdir=None)
Retrieves a PDB structure file from the PDB server and stores it in a local file tree.
source code
 
update_pdb(self)
I guess this is the 'most wanted' function from this module.
source code
Class Variables [hide private]
  PDB_REF = '\n The Protein Data Bank: a computer-based archi...
  alternative_download_url = 'http://www.rcsb.org/pdb/files/'
Method Details [hide private]

download_entire_pdb(self, listfile=None)

source code 

Retrieves all PDB entries not present in the local PDB copy. Writes a list file containing all PDB codes (optional, if listfile is given).

download_obsolete_entries(self, listfile=None)

source code 

Retrieves all obsolete PDB entries not present in the local obsolete PDB copy. Writes a list file containing all PDB codes (optional, if listfile is given).

get_all_entries(self)

source code 

Retrieves a big file containing all the PDB entries and some annotation to them. Returns a list of PDB codes in the index file.

get_all_obsolete(self)

source code 
Returns a list of all obsolete entries ever in the PDB.

        Returns a list of all obsolete pdb codes that have ever been
        in the PDB.
        
        Gets and parses the file from the PDB server in the format
        (the first pdb_code column is the one used).
 LIST OF OBSOLETE COORDINATE ENTRIES AND SUCCESSORS
OBSLTE     30-SEP-03 1Q1D      1QZR
OBSLTE     26-SEP-03 1DYV      1UN2    
        

get_recent_changes(self)

source code 
Returns three lists of the newest weekly files (added,mod,obsolete).
        
        Reads the directories with changed entries from the PDB server and
        returns a tuple of three URL's to the files of new, modified and
        obsolete entries from the most recent list. The directory with the
        largest numerical name is used.
        Returns None if something goes wrong.
        
        Contents of the data/status dir (20031013 would be used);
drwxrwxr-x   2 1002     sysadmin     512 Oct  6 18:28 20031006
drwxrwxr-x   2 1002     sysadmin     512 Oct 14 02:14 20031013
-rw-r--r--   1 1002     sysadmin    1327 Mar 12  2001 README


        

get_status_list(self, url)

source code 
Retrieves a list of pdb codes in the weekly pdb status file
        from the given URL. Used by get_recent_files.
        
        Typical contents of the list files parsed by this method;
-rw-r--r--   1 rcsb     rcsb      330156 Oct 14  2003 pdb1cyq.ent
-rw-r--r--   1 rcsb     rcsb      333639 Oct 14  2003 pdb1cz0.ent
        

retrieve_pdb_file(self, pdb_code, obsolete=0, compression='.Z', uncompress='gunzip', pdir=None)

source code 

Retrieves a PDB structure file from the PDB server and stores it in a local file tree. The PDB structure is returned as a single string. If obsolete is 1, the file will be by default saved in a special file tree. The compression should be '.Z' or '.gz'. 'uncompress' is the command called to uncompress the files.

Parameters:
  • pdir (string) - put the file in this directory (default: create a PDB-style directory tree)
Returns: string
filename

update_pdb(self)

source code 

I guess this is the 'most wanted' function from this module. It gets the weekly lists of new and modified pdb entries and automatically downloads the according PDB files. You can call this module as a weekly cronjob.


Class Variable Details [hide private]

PDB_REF

Value:
'''
    The Protein Data Bank: a computer-based archival file for macromol\
ecular structures.
    F.C.Bernstein, T.F.Koetzle, G.J.B.Williams, E.F.Meyer Jr, M.D.Bric\
e, J.R.Rodgers, O.Kennard, T.Shimanouchi, M.Tasumi
    J. Mol. Biol. 112 pp. 535-542 (1977)
    http://www.pdb.org/.
    '''