Package Bio :: Package expressions :: Module genbank
[show private | hide private]
[frames | no frames]

Module Bio.expressions.genbank

Martel based parser to read GenBank formatted files.

This is a huge regular regular expression for GenBank, built using the 'regular expressions on steroids' capabilities of Martel.

Documentation for GenBank format that I found:

o GenBank/EMBL feature tables are described at: http://www.ebi.ac.uk/embl/Documentation/FT_definitions/feature_table.html

o There are also descriptions of different GenBank lines at: http://www.ibc.wustl.edu/standards/gbrel.txt
Function Summary
  define_block(identifier, block_tag, block_data, std_block_tag, std_tag)
Define a Martel grouping which can parse a block of text.

Variable Summary
Group accession = <Martel.Expression.Group instance at 0xb6fb1...
Group accession_block = <Martel.Expression.Group instance at 0...
Group authors_block = <Martel.Expression.Group instance at 0xb...
Group base_count = <Martel.Expression.Group instance at 0xb6f4...
Group base_count_line = <Martel.Expression.Group instance at 0...
Group base_number = <Martel.Expression.Group instance at 0xb6f...
Str big_indent_space = <Martel.Expression.Str instance at 0x...
MaxRepeat blank_space = <Martel.Expression.MaxRepeat instance at 0...
Group comment_block = <Martel.Expression.Group instance at 0xb...
Group consrtm_block = <Martel.Expression.Group instance at 0xb...
Group contig_block = <Martel.Expression.Group instance at 0xb6...
Group contig_location = <Martel.Expression.Group instance at 0...
Group data_file_division = <Martel.Expression.Group instance a...
Group date = <Martel.Expression.Group instance at 0xb6facb2c>
Group db_source_block = <Martel.Expression.Group instance at 0...
Group definition_block = <Martel.Expression.Group instance at ...
list divisions = [<Martel.Expression.Str instance at 0xb6facb...
Group feature = <Martel.Expression.Group instance at 0xb6f45e0...
Group feature_block = <Martel.Expression.Group instance at 0xb...
Group feature_key = <Martel.Expression.Group instance at 0xb6f...
int FEATURE_KEY_INDENT = 5                                                                     
Group feature_key_line = <Martel.Expression.Group instance at ...
int FEATURE_QUALIFIER_INDENT = 21                                                                    
Group features_line = <Martel.Expression.Group instance at 0xb...
ParseRecords format = <Martel.Expression.ParseRecords instance at 0xb...
Group gi = <Martel.Expression.Group instance at 0xb6fb1a6c>
Seq header = <Martel.Expression.Seq instance at 0xb6f4cacc>
int INDENT = 12                                                                    
Group journal_block = <Martel.Expression.Group instance at 0xb...
Group keywords_block = <Martel.Expression.Group instance at 0x...
Group location = <Martel.Expression.Group instance at 0xb6f455...
Group locus = <Martel.Expression.Group instance at 0xb6fac64c>
Group locus_line = <Martel.Expression.Group instance at 0xb6fa...
Group medline_line = <Martel.Expression.Group instance at 0xb6...
HeaderFooter ncbi_format = <Martel.Expression.HeaderFooter instance a...
Group nid = <Martel.Expression.Group instance at 0xb6fb17ac>
Group nid_line = <Martel.Expression.Group instance at 0xb6fb18...
Group organism = <Martel.Expression.Group instance at 0xb6fb77...
Group organism_block = <Martel.Expression.Group instance at 0x...
Group origin_line = <Martel.Expression.Group instance at 0xb6f...
Group pid = <Martel.Expression.Group instance at 0xb6fb18cc>
Group pid_line = <Martel.Expression.Group instance at 0xb6fb19...
Group primary = <Martel.Expression.Group instance at 0xb6f4546...
Group primary_line = <Martel.Expression.Group instance at 0xb6...
Group primary_ref_line = <Martel.Expression.Group instance at ...
Group pubmed_line = <Martel.Expression.Group instance at 0xb6f...
Group qualifier = <Martel.Expression.Group instance at 0xb6f45...
Alt qualifier_space = <Martel.Expression.Alt instance at 0xb...
Str quote = <Martel.Expression.Str instance at 0xb6f458ac>
Group quoted_chars = <Martel.Expression.Group instance at 0xb6...
Seq quoted_string = <Martel.Expression.Seq instance at 0xb6f...
Group record = <Martel.Expression.Group instance at 0xb6f4ca6c...
Group record_end = <Martel.Expression.Group instance at 0xb6f4...
Group reference = <Martel.Expression.Group instance at 0xb6fbe...
Group reference_bases = <Martel.Expression.Group instance at 0...
Group reference_line = <Martel.Expression.Group instance at 0x...
Group reference_num = <Martel.Expression.Group instance at 0xb...
Group region = <Martel.Expression.Group instance at 0xb6fb144c...
Group remark_block = <Martel.Expression.Group instance at 0xb6...
list residue_prefixes = [<Martel.Expression.Str instance at 0...
Group residue_type = <Martel.Expression.Group instance at 0xb6...
list residue_types = [<Martel.Expression.Str instance at 0xb6...
Group segment = <Martel.Expression.Group instance at 0xb6fb71e...
Group segment_line = <Martel.Expression.Group instance at 0xb6...
Group sequence = <Martel.Expression.Group instance at 0xb6f4c1...
Group sequence_entry = <Martel.Expression.Group instance at 0x...
Group sequence_line = <Martel.Expression.Group instance at 0xb...
Group sequence_plus_spaces = <Martel.Expression.Group instance...
Group size = <Martel.Expression.Group instance at 0xb6fac70c>
Str small_indent_space = <Martel.Expression.Str instance at ...
Group source_block = <Martel.Expression.Group instance at 0xb6...
Group taxonomy = <Martel.Expression.Group instance at 0xb6fb75...
Group title_block = <Martel.Expression.Group instance at 0xb6f...
Seq unquoted_string = <Martel.Expression.Seq instance at 0xb...
list valid_divisions = ['PRI', 'ROD', 'MAM', 'VRT', 'INV', 'P...
list valid_residue_prefixes = ['ss-', 'ds-', 'ms-']
list valid_residue_types = ['DNA', 'RNA', 'mRNA', 'tRNA', 'rR...
Group version = <Martel.Expression.Group instance at 0xb6fb19a...
Group version_line = <Martel.Expression.Group instance at 0xb6...

Function Details

define_block(identifier, block_tag, block_data, std_block_tag=None, std_tag=None)

Define a Martel grouping which can parse a block of text.

Many of the GenBank lines we'll want to process are grouped into a block like:

IDENTIFIER Blah blah blah

Where blah blah blah can wrap for multiple lines. This function makes it easy to consistently define a definition for these blocks.

Arguments: o identifier - The identifier that begins the block (like DEFINITION). o block_tag - A callback tag for the entire block. o block_data - A callback tag for the data in the block (ie. the stuff you are interested in). o std_block_tag - A Bio.Std Martel tag used to register the entire block as having being a "standard" type of information. o std_tag - A Bio.Std Martel tag used to register just the information in the block as being "standard"

Variable Details

accession

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb13cc>                       

accession_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb172c>                       

authors_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb7e4c>                       

base_count

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f45eec>                       

base_count_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f45f6c>                       

base_number

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f4c1ac>                       

big_indent_space

Type:
Str
Value:
<Martel.Expression.Str instance at 0xb6fac56c>                         

blank_space

Type:
MaxRepeat
Value:
<Martel.Expression.MaxRepeat instance at 0xb6fac50c>                   

comment_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fbefcc>                       

consrtm_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fbe10c>                       

contig_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f4c78c>                       

contig_location

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f4c54c>                       

data_file_division

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6face0c>                       

date

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6facb2c>                       

db_source_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb1eac>                       

definition_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb136c>                       

divisions

Type:
list
Value:
[<Martel.Expression.Str instance at 0xb6facb0c>,
 <Martel.Expression.Str instance at 0xb6facbac>,
 <Martel.Expression.Str instance at 0xb6facbcc>,
 <Martel.Expression.Str instance at 0xb6facbec>,
 <Martel.Expression.Str instance at 0xb6facc0c>,
 <Martel.Expression.Str instance at 0xb6facc2c>,
 <Martel.Expression.Str instance at 0xb6facc4c>,
 <Martel.Expression.Str instance at 0xb6facc6c>,
...                                                                    

feature

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f45e0c>                       

feature_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f45e6c>                       

feature_key

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f455ac>                       

FEATURE_KEY_INDENT

Type:
int
Value:
5                                                                     

feature_key_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f4584c>                       

FEATURE_QUALIFIER_INDENT

Type:
int
Value:
21                                                                    

features_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f4550c>                       

format

Type:
ParseRecords
Value:
<Martel.Expression.ParseRecords instance at 0xb6f4cbcc>                

gi

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb1a6c>                       

header

Type:
Seq
Value:
<Martel.Expression.Seq instance at 0xb6f4cacc>                         

INDENT

Type:
int
Value:
12                                                                    

journal_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fbe64c>                       

keywords_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb716c>                       

location

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f4558c>                       

locus

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fac64c>                       

locus_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6face6c>                       

medline_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fbe68c>                       

ncbi_format

Type:
HeaderFooter
Value:
<Martel.Expression.HeaderFooter instance at 0xb6f4caec>                

nid

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb17ac>                       

nid_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb182c>                       

organism

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb778c>                       

organism_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb794c>                       

origin_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f4c12c>                       

pid

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb18cc>                       

pid_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb194c>                       

primary

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f4546c>                       

primary_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f450ec>                       

primary_ref_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f4532c>                       

pubmed_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fbe82c>                       

qualifier

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f45cac>                       

qualifier_space

Type:
Alt
Value:
<Martel.Expression.Alt instance at 0xb6fac5cc>                         

quote

Type:
Str
Value:
<Martel.Expression.Str instance at 0xb6f458ac>                         

quoted_chars

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f4590c>                       

quoted_string

Type:
Seq
Value:
<Martel.Expression.Seq instance at 0xb6f45acc>                         

record

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f4ca6c>                       

record_end

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f4c86c>                       

reference

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fbecec>                       

reference_bases

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb7a6c>                       

reference_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb7b6c>                       

reference_num

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb79cc>                       

region

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb144c>                       

remark_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fbec2c>                       

residue_prefixes

Type:
list
Value:
[<Martel.Expression.Str instance at 0xb6fac7cc>,
 <Martel.Expression.Str instance at 0xb6fac7ec>,
 <Martel.Expression.Str instance at 0xb6fac80c>]                       

residue_type

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fac9cc>                       

residue_types

Type:
list
Value:
[<Martel.Expression.Str instance at 0xb6fac82c>,
 <Martel.Expression.Str instance at 0xb6fac84c>,
 <Martel.Expression.Str instance at 0xb6fac86c>,
 <Martel.Expression.Str instance at 0xb6fac88c>,
 <Martel.Expression.Str instance at 0xb6fac8ac>,
 <Martel.Expression.Str instance at 0xb6fac8cc>,
 <Martel.Expression.Str instance at 0xb6fac8ec>,
 <Martel.Expression.Str instance at 0xb6fac90c>,
...                                                                    

segment

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb71ec>                       

segment_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb742c>                       

sequence

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f4c16c>                       

sequence_entry

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f4c4ec>                       

sequence_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f4c40c>                       

sequence_plus_spaces

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6f4c38c>                       

size

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fac70c>                       

small_indent_space

Type:
Str
Value:
<Martel.Expression.Str instance at 0xb6fac46c>                         

source_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb772c>                       

taxonomy

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb754c>                       

title_block

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fbe3ac>                       

unquoted_string

Type:
Seq
Value:
<Martel.Expression.Seq instance at 0xb6f45bac>                         

valid_divisions

Type:
list
Value:
['PRI', 'ROD', 'MAM', 'VRT', 'INV', 'PLN', 'BCT', 'RNA', 'VRL']        

valid_residue_prefixes

Type:
list
Value:
['ss-', 'ds-', 'ms-']                                                  

valid_residue_types

Type:
list
Value:
['DNA', 'RNA', 'mRNA', 'tRNA', 'rRNA', 'uRNA', 'scRNA', 'snRNA', 'snoR\
NA']                                                                   

version

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb19ac>                       

version_line

Type:
Group
Value:
<Martel.Expression.Group instance at 0xb6fb1bcc>                       

Generated by Epydoc 2.1 on Thu Aug 10 20:01:11 2006 http://epydoc.sf.net