Trees | Indices | Help |
---|
|
Object used to load SeqRecord objects into a BioSQL database.
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
|
|||
Int |
|
||
|
|||
|
|||
|
|||
|
|||
|
|
Initialize with connection information for the database. Creating a DatabaseLoader object is normally handled via the BioSeqDatabase DBServer object, for example: from BioSQL import BioSeqDatabase server = BioSeqDatabase.open_database(driver="MySQLdb", user="gbrowse", passwd = "biosql", host = "localhost", db="test_biosql") try : db = server["test"] except KeyError : db = server.new_database("test", description="For testing GBrowse") |
Returns the identifier for the named ontology (PRIVATE). This looks through the onotology table for a the given entry name. If it is not found, a row is added for this ontology (using the definition if supplied). In either case, the id corresponding to the provided name is returned, so that you can reference it in another table. |
Get the id that corresponds to a term (PRIVATE). This looks through the term table for a the given term. If it is not found, a new id corresponding to this term is created. In either case, the id corresponding to that term is returned, so that you can reference it in another table. The ontology_id should be used to disambiguate the term. |
Get the taxon id for this record (PRIVATE). record - a SeqRecord object This searches the taxon/taxon_name tables using the NCBI taxon ID, scientific name and common name to find the matching taxon table entry's id. If the species isn't in the taxon table, and we have at least the NCBI taxon ID, scientific name or common name, at least a minimal stub entry is created in the table. Returns the taxon id (database key for the taxon table, not an NCBI taxon ID), or None if the taxonomy information is missing. See also the BioSQL script load_ncbi_taxonomy.pl which will populate and update the taxon/taxon_name tables with the latest information from the NCBI. |
Map Entrez name terms to those used in taxdump (PRIVATE). We need to make this conversion to match the taxon_name.name_class values used by the BioSQL load_ncbi_taxonomy.pl script. e.g. "ScientificName" -> "scientific name", "EquivalentName" -> "equivalent name", "Synonym" -> "synonym", |
Get the taxon id for this record from the NCBI taxon ID (PRIVATE). ncbi_taxon_id - string containing an NCBI taxon id scientific_name - string, used if a stub entry is recorded common_name - string, used if a stub entry is recorded This searches the taxon table using ONLY the NCBI taxon ID to find the matching taxon table entry's ID (database key). If the species isn't in the taxon table, and the fetch_NCBI_taxonomy flag is true, Biopython will attempt to go online using Bio.Entrez to fetch the official NCBI lineage, recursing up the tree until an existing entry is found in the database or the full lineage has been fetched. Otherwise the NCBI taxon ID, scientific name and common name are recorded as a minimal stub entry in the taxon and taxon_name tables. Any partial information about the lineage from the SeqRecord is NOT recorded. This should mean that (re)running the BioSQL script load_ncbi_taxonomy.pl can fill in the taxonomy lineage. Returns the taxon id (database key for the taxon table, not an NCBI taxon ID). |
This is recursive! (PRIVATE). taxonomic_lineage - list of taxonomy dictionaries from Bio.Entrez First dictionary in list is the taxonomy root, highest would be the species. Each dictionary includes: - TaxID (string, NCBI taxon id) - Rank (string, e.g. "species", "genus", ..., "phylum", ...) - ScientificName (string) (and that is all at the time of writing) This method will record all the lineage given, returning the the taxon id (database key, not NCBI taxon id) of the final entry (the species). |
Fill the bioentry table with sequence information (PRIVATE). record - SeqRecord object to add to the database. |
Add the effective date of the entry into the database. record - a SeqRecord object with an annotated date bioentry_id - corresponding database identifier |
Record a SeqRecord's sequence and alphabet in the database (PRIVATE). record - a SeqRecord object with a seq property bioentry_id - corresponding database identifier |
Record a SeqRecord's annotated comment in the database (PRIVATE). record - a SeqRecord object with an annotated comment bioentry_id - corresponding database identifier |
Record a SeqRecord's misc annotations in the database (PRIVATE). The annotation strings are recorded in the bioentry_qualifier_value table, except for special cases like the reference, comment and taxonomy which are handled with their own tables. record - a SeqRecord object with an annotations dictionary bioentry_id - corresponding database identifier |
Record a SeqRecord's annotated references in the database (PRIVATE). record - a SeqRecord object with annotated references bioentry_id - corresponding database identifier |
Load the first tables of a seqfeature and returns the id (PRIVATE). This loads the "key" of the seqfeature (ie. CDS, gene) and the basic seqfeature table itself. |
Load all of the locations for a SeqFeature into tables (PRIVATE). This adds the locations related to the SeqFeature into the seqfeature_location table. Fuzzies are not handled right now. For a simple location, ie (1..2), we have a single table row with seq_start = 1, seq_end = 2, location_rank = 1. For split locations, ie (1..2, 3..4, 5..6) we would have three row tables with: start = 1, end = 2, rank = 1 start = 3, end = 4, rank = 2 start = 5, end = 6, rank = 3 |
Add a location of a SeqFeature to the seqfeature_location table (PRIVATE). TODO - Add location_operators to location_qualifier_value. |
Insert the (key, value) pair qualifiers relating to a feature (PRIVATE). Qualifiers should be a dictionary of the form: {key : [value1, value2]} |
Add database crossreferences of a SeqFeature to the database (PRIVATE). o dbxrefs List, dbxref data from the source file in the format <database>:<accession> o seqfeature_id Int, the identifier for the seqfeature in the seqfeature table Insert dbxref qualifier data for a seqfeature into the seqfeature_dbxref and, if required, dbxref tables. The dbxref_id qualifier/value sets go into the dbxref table as dbname, accession, version tuples, with dbxref.dbxref_id being automatically assigned, and into the seqfeature_dbxref table as seqfeature_id, dbxref_id, and rank tuples |
o db String, the name of the external database containing the accession number o accession String, the accession of the dbxref data Finds and returns the dbxref_id for the passed data. The method attempts to find an existing record first, and inserts the data if there is no record.
|
Check for a pre-existing seqfeature_dbxref entry with the passed seqfeature_id and dbxref_id. If one does not exist, insert new data |
Load any sequence level cross references into the database (PRIVATE). See table bioentry_dbxref. |
Check for a pre-existing bioentry_dbxref entry with the passed seqfeature_id and dbxref_id. If one does not exist, insert new data |
Trees | Indices | Help |
---|
Generated by Epydoc 3.0.1 on Tue Sep 22 19:54:07 2009 | http://epydoc.sourceforge.net |