Trees | Indices | Help |
---|
|
Dictionary like indexing of sequence files (PRIVATE).
You are not expected to access this module, or any of its code, directly. This is all handled internally by the Bio.SeqIO.index(...) function which is the public interface for this functionality.
The basic idea is that we scan over a sequence file, looking for new record markers. We then try and extract the string that Bio.SeqIO.parse/read would use as the record id, ideally without actually parsing the full record. We then use a subclassed Python dictionary to record the file offset for the record start against the record id.
Note that this means full parsing is on demand, so any invalid or problem record may not trigger an exception until it is accessed. This is by design.
This means our dictionary like objects have in memory ALL the keys (all the record identifiers), which shouldn't be a problem even with second generation sequencing. If this is an issue later on, storing the keys and offsets in a temp lookup file might be one idea (e.g. using SQLite or an OBDA style index).
|
|||
_IndexedSeqFileDict Read only dictionary interface to a sequential sequence file. |
|||
_SequentialSeqFileDict Subclass for easy cases (PRIVATE). |
|||
FastaDict Indexed dictionary like access to a FASTA file. |
|||
QualDict Indexed dictionary like access to a QUAL file. |
|||
PirDict Indexed dictionary like access to a PIR/NBRF file. |
|||
PhdDict Indexed dictionary like access to a PHD (PHRED) file. |
|||
AceDict Indexed dictionary like access to an ACE file. |
|||
GenBankDict Indexed dictionary like access to a GenBank file. |
|||
EmblDict Indexed dictionary like access to an EMBL file. |
|||
SwissDict Indexed dictionary like access to a SwissProt file. |
|||
IntelliGeneticsDict Indexed dictionary like access to a IntelliGenetics file. |
|||
TabDict Indexed dictionary like access to a simple tabbed file. |
|||
_FastqSeqFileDict Subclass for easy cases (PRIVATE). |
|||
FastqSangerDict Indexed dictionary like access to a standard Sanger FASTQ file. |
|||
FastqSolexaDict Indexed dictionary like access to a Solexa (or early Illumina) FASTQ file. |
|||
FastqIlluminaDict Indexed dictionary like access to a Illumina 1.3+ FASTQ file. |
|
|||
_FormatToIndexedDict =
|
|||
__package__ =
|
|
_FormatToIndexedDict
|
Trees | Indices | Help |
---|
Generated by Epydoc 3.0.1 on Tue Sep 22 19:53:55 2009 | http://epydoc.sourceforge.net |