Class UnknownSeq
source code
object --+
|
Seq --+
|
UnknownSeq
A read-only sequence object of known length but unknown contents.
If you have an unknown sequence, you can represent this with a normal
Seq object, for example:
>>> my_seq = Seq("N"*5)
>>> my_seq
Seq('NNNNN', Alphabet())
>>> len(my_seq)
5
>>> print my_seq
NNNNN
However, this is rather wasteful of memory (especially for large
sequences), which is where this class is most usefull:
>>> unk_five = UnknownSeq(5)
>>> unk_five
UnknownSeq(5, alphabet = Alphabet(), character = '?')
>>> len(unk_five)
5
>>> print(unk_five)
?????
You can add unknown sequence together, provided their alphabets and
characters are compatible, and get another memory saving UnknownSeq:
>>> unk_four = UnknownSeq(4)
>>> unk_four
UnknownSeq(4, alphabet = Alphabet(), character = '?')
>>> unk_four + unk_five
UnknownSeq(9, alphabet = Alphabet(), character = '?')
If the alphabet or characters don't match up, the addition gives an
ordinary Seq object:
>>> unk_nnnn = UnknownSeq(4, character = "N")
>>> unk_nnnn
UnknownSeq(4, alphabet = Alphabet(), character = 'N')
>>> unk_nnnn + unk_four
Seq('NNNN????', Alphabet())
Combining with a real Seq gives a new Seq object:
>>> known_seq = Seq("ACGT")
>>> unk_four + known_seq
Seq('????ACGT', Alphabet())
>>> known_seq + unk_four
Seq('ACGT????', Alphabet())
|
__init__(self,
length,
alphabet=Alphabet(),
character=None)
Create a new UnknownSeq object. |
source code
|
|
|
|
|
__str__(self)
Returns the unknown sequence as full string of the given length. |
source code
|
|
|
|
|
|
|
|
|
|
|
count(self,
sub,
start=0,
end=2147483647)
Non-overlapping count method, like that of a python string. |
source code
|
|
|
|
|
|
|
|
|
|
|
translate(self,
**kwargs)
Translate an unknown nucleotide sequence into an unknown protein. |
source code
|
|
Inherited from Seq :
endswith ,
find ,
lstrip ,
rfind ,
rsplit ,
rstrip ,
split ,
startswith ,
strip ,
tomutable ,
tostring
Inherited from object :
__delattr__ ,
__getattribute__ ,
__hash__ ,
__new__ ,
__reduce__ ,
__reduce_ex__ ,
__setattr__
|
Inherited from Seq :
data
Inherited from object :
__class__
|
__init__(self,
length,
alphabet=Alphabet(),
character=None)
(Constructor)
| source code
|
Create a new UnknownSeq object.
If character is ommited, it is determed from the alphabet,
"N" for nucleotides, "X" for proteins, and
"?" otherwise.
- Overrides:
object.__init__
|
Returns the stated length of the unknown sequence.
- Overrides:
Seq.__len__
|
__str__(self)
(Informal representation operator)
| source code
|
Returns the unknown sequence as full string of the given length.
- Overrides:
object.__str__
|
Returns a (truncated) representation of the sequence for
debugging.
- Overrides:
object.__repr__
- (inherited documentation)
|
Add another sequence or string to this sequence.
- Overrides:
Seq.__add__
- (inherited documentation)
|
Non-overlapping count method, like that of a python string.
This behaves like the python string (and Seq object) method of the
same name, which does a non-overlapping count!
Returns an integer, the number of occurrences of substring argument
sub in the (sub)sequence given by [start:end]. Optional arguments start
and end are interpreted as in slice notation.
Arguments:
-
sub - a string or another Seq object to look for
-
start - optional integer, slice start
-
end - optional integer, slice end
>>> "NNNN".count("N")
4
>>> Seq("NNNN").count("N")
4
>>> UnknownSeq(4, character="N").count("N")
4
>>> UnknownSeq(4, character="N").count("A")
0
>>> UnknownSeq(4, character="N").count("AA")
0
HOWEVER, please note because that python strings and Seq objects (and
MutableSeq objects) do a non-overlapping search, this may not give the
answer you expect:
>>> UnknownSeq(4, character="N").count("NN")
2
>>> UnknownSeq(4, character="N").count("NNN")
1
- Overrides:
Seq.count
|
The complement of an unknown nucleotide equals itself.
>>> my_nuc = UnknownSeq(8)
>>> my_nuc
UnknownSeq(8, alphabet = Alphabet(), character = '?')
>>> print my_nuc
????????
>>> my_nuc.complement()
UnknownSeq(8, alphabet = Alphabet(), character = '?')
>>> print my_nuc.complement()
????????
- Overrides:
Seq.complement
|
The reverse complement of an unknown nucleotide equals itself.
>>> my_nuc = UnknownSeq(10)
>>> my_nuc
UnknownSeq(10, alphabet = Alphabet(), character = '?')
>>> print my_nuc
??????????
>>> my_nuc.reverse_complement()
UnknownSeq(10, alphabet = Alphabet(), character = '?')
>>> print my_nuc.reverse_complement()
??????????
- Overrides:
Seq.reverse_complement
|
Returns unknown RNA sequence from an unknown DNA sequence.
>>> my_dna = UnknownSeq(10, character="N")
>>> my_dna
UnknownSeq(10, alphabet = Alphabet(), character = 'N')
>>> print my_dna
NNNNNNNNNN
>>> my_rna = my_dna.transcribe()
>>> my_rna
UnknownSeq(10, alphabet = RNAAlphabet(), character = 'N')
>>> print my_rna
NNNNNNNNNN
- Overrides:
Seq.transcribe
|
Returns unknown DNA sequence from an unknown RNA sequence.
>>> my_rna = UnknownSeq(20, character="N")
>>> my_rna
UnknownSeq(20, alphabet = Alphabet(), character = 'N')
>>> print my_rna
NNNNNNNNNNNNNNNNNNNN
>>> my_dna = my_rna.back_transcribe()
>>> my_dna
UnknownSeq(20, alphabet = DNAAlphabet(), character = 'N')
>>> print my_dna
NNNNNNNNNNNNNNNNNNNN
- Overrides:
Seq.back_transcribe
|
Translate an unknown nucleotide sequence into an unknown protein.
e.g.
>>> my_seq = UnknownSeq(11, character="N")
>>> print my_seq
NNNNNNNNNNN
>>> my_protein = my_seq.translate()
>>> my_protein
UnknownSeq(3, alphabet = ProteinAlphabet(), character = 'X')
>>> print my_protein
XXX
In comparison, using a normal Seq object:
>>> my_seq = Seq("NNNNNNNNNNN")
>>> print my_seq
NNNNNNNNNNN
>>> my_protein = my_seq.translate()
>>> my_protein
Seq('XXX', ExtendedIUPACProtein())
>>> print my_protein
XXX
- Overrides:
Seq.translate
|