![]() |
sigcleave |
sigcleave uses the method of von Heijne as modified by von Heijne in his later book where treatment of positions -1 and -3 in the matrix is slightly altered (see references).
% sigcleave Reports protein signal cleavage sites Input sequence(s): tsw:ach2_drome Minimum weight [3.5]: Output report [ach2_drome.sig]: |
Go to the input files for this example
Go to the output files for this example
Mandatory qualifiers: [-sequence] seqall Sequence database USA -minweight float Minimum scoring weight value for the predicted cleavage site [-outfile] report Output report file name Optional qualifiers: -prokaryote boolean Specifies the sequence is prokaryotic and changes the default scoring data file name Advanced qualifiers: (none) General qualifiers: -help boolean Report command line options. More information on associated and general qualifiers can be found with -help -verbose |
Mandatory qualifiers | Allowed values | Default | |
---|---|---|---|
[-sequence] (Parameter 1) |
Sequence database USA | Readable sequence(s) | Required |
-minweight | Minimum scoring weight value for the predicted cleavage site | Number from 0.000 to 100.000 | 3.5 |
[-outfile] (Parameter 2) |
Output report file name | Report output file | |
Optional qualifiers | Allowed values | Default | |
-prokaryote | Specifies the sequence is prokaryotic and changes the default scoring data file name | Boolean value Yes/No | No |
Advanced qualifiers | Allowed values | Default | |
(none) |
ID ACH2_DROME STANDARD; PRT; 576 AA. AC P17644; DT 01-AUG-1990 (REL. 15, CREATED) DT 01-AUG-1990 (REL. 15, LAST SEQUENCE UPDATE) DT 01-NOV-1997 (REL. 35, LAST ANNOTATION UPDATE) DE ACETYLCHOLINE RECEPTOR PROTEIN, ALPHA-LIKE CHAIN 2 PRECURSOR. GN ACRE OR SAD OR ACR96AB. OS DROSOPHILA MELANOGASTER (FRUIT FLY). OC EUKARYOTA; METAZOA; ARTHROPODA; TRACHEATA; HEXAPODA; INSECTA; OC PTERYGOTA; DIPTERA; BRACHYCERA; MUSCOMORPHA; EPHYDROIDEA; OC DROSOPHILIDAE; DROSOPHILA. RN [1] RP SEQUENCE FROM N.A. RC TISSUE=HEAD; RX MEDLINE; 90301489. RA BAUMANN A., JONAS P., GUNDELFINGER E.D.; RT "Sequence of D alpha 2, a novel alpha-like subunit of Drosophila RT nicotinic acetylcholine receptors."; RL NUCLEIC ACIDS RES. 18:3640-3640(1990). RN [2] RP SEQUENCE FROM N.A. RC TISSUE=HEAD; RX MEDLINE; 90353591. RA JONAS P., BAUMANN A., MERZ B., GUNDELFINGER E.D.; RT "Structure and developmental expression of the D alpha 2 gene RT encoding a novel nicotinic acetylcholine receptor protein of RT Drosophila melanogaster."; RL FEBS LETT. 269:264-268(1990). RN [3] RP SEQUENCE FROM N.A. RX MEDLINE; 90360975. RA SAWRUK E., SCHLOSS P., BETZ H., SCHMITT B.; RT "Heterogeneity of Drosophila nicotinic acetylcholine receptors: SAD, RT a novel developmentally regulated alpha-subunit."; RL EMBO J. 9:2671-2677(1990). CC -!- FUNCTION: AFTER BINDING ACETYLCHOLINE, THE ACHR RESPONDS BY AN CC EXTENSIVE CHANGE IN CONFORMATION THAT AFFECTS ALL SUBUNITS AND CC LEADS TO OPENING OF AN ION-CONDUCTING CHANNEL ACROSS THE PLASMA CC MEMBRANE. CC -!- SUBCELLULAR LOCATION: INTEGRAL MEMBRANE PROTEIN. CC -!- TISSUE SPECIFICITY: CNS IN EMBRYOS. CC -!- DEVELOPMENTAL STAGE: LATE EMBRYONIC AND LATE PUPAL STAGES. CC -!- SIMILARITY: BELONGS TO THE LIGAND-GATED IONIC CHANNELS FAMILY. CC -------------------------------------------------------------------------- CC This SWISS-PROT entry is copyright. It is produced through a collaboration CC between the Swiss Institute of Bioinformatics and the EMBL outstation - CC the European Bioinformatics Institute. There are no restrictions on its CC use by non-profit institutions as long as its content is in no way CC modified and this statement is not removed. Usage by and for commercial CC entities requires a license agreement (See http://www.isb-sib.ch/announce/ CC or send an email to license@isb-sib.ch). CC -------------------------------------------------------------------------- DR EMBL; X52274; G7803; -. DR EMBL; X53583; G8533; -. DR PIR; S11679; ACFFA2. DR FLYBASE; FBgn0000039; nAcR-alpha-96Ab. DR PROSITE; PS00236; NEUROTR_ION_CHANNEL; 1. DR PFAM; PF00065; neur_chan; 1. KW RECEPTOR; POSTSYNAPTIC MEMBRANE; IONIC CHANNEL; GLYCOPROTEIN; SIGNAL; KW TRANSMEMBRANE; MULTIGENE FAMILY. FT SIGNAL 1 41 PROBABLE. FT CHAIN 42 576 ACETYLCHOLINE RECEPTOR PROTEIN, ALPHA-2. FT DOMAIN 42 261 EXTRACELLULAR (POTENTIAL). FT TRANSMEM 262 285 POTENTIAL. FT TRANSMEM 293 311 POTENTIAL. FT TRANSMEM 327 346 POTENTIAL. FT DOMAIN 347 526 CYTOPLASMIC (POTENTIAL). FT TRANSMEM 527 545 POTENTIAL. FT DISULFID 169 183 BY SIMILARITY. FT DISULFID 243 244 ASSOCIATED WITH RECEPTOR ACTIVATION FT (BY SIMILARITY). FT CARBOHYD 65 65 POTENTIAL. FT CARBOHYD 254 254 POTENTIAL. FT CARBOHYD 570 570 POTENTIAL. SQ SEQUENCE 576 AA; 65506 MW; 7B795689 CRC32; MAPGCCTTRP RPIALLAHIW RHCKPLCLLL VLLLLCETVQ ANPDAKRLYD DLLSNYNRLI RPVSNNTDTV LVKLGLRLSQ LIDLNLKDQI LTTNVWLEHE WQDHKFKWDP SEYGGVTELY VPSEHIWLPD IVLYNNADGE YVVTTMTKAI LHYTGKVVWT PPAIFKSSCE IDVRYFPFDQ QTCFMKFGSW TYDGDQIDLK HISQKNDKDN KVEIGIDLRE YYPSVEWDIL GVPAERHEKY YPCCAEPYPD IFFNITLRRK TLFYTVNLII PCVGISYLSV LVFYLPADSG EKIALCISIL LSQTMFFLLI SEIIPSTSLA LPLLGKYLLF TMLLVGLSVV ITIIILNIHY RKPSTHKMRP WIRSFFIKRL PKLLLMRVPK DLLRDLAANK INYGLKFSKT KFGQALMDEM QMNSGGSSPD SLRRMQGRVG AGGCNGMHVT TATNRFSGLV GALGGGLSTL SGYNGLPSVL SGLDDSLSDV AARKKYPFEL EKAIHNVMFI QHHMQRQDEF NAEDQDWGFV AMVMDRLFLW LFMIASLVGT FVILGEAPSL YDDTKAIDVQ LSDVAKQIYN LTEKKN // |
The output is a standard EMBOSS report file.
The results can be output in one of several styles by using the command-line qualifier -rformat xxx, where 'xxx' is replaced by the name of the required format. The available format names are: embl, genbank, gff, pir, swiss, trace, listfile, dbmotif, diffseq, excel, feattable, motif, regions, seqtable, simple, srs, table, tagseq
See: http://www.uk.embnet.org/Software/EMBOSS/Themes/ReportFormats.html for further information on report formats.
By default sigcleave writes a 'motif' report file.
######################################## # Program: sigcleave # Rundate: Thu Nov 07 14:48:24 2002 # Report_format: motif # Report_file: ach2_drome.sig ######################################## #======================================= # # Sequence: ACH2_DROME from: 1 to: 576 # HitCount: 9 # # Reporting scores over 3.50 # #======================================= (1) Score 13.739 length 13 at residues 29->41 Sequence: LLVLLLLCETVQA | | 29 41 mature_peptide: NPDAKRLYDDLLSNYNRLIRPVSNNTDTVLVKLGLRLSQLIDLNLKDQIL (2) Score 3.632 length 13 at residues 308->320 Sequence: LLISEIIPSTSLA | | 308 320 mature_peptide: LPLLGKYLLFTMLLVGLSVVITIIILNIHYRKPSTHKMRPWIRSFFIKRL (3) Score 3.751 length 13 at residues 527->539 Sequence: LFLWLFMIASLVG | | 527 539 mature_peptide: TFVILGEAPSLYDDTKAIDVQLSDVAKQIYNLTEKKN (4) Score 4.026 length 13 at residues 31->43 Sequence: VLLLLCETVQANP | | 31 43 mature_peptide: DAKRLYDDLLSNYNRLIRPVSNNTDTVLVKLGLRLSQLIDLNLKDQILTT (5) Score 5.057 length 13 at residues 24->36 Sequence: KPLCLLLVLLLLC | | 24 36 mature_peptide: ETVQANPDAKRLYDDLLSNYNRLIRPVSNNTDTVLVKLGLRLSQLIDLNL (6) Score 6.981 length 13 at residues 330->342 Sequence: FTMLLVGLSVVIT | | 330 342 mature_peptide: IIILNIHYRKPSTHKMRPWIRSFFIKRLPKLLLMRVPKDLLRDLAANKIN (7) Score 7.360 length 13 at residues 528->540 Sequence: FLWLFMIASLVGT | | 528 540 mature_peptide: FVILGEAPSLYDDTKAIDVQLSDVAKQIYNLTEKKN (8) Score 10.465 length 13 at residues 28->40 Sequence: LLLVLLLLCETVQ | | 28 40 mature_peptide: ANPDAKRLYDDLLSNYNRLIRPVSNNTDTVLVKLGLRLSQLIDLNLKDQI (9) Score 12.135 length 13 at residues 26->38 Sequence: LCLLLVLLLLCET | | 26 38 mature_peptide: VQANPDAKRLYDDLLSNYNRLIRPVSNNTDTVLVKLGLRLSQLIDLNLKD #--------------------------------------- #--------------------------------------- |
EMBOSS data files are distributed with the application and stored in the standard EMBOSS data directory, which is defined by EMBOSS environment variable EMBOSS_DATA.
Users can provide their own data files in their own directories. Project specific files can be put in the current directory, or for tidier directory listings in a subdirectory called ".embossdata". Files for all EMBOSS runs can be put in the user's home directory, or again in a subdirectory called ".embossdata".
The directories are searched in the following order:
# Amino acid counts for 161 Eukaryotic Signal Peptides, # from von Heijne (1986), Nucl. Acids. Res. 14:4683-4690 # # The cleavage site is between +1 and -1 # Sample: 161 aligned sequences # # R -13 -12 -11 -10 -9 -8 -7 -6 -5 -4 -3 -2 -1 +1 +2 Expect # - --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ------ A 16 13 14 15 20 18 18 17 25 15 47 6 80 18 6 14.5 C 3 6 9 7 9 14 6 8 5 6 19 3 9 8 3 4.5 D 0 0 0 0 0 0 0 0 5 3 0 5 0 10 11 8.9 E 0 0 0 1 0 0 0 0 3 7 0 7 0 13 14 10.0 F 13 9 11 11 6 7 18 13 4 5 0 13 0 6 4 5.6 G 4 4 3 6 3 13 3 2 19 34 5 7 39 10 7 12.1 H 0 0 0 0 0 1 1 0 5 0 0 6 0 4 2 3.4 I 15 15 8 6 11 5 4 8 5 1 10 5 0 8 7 7.4 K 0 0 0 1 0 0 1 0 0 4 0 2 0 11 9 11.3 L 71 68 72 79 78 45 64 49 10 23 8 20 1 8 4 12.1 M 0 3 7 4 1 6 2 2 0 0 0 1 0 1 2 2.7 N 0 1 0 1 1 0 0 0 3 3 0 10 0 4 7 7.1 P 2 0 2 0 0 4 1 8 20 14 0 1 3 0 22 7.4 Q 0 0 0 1 0 6 1 0 10 8 0 18 3 19 10 6.3 R 2 0 0 0 0 1 0 0 7 4 0 15 0 12 9 7.6 S 9 3 8 6 13 10 15 16 26 11 23 17 20 15 10 11.4 T 2 10 5 4 5 13 7 7 12 6 17 8 6 3 10 9.7 V 20 25 15 18 13 15 11 27 0 12 32 3 0 8 17 11.1 W 4 3 3 1 1 2 6 3 1 3 0 9 0 2 0 1.8 Y 0 1 4 0 0 1 3 1 1 2 0 5 0 1 7 5.6
If you use matrix tables with a different number of residues before or after the cleavage site, you must also set the advanced parameters nval and pval.
Program name | Description |
---|---|
antigenic | Finds antigenic sites in proteins |
digest | Protein proteolytic enzyme or reagent cleavage digest |
fuzzpro | Protein pattern search |
fuzztran | Protein pattern search after translation |
helixturnhelix | Report nucleic acid binding motifs |
oddcomp | Finds protein sequence regions with a biased composition |
patmatdb | Search a protein sequence with a motif |
patmatmotifs | Search a PROSITE motif database with a protein sequence |
pepcoil | Predicts coiled coil regions |
pestfind | Finds PEST motifs as potential proteolytic cleavage sites |
preg | Regular expression search of a protein sequence |
pscan | Scans proteins using PRINTS |
Original program "SIGCLEAVE" by Peter Rice (EGCG 1989)