Sophie

Sophie

distrib > Mandriva > cooker > i586 > by-pkgid > eede1088262777189a3d3f16cb3dc560 > files > 116

garlic-1.6-6.i586.rpm

<html>

<head>
<title>
PAT, PATTERN
</title>
</head>

<h1 align=center>

PAT, PATTERN

</h1>

<hr size="3">

<font color=#880000>
<b>
NAME
<br>
</b>
</font>

PAT, PATTERN - define sequence pattern.
<br><br>

<font color=#880000>
<b>
SYNOPSIS
<br>
</b>
</font>

PAT = name_11 name_12 ... name_1m / ... / name_k1 name_k2 ... name_kn
<br>
PAT TOL number_of_allowed_errors
<br>
PATTERN = name_11 name_12 ... name_1m / ... / name_k1 name_k2 ... name_kn
<br>
PATTERN TOLERANCE number_of_allowed_errors
<br><br>

<font color=#880000>
<b>
DESCRIPTION
<br>
</b>
</font>

The command PATTERN (garlic version 1.4 or later) may be used to define a
sequence pattern. It is more general than the command SEQUENCE. Each pattern
is a set of name lists, separated by slashes (/). In the following example,
a simple pattern of three residues is defined:
<br><br>

PAT = ASP ASN / PHE TYR TRP / ARG LYS
<br><br>

The first residue may be matched by either ASP or ASN, the second residue
may be matched by PHE, TYR or TRP, while the third residue may be matched
by ARG or LYS.
<br><br>

The macromolecular structure may be searched for the given pattern, using
the command SELECT PATTERN (short form: SEL PAT). The commands RESTRICT
and ADD may be also used to search for the fiven pattern. For the pattern
defined above, the following fragments will be selected:
<br><br>

ASP PHE ARG
<br>
ASP PHE LYS
<br>
ASP TYR ARG
<br>
ASP TYR LYS
<br>
ASP TRP ARG
<br>
ASP TRP LYS
<br>
ASN PHE ARG
<br>
ASN PHE LYS
<br>
ASN TYR ARG
<br>
ASN TYR LYS
<br>
ASN TRP ARG
<br>
ASN TRP LYS
<br><br>

<font color=#880000>
<b>
WILDCARDS
<br>
</b>
</font>

The command PATTERN may be combined with wildcards. The character * (asterisk)
may be placed at any position in the pattern string, though it does not make
much sense at the first and at the last position. Example:
<br><br>

PAT = ARG LYS HIS / * / ARG LYS HIS / * / ARG LYS HIS
<br><br>

<font color=#880000>
<b>
PATTERN TOLERANCE
<br>
</b>
</font>

TOLERANCE (short form: TOL) is the only keyword which may be combined with
the command PATTERN. It may be used to define the pattern tolerance, i.e. the
maximal number of allowed errors which may be tolerated while searching for
the specified pattern.
<br><br>

<font color=#880000>
<b>
DELETIONS AND INSERTIONS
<br>
</b>
</font>

It is not possible to handle deletions and insertions directly. However, 
a number of different patterns may be incorporated into a simple script,
which may be used to handle deletions and insertions. The example below is
a screen shot of one such script (loaded into vi editor). The script contains
three patterns of different length. Each pattern was written as a single
line but it was wrapped by the editor because of length. Note that many
consecutive wildcards are used in this example.
<br><br>

<img src="patterns.gif">

<br><br>

<font color=#880000>
<b>
NOTES
<br>
</b>
</font>

(1) The maximal number of residue names in each list is 30 (more than enough
for 20 standard residue names). The maximal number of lists in a pattern
is 100. If this is not enough, change MAX_PATT_LENGTH in defines.h file.
<br><br>

<font color=#880000>
<b>
RELATED COMMANDS
<br>
</b>
</font>

The command SEQUENCE, combined with the keyword = (equality sign), may be
used to define a fixed sequence pattern. Of course, only one residue at
each position may be specified with the command SEQUENCE. The commands
SELECT, ADD and RESTRICT may be used to search the macromolecular structure
for the specified pattern.
<br><br>

<hr size="3">

</html>