Bioplib
Protein Structure C Library
|
Perform Needleman & Wunsch sequence alignment on two sequences encoded as numeric symbols. More...
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include "SysDefs.h"
#include "macros.h"
#include "array.h"
#include "general.h"
#include "seq.h"
Go to the source code of this file.
Data Structures | |
struct | XY |
Macros | |
#define | DATAENV "DATADIR" /* Environment variable or assign */ |
#define | MAXBUFF 2048 |
Functions | |
BOOL | blNumericReadMDM (char *mdmfile) |
int | blNumericCalcMDMScore (int resa, int resb) |
int | blNumericAffineAlign (int *seq1, int length1, int *seq2, int length2, BOOL verbose, BOOL identity, int penalty, int penext, int *align1, int *align2, int *align_len) |
Perform Needleman & Wunsch sequence alignment on two sequences encoded as numeric symbols.
This code is NOT IN THE PUBLIC DOMAIN, but it may be copied according to the conditions laid out in the accompanying file COPYING.DOC.
The code may be modified as required, but any modifications must be documented so that the person responsible can be identified.
The code may not be sold commercially or included as part of a commercial product except as described in the file COPYING.DOC.
Note, the code herein is very heavily based on code written by Dr. Andrew C.R. Martin while self-employed. Some modifications were made to that original code while employed at University College London. This version which handles sequences encoded as arrays of numbers rather than as character arrays was modified from the original version(s) while employed at Reading University.
A simple Needleman & Wunsch Dynamic Programming alignment of 2 sequences encoded as numeric symbols. A window is not used so the routine may be a bit slow on long sequences.
First call NumericReadMDM() to read the mutation data matrix, then call NumericAffineAlign() to align the sequences.
Definition in file NumericAlign.c.
#define DATAENV "DATADIR" /* Environment variable or assign */ |
Definition at line 110 of file NumericAlign.c.
#define MAXBUFF 2048 |
Definition at line 112 of file NumericAlign.c.
int blNumericAffineAlign | ( | int * | seq1, |
int | length1, | ||
int * | seq2, | ||
int | length2, | ||
BOOL | verbose, | ||
BOOL | identity, | ||
int | penalty, | ||
int | penext, | ||
int * | align1, | ||
int * | align2, | ||
int * | align_len | ||
) |
[in] | *seq1 | First sequence of tokens |
[in] | length1 | First sequence length |
[in] | *seq2 | Second sequence of tokens |
[in] | length2 | Second sequence length |
[in] | verbose | Display N&W matrix |
[in] | identity | Use identity matrix |
[in] | penalty | Gap insertion penalty value |
[in] | penext | Extension penalty |
[out] | *align1 | Sequence 1 aligned |
[out] | *align2 | Sequence 2 aligned |
[out] | *align_len | Alignment length |
Perform simple N&W alignment of seq1 and seq2. No window is used, so will be slow for long sequences.
The sequences come as integer arrays containing numeric tokens
Note that you must allocate sufficient memory for the aligned sequences. The easy way to do this is to ensure that align1 and align2 are of length (length1+length2).
Identical to align.c/affinealign(), but uses integer arrays
Definition at line 412 of file NumericAlign.c.
int blNumericCalcMDMScore | ( | int | resa, |
int | resb | ||
) |
[in] | resa | First token |
[in] | resb | Second token |
Calculate score from static globally stored mutation data matrix
Identical to align.c/CalcMDMScore(), but uses a different static score array and takes integer parameters. These are used as direct lookups into the score array rather than being searched.
Definition at line 342 of file NumericAlign.c.
BOOL blNumericReadMDM | ( | char * | mdmfile | ) |
[in] | *mdmfile | Mutation data matrix filename |
Read mutation data matrix into static global arrays. The matrix may have comments at the start introduced with a ! in the first column. The matrix must be complete (i.e. a triangular matrix will not work). A line describing the residue types must appear, and may be placed before or after the matrix itself
Identical to align.c/ReadMDM() but reads into a different static 2D array and doesn't read a symbol identifier line from the file as the symbols are numeric and always start from 1 (0 is used as the insert character)
Definition at line 258 of file NumericAlign.c.