# Smith-Waterman Algorithm in Python

The Smith-Waterman algorithm is used to perform local sequence alignment of strings. The strings mostly represent DNA strands or protein sequences.

## Smith-Waterman Algorithm in Python

The `swalign` module contains several functions to implement the Smith-Waterman algorithm in Python. You can install the `swalign` module using `PIP` by executing the following statement in the command line.

``````pip3 install swalign
``````

The above statement installs the module for Python version 3. To install the module in Python version 2, you can use the following command.

``````pip install swalign
``````

After installing the `swalign` module, we will use the following steps to implement the Smith-Waterman algorithm in our Python program.

1. First, we will import the `swalign` module using the `import` statement.

2. To perform the alignment, we must create a nucleotide scoring matrix. In the matrix, we provide a score for each match and mismatch.

Commonly, we use 2 for a match score and -1 for a mismatch.

3. To create the nucleotide scoring matrix, we will use the `NucleotideScoringMatrix()` method. The `NucleotideScoringMatrix()` takes the match score as its first input argument and the mismatch score as its second input argument.

After execution, it returns an `IdentityScoringMatrix` object.

4. Once we get the nucleotide matrix, we will create a `LocalAlignment` object using the `LocalAlignment()` method. The `LocalAlignment()` method takes the nucleotide scoring matrix as its input and returns a `LocalAlignment` object.

5. Once we get the `LocalAlignment` object, we can execute the Smith-Waterman algorithm using the `align()` method.

6. The `align()` method, when invoked on a `LocalAlignment` object, takes a string representing a DNA strand as its first input argument. It takes another string representing the reference DNA strand.

7. After execution, the `align()` method returns an `Alignment` object. The `Alignment` object contains the match details and mismatch of the input strings and several other details.

You can observe the entire process in the following example.

``````import swalign

dna_string = "ATCCACAGC"
reference_string = "ATGCAGCGC"
match_score = 2
mismatch_score = -1
matrix = swalign.NucleotideScoringMatrix(match_score, mismatch_score)
lalignment_object = swalign.LocalAlignment(matrix)
alignment_object = lalignment_object.align(dna_string, reference_string)
alignment_object.dump()
``````

Output:

``````Query:  1 ATGCAGC-GC 9
||.|| | ||
Ref  :  1 ATCCA-CAGC 9

Score: 11
Matches: 7 (70.0%)
Mismatches: 3
CIGAR: 5M1I1M1D2M
``````

## Conclusion

This article discusses how we can implement the Smith-Waterman algorithm using Python’s `swalign` module.