How to Get the Reverse Complement of a DNA Strand Using Python

Aditya Raj Feb 02, 2024
  1. Reverse Complement of a DNA Strand
  2. Get the Reverse Complement of a DNA Strand Using the for Loop in Python
  3. Get the Reverse Complement of a DNA Strand Using the join() Method in Python
  4. Get the Reverse Complement of a DNA Strand Using the get() Method in Python
  5. Get the Reverse Complement of a DNA Strand Using List Comprehension in Python
  6. Get the Reverse Complement of a DNA Strand Using the translate() Method in Python
  7. Get the Reverse Complement of a DNA Strand Using the Biopython Module in Python
  8. Conclusion
How to Get the Reverse Complement of a DNA Strand Using Python

While working on scientific tasks in Python, we need to perform certain tasks. This article discusses various ways to get the reverse complement of a DNA strand using Python.

Reverse Complement of a DNA Strand

A DNA strand is mainly represented by four types of bases, namely Adenine (A), Thymine (T), Guanine (G), and Cytosine (C). There may be other types of bases too.

Each DNA strand is represented by a series of letters A, T, G, and C. For instance, ACGTAATTGGCC might be one of the DNA strands.

To get the complement of a DNA strand, we replace A with T, C with G, G with C, and T with A in the original strand. For example, the complement of ACGTAATTGGCC is TGCATTAACCGG.

To reverse the complement of a DNA strand, we will reverse the characters in the complement of the DNA strand. Therefore, the reverse complement will be GGCCAATTACGT.

Let us now discuss ways to get the reverse complement of a DNA string using Python.

Get the Reverse Complement of a DNA Strand Using the for Loop in Python

We will follow the following steps to get the reverse complement of a DNA strand using a for loop in Python.

  • We will first define an empty string named reversed_strand to store the output string.
  • Then, we will find the length of the input DNA strand using the len() function. The len() function takes the DNA strand’s string and returns the length.
  • After that, we will iterate through the characters of the input DNA strand using a for loop in the reverse order.
  • During iteration, if we encounter the character A, we will append T to reversed_strand. Similarly, we will append T, G, and C to reversed_strand if we encounter A, C, and G, in the sequence.
  • If we encounter any other character in the input DNA strand, we will append the same character to the reversed_strand.
  • After executing the for loop, we will get the reversed DNA strand in the variable reversed_strand.

You can observe this in the following example.

input_strand = "ACGTAATTGGCC"
reversed_strand = ""
length = len(input_strand)
for i in range(length):
    character = input_strand[length - 1 - i]
    if character == "A":
        reversed_strand = reversed_strand + "T"
    elif character == "T":
        reversed_strand = reversed_strand + "A"
    elif character == "G":
        reversed_strand = reversed_strand + "C"
    elif character == "C":
        reversed_strand = reversed_strand + "G"
    else:
        reversed_strand = reversed_strand + character
print("The input DNA strand is:", input_strand)
print("The reverse complement is:", reversed_strand)

Output:

The input DNA strand is: ACGTAATTGGCC
The reverse complement is: GGCCAATTACGT

Get the Reverse Complement of a DNA Strand Using the join() Method in Python

In the above approach, while creating reversed_strand, a new string is created for each character in the input DNA strand. This can be costly in terms of time and memory if the input DNA strands are too long.

To avoid this, we can use a list to get the reverse complement of a DNA strand using Python.

We will use the following steps to reverse complement a DNA strand using a for loop, a list, and the join() method.

  • First, we will create an empty list named complement_chars to store the characters of the reverse complement of the DNA strand.
  • Then, we will find the length of the input DNA strand using the len() function.
  • After that, we will iterate through the characters of the input DNA strand using a for loop in the reverse order.
  • In iterations, if we encounter the character A, we will append T to complement_chars using the append() method. The append() method, when invoked on complement_chars, takes a character as its input argument and appends it to complement_chars.
  • Similarly, we will append T, G, and C to complement_chars if we encounter A, C, and G in the sequence.
  • If we encounter any other character in the input DNA strand, we will append the same character to the complement_chars.
  • After executing the for loop, we will get a list of characters of the reverse complement of the input DNA strand in complement_chars.
  • After this, we will use the join() method to obtain the reverse complement of the original DNA strand. The join() method, when invoked on an input_string, takes an iterable object as its input argument; after executing, it returns a new string containing the elements of the iterable object as its characters, separated by input_string.
  • To obtain the reverse complement of the DNA strand using the join() method, we will invoke the join() method on an empty string with complement_chars as its input argument. After executing the join() method, we will get the reverse complement of the input DNA strand.

You can observe this in the following example.

input_strand = "ACGTAATTGGCC"
reversed_strand = ""
complement_chars = []
length = len(input_strand)
for i in range(length):
    character = input_strand[length - 1 - i]
    if character == "A":
        complement_chars.append("T")
    elif character == "T":
        complement_chars.append("A")
    elif character == "G":
        complement_chars.append("C")
    elif character == "C":
        complement_chars.append("G")
    else:
        complement_chars.append(character)
reversed_strand = "".join(complement_chars)
print("The input DNA strand is:", input_strand)
print("The reverse complement is:", reversed_strand)

Output:

The input DNA strand is: ACGTAATTGGCC
The reverse complement is: GGCCAATTACGT

Get the Reverse Complement of a DNA Strand Using the get() Method in Python

Instead of using the if-else block in the for loop, we can use a dictionary and the get() method to get the reverse complement of a DNA strand using Python. For this task, we will create the following dictionary.

reverse_dict = {"A": "T", "T": "A", "G": "C", "C": "G"}

The get() method retrieves a value associated with a key in a dictionary. When invoked on a dictionary, the get() method takes the key as its first input argument and an optional value as its second input argument.

If the key is present in the dictionary, it returns the value associated with it. Otherwise, the get() method returns the optional value passed as the second argument.

We will use the following steps to reverse the complement of a DNA strand using the get() method and reverse_dict.

  • First, we will define an empty string named reversed_strand to store the output string.
  • Then, we will find the length of the input DNA strand using the len() function.
  • After that, we will iterate through the characters of the input DNA strand in reverse order using a for loop.
  • During iteration, we will invoke the get() method on reverse_dict with the current character as its first and the second argument. If the current character is present in reverse_dict, the get() method will return the DNA strand complement; otherwise, the get() method will return the current character.
  • We will append the output of the get() method to reversed_strand.
  • After executing the for loop, we will get the reversed DNA strand in the variable reversed_strand.

You can observe this in the following example.

input_strand = "ACGTAATTGGCC"
reversed_strand = ""
reverse_dict = {"A": "T", "T": "A", "G": "C", "C": "G"}
length = len(input_strand)
for i in range(length):
    character = input_strand[length - 1 - i]
    reversed_strand = reversed_strand + reverse_dict.get(character, character)

print("The input DNA strand is:", input_strand)
print("The reverse complement is:", reversed_strand)

Output:

The input DNA strand is: ACGTAATTGGCC
The reverse complement is: GGCCAATTACGT

As discussed earlier, the approach of creating strings in the for loop is costly. Therefore, we can use a list and the join() method with the get() method to get the reverse complement of a DNA strand using Python, as shown in the following example.

input_strand = "ACGTAATTGGCC"
reversed_strand = ""
reverse_dict = {"A": "T", "T": "A", "G": "C", "C": "G"}
complement_chars = []
length = len(input_strand)
for i in range(length):
    character = input_strand[length - 1 - i]
    complement_chars.append(reverse_dict.get(character, character))
reversed_strand = "".join(complement_chars)
print("The input DNA strand is:", input_strand)
print("The reverse complement is:", reversed_strand)

Output:

The input DNA strand is: ACGTAATTGGCC
The reverse complement is: GGCCAATTACGT

Here, we have first created a list of characters in the reverse complement while iterating the input DNA strand. After that, we have created the reverse complement by joining the characters using the join() method.

Get the Reverse Complement of a DNA Strand Using List Comprehension in Python

Instead of using the for loop, you can also use list comprehension to reverse complement a DNA strand using Python.

We will first reverse the input DNA strand using indexing to complement a DNA strand using list comprehension. After that, we will use list comprehension with the get() method and reverse_dict created in the last example to get a list of characters of the reverse complement.

Once we get the list of characters, we will use the join() method to find the reverse complement of the input DNA strand, as shown in the following example.

input_strand = "ACGTAATTGGCC"
reversed_strand = ""
reverse_dict = {"A": "T", "T": "A", "G": "C", "C": "G"}
temp = input_strand[::-1]
complement_chars = [reverse_dict.get(character) for character in temp]
reversed_strand = "".join(complement_chars)
print("The input DNA strand is:", input_strand)
print("The reverse complement is:", reversed_strand)

Output:

The input DNA strand is: ACGTAATTGGCC
The reverse complement is: GGCCAATTACGT

Get the Reverse Complement of a DNA Strand Using the translate() Method in Python

We can also find the reverse complement of a DNA strand using the translate() method. For this, we will use the following steps.

  • First, we will reverse the input DNA strand using string indexing. After that, we will create a translation table using the maketrans() function.
  • The maketrans() function takes two strings as its input arguments, and the length of both the strings should be the same. After execution, it returns a translation table in which each character in the first string is mapped to the character at the same position in the second string.
  • While invoking the maketrans() method on the input DNA strand, we will pass "ATGC" as the first input argument and "TACG" as the second input argument. In this way, each character in the input DNA strand will be mapped to its complement.
  • After creating the translation table, we will use the translate() method to obtain the reverse complement of the DNA strand.
  • The translate() method, when invoked on a string, takes a translation table as its input argument. After execution, it returns a new string by replacing the characters in the string on which it is invoked according to the translation table; if the mapping of a character is not found in the translation table, it copies the same character to the output string.
  • We will invoke the translate() method on the reversed DNA strand with the translation table as its input argument.
  • After executing the translate() method, we will get the reverse complement of the input DNA strand.

You can observe this in the following example.

input_strand = "ACGTAATTGGCC"
translation_table = input_strand.maketrans("ATCG", "TAGC")
temp = input_strand[::-1]
reversed_strand = temp.translate(translation_table)
print("The input DNA strand is:", input_strand)
print("The reverse complement is:", reversed_strand)

Output:

The input DNA strand is: ACGTAATTGGCC
The reverse complement is: GGCCAATTACGT

Get the Reverse Complement of a DNA Strand Using the Biopython Module in Python

We can also use the Biopython module in Python to reverse complement a DNA strand. Using the following statement, you can install the Biopython module using the package installer for Python PIP3.

pip3 install Bio

The Biopython module provides the reverse_complement() method to reverse complement a DNA strand using Python. When invoked on a DNA sequence object, the reverse_complement() method returns the reverse complement of a DNA sequence.

We will use the following steps to obtain the reverse complement of a DNA strand using the reverse_complement() method in Python.

  • First, we create a DNA sequence from the DNA strand using the Seq() function. The Seq() function takes a string representing the DNA strand as its input and returns a DNA sequence.
  • After getting the DNA sequence object, we will invoke the reverse_complement() method on the sequence to obtain the reverse complement of the DNA strand, as shown in the following example.
from Bio.Seq import Seq

input_strand = "ACGTAATTGGCC"
sequence = Seq(input_strand)
reversed_strand = sequence.reverse_complement()
print("The input DNA strand is:", input_strand)
print("The reverse complement is:", reversed_strand)

Output:

The input DNA strand is: ACGTAATTGGCC
The reverse complement is: GGCCAATTACGT

Conclusion

In this article, we have discussed various approaches to reverse complement a DNA strand using Python. Out of all these methods, you can choose the approach with the translate() method if you aren’t allowed to use an external library; otherwise, you can use the Biopython module to reverse complement a DNA strand in Python.

We hope you enjoyed reading this article. Stay tuned for more informative articles.

Author: Aditya Raj
Aditya Raj avatar Aditya Raj avatar

Aditya Raj is a highly skilled technical professional with a background in IT and business, holding an Integrated B.Tech (IT) and MBA (IT) from the Indian Institute of Information Technology Allahabad. With a solid foundation in data analytics, programming languages (C, Java, Python), and software environments, Aditya has excelled in various roles. He has significant experience as a Technical Content Writer for Python on multiple platforms and has interned in data analytics at Apollo Clinics. His projects demonstrate a keen interest in cutting-edge technology and problem-solving, showcasing his proficiency in areas like data mining and software development. Aditya's achievements include securing a top position in a project demonstration competition and gaining certifications in Python, SQL, and digital marketing fundamentals.

GitHub

Related Article - Python String