Contact Form

Name

Email *

Message *

Cari Blog Ini

Burrows Wheeler Transformation

The Burrows-Wheeler Transform: A Powerful Tool for Data Compression

What is the Burrows-Wheeler Transform?

The Burrows-Wheeler Transform (BWT) is a data transformation algorithm that restructures data in such a way that certain patterns become more apparent. It is often used as a pre-processing step for lossless data compression algorithms, such as bzip2.

Motivation: Short Read Mapping

One of the key applications of the BWT is in short read mapping. With the advent of high-throughput sequencing technologies, biologists can now generate millions of short reads from a single DNA sample. These reads need to be mapped to a reference genome to identify genetic variants and other genomic features. The BWT can be used to accelerate this mapping process by making it easier to find matches between the reads and the reference genome.

How does the Burrows-Wheeler Transform Work?

The BWT works by rotating the characters of a string by one position to the right. This creates a new string, called the transformed string. The characters in the transformed string are then sorted in alphabetical order. For example, given the string "banana", the transformed string would be "aaannb". The BWT has a number of interesting properties. One property is that the last character of the transformed string is always the first character of the original string. Another property is that the number of occurrences of any given character in the transformed string is the same as the number of occurrences of that character in the original string.

Conclusion

The Burrows-Wheeler Transform is a powerful tool for data compression and other applications. It is a relatively simple algorithm to implement, but it can have a significant impact on the performance of downstream algorithms. As a result, the BWT is widely used in a variety of applications, including bioinformatics, text processing, and data mining.


Comments