Distributed Indexing Dispatched Alignment* (DIDA*)

DIDA* performs large-scale alignment tasks by distributing the indexing and alignment stages into smaller subtasks over a cluster of compute nodes.

Performance increase by up to 77 percent1

Distributed Indexing Dispatched Alignment* (DIDA*) is a novel distributed and parallel indexing and alignment framework that consists of five major steps to perform the indexing and alignment task: distribute, index, dispatch, align, and merge. The indexing and dispatch steps are performed in parallel. It works by first partitioning the targets into smaller parts using a heuristic balanced cut. Next, DIDA creates an index for each partition. The reads are then “flowed” through a Bloom filter to dispatch the alignment task to the node(s). Finally, the reads are aligned on all partitions in parallel and the partial results are combined together to create the final output.

DIDA is written in C++ and parallelized using OpenMP for multithreaded computing on a single computing node. For distributed computing, DIDA employs a message passing interface (MPI) for inter-process communications. As input, it gets the set of target sequences and the set of queries in FASTA or FASTQ formats, and the default output is SAM format.

Performance Results

The performance of DIDA was measured and evaluated when coupled with popular alignment methods Burrows-Wheeler Aligner* (BWA*), Bowtie2, Novoalign, and ABySS-map on C. elegans, human draft genome, human reference genome, and P. glauca genome. Compared to their baseline performance, when run through the DIDA framework with 12 nodes, BWA, Bowtie2, Novoalign, and ABySS-map use less memory (91 percent, 90 percent, 87 percent, and 91 percent, respectively) and execute faster (55 percent, 74 percent, 77 percent, and 67 percent, respectively) for a draft human genome assembly.1

Download the code ›

Reproduce these results with this optimization recipe ›

Related Codes

Assembly By Short Sequences * (ABySS*) ›


Hamid Mohamadi, Benjamin P. Vandervalk, Anthony Raymond, Shaun D. Jackman, Justin Chu, Clay P. Breshears, and Inanc Birol. "DIDA: Distributed Indexing Dispatched Alignment." PLoS ONE 10, no. 4 (2015). doi: 10.1371/journal.pone.0126409.

Configuration Table

System Overview



Twelve HPC nodes interconnected by 40Gbps Infiniband


Each node has two Intel® Xeon® X5650 processors (2.67 GHz)


Each node has 48GB RAM

Operating System

CentOS 5.4
Intel® Cluster Studio 2013
DIDA ver. 1.0.1, ABySS-map v1.5.2
BWA v0.7.10, Bowtie2 v2.1.0
Novoalign v3.01.02




效能測試中使用的軟體與工作負載可能僅針對 Intel® 微處理器進行最佳化。包括 SYSmark* 與 MobileMark* 在內的效能測試是使用特定電腦系統、零組件、軟體、作業與功能進行量測。這些因素若有任何異動,均可能導致測得結果產生變化。建議您參考其他資訊與效能測試數據,協助您充分評估欲購買產品的性能,包括該產品在搭配其他產品運作時的效能。如需完整的資訊,請參閱 http://www.intel.com.tw/benchmarks