PSAlign is a multiple sequence alignment program based on the shortest preserving alignment formulation proposed in Sze et al. (2006). It finds a multiple alignment of k sequences which preserves k-1 pairwise alignments as specified by edges of a given tree in polynomial time without using a heuristic. By using consistency-based pairwise alignments from the first stage of the programs TCoffee or ProbCons, PSAlign replaces the second heuristic progressive step of these programs by the exact preserving alignment step.
We test PSAlign on three sets of benchmark multiple alignments including BAliBASE 2.01, PREFAB 4.0 and SABmark 1.65. PSAlign outperforms TCoffee on two of the three test sets, achieves similar or better accuracy except on one test set when compared to a variant of ProbCons with no iterative refinements, and achieves similar or better accuracy on many sub-categories when compared to ProbCons with iterative refinements. Detailed comparisons can be found in Sze et al. (2006).
The PSAlign source code is available for download and can be compiled under the Unix/Linux/Windows(Cygwin) environment. The source code includes modified versions of TCoffee 1.37 and ProbCons 1.10. The following steps will create a directory called psalign. Further instructions are in the README file.
Sze, S.-H., Lu, Y., and Yang, Q. (2006) A polynomial time solvable formulation of multiple sequence alignment. Journal of Computational Biology, 13, 309-319. (Also appear in Proceedings of the 9th Annual International Conference on Research in Computational Molecular Biology (RECOMB'2005), Lecture Notes in Bioinformatics, 3500, 204-216.)