The MAFFT algorithm is for gene level alignment of either protein or nucleotide sequences. If you want to align a large number of sequences in MegAlign Pro, we recommend using this algorithm. As of the Lasergene 17.3 release (August 2021), MegAlign Pro uses MAFFT7. This powerful aligner can typically align thousands of viral genomes, for example, in under two minutes. The ability to align to a specified reference sequence was added in the Lasergene 17.3.1 release (January 2022).
To run a MAFFT alignment, select two or more sequences and choose Align > (Re)Align Using MAFFT. If you wish to change method options, instead choose Align > Align with Options. Options vary depending whether the sequences are protein or nucleotide (nucleotide version shown below).
Change settings as desired:
- Use the Reference Sequence drop-down to select either None or one of the sequences that have been chosen for alignment. This option is ideal for the small percentage of MegAlign Pro users that need an extremely high capacity alignment algorithm.
- Use the Align drop-down menu to select sequences to align or realign.
- In the Using drop-down menu, choose Clustal W.
- Specify the Gap open penalty, the numerical penalty for introducing a gap of any length when calculating alignments. This penalty does not take into account the size of the gap. The default is -2.0.
- Gap extension penalty – This option affects the lengths of gaps, and must be zero or a negative number. Lowering the magnitude of the gap extension penalty may allow for longer gaps. The default is 0.
- In the Algorithm drop-down menu, either retain the default of Choose algorithm depending upon size [auto], or specify a particular alignment algorithm.
Description | Name | No. of Seqs | Strategy |
---|---|---|---|
Choose algorithm depending upon size [auto] | This is the default method for most nucleotide sequences. (MAUVE is the default method for sequences >1Mb in size. | ||
Very slow | global homology | < 200 | g-ins-i |
Very slow | one conserved domain | < 200 | l-ins-i. This is the default method used for protein sequences. |
Very slow | multiple conserved domains | < 200 | e-ins-i |
Slow | iterative refinement, iterations specified by user | N/A | fft-ns-i |
Medium | Iterative refinement, two iterations | N/A | N/A |
Fast | Progressive | N/A | fft-ns-2 |
Very fast | Progressive | > 2000 | fft-ns-1 |
- in the Scoring matrix drop-down menu, choose the desired matrix:
Matrix name | Description |
---|---|
Nucleotide sequences | |
1PAM / k=2 | For closely related sequences. |
20PAM / k=2 | For moderately related sequences. |
200 PAM / k=2 | For distantly related sequences. |
Protein sequences | |
BLOSUM30, 45, 62, 80 | (Henikoff & Henikoff, 1992). The BLOSUM series of matrices contain the same values as in some of MegAlign Pro’s other alignment methods, except that the protein ambiguity codes B, Z, and X are excluded in this case. These matrices are ideal for carrying out similarity searches. Choose a larger number BLOSUM matrix for less divergent sequences. |
JTT100, 200 | (Thorne JL et al., 1998) These matrices are similar to PAM matrices and were generated using an algorithm similar to the approach of Dayhoff et al. (1978), but based on a larger set of protein sequences. |
After making your changes:
- Choose Align to use the entered options to perform a multiple sequence alignment.
- Use the Reset to Default button if you would like to reset all values to the MegAlign Pro defaults.
- Select Cancel to leave the dialog without saving any changes.
Need more help with this?
Contact DNASTAR