Multiple Sequence Alignment Tools
4.2. Multiple Sequence Alignment Tools
The main objective of the multiple sequence alignment tools is to ascertain via a planned inves- tigation to study groups of related genes or proteins to infer and establish ‘evolutionary broad relation- ships’ prevailing between genes. In addition it also helps to discover patterns that are usually shared amongst various groups or functionally and structurally related sequences.
A few important multiple sequence alignment tools are employed in the study of genes are dis- cussed as under :
(a) CLUSTALW [http://www.ebi.ac.uk]. CLUSTALW refers to a programme specially de- signed for accomplishing progressive multiple sequence alignment. The underlying principle usually employed in this programme is that of a phylogenetic* analysis, and essentially involves the various steps as detailed under :
1. First of all a ‘pair-wise distance matrix’ with respect to all the requisite sequences need to be aligned is generated and a specific ‘guide tree’ is made creatively by making use of the neighbour-joining algorithms,
2. Consequently, each and every most intimately related pairs of sequences attached to the ultimate branches of the specific ‘guide tree’ are adequately aligned to each other,
3. The relatively less related sequences are subsequently alignment to obtain a definite profile ; and finally all such ‘profiles’ are duly aligned to each other until a complete dendogram is achieved.
Summararily, the CLUSTALW programme is very much identical to such other programmes as : NSYS, UPGMA, that essentially make use of molecular marker or morphological data for the ulti- mate creation of distance matrix and the dendogram, except that it particularly uses either DNA or : protein sequences.
* Concerning the development of a race or phylum.
BIOSENSOR TECHNOLOGY
It has already been established beyond any reasonable doubt that CLUSTALW may also be exploited for pairwise global protein or nucleic acid sequence alignments existing amongst ORFs with
a significant BLAST alighment (E < 10 – 10 ). It would meticulously provide a significant degree of divergence particularly for each of the large number of pairwise comparisons that may be properly
utilized for an elaborated and exhaustive comparative genomics research. (b) PHYLIP. PHYLIP comprises of thirty most crucial programmes which implement different
phylogenetic analysis algorithms, and hence recognized as an indispensable package. Highlights of PHYLIP. are as follows ; (i) Each programme almost runs individualy. (ii) Large segment of programmes invariably look for an input file [called ‘infile’], and write an
output file [known as ‘out-file’] ; instead of entering parameters e.g., E-value, number of returnes required etc., as required with BLAST.
The sequence of DNA and protein analysis are frequently and abundantly carried out by the help of PHYLIP programmes as detailed below :
(i) PROTPARS — concludes phylogenesis* from protein sequence inputs, (ii) PROTDIST — helps in the computation of an evolutionary distance matrix from protein
sequence input, (iii) DNAPARS — infers phylogenesis from DNA sequence input, (iv) NEIGHBOR — concludes phylogenesis from distance matrix data employing either pair-
wise clustering or the NEIGHBOR joining algorithms, and (v) DRAWGRAM — draws a rooted phytogenetic tree based upon output from one of the
phylogeny inference programme. (c) Bio Numeric Version 2.0. Importantly, the Bio Numeric Version 2.0 refers to an unique
software package available off-line to have the download facilities against payment. In order to perform an elaborated study with regard to a genetic similarity or diversity,, one may handle the available experi- mental data in two ways, namely :
(i) In the form of Gel or Capillary Electrophoresis patterns, High Performance Liquid Chroma- tography (HPLC) profiles, and autoradiograms, and
(ii) In the form of ensuing nucleotide sequences of DNA/RNA, and amino acid sequences of proteins respectively.
Mechanism : The Bio Numeric Version 2.0 first and foremost constructs a similarity matrix by making use of any one of the available algorithms. The resulting similarity matrix is used for construct- ing a dendogram by the aid of one of the most suitable available techniques, for instance : UPGMA, Single Linkage, Neighbour-joining, Ward, and Complete Linkage.