Subject: COMP: New WWW-based DNA and protein sequence analysis tools
From: "Frank S. Zollmann" <zollmann.1@osu.edu>
Date: Sun, 9 Apr 1995 23:22:06 -0400

Dear HUM-MOLGENeticists,

1)From: Randall Smith <rsmith@DOT.IMGEN.BCM.TMC.EDU>
Subj. : New WWW-based DNA and protein sequence analysis tools

The Human Genome Center, Baylor College of Medicine, is pleased to
announce two new WWW services which speed the analysis of DNA and
protein sequences.  These services are now available via the "BCM
Search Launcher" Web page:


(The BCM Search Launcher organizes and simplifies molecular
biology-related search and analysis services available on the WWW. It
provides a single point-of-entry for related services, for example, a
single page for launching sequence searches using standard

1) Multiple Sequence Alignment Server

CLUSTAL-W (Thompson, Higgins, and Gibson, 1994), MAP (Huang, 1994),
and PIMA (Smith and Smith, 1992) multiple sequence alignments can now
be run remotely on our server via the Search Launcher.  The server
uses Don Gilbert's readseq program to input sequences in any one of a
variety of formats (e.g., FASTA, GCG, NBRF, EMBL).  Both DNA and
protein multiple alignments can be performed.

2) BEAUTY searches of a new Annotated Protein Sequence Database

The Annotated Sequences database consists of all Entrez protein
sequences containing at least one domain or site (see below). BEAUTY
performs a standard BLAST search of this database and generates a
graphic for each database hit showing the locations of all annotated
domains and sites with respect to the locations of the hits within
each matched sequence.  SRS links to the appropriate record in the
Entrez, PROSITE, BLOCKS, and PRINTS databases are also provided. These
enhancements make it much easier to detect functionally significant
matches in BLAST database searches.

A database of annotated domains/sites was created by 1) scanning the
NCBI's Entrez database for protein sequence records containing
annotations of domains and sites, and storing the location of all such
regions, 2) matching each Entrez protein sequence against the sequence
motifs in the PROSITE pattern database, and storing the location of
each hit, 3) extracting the locations of the conserved blocks within
the sequences represented in the BLOCKS database, and 4) extracting
the locations of all domains identified in the sequences in the PRINTS
protein fingerprint database.  The Annotated Sequences database
currently contains 55,566 sequences.

A more detailed program description, is available at:

Kim Worley, Brent Wiese, and Randall Smith
Human Genome Center, Department of Molecular and Human Genetics and
W.M. Keck Center for Computational Biology
Baylor College of Medicine, Houston, TX  77030  USA

kworley@bcm.tmc.edu, brent@bcm.tmc.edu, rsmith@bcm.tmc.edu

