Pfam 


Pfam search for term

The Pfam database is one the most important collections of information in the world for classifying proteins. The database categorises 75 per cent of known proteins to form a library of protein families - a 'periodic table' of biology. The open access resource was established at the Wellcome Trust Sanger Institute in 1998. Each entry in the Pfam database includes a protein sequence alignment as well as an accompanying statistical model, called a hidden Markov model. Proteins are built from a number of regions, called domains, which in different combinations can determine the protein's function. Pfam allows users to analyse sequence data and search for related proteins in the database. The tool also lets users see the structure and domain architecture of any of the proteins stored, examine what species proteins are found in and look at multiple alignments. In addition, Pfam stores and gives access to information on higher level groupings of related protein families - known as clans - which are related by similarity of sequence, structure or by a statistical analysis of their associated hidden Markov model. The database comprises two main collections of information. Pfam-A comprises high-quality entries that have been curated manually. To extend the sequence coverage of Pfam, an additional area of the Pfam database - Pfam-B - contains automatically curated entries that are of a lower quality but add valuable coverage for regions not yet curated and stored in Pfam-A. (http://www.sanger.ac.uk/resources/databases/pfam.html )