Apologies - the RAID controller on our main server has failed. This affects our database and other services - most of these WILL NOT WORK! We are working to restore this as soon as possible.

OMIM Mutations Search

About...

OMIM provides information on 'allelic variants' (mutations), but does not provide cross references to SwissProt.

In addition, there are frequently problems with the numbering used in OMIM. Often the residue numbers used do not correspond to the residue number in the corresponding SwissProt file. Generally, applying a constant offset to all the residue numbers will correct this, but in about 8% of entries, the correct residue numbers cannot be found in this way and for about 1% of the data, there is a clear change in the numbering scheme within the OMIM entry (although there is no parsable annotation to indicate this is the case). These are validated as 'probably correct' (see below).

On the other hand, SwissProt provides cross-links to mutation data, but the source of these data are not available from the SwissProt file (though links on the web provide these data), making it difficult to identify disease-related mutations rather than SNPs.

This server extracts mis-sense mutation data and OMIM reference numbers from OMIM and takes the sequence data, OMIM cross-reference and accession code from SwissProt. The data are linked using a PostgreSQL relational database. The residue numbers from OMIM are then validated against the sequence from SwissProt and the results of the validation are written back into the database.

'Probably correct' residue numbers

'Probably correct' residue numbers are those where a majority of residues in the OMIM record are found by applying an offset (let's say of 20 residues). However a subset are not found using that offset. If any of this subset then matches with an offset of zero, we assume that the numbering scheme in OMIM has changed.

File formats

XML

The XML file is pretty-much self-documenting. An <omim> tag with an id attribute corresponds to an OMIM identifier. This contains one <sprot> tag whose ac attribute indicates the UniProtKB/SwissProt accession code. This tag contains one or more <record> tags which correspond to the OMIM allelic variant records.

Within each <record> tag, we indicate the residue number as it appeared in OMIM using the <omim_resnum> tag. This has a correct attribute to indicate whether we have validated this residue number as correct with respect to the SwissProt sequence. (The correct attribute is either 't' for true or 'f' for false.) The <resnum> tag indicates our validated residue number with respect to the SwissProt sequence. The valid attribute indicates that this number is definitely correct ('t' for true), definitely incorrect ('f' for false - indicates we were unable to find the residues indicated in OMIM), or probably correct ('?' - see above).

Finally within the record tag, we indicate the native and mutant residues with the <native> and <mutant> tags and provide a <description> tag with the brief descriptive title taken from the OMIM data.

Download the DTD

CSV

The comma-separated value format contains the following columns:

Draft paper

You can download a draft of paper.