Query the Kabat sequence database - Full query language
This page allows you to query the Kabat antibody sequence database using an SQL-like query language. The main deviation between the language used here and SQL is that clauses within the WHERE statement are combined in reverse polish notation. Also, since there is only one table in the database, there is no FROM statement.
Unless you are asking complex queries, you will probably find the simple point-and-click interface to KabatMan easier!. This only allows you to ask fairly simple queries, but covers the majority of types of queries which people ask. To ask more complex queries, you must use the KabatMan Query Language directly using the text entry box below.
05.12.11 Data reading
A small number of sequences with insertions (e.g. 18D-6 light chain, ID 007779) were not being read correctly as the Kabat format was inconsistent. This is now fixed.
25.09.06 LEN() function
The LEN() function can now be used with framework regions as well as CDRs.
24.08.06 New SEQUENCE command
The SEQUENCE command works like PIR but produces a Kabat numbered sequence rather than a PIR file.
You may now use LFR1, LFR2, LFR3, LFR4, HFR1, HFR2, HFR3, HFR4, (in the same way as L1, L2, etc.) to extract the sequences of the 8 framework regions. These may all be used in SELECT or WHERE clauses.
03.04.02 New DATE command
You can now use DATE to extract the year of the earliest publication.
You can now specify the delimiter used between fields in the output. You may also select human subgroups (thanks to Sophie Deret) as has been available from the full query interface for the last year.
There is now a new selection field: subgroup() which takes either L or H as a parameter. This gives the subgroup of either the light or heavy chain. It may be used in SELECT or WHERE clauses. Note that this only works for human sequences.
There is now an option to specify which definition of canonicals you wish to use. The default is pretty much as before (i.e. derived from AbM definitions although additional classes have been removed as they are now covered by the auto mode). The other two modes, strict and auto apply strict definitions from Chothia's papers or my own automated definition of conformation and key residues, respectively.
There are now 4 new selection fields: idlight, idheavy, urllight and urlheavy while allow you to obtain the Kabat accession codes for the light or heavy chain. The url versions return hyper-links to the Kabat Web page; clicking the link will give you the full raw-data entry from their page.
You can now ask for the light chain and heavy chain Kabat accession numbers which are returned as hyperlinks giving you access to the full data from the Kabat Web site.
The error reporting done by KabatMan is rather poor at present; if you ask a meaningless question, KabatMan is most likely to ignore you alltogether rather than giving you a sensible error message!
Please read the documentation which covers all you need to know to use the interface.
You can also read the full documentation for the stand-alone version of KabatMan.
In brief, the syntax is:
WHERE column comparison property [column comparison property boolop]
column is a column name in the database
comparison is one of: =, !=, <, >, ≤, ≥, includes
property is something you are trying to match in the column
boolop is one of: and, or
Note, in particular,
- Just entering a sequence will not give you any results!
- Spaces must be used around all operators. e.g. you must use RES(L23) = C rather than RES(L23)=C
- If you wish to search for strings containing spaces (e.g. in the antigen field), you must enclose the whole string in inverted commas (double or single).
- If you are comparing strings, you are probably best using the INCludes comparison operator rather than using = for an exact match
- If you are searching for a sequence, you should use amino acid 1-letter code with no spaces. See the manual!
Statistics on the information stored in the Kabat database.
Please note, there was a bug in V2.7-V2.9 in doing substring searches (e.g. for a sequence fragment). This is now fixed.
Please cite the following reference in any publications resulting from searches using this software:
Martin, A.C.R. Accessing the Kabat Antibody Sequence Database by Computer PROTEINS: Structure, Function and Genetics, 25 (1996), 130-133.
Companies may use this public server, but need to be aware that data are not encrypted and it is not secure.
After trialing the system, companies should consider abYsis. A commercial licence will enable you to install a local version of this code together with an integrated database which can also store and analyse proprietary sequence and structure data.
For information on commercial licences, please contact the distributor Ebisu.