The CARD is curated by a group of experts in the area of antimicrobial resistance (AMR) and bioinformatics, including consultation with outside experts where needed.
The CARD is updated monthly.
Only peered reviewed, published data that is also associated with a GenBank accession can be included in the CARD. We are currently developing new tools that will allow use of unpublished sequences and develop private detection models within the CARD.
The CARD is now more tightly focussed on antimicrobial resistance (AMR) reference sequences and associated detection models. Each sequence curated into the CARD is now associated with both the Antibiotic Resistance Ontology to provide classification and semantic context as well as defined detection models and parameters. The CARD no longer curates entire genome or plasmid sequences. The CARD has additionally abandoned use of internal accessions for sequences and now exclusively uses GenBank accessions.
The CARD is based on reference sequences, so does not fully annotate genomes. For example, while the CARD contains the canonical NDM-1 sequence from its first report in Klebsiella pneumoniae, it does not record all subsequent instances among other pathogens. We are working towards this level of data but you may find PATRIC has some of this information for you.
Yes, the Resistance Gene Identifier can now be downloaded as linux command-line software.
Yes, the SNP mapping data is now available in the Downloads sections within the card.json and snps.txt files.
The CARD does not contain complete sequences of resistant mutants, due to the fact the individual mutations are often reported in the literature without the complete mutant gene sequence being deposited in GenBank. Instead, the CARD maintains a complete list of all resistance SNPs relative to a reference sequence, which may either be a reported mutant sequence or a wild-type sequence. As such, it is important that SNP mapping be included in analysis of any genes that require mutation to confer resistance. This step is included in the Resistance Gene Identifier but not naive BLAST analyses.
The CARD does not yet curate MIC data directly, but instead records the resistance profile of resistance genes. This is performed using the categorical confers_resistance_to relationship within the Antibiotic Resistance Ontology, e.g. beta-lactamases confers_resistance_to beta-lactams, as well as the specific confers_resistance_to_drug relationship, e.g. AAC(1) confers_resistance_to_drug apramycin. The latter requires constant curatorial effort and may have gaps - please let us know if find such missing data within the CARD.
While the CARD systematically curates categorical confers_resistance_to relationships within the Antibiotic Resistance Ontology, e.g. beta-lactamases confers_resistance_to beta-lactams, curation of specific confers_resistance_to_drug relationships, e.g. AAC(1) confers_resistance_to_drug apramycin, is rarely complete due to the volume of literature to curate, variation in MICs for genes among pathogens, and changing clinical breakpoints. As such, curation of confers_resistance_to_drug relationships for accurate prediction of antibiogram is currently inconsistent throughout the CARD and our RGI software is focussed primarily upon accurate prediction of resistome, not antibiogram.
If a hit is PERFECT, the predicted gene perfectly matches a known resistance gene curated in the CARD at the amino acid level (including SNPs if that is part of the detection model). Only published AMR genes, with subsequent submission of sequence to GenBank, with clear evidence of elevated MICs are curated into CARD. However, a PERFECT hit does not indicate if the AMR gene is expressed or if it results in elevated MIC in the pathogen of interest. Activity of AMR genes can be pathogen and strain specific. STRICT hits are not exact matches to a published AMR sequence, but are similar to CARD reference sequences within detection model cut-offs defined by the CARD curators. STRICT hits are likely functional, but those with low percent similarity to the curated CARD reference sequence may require experimental verification.
Currently the RGI does not analyze metagenomics data, outside of a simple BLASTX algorithm (with SNP screening) available as a beta-test feature in the downloadable, command-line tool. We are actively developing new metagenomics algorithms. The default RGI behaviour attempts to predict complete open reading frames (ORFs) from submitted nucleotide data, which will fail for short metagenomic reads. In addition, metagenomic data exceeds the 20 Mb limit for the web interface. If you intend to use the CARD reference sequences for your own metagenomics pipeline, be sure to either only use references that exclude mutation in their resistance mechanism or use the curated CARD SNP data to add this screening step to your pipeline.
From the NCBI BLAST Glossary, percent identity is the extent to which two (nucleotide or amino acid) sequences have the same residues at the same positions in an alignment, often expressed as a percentage. The expectation value or expect value represents the number of different alignments with scores equivalent to or better that is expected to occur in a database search by chance. The lower the E value, the more significant the score and the alignment.