RGI Resistance Gene Identifier

RGI Bulk Analysis: Do you want to use RGI to analyze a large number of genomes? It is now available as a downloadable command-line tool in the Download section of the CARD website.

The CARD includes a curated collection of antimicrobial resistance gene and mutation sequences, bioinformatics models for their detection, and software in the form of the Resistance Gene Identifier (RGI) for their detection in genome or protein sequences. It includes a common framework for the sharing of antimicrobial resistance data in the form of the novel Antibiotic Resistance Ontology (ARO), allowing the RGI to catalog resistome predictions by both drug class and resistance mechanism.

The RGI provides a preliminary annotation of your DNA or protein sequence(s), based upon the data available in CARD. The RGI currently only analyzes protein sequences - if genome sequences or assembly contigs are submitted, the RGI first predicts open reading frames using the Prodigal software (ignoring those less than 30 bp) and analyzes the predicted protein sequences. The RGI currently supports two detection model types - protein homolog models (use of BLAST sequence similarity cut-offs to detection functional homologs of AMR genes) and protein variant models (for accurate differentiation between susceptible intrinsic genes and intrinsic genes that have acquired mutations conferring AMR, based on curated SNP matrices).

Disclaimer: As the CARD curation evolves, the results of the RGI evolve. RGI targets, reference sequences, and significance cut-offs are under constant curation.

The RGI analyzes sequences under three paradigms – Perfect, Strict, and Loose (a.k.a. Discovery). The Perfect algorithm is most often applied to clinical surveillance as it detects perfect matches to the curated reference sequences and mutations in the CARD. In contrast, the Strict algorithm detects previously unknown variants of known AMR genes, including secondary screen for key mutations, using detection models with curated similarity cut-offs to ensure the detected variant is likely a functional AMR gene. The Loose algorithm works outside of the detection model cut-offs to provide detection of new, emergent threats and more distant homologs of AMR genes, but will also catalog homologous sequences and spurious partial hits that may not have a role in AMR. Combined with phenotypic screening, the Loose algorithm allows researchers to hone in on new AMR genes.

New: The online RGI now caches results for each session for 7 days, allowing you to work on multiple data sets using the web interface.

Upload external RGI json results and visualize:

Upload a JSON file containing RGI results generated using the command-line version. File size limited to 20 Mb. Note that only Loose hits of e-10 or better can be visualized.

Upload JSON
Use the RGI:

Nucleotide sequences will undergo ORF calling to generate predicted protein sequences. Short or partial gene sequences are unlikely to work. Examples: JN420336.1, AY123251.1, HQ451074.1, AL123456

Upload a plain text file containing DNA or protein sequence(s) in FASTA format (20 Mb limit). The file can contain more than one FASTA formatted sequence, such as assembly contigs or multiple proteins. Each file will be treated as a single sample.