David Bioinformatics Resources ((install)) Direct
Database for Annotation, Visualization, and Integrated Discovery (DAVID)
The "Elevator Pitch" That Changed Genomics
In the early 2000s, a biologist named Dr. Da Wei Huang had a frustrating problem. He had just run a microarray experiment and had a list of 500 genes that were "differentially expressed." He knew the names of these genes—BRCA1, TP53, AKT1—but he had no idea what they meant together. david bioinformatics resources
Database for Annotation, Visualization, and Integrated Discovery (DAVID) Gene expression analysis : DAVID is used to
Conclusion
The Legacy: Democratizing Bioinformatics
The most important story of DAVID is not about algorithms; it's about accessibility. Before DAVID, you needed a bioinformatics PhD to find the functional themes in a gene list. After DAVID, a first-year graduate student with a web browser could do it in five minutes. images of visualizations
- Gene Ontology (GO): Biological Process, Cellular Component, Molecular Function.
- Pathway Databases: KEGG, BioCarta, Reactome, PANTHER.
- Protein Domains: InterPro, SMART, Pfam, PROSITE.
- Disease Associations: OMIM, GAD (Genetic Association Database).
- Tissue Expression: UniGene, ESTs.
- Literature: PubMed Central.
- Gene expression analysis: DAVID is used to analyze gene expression data from microarray and RNA-seq experiments, to identify differentially expressed genes and understand their functional significance.
- Systems biology: DAVID is used to study complex biological systems, including protein-protein interactions, genetic interactions, and metabolic networks.
- Cancer research: DAVID is used to analyze gene expression data from cancer samples, to identify potential therapeutic targets and understand the molecular mechanisms of cancer progression.
- Drug discovery: DAVID is used to analyze gene expression data from drug-treated cells, to identify potential off-target effects and understand the mechanisms of action of small molecules.
As microarray and RNA-seq technologies exploded, producing lists of 2,000 differentially expressed genes became routine. The manual approach became impossible. Researchers needed a way to automate the search for patterns, or "enrichment," within their data.
Typical workflow (step-by-step)
- Prepare gene list: one identifier per line; specify species and ID type (Entrez Gene ID, Ensembl, gene symbol, etc.).
- Upload list to DAVID (or paste into input box). Optionally upload a background list.
- Choose annotation categories to include (GO BP/MF/CC, KEGG, Reactome, InterPro, Pfam, OMIM, PharmGKB, UniProt keywords, tissue expression).
- Run Functional Annotation Chart to get enriched terms with p-values, FDR, and fold enrichment.
- Use Functional Annotation Clustering to reduce redundancy across related terms and identify broader biological themes.
- Inspect per-gene annotation table to see which genes drive each enriched term.
- Export results: tables (TSV/CSV), images of visualizations, or session files for later use.
- (Optional) Automate via API for large-scale analyses or integration into pipelines.