D2H2: diabetes data and hypothesis hub

Giacomo B. Marino, Nasheath Ahmed, Zhuorui Xie, Kathleen M. Jagodnik, Jason Han, Daniel J.B. Clarke, Alexander Lachmann, Mark P. Keller, Alan D. Attie, Avi Ma'Ayan

Research output: Contribution to journalArticlepeer-review


Motivation: There is a rapid growth in the production of omics datasets collected by the diabetes research community. However, such published data are underutilized for knowledge discovery. To make bioinformatics tools and published omics datasets from the diabetes field more accessible to biomedical researchers, we developed the Diabetes Data and Hypothesis Hub (D2H2). Results: D2H2 contains hundreds of high-quality curated transcriptomics datasets relevant to diabetes, accessible via a user-friendly web-based portal. The collected and processed datasets are curated from the Gene Expression Omnibus (GEO). Each curated study has a dedicated page that provides data visualization, differential gene expression analysis, and single-gene queries. To enable the investigation of these curated datasets and to provide easy access to bioinformatics tools that serve gene and gene set-related knowledge, we developed the D2H2 chatbot. Utilizing GPT, we prompt users to enter free text about their data analysis needs. Parsing the user prompt, together with specifying information about all D2H2 available tools and workflows, we answer user queries by invoking the most relevant tools via the tools' API. D2H2 also has a hypotheses generation module where gene sets are randomly selected from the bulk RNA-seq precomputed signatures. We then find highly overlapping gene sets extracted from publications listed in PubMed Central with abstract dissimilarity. With the help of GPT, we speculate about a possible explanation of the high overlap between the gene sets. Overall, D2H2 is a platform that provides a suite of bioinformatics tools and curated transcriptomics datasets for hypothesis generation.

Original languageEnglish
Article numbervbad178
JournalBioinformatics Advances
Issue number1
StatePublished - 2023
Externally publishedYes

Bibliographical note

Publisher Copyright:
© The Author(s) 2023. Published by Oxford University Press.


This work was supported by the National Institutes of Health (NIH) [RC2DK131995].

FundersFunder number
National Institutes of HealthRC2DK131995


    Dive into the research topics of 'D2H2: diabetes data and hypothesis hub'. Together they form a unique fingerprint.

    Cite this