We use cookies and other tools to enhance your experience on our website and to analyze our web traffic.
For more information about these cookies and the data collected, please refer to our Privacy Policy.

SHHS Genetic Analyses - dbGaP

1 post
Was this reply useful? Learn more...
Austin +0 points · over 5 years ago


I am interested in conducting genetic analyses using the sleep heart health study (SHHS) data and would appreciate recommendations about where to begin.

I see that there is a dbGaP page for SHHS data, but as far as I can tell it is only available for those recruited from Framingham (https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/study.cgi?study_id=phs000137.v10.p5).

I am hoping to derive my outcome variables from the raw SHHS polysomnography data available.

Here are my questions:

  1. Is there a way to access genetic data for all SHHS participants in dbGaP without individually downloading dbGaP files for each cohort involved in SHHS?
  2. I understand that study IDs are removed before uploading to dbGaP. Is there a central coordinating center that could convert polysomnography data available through SHHS to dbGaP IDs rather so that I could conduct my analysis? If I can get my hands on relevant genetic info, I would like to link it to polysomnography data without compromising study IDs.

Thanks! Austin

462 posts
Was this reply useful? Learn more...
mrueschman +0 points · over 5 years ago


Thanks for inquiring and for your interest in NSRR/dbGaP data. I am going to ping a couple others on the NSRR team who are more familiar with dbGaP and ask them to comment if they have additional information.

Here's my best attempt at answering your questions:

  1. Not to my knowledge. I think you are locked into going through Framingham, CHS, and ARIC on dbGaP separately.
  2. We made an effort a couple years ago to create and attach ID linking files (NSRR <-> dbGaP) into the cohort-level dbGaP repositories. Based on my records, I believe this was completed for Framingham and (maybe) CHS. Once you have access to both data sources you could assess what you have and then follow-up with additional questions. Even if the ID translations are not immediately apparent we can probably help in figuring out a solution.

Good luck, please let us know if we can assist further!


1 post
Was this reply useful? Learn more...
briancade +1 point · over 5 years ago

Hi Austin,

Genotype data are available for three parent cohorts in SHHS: ARIC, CHS, and FHS. No joint genotyping data are available, but some consortia data (e.g. publicly released versions of CHARGE WES/WGS and TOPMed WGS) may be more amenable to merging (though these too will be split at a dbGaP study level).

Depending on your goals, it may make sense to work with summary data already deposited in dbGaP. For PSG, your best options are: ARIC: pht004228.v2.p1 (uc6453) CHS: pht003699.v2.p1 (SHHS1_PSG) FHS: pht000395.v9.p11 (sleep1_1998s) [Offspring only, no genotypes for the Omni cohort]

Other datasets also exist (e.g. SHHS2 PSG, questionnaire data, etc.) You can search for the pht number (e.g. 'pht000395') from the main dbGaP page for more information on particular datasets.

If these variables don't suit your needs, then you can use ID translations that have been set up by the NSRR team working with dbGaP and the parent cohorts (I'm not personally involved in this). This is still in progress, and there's no guarantee that links will be in place for all cohorts as the parent studies need to sign off. This has been coordinated with dbGaP, so there won't be any compromising study ID issues. A general introduction is here: https://www.ncbi.nlm.nih.gov/projects/gap/cgi-bin/document.cgi?study_id=phs000287.v6.p1&amp;phd=5259 .

For CHS, you can use pht005388 (nsrr_chs_id_link2). For FHS, you can use pht007767 (nsrrid_fhsdbgapid_link_shareid). Note here that SHARe IDs are a secondary dbGaP ID set.

You'd then need to work within dbGaP's ID structure. There are subject-sample mapping files that link a single participant ID to one or more sample IDs (e.g. FHS may have different sample IDs for Affy 500k, Omni 5M, etc. for the same person's unique dbGaP sample ID).

We may have similar goals (I've run multiple GWAS on these and other datasets). If you're interested in potentially collaborating, I'd be happy to discuss offline.

Best, Brian Cade