We've updated our privacy policy.

mrueschman

mrueschman
Joined Oct 2013
Bio

Boston, MA

mrueschman
Joined Oct 2013
Bio

Boston, MA

Alexander,

Thanks for raising this issue -- it is an important one. There is a bit of documentation missing that would have helped you understand the missingness in cai4p and oahi. These variables have been filtered and many values have been censored from the dataset. The bigger issue is that we don't have documentation on sleepdata.org that describes the filters that have been applied and to which variables. For SHHS, we are mostly in the dark because the original (filtered) analytic datasets were generated 20 years ago and I have not come across the data processing code to know exactly what was done. The task of reverse engineering all the filters and making them known somehow has been on my backburner for awhile now.

Based on prior experience, I made an educated guess that cai4p was filtered by chestqual (quality of chest signal) and abdoqual (quality of abdomen signal), and this seems to be correct. The signal quality variables in SHHS1 run from 1 (lowest) to 4 (highest), and some quick tinkering led me to this formula:

if chstqual in (3,4) and abdoqual in (3,4) then cai4p_new = 60 * ( carbp4 + carop4 + canbp4 + canop4 ) / slpprdp;

cai4p_new then has 4,406 valid values and 1,398 missing values, like the cai4p variable you are working with.

These filters were applied with the mindset of only retaining AHI values where the corresponding scoring signals (e.g. effort channels for indices of central sleep apnea) were of good or better quality. I will work with my colleagues here to try to prioritize writing some documentation that describes this (currently) "hidden" filtering and/or reverse engineering some of these filters and presenting the filtering code alongside the calculation.

Thanks for checking out the site and bringing this topic to the forum!

I dug through the archives and found some SAS code from long ago. I went ahead and tried the below against the CFS dataset here on the NSRR and was able to replicate dayslp_dur_hr and other related variables. At a glance, in comparing with daybed and daywake, the results look correct.

There was a manual correction as well to one of the weekend times -- I left that out of the code snippet.

If you continue to encounter differences and find that the dayslp_dur_hr values do not correctly line up with the component variables, please let us know and we can look to make dataset corrections. Thanks!

SAS macros:

* macro to create time variables from 24-hour times collected as integers;
%macro fixtimes(varin,varout);
        if &varin >=0 then do;
                 &varin.c = trim(left(put(&varin, 8. )));
                 &varout = input(substr(trim(left(reverse(substr(trim(left(reverse(&varin.c)!! '0000' )), 1 , 4 )))), 1 , 2 )
                                   !! ':' !! substr(trim(left(reverse(substr(trim(left(reverse(&varin.c)!! '0000' )), 1 , 4 )))), 3 , 2 ), time5. );
        end;
        format &varout time5. ;
        drop &varin.c;
%mend ;

Transformation code:

*******************************************************************************;
    * SELF-REPORTED SLEEP TIME
    *******************************************************************************;
        * FIX SELF-REPORTED BED AND WAKE TIMES (WEEKDAYS);
        %fixtimes(DAYBED ,DAYSLP_time);
        %fixtimes(DAYWAKE ,DAYWAKE_time);

         * if slptime before midnight and waketime after midnight set date as previous day;
         if hour(DAYSLP_time) > 12 and hour(DAYWAKE_time) <=12 then DAYSLP_time2 = dhms(date()- 1 ,hour(DAYSLP_time),minute(DAYSLP_time), 0 );
         * if both slptime  and waketime before midnight then use current day;
         else if hour(DAYSLP_time) > 12 and hour(DAYWAKE_time) >12 then DAYSLP_time2 = dhms(date() ,hour(DAYSLP_time),minute(DAYSLP_time), 0 );
         * otherwise use current day;
         else DAYSLP_time2 = dhms(date(),hour(DAYSLP_time),minute(DAYSLP_time), 0 );

         * if   waketime at midnight then waketime use next day;
         if  hour(DAYWAKE_time) =0 then DAYSLP_time2 = dhms(date()+1 ,hour(DAYSLP_time),minute(DAYSLP_time), 0 );
         * otherwise use current day for waketime;
         else DAYWAKE_time2 = dhms(date(),hour(DAYWAKE_time),minute(DAYWAKE_time), 0 );

        * calculate midpoint of bed and wake times;
         DAYMID_time2 = DAYSLP_time2 + (DAYWAKE_time2 - DAYSLP_time2)/2;
         DAYMID_time = timepart(DAYMID_time2);

        format daymid_time time5. DAYSLP_time2 DAYWAKE_time2 DAYMID_time2 datetime16.;

         * calculate time in bed as difference between wake and bed time (accounting for change in midnight);
         DAYSLP_dur_mn  = (DAYWAKE_time2 - DAYSLP_time2)/ 60;
         DAYSLP_dur_hr2  = (DAYWAKE_time2 - DAYSLP_time2)/ 3600;
         Label    DAYSLP_dur_mn  ='calculated sleep duration (minutes) during weekday-by DAYWAKE -DAYSLP '
                  DAYSLP_dur_hr  ='calculated sleep duration (hourss) during weekday -by DAYWAKE -DAYSLP ';

        drop daywake_time2 dayslp_time2 daymid_time2;
        *****************************************************************************************************************;

        * FIX SELF-REPORTED BED AND WAKE TIMES (WEEKENDS);
        %fixtimes(ENDBED ,ENDSLP_time);
        %fixtimes(ENDWAKE ,ENDWAKE_time);

         * if slptime before midnight and waketime after midnight, set date as previous day;
         if hour(ENDSLP_time) > 12 and hour(ENDWAKE_time) <=12 then ENDSLP_time2 = dhms(date()- 1 ,hour(ENDSLP_time),minute(ENDSLP_time), 0 );
         * if both slptime  and waketime before midnight then use current day;
         else if hour(ENDSLP_time) > 12 and hour(ENDWAKE_time) >12 then ENDSLP_time2 = dhms(date() ,hour(ENDSLP_time),minute(ENDSLP_time), 0 );
         * otherwise use current day;
         else ENDSLP_time2 = dhms(date(),hour(ENDSLP_time),minute(ENDSLP_time), 0 );

         * if waketime at midnight then waketime use next day;
         if  hour(ENDWAKE_time) =0 then ENDWAKE_time2 = dhms(date()+1,hour(ENDWAKE_time),minute(ENDWAKE_time), 0 );
         * otherwise use current day for  wake time;
         else ENDWAKE_time2 = dhms(date(),hour(ENDWAKE_time),minute(ENDWAKE_time), 0 );
         *manual fix for this one (otherwise end_dur_hr is negative);
         if  obf_pptid = 802529 then ENDWAKE_time2 = dhms(date()+1,hour(ENDWAKE_time),minute(ENDWAKE_time), 0 );

        * calculate midpoint of bed and wake times;
         ENDMID_time2 = ENDSLP_time2 + (ENDWAKE_time2 - ENDSLP_time2)/2;
         ENDMID_time = timepart(ENDMID_time2);

         format ENDmid_time time5. ENDSLP_time2 ENDWAKE_time2 ENDMID_time2 datetime16.;

         * calculate time in bed as difference between wake and bed time (accounting for change in midnight);
         END_dur_mn  = (ENDWAKE_time2 - ENDSLP_time2)/ 60;
         END_dur_hr  = (ENDWAKE_time2 - ENDSLP_time2)/ 3600;
         Label    END_dur_mn  ='calculated sleep duration (minutes) during weekend -by ENDWAKE  - ENDSLP '
                  END_dur_hr  ='calculated sleep duration (hourss) during weekend-by ENDWAKE  - ENDSLP ';

        drop endwake_time2 endslp_time2 endmid_time2;
        *****************************************************************************************************************;