Ok, Michael. Thank you for your prompt reply!
It is mentioned everywhere that shhs1 consists of 5804 records, but there are only 5793 edf records on the server. Why? Is it possible to download remaining 11 records?
Thanks a lot for the comprehensive answer!
This's not a qestion, just a small notice. Maybe it will be useful for someone.
I need to choose records in shhs1 with AHI < 5. There isn't AHI parameters in shhs1-dataset-0.7.0.csv, but we can calculate it as AHI = cai4p + oahi.
cai4p - https://sleepdata.org/datasets/shhs/variables/cai4p, oahi - https://sleepdata.org/datasets/shhs/variables/oahi.
However, there are 698 unknown oahi values and 1398 unknown cai4p values. They are defined as follows:
oahi = 60 * ( hrembp4 + hrop4 + hnrbp4 + hnrop4 + oarbp + oarop + oanbp + oanop ) / slpprdp,
cai4p = 60 * ( carbp4 + carop4 + canbp4 + canop4 ) / slpprdp.
So, we can expect that at least one of these variables should be unknown if oahi or cai4p are unknown, but they are not. Thus, we can calculate cai4p, oahi and AHI for every record in shhs1. This is code for it in Python:
import pandas as pd
data = pd.read_csv('shhs1-dataset-0.7.0.csv')
print('Amount of missing values in cai4p', data['cai4p'].isnull().sum())
print('Amount of missing values in oahi', data['oahi'].isnull().sum())
cai4p = 60*data[['CAROP4','CARBP4','CANBP4', 'CANOP4']].sum(1)/data['SlpPrdP']
oahi = 60*data[['HREMBP4','HROP4','HNRBP4', 'HNROP4','OARBP','OAROP','OANBP','OANOP' ]].sum(1)/data['SlpPrdP']
print('Amount of missing values in cai4p', cai4p.isnull().sum())
print('Amount of missing values in oahi', oahi.isnull().sum())
AHI = cai4p + oahi