The OAHI and many other variables are available in the CSV datasets in this folder: https://sleepdata.org/datasets/shhs/files/datasets
You could report the issue to the developer here: https://github.com/Teuniz/EDFbrowser
I'll ask members of our team here if they have used EDFBrowser. We don't have any direct affiliation with Teuniz, the EDFBrowser developer.
Edit: My guess is that EDFBrowser doesn't save/re-write EDF files. I suppose the changes you made using "Organize" would then be saved/loaded with a montage file.
The breakdown of the NSRR SHHS dataset, as originally taken from BioLINCC, goes like:
Thanks for checking out the resource!
Thanks for inquiring about what do to as you bring on another user to your project. Our preferred approach is for your colleague to submit a DAUA for themselves. The contents of the DAUA could mimic your own and should mention your upcoming collaboration.
Thanks for checking out the site and SHHS dataset. I think the variation in the number of commas on each given line that you're seeing derives from the "Comm" variable/column. "Comm" is a free text field that includes scorer notes about the overnight sleep study quality. Some of these notes include commas. That said, these field values that contain commas will be contained within double quotes, which most CSV parsers should understand. The dataset reads into Excel, SAS, and R correctly for me.
Example snippet from "Comm":
1,0,0,6,"Lot of alpha-delta sleep but not alpha intrusion. Sleeps on back entire time, only change in position when awake. Airflow choppy at times (-1 hr), chest very small amp (- 1 hr), Low baseline saO2 ~92%, Desats into 70's in REM.",0,8,8,8,8,8,8,
Here's a Stack Overflow post that describes handling commas in a CSV file.
Hope this helps!
Thanks for checking out the resource. The approach you have outlined looks correct to me.
I'll ask the scoring team to comment on the scoring procedures. There is some documentation here: https://sleepdata.org/datasets/shhs/pages/mop/6-00-mop-toc.md
All of these variables (like 'oanba4' or 'oanbp4') are calculated/output from the Compumedics Profusion software. I believe the software links the respiratory events with the arousals itself when spitting out all these numbers.
If we took the difference of 'oanba4' and 'oanbp4' I think we would get a count of "obstructive apneas with arousals but WITHOUT a >= 4% oxygen desaturation (NREM/Supine)". I'm not sure how to get "OAs WITHOUT Arousals, but WITH >= 4% desaturation (NREM/Supine)". These variables will make your head spin!
I think you may have to mine the raw data (EDF/XML) to get at these sorts of things.
What are you trying to do once you parse the XML annotation files?
I don't use python, but I did something in R a couple years ago: https://sleepdata.org/tools/mrueschman-xml-annotation-extractor
Another example in Ruby: https://sleepdata.org/tools/ruby-script-tutorial-05
I asked a couple folks here who know more about the CCSHS EDFs to comment.
Yes! Many participants appear in the Visit 1 and Visit 2 datasets, as well as the CVD Outcomes dataset. This allows you to look at data from the same participants for up to ~15 years of follow-up.