We use cookies and other tools to enhance your experience on our website and to analyze our web traffic.
For more information about these cookies and the data collected, please refer to our
Privacy Policy.
The first version of the Cleveland Family Study Visit 5 dataset is now available. This dataset accompanies 730 EDF and XML annotation files that are available for more in-depth analysis.
Future dataset iterations for CFS will include more variables, along with data from earlier study visits.
Here's a summary of the release:
0.1.0 Changes
Initial import from family3dd.xls
All non-calculated variable are associated with forms
Redundant identifier variables have been removed from the data dictionary
Domains have been created for all variables originally marked as type: choices
Fixed several outliers, negative and implausible values
Variables have now been associated with forms, where appropriate
Demographics variables and key subscales have been marked as 'commonly used'
Missing values have been stripped from the dataset
Family medical history variables have been removed from this release, pending a more in depth cleaning
PHI and identifiable variables have either been obfuscated or removed from the dataset
Variables sourced from the baseline_lab_questionnaire form have been updated to match exact questionnaire wording
Special thanks to @mcailler, @kevgleas, and @michellereid on a great job curating and finalizing the CFS dataset and data dictionary! Over 200 issues closed in the creation of this first release!