We use cookies and other tools to enhance your experience on our website and to analyze our web traffic.
For more information about these cookies and the data collected, please refer to our Privacy Policy.

CFS Dataset v0.1.0 released!

226 posts
bio
Was this reply useful? Learn more...
 
[-]
mrueschman +0 points · about 6 years ago

The first version of the Cleveland Family Study Visit 5 dataset is now available. This dataset accompanies 730 EDF and XML annotation files that are available for more in-depth analysis.

Future dataset iterations for CFS will include more variables, along with data from earlier study visits.

Here's a summary of the release:

0.1.0 Changes

  • Initial import from family3dd.xls
  • All non-calculated variable are associated with forms
  • Redundant identifier variables have been removed from the data dictionary
  • Domains have been created for all variables originally marked as type: choices
  • Fixed several outliers, negative and implausible values
  • Variables have now been associated with forms, where appropriate
  • Demographics variables and key subscales have been marked as 'commonly used'
  • Missing values have been stripped from the dataset
  • Family medical history variables have been removed from this release, pending a more in depth cleaning
  • PHI and identifiable variables have either been obfuscated or removed from the dataset
  • Variables sourced from the baseline_lab_questionnaire form have been updated to match exact questionnaire wording
50 posts
bio
Was this reply useful? Learn more...
 
[-]
remomueller +0 points · about 6 years ago

Special thanks to @mcailler, @kevgleas, and @michellereid on a great job curating and finalizing the CFS dataset and data dictionary! Over 200 issues closed in the creation of this first release!

Topic is locked. Start a new topic