We use cookies and other tools to enhance your experience on our website and to analyze our web traffic.
For more information about these cookies and the data collected, please refer to our Privacy Policy.

File checksum mismatch errors - CCSHS

1 post
Was this reply useful? Learn more...
 
[-]
Pranavan +0 points · 12 days ago

Hi there,

Good day.

While downloading the CCSHS dataset, I encountered a "file checksum mismatch" issue for quite a few files. Precisely, there were 71 files failing due to this issue. Please check below:

  failed ccshs-trec-1800891-profusion.xml
         File checksum mismatch, expected: 9069450a1764a63bd61175b2d8c77158
                                   actual: ce1c1edfe712dc2a84e6ff5df8c91a0e

  failed ccshs-trec-1800389.edf
         File checksum mismatch, expected: 5eaa106152f6a511ebdef66b5fa95956
                                   actual: c45e2e20b021c4611abb7a0e44656bc8

  failed ccshs-trec-1800798.edf
         File checksum mismatch, expected: ff30c767eefe00cb9ddbc56d4234f4e3
                                   actual: a7d2b1499edfea722022e12397e9be18

  and all the edfs after "ccshs-trec-1800798.edf"

I checked the forum and unfortunately could not find any prior discussions related to this particular issue, hence creating a new topic. I suspect this could be due to some version incompatibility between ruby and NSRR gem. Could you please confirm whether that is the case? Also, any help regarding resolving this problem would be highly appreciated. Please find the authorisation and version details below:

Data authorised to Philip Terrill (who is my principal PhD supervisor at the University of Queensland, Australia.

Ruby version: ruby 3.0.2p107 (2021-07-07 revision 0db68f0233) [x64-mingw32]

NSRR gem version: Nsrr 8.0.0

280 posts
bio
Was this reply useful? Learn more...
 
[-]
mrueschman +0 points · 10 days ago

Hey Pranavan,

I attempted a fresh run of nsrr download ccshs using NSRR gem 8.0.0 and it completed without any errors both times. I had success with both Ruby 2.7 and 3.02p107.

The tool is designed with catching incomplete/interrupted downloads in mind. Did you get these failures on your first run? Perhaps the files downloaded incompletely, the tool checked the checksums, and found the mismatch. Ideally you could then re-run the same command again and it would fix/re-download any files where the checksums mismatch. All other files will get skipped quickly and be reported as "identical" if the checksums match.