Which specific datasets are you working with?
1) In general, the units for the heart rate variability indices are milliseconds for the time domain variables: AVNN, SDNN, SDANN, SDNNIDX, rMSSD. pNN50 is a %.; the unites for instantaneous heart rate IHR is bpm; the units for the frequency domain variables: total power, ultra-low frequency, very low frequency, low frequency and high frequency power are millisecond square (ms2). The ratio LF/HF is unitless.
2) If you have the ECG signal you can extract the RR intervals using a open-source QRS detector. One is available here: https://www.physionet.org/physiotools/wag/wqrs-1.htm
3) The xml files, if available, contain the sleep stage annotations: 0 for awake, 1 for N1, 2 for N2, 3 for N3, 4 for N4 and 5 for REM. Usually the annotation window is 30 secs. The time of occurrence of the first "non-zero" symbol is sleep onset. The time of occurrence of the last "non-zero" symbol is sleep-offset.
4) The answer to that question depends on the protocol followed by a particular study.