WB_EEG_QA

Information and resources about the WeBrain tools
Post Reply
WeBrainTool
Posts: 14
Joined: Mon Apr 19, 2021 10:56 am

WB_EEG_QA

Post by WeBrainTool » Mon May 24, 2021 9:35 pm

WB_EEG_QA is a stable tool to realize quality assessment (QA) of a continuous EEG raw data (e.g, resting-state EEG data). The bad data in small windows of each channel could be detected by kinds of 4 methods, and a number of indices related to the data quality will be calculated. Meanwhile, the overall data quality rating will be also provided, including levels of A, B, C, D (corresponding to perfect, good, poor, bad). The QA consists of:
[1] A continuous EEG data of each channel will be high pass filtered (>1Hz) and then segmented as small windows;
[2] Detecting constant or NaN/Inf signals in each window (Method 1). The windows containing any NaN/Inf or with tiny SD/median values (<10-10) are considered as bad windows.
[3] Detecting unusually high or low amplitude using robust standard deviation across time points in each window (Method 2). If the z-score of robust time deviation falls below a threshold or the absolute amplitude exceeds a value of microvolts (150μV), the window is considered to be bad.
[4] Detecting high or power frequency noises in each window by calculating the noise-to-signal ratio (NSR) based on Christian Kothe's method (Method 3). If the z-score of estimate of signal above 40/50Hz (power frequency minus 10 Hz) to that below 40/50 Hz above a threshold or absolute NSR exceeds 0.5, the small window is considered to be bad.
[5] Detecting low correlations with other channels in each window using Pearson correlation (default) or RANSAC correlation (Method 4). If the maximum correlation (absolute correlation coefficients) of the window of a channel to the other channels falls below a threshold, the window is considered bad.
[6] Calculating a number of indices relative to the data quality and rating the EEG raw data.
More details about the QA tool can be seen in the paper: Zhao et al., 2021, Quantitative signal quality assessment for large-scale continuous scalp EEG with big data perspective, submitted.

Image
Fig. 2: Pipeline of quality assessment of continuous EEG raw data. (1) Raw EEG data with artifacts such as eye blink, eye movement etc. (2) The continuous EEG data of each channel will be high pass filtered and then segmented as small windows. Here ‘WindowSeconds’ is the window size (e.g. 1 sec.) over which the following methods are conducted. (3) Detecting constant or NaN/Inf signals in each window (Method 1). (4) Detecting unusually high or low amplitude using robust standard deviation across time points in each window (Method 2). If the z score of robust time deviation falls below ‘robustDeviationThreshold’ or the absolute amplitude exceeds 150 microvolts (μV), the small window is considered to be bad. (5) Detecting high or power frequency noises in each window by calculating the noise to signal ratio based on Christian Kothe's method (Method 3) (clean_rawdata0.32 https://sccn.ucsd.edu/wiki/Artifact_Sub ... tion_(ASR) ). If the z score of estimate of signal above 40 Hz (power frequency 10Hz) to that below 40 Hz above ‘highFrequencyNoiseThreshold’ or absolute NSR exceeds 0.5, the small window is considered to be bad. Noting that if the sampling rate is below 2××power frequency, this step will be skipped. (6) Detecting low correlations with other channels in each window using Pearson correlation (default) or RANSAC correlation (Method 4). For Pearson correlation, if the maximum correlation of the window of a 13 channel to the other channels falls below ‘correlationThreshold’, the window is considered bad. For RANSAC correlation (Bigdely Shamlo et al., 2015), each window of a channel is predicted using RANSAC interpolation based on a RANSAC fraction of the channels. If the correlation of the prediction to the actual behavior falls below ‘ransacCorrelationThreshold’ or calculation is too long, the window is marked as bad. The time cost of this method is high, and the channel locations are required. The RANSAC correlation is optional and default is not performed. (7) Calculating a number of indices relative to the data quality and rating the EEG raw data.

Parameters
WindowSeconds: the window size (in seconds, default = 1 sec.) over which the above methods are conducted.
HighPassband: lower edge of the frequency for high pass filtering. Default is 1 Hz.
seleChanns: number with indices of the selected EEG channels (e.g. ‘[1:4,7:30]’ or ‘all’). Default is ‘all’.
badWindowThreshold: cutoff fraction of bad windows (default = 0.4) for detecting bad channels.
robustDeviationThreshold: Z-score cutoff for robust time deviation in each window (default = 5).
PowerFrequency: power frequency. Default is 50 Hz (in Chinese). Noting that in USA, power frequency is 60Hz.
FrequencyNoiseThreshold: Z-score cutoff for NSR (signal above power frequency - 10Hz). Default is 3. If the z score of estimate of signal above 40 Hz (power frequency - 10Hz) to that below 40 Hz above ‘highFrequencyNoiseThreshold’ or absolute NSR exceeds 0.5, the small window is considered to be bad.
flagNotchFilter : flagNotchFilter = 1: remove 0.5×power frequency noise using notch filtering. Default is off (flagNotchFilter = 0).
correlationThreshold: maximal correlation below which window is bad (range is (0,1), default = 0.6). If the maximum correlation of the window of a channel to the other channels falls below ‘correlationThreshold’, the window is considered bad.
ransacCorrelationThreshold: cutoff correlation for abnormal wrt neighbors(default = [] | --> not performed).
ransacChannelFraction: fraction of channels for robust reconstruction (default = 0.3).
ransacSampleSize: samples for computing RANSAC (default = 50).
srate: sampling rate of EEG data. It can be automatically detected in EEG data. But for ASCII/Float .txt File or MATLAB .mat File, user should fill the sampling rate by hand. Default is ‘[]’.

Note:
(1)Assumptions of QA tool:
- The signal is a structure of continuous data with data and sampling rate at least.
- No segments of the EEG data have been removed.
(2)Noting that quality assessing EEG raw data would NOT change the raw data.
(3)If channel locations are not contained in EEG data or selected channels do not contain channel locations, the RANSAC correlation is invalid.
(4)Noting that if the sampling rate is below 2×power frequency, the step of detecting high or power frequency noises will be skipped.

Outputs
For each subject, a mat file which contains a structure array of QA results will be generated (saved as results_QA_*.mat file which contains the QA results and parameters of each step). Meanwhile, a table file named QA_table.mat which lists all indices of all subjects (including calculated and skipped subjects) in a table will be generated at same time.

results_QA.ONS: Overall ratio of No Signal windows. The ONS ranges from 0 to 1. The ONS = 0 if and only if there is no NaN or constant signals in the data. In contrast, the ONS = 1 for all NaN or constant signals;
results_QA.OHA: Overall ratio of High Amplitude windows. The OHA ranges from 0 to 1. The OHA = 0 if and only if there is no high amplitude window in the data, and the OHA = 1 for all are high amplitude bad windows in the data;
results_QA.OFN: Overall ratio of high Frequency Noise windows. The OFN ranges from 0 to 1. The OFN = 0 if and only if there is no high frequency noise window in the data, and the OFN = 1 for all are high amplitude bad windows in the data;
results_QA.OLC: Overall ratio of Low Correlation windows. The OLC ranges from 0 to 1. The OLC = 0 if and only if there is no low correlation windows in the data, and the OLC = 1 for all windows are low correlation bad windows in the data;
results_QA.OLRC: Overall ratio of windows of Low RANSAC Correlation (optional). The OLRC = 0 if and only if there is no low correlation windows in the data, and the OLRC = 1 for all windows are low correlation bad windows in the data;
results_QA.badChannels: The index of bad channels of which the ratio of the bad quality windows exceed a certain threshold (0.4 by default);
results_QA.NBC: No. of Bad Channels;
results_QA.OBC: Overall ratio of Bad Channels. The OBC tends to 0 for no bad channels and to 1 for all bad channels;
results_QA.OBClus: Overall ratio of Bad Clusters. The number of the connected components of the bad quality windows. This measure can describe the situations of the bad quality windows in the data. OBClus tends to 1 for a wide noises in the data, to 0 for no bad clusters. The lower of OBClus is, the less part of EEG signals is contaminated. If ODQs of two EEG data are same, the quality of the data with lower OBClus is better than another;
results_QA.ODQ: Overall Data Quality: the overall ratio of good data windows. The ODQ ranges from 0 to 100. The ODQ = 0 if and only if there is no good window in the data, and the ODQ = 100 for all are good windows in the data;
results_QA.DataQualityRating: Overall Data Quality Rating
Level A: ODQ >= 90;
Level B: ODQ >= 80 && ODQ < 90;
Level C: ODQ >= 60 && ODQ < 80;
Level D: ODQ < 60;
results_QA.allMAV: mean absolute value of all windows;
results_QA.badMAV: mean absolute value of bad windows;
results_QA.goodMAV: mean absolute value of good windows;
results_QA.NoSignalMask: a mask of windows with no signals (with dimension channels × windows);
results_QA.AmpliChannelMask: a mask of windows with high amplitudes (with dimension channels × windows);
results_QA.FrequencyNoiseMask: a mask of windows with high frequency (and power frequency, if applicable) noise (with dimension channels × windows);
results_QA.LowCorrelationMask: a mask of windows with low correlations (with dimension channels × windows);
results_QA.RansacBadWindowMask: a mask of windows with RANSAC low correlations (with dimension channels × windows);
results_QA.OverallBadMask: a mask of windows with overall bad signals (with dimension channels × windows);
results_QA.fractionBadWindows: fractions of bad windows for each channel (with dimension channels × 1);
results_QA.badChannelsFromAll: logical value of bad channels from all methods (with dimension channels × 1).

Parameter details:

results_QA.parameters.srate: sampling rate;
results_QA.parameters.WindowSeconds: window size in seconds (default = 1 sec);
results_QA.parameters.HighPassband: lower edge of the frequency for high pass filtering, Hz;
results_QA.parameters.selechanns: number with indices of the selected channels (e.g. [1:4,7:30] or ‘all’).Default is ‘all’;
results_QA.parameters.badWindowThreshold: cutoff fraction of bad windows;
results_QA.parameters.PowerFrequency: power frequency. Default is 50 Hz (in Chinese). Noting that in USA, power frequency is 60Hz;
results_QA.parameters.robustDeviationThreshold: Z-score cutoff for robust channel deviation;
results_QA.parameters.FrequencyNoiseThreshold: Z-score cutoff for nosie-to-signal ratio (signal above 40 Hz);
results_QA.parameters.correlationThreshold: maximal correlation below which window is bad (range is (0,1));
results_QA.parameters.chanlocsflag: flag of channel locations. if chanlocsflag = 1: have channel locations;
results_QA.parameters.chanlocsXYZ: xyz coordinates of selected channels;
results_QA.parameters.chanlocs: channel locations of selected channels;
results_QA.parameters.ransacSampleSize: samples for computing RANSAC (default = 50);
results_QA.parameters.ransacChannelFraction: fraction of channels for robust reconstruction (default = 0.3);
results_QA.parameters.ransacCorrelationThreshold: cutoff correlation for abnormal wrt neighbors(default = [] | --> not performed).

Post Reply

Who is online

Users browsing this forum: No registered users and 1 guest