Methodology

A low-sample pre-screening pipeline for MI-BCI literacy combining brief resting-state baselines with only the first n=15 motor imagery trials.

Phase 1: Ground Truth Label Generation (CSP-LDA)

We establish ground-truth MI performance with a subject-specific CSP-LDA decoder following the MetaBCI framework.

Dataset: PhysioNet EEG Motor Movement/Imagery — 109 subjects, 64 channels, 160 Hz
ROI: C3, Cz, C4 (+ FC1/FC2, CP1/CP2 for spatial context)
Filtering: Mu-band 8–13 Hz, post-cue window 0.5–4.0 s
CSP: Top k=4 + bottom k=4 spatial filters (8 components)
Classifier: LDA with shrinkage regularization
Validation: Stratified 5-fold CV within each subject → one literacy score per subject

Phase 2: Feature Extraction

We extract neurophysiological markers from the first 15 trials of left-vs-right MI runs (runs 4/8/12) plus brief resting baselines (R01/R02). Two feature families are computed:

2a. Subject-Level Trait & Task-Modulation Features

Resting Alpha & Asymmetry Welch PSD → 8–13 Hz power at C3/C4/Cz (eyes-closed); alpha asymmetry (C3 vs C4); relative alpha power (RPL)

Resting Beta Power Lower beta (13–20 Hz) for ERD/ERS patterns; upper beta (20–30 Hz) for motor cortex state

SMR Strength, IAF & Aperiodic Exponent Laplacian-filtered C3/C4 PSD modeled as noise floor + Gaussian peaks → peak height, IAF, spectral slope

Alpha Variability SD / CV of alpha power in 2 s sliding windows (1 s overlap) — captures dynamic modulation capacity

Interhemispheric Coherence Welch coherence between C3 & C4 in mu/beta bands — lower coupling → more lateralized MI patterns

PSE, LZC & TAR Power spectral entropy (ΔH = H_baseline − H_task); Lempel-Ziv complexity gap; Theta/Alpha ratio modulation

2b. Early-Trial Decodability Features

Band Power (6) Mu (8–13 Hz) & Beta (13–30 Hz) at C3, Cz, C4

CSP Pattern Quality (4) Eigenvalue ratio, class separability, CSP feature mean/std

ERD/ERS Magnitude (6) Relative power change (task vs baseline) in mu & beta at C3/Cz/C4

Trial Variability (4) CV summary stats, inter-trial correlation, stability index

SNR (3) Mean squared amplitude ratio (task vs baseline) — mean, std, max

Phase 2.5: Merged Feature Pipeline

Features from Phase 2a and 2b are merged by subject ID into a single design matrix: 38 total merged features per subject for all 109 subjects.

Feature Selection

To reduce redundancy and control false discoveries:

Rank-based association: Spearman correlation with 5,000-permutation significance tests
FDR correction: Benjamini–Hochberg at α = 0.05
Minimum effect size: |ρ| ≥ 0.1

This retained 12 selected predictors for the final models:

Feature	ρ	p_FDR
csp_class_separability	0.454	0.002
smr_strength	0.399	0.002
erdrs_mu_C3	−0.365	0.002
rpl	0.314	0.008
+ 8 more (resting_rpl_alpha, resting_pse_avg, band_power_beta_C3/C4, erdrs_mu_Cz, snr_mean, mu_erd_imagined_C3, resting_tar)

Phase 3: BCI Literacy Screening Models

We trained two types of predictive models from merged feature vectors (standardized prior to fitting):

Regression

Compared four models (Random Forest, Gradient Boosting, RBF-SVR, Ridge) using 5-fold CV. Random Forest selected as best (R² = 0.302, Pearson r = 0.580).

Binary Classifier

Random Forest classifier separating HIGH vs LOW performers at accuracy threshold ≥ 0.65. Evaluated via leave-one-out cross-validation (LOOCV) → 78.8% accuracy with stronger low-performer detection (precision 0.833, recall 0.890).

Preprocessing

Re-referencing: Common-average reference
Filtering: FIR bandpass 1–40 Hz
Epoching: −1 to +4 s relative to cue onset; ±100 μV artifact rejection
ICA: Picard, 20 components; Fp1 + AF7/AF8 as EOG proxies
Laplacian: Subtracted mean of 4 nearest neighbors for ROI channels
Imputation: Feature-wise median for missing feature groups

System Architecture: HIL Simulation

The current demonstration uses a Hardware-in-the-Loop (HIL) Simulation:

Signal Source: Pre-recorded PhysioNet data streamed as a "Digital Twin"
Processing: Feature extraction and ML decoding run in real-time
Justification: Validates end-to-end latency deterministically before human deployment

Software Stack

Python 3.9+
MNE-Python (EEG Processing)
MetaBCI (Decoding Pipelines)
scikit-learn (Machine Learning)
Flask (API)