Results

Performance of the BCI literacy pre-screening pipeline on 99 subjects with complete labels.

Key Findings

0.302
R² (Best Regressor)
0.580
Pearson r
78.8%
Classifier Accuracy
12
Selected Features

Regression Model Comparison

Four regressors compared via 5-fold CV on N=99 subjects. Random Forest selected as best.

Model RMSE MAE Pearson r Spearman ρ
Random Forest 0.125 ± 0.020 0.099 ± 0.015 0.302 ± 0.097 0.580 0.474
Gradient Boosting 0.134 ± 0.019 0.107 ± 0.017 0.169 ± 0.201 0.524 0.428
SVM (RBF-SVR) 0.131 ± 0.028 0.105 ± 0.021 0.246 ± 0.112 0.492 0.413
Ridge Regression 0.128 ± 0.025 0.101 ± 0.017 0.269 ± 0.131 0.530 0.428

Binary Classifier — HIGH vs LOW Performer Screening

Random Forest classifier with decoder-accuracy threshold ≥ 0.65. Evaluated via leave-one-out cross-validation (LOOCV).

78.8%
LOOCV Accuracy
73 / 26
LOW / HIGH Subjects
21 / 99
Misclassified

Confusion Matrix

Pred LOW Pred HIGH
Actual LOW 65 8
Actual HIGH 13 13

Per-Class Metrics

Class n Precision Recall F1
LOW 73 0.833 0.890 0.861
HIGH 26 0.619 0.500 0.553

Selected Features (12 / 38)

Surviving Spearman + permutation test + Benjamini–Hochberg FDR (α = 0.05):

Feature ρ praw pFDR
csp_class_separability 0.454 0.000 0.002
smr_strength 0.399 0.000 0.002
erdrs_mu_C3 −0.365 0.000 0.002
rpl 0.314 0.001 0.008
resting_rpl_alpha 0.293 0.004 0.023
resting_pse_avg −0.284 0.006 0.030
band_power_beta_C3 −0.264 0.008 0.036
band_power_beta_C4 −0.258 0.011 0.038
erdrs_mu_Cz −0.257 0.011 0.038
mu_erd_imagined_C3 −0.251 0.011 0.038
snr_mean −0.247 0.013 0.041
resting_tar −0.239 0.016 0.046

Interpretation

The Random Forest regression model captures a moderate but meaningful association between early, lightweight EEG features and decoder-derived MI literacy (Pearson's r = 0.580). The binary classifier achieves strong recall for the majority low-performing class, effectively flagging individuals unlikely to achieve reliable control. However, sensitivity for high performers is low — the classifier is best interpreted as a conservative filter rather than a definitive "pass" decision.

Base-rate context: The dataset is imbalanced — 73 of 99 labeled subjects (73.7%) fall in the LOW class. A naïve majority-class classifier would achieve 73.7% by always predicting LOW. Our 78.8% is a modest ~5 pp improvement, with the key benefit being correct identification of 13/26 high performers who would be missed by a naïve rule. R² = 0.302 means ~70% of inter-subject variance remains unexplained, consistent with the noisy, multifactorial nature of BCI performance.

Feature importance is dominated by CSP class separability and SMR strength, with additional contributions from resting spectral summaries (TAR, PSE, RPLα) and ERD/ERS measures. This confirms that both baseline rhythm structure and early-trial separability carry complementary information about BCI literacy.

Limitations

Scope

This study is a proof-of-concept on a single public benchmark. We did not test on live EEG, novel populations, cross-session generalization, or non-MI paradigms. The pipeline is designed for binary left-vs-right hand imagery only.


Per-Subject Ground Truth Explorer

Load individual subject-level decoder accuracy data from the PhysioNet cohort.