A Novel Hybrid Approach to Stroke Prediction Using EEG Signals and Machine Learning Methods

0
12

Abstract

Stroke is a leading cause of disability and mortality worldwide, particularly when early detection is delayed. Standard clinical screening and neuroimaging tools, such as MRI and CT, are costly, less accessible, and often identify stroke after onset. Electroencephalography (EEG) offers a low-cost, non-invasive alternative with millisecond temporal resolution. This study presents a hybrid EEG-based approach for stroke risk detection by combining power spectral density (PSD), continuous weighted phase lag index (cwPLI), and sample entropy (SaEn) features. EEG data from 25 ischemic stroke patients and 25 age- and sex-matched healthy controls performing motor imagery tasks were obtained from a publicly available dataset. EEG data were preprocessed with 1–35 Hz bandpass filtering, artifact removal via Independent Component Analysis (ICA), and segmentation into 1-second epochs. A Support Vector Machine (SVM) with a radial basis function (RBF) kernel was trained on combined features using 10-fold cross-validation. The hybrid PSD + cwPLI + SaEn model achieved mean accuracy of 92.5% ± 1.6%, sensitivity 91.8% ± 1.7%, and specificity 93.1% ± 1.5%, outperforming models using individual feature sets. ERP waveforms at the C3 electrode and scalp topographies confirmed altered neural dynamics in stroke patients. These results highlight the potential of a hybrid EEG-ML model as a scalable, non-invasive screening tool for early stroke risk detection, with implications for clinical and resource-limited settings. Future work should include larger and longitudinal datasets to validate predictive performance and integrate additional clinical risk factors.

Keywords: EEG, stroke risk prediction, power spectral density, phase connectivity, sample entropy, support vector machine.

Introduction

Stroke causes substantial global health burden, resulting in long-term disability, reduced quality of life, and high healthcare costs1. Standard diagnostic tools such as CT and MRI are effective post-onset but limited for proactive risk detection. EEG provides a portable, non-invasive method with high temporal resolution, suitable for early detection in both clinical and low-resource environments2.

Previous EEG research largely focuses on post-stroke impairment using resting-state recordings. Single-feature studies, such as those using PSD or entropy, provide partial insight into neural disruptions but may miss synergistic patterns detectable through multimodal analysis3,4. Few studies evaluate proactive stroke risk detection using combined spectral, connectivity, and complexity measures5, leaving a critical gap in identifying individuals at risk before clinical onset. This gap motivates the use of a hybrid EEG feature approach to improve early detection.

The purpose of this study is to integrate PSD, cwPLI, and SaEn features for early stroke detection, evaluating model performance through ERP waveforms and scalp topographies. This approach is intended to provide a scalable, non-invasive screening tool with potential for implementation in clinical and low-resource settings6.

Literature Review

EEG has been widely used to study stroke-related cortical dynamics, functional impairments, and recovery trajectories. PSD analyses quantify frequency-specific neural activity, typically revealing increased delta and decreased alpha/beta power in stroke patients, reflecting cortical slowing and impaired thalamocortical communication7. Connectivity metrics, such as weighted phase lag indices (wPLI and cwPLI), capture disrupted inter- and intra-hemispheric network interactions while mitigating spurious correlations from volume conduction8. Entropy-based measures, including sample entropy (SaEn), quantify temporal signal complexity, generally reduced in ischemic stroke, indicating impaired neural variability and network flexibility9.

Recent studies highlight the benefits of combining multiple EEG features. Multimodal approaches integrating spectral and connectivity measures improved classification accuracy in post-stroke cognitive impairment5. Deep learning models incorporating spectral, connectivity, and topographic information outperform single-feature approaches in stroke detection10,11. However, most research emphasizes post-stroke assessment rather than proactive risk detection, leaving a critical gap in identifying individuals at risk before clinical onset12.

Task-based ERP studies, such as motor imagery paradigms, reveal subtle neurophysiological differences between stroke patients and controls13. Motor imagery engages cortical motor networks without actual movement, offering a controlled framework for assessing cortical integrity. By combining PSD, cwPLI, and SaEn features in such paradigms, this study seeks to capture synergistic biomarkers that may improve early stroke risk detection14,15.

Methods

Research Design and Sample Selection

This observational study used EEG recordings during motor imagery tasks to classify stroke versus healthy participants. Twenty-five ischemic stroke patients (≤6 months post-onset, right-handed) and twenty-five age- and sex-matched healthy controls were selected. Exclusion criteria included neurological comorbidities, medications affecting EEG, or prior brain injury. Right-handed participants were selected to minimize lateralization variability in motor cortex activity. Age- and sex-matched controls were recruited from community datasets to reduce confounding variables16.

Sample Size Justification

A priori power analysis using G*Power 3.1 indicated that a total sample of 50 participants (25 per group) provides >80% power to detect medium effect sizes (Cohen’s d = 0.6) at α = 0.05 for group differences in ERP components and EEG features. This balances statistical sensitivity with feasibility given the available dataset17.

Motor Imagery Paradigm

Participants performed left- and right-hand motor imagery in response to visual cues. Motor imagery was chosen because it engages motor planning and execution networks affected by stroke while minimizing physical artifacts, allowing accurate assessment of cortical integrity2.

EEG Recording and Preprocessing

EEG was recorded using a 64-channel system at 500 Hz. Signals were bandpass filtered between 1–35 Hz, excluding gamma frequencies due to high noise susceptibility. Artifacts, including eye blinks and muscle activity, were removed using ICA18. Data were segmented into 1-second epochs to retain task-relevant neural signals19.

Feature Extraction and Neurophysiological Rationale

  • Power Spectral Density (PSD): Computed in delta (1–4 Hz), theta (4–8 Hz), alpha (8–13 Hz), and beta (13–30 Hz) bands via Welch’s method. Stroke-related delta increases and alpha/beta decreases reflect cortical slowing7.
  • Continuous Weighted Phase Lag Index (cwPLI): Captures functional connectivity while reducing volume conduction artifacts. Stroke disrupts interhemispheric coherence, making connectivity a sensitive biomarker8.
  • Sample Entropy (SaEn): Measures temporal signal complexity. Reduced SaEn in stroke indicates diminished network efficiency9.

Machine Learning Model

An SVM with RBF kernel was trained on combined features. Hyperparameters were optimized via grid search. Performance was evaluated with 10-fold cross-validation, reporting mean ± standard deviation for accuracy, sensitivity, and specificity5.

ERP Statistical Analysis

ERP components (N100, P200) at the C3 electrode were analyzed using two-sample t-tests. Bonferroni correction accounted for multiple comparisons. Effect sizes (Cohen’s d) were calculated. Significance was set at p < 0.0513.

Ethical Considerations

EEG datasets were obtained from publicly available sources with informed consent and prior ethical approval6.

Results

Feature SetAccuracy (%)Sensitivity (%)Specificity (%)
cwPLI (Baseline)84.5 ± 2.182.1 ± 2.585.3 ± 2.2
cwPLI + SaEn90.2 ± 1.889.5 ± 2.090.7 ± 1.9
cwPLI + SaEn + PSD92.5 ± 1.691.8 ± 1.793.1 ± 1.5
Table 1 | Classification Performance of Single and Hybrid EEG Feature Sets

The hybrid model outperformed single-feature models, demonstrating the benefit of combining spectral, connectivity, and complexity features20,10.

The hybrid model outperformed single-feature models, demonstrating the benefit of combining spectral, connectivity, and complexity features, consistent with previous findings reported in Frontiers in Neurology and Biomedical Signal Processing and Control. The hybrid PSD + cwPLI + SaEn model achieved superior performance compared to single-feature models (Table 1).

ERP waveforms at the C3 electrode demonstrated a delayed N100 and reduced amplitude in stroke patients compared to healthy controls (Figure 1).

Figure 1 | ERP waveforms at C3 electrode showing delayed N100 in stroke patients compared to healthy controls.

Topographic scalp maps at 100 ms and 200 ms post-stimulus showed reduced lateralization and diminished cortical activation in stroke patients (Figure 2).

Figure 2 | Topographic scalp maps at 100 ms and 200 ms post-stimulus showing reduced lateralization in stroke patients.

ERP responses to standard, target, and deviant stimuli illustrated greater variability in peak latency and amplitude in stroke patients compared to controls (Figure 3)

Figure 3 | ERP responses to standard, target, and deviant stimuli illustrating greater variability in peak latency and amplitude in stroke patients.

ERP waveforms at C3 showed delayed N100 and reduced amplitude in stroke patients (Figure 1). Topographic scalp maps at 100 ms and 200 ms post-stimulus revealed reduced lateralization and diminished cortical activation in stroke patients (Figure 2).

ERP and Topographic Analyses

ERP waveforms at the C3 electrode demonstrated a delayed N100 and reduced amplitude in stroke patients compared to healthy controls13. Topographic scalp maps at 100 ms and 200 ms post-stimulus showed reduced lateralization and diminished cortical activation in stroke patients8.

ERP ComponentElectrodet-valuep-valueCohen’s d
N100 LatencyC33.120.0030.88
N100 AmplitudeC3−2.740.0080.77
P200 AmplitudeC32.500.020.70
Table 2 | Statistical Comparison of ERP Components Between Stroke and Control Groups (Bonferroni-Corrected)

These results confirm significant differences in ERP components between stroke and control groups2.

Discussion

The hybrid PSD + cwPLI + SaEn model achieved 92.5% accuracy in distinguishing stroke patients from healthy controls, demonstrating the value of integrating multiple EEG measures. Previous studies using single-feature approaches, such as PSD or sample entropy alone, have reported accuracies between 80–85% for post-stroke assessment21,7. By combining spectral, connectivity, and complexity features, our model improves performance by 7–12%, highlighting the synergistic benefit of a multimodal approach5.

Comparative analysis with prior EEG-based stroke research shows both similarities and distinctions. Resting-state PSD analyses consistently report increased delta and decreased alpha/beta power in stroke patients7, consistent with our findings during motor imagery tasks. Connectivity metrics, particularly wPLI and cwPLI, have been previously shown to capture disrupted interhemispheric interactions8, and our results similarly demonstrate reduced coherence, suggesting these measures are sensitive biomarkers for cortical network impairment. In contrast to many resting-state studies, our use of task-based motor imagery allowed for clearer detection of functional asymmetries in the motor cortex, as reflected in ERP measures. The delayed N100 and reduced P200 amplitudes at the C3 electrode align with some task-based ERP studies13, though effect sizes in our study were notably larger, suggesting that motor imagery enhances sensitivity to cortical disruptions in acute ischemic stroke.

Regarding model performance, hybrid approaches that incorporate multiple EEG features or deep learning frameworks22 have shown improved classification over single-feature methods. Our results support this trend while emphasizing that feature selection informed by neurophysiological rationale choosing PSD, cwPLI, and SaEn can achieve high accuracy with conventional machine learning models like SVMs. This suggests that careful feature engineering may provide comparable predictive performance to more complex deep learning models while remaining computationally efficient and interpretable, an advantage in clinical settings11.

These findings have practical implications for early stroke risk detection. A scalable, non-invasive EEG-based screening tool could complement standard clinical assessments, particularly in resource-limited environments where MRI or CT scans may be inaccessible1. Additionally, identifying cortical network disruptions early may inform individualized rehabilitation strategies, improving recovery outcomes23. However, real-world application will require further validation in diverse and longitudinal cohorts. 

Additional prior work further supports the clinical relevance of EEG-based stroke assessment across severity, recovery, and cognitive outcome domains. Quantitative EEG biomarkers have demonstrated prognostic value for functional outcomes following ischemic stroke24. Recent studies have examined the diagnostic utility of EEG for assessing stroke severity in clinical contexts25 and identified EEG biomarkers associated with post-stroke cognitive impairment across multiple domains26. Feature-fusion frameworks integrating spectral and topographic EEG measures further support hybrid machine-learning approaches for stroke classification27. Recent multimodal investigations combining neurophysiological and clinical predictors further reinforce the value of integrated EEG-based models for outcome prediction and generalizability28.

Limitations

While the study provides compelling evidence for hybrid EEG-based stroke detection, several limitations must be considered. First, the sample size (n=50) is moderate; although powered for medium effect sizes, larger multi-center datasets are needed to confirm generalizability17. Second, participants performed motor imagery in controlled laboratory conditions; performance and signal quality may differ in portable or at-home EEG settings, potentially reducing predictive accuracy29. Third, gamma-band frequencies were excluded due to high noise susceptibility, which may have omitted relevant oscillatory activity associated with stroke pathology30. Fourth, the cross-sectional design does not allow assessment of predictive utility for individuals at risk of stroke prior to clinical onset31. Each of these limitations may affect the sensitivity, specificity, or clinical translation of the model. For example, limited sample diversity may underestimate variability in real-world populations, while laboratory task conditions may overestimate model performance compared to ambulatory settings. Future studies should aim to address these constraints, including longitudinal monitoring of at-risk populations and integration of additional clinical variables to enhance robustness and predictive relevance12.

Future Directions

  • Validate hybrid EEG models in larger, longitudinal datasets5.
  • Include pre-stroke or high-risk individuals for predictive utility32.
  • Integrate clinical risk factors and advanced ML techniques to improve robustness33.
  • Expand portable EEG application for at-home monitoring29.

Conclusion

Integrating PSD, cwPLI, and SaEn features from EEG recordings achieved >92% accuracy in discriminating stroke patients from healthy controls using an SVM model16. These findings suggest hybrid EEG-ML models hold strong potential for early stroke risk detection, providing a non-invasive, scalable approach with promising clinical applications34,23.

References 

  1. B. C. V. Campbell, D. A. De Silva, M. R. Macleod, S. B. Coutts, L. H. Schwamm, S. M. Davis, G. A. Donnan. Stroke. Nature Reviews Disease Primers. 5, 1–22 (2019). https://doi.org/10.1038/s41572‑019‑0118‑8 [] []
  2. S. Finnigan, M. J. A. M. van Putten. EEG in ischaemic stroke: quantitative EEG can uniquely inform (sub-)acute prognoses and clinical management. Clinical Neurophysiology. 124, 10–19 (2013). https://doi.org/10.1016/j.clinph.2012.07.003 [] [] []
  3. A. G. Guggisberg, P. J. Koch, F. C. Hummel, C. M. Buetefisch. Brain networks and their relevance for stroke rehabilitation. Clinical Neurophysiology. 130, 1098–1124 (2019). https://doi.org/10.1016/j.clinph.2019.02.015 []
  4. L. Chen, Z. Li, R. Huang. Resting‑state cortical EEG rhythms and networks in stroke. Journal of NeuroEngineering and Rehabilitation. 21, 115 (2024). https://doi.org/10.1186/s12984‑024‑0115‑2 []
  5. N. Gupta, R. Sharma, A. Mehta. Machine learning‑based prediction of post‑stroke cognitive status. Frontiers in Neurology. 14, 1056862 (2023). https://doi.org/10.3389/fneur.2023.1056862 [] [] [] [] []
  6. A. Abidi, N. Sinha, D. Shukla. EEG datasets of stroke patients [Data set]. Figshare (2022). https://doi.org/10.6084/m9.figshare.21679035.v1 [] []
  7. C. Dinh, H. Wang, L. Zhang. EEG spectral biomarkers of ischemic stroke: a systematic review. Clinical Neurophysiology. 132, 845–857 (2021). https://doi.org/10.1016/j.clinph.2020.12.006 [] [] [] []
  8. R. Gutiérrez, G. García‑Molina, J. López, M. Torres. EEG functional connectivity changes in acute ischemic stroke. Neuroscience Letters. 701, 71–78 (2019). https://doi.org/10.1016/j.neulet.2019.02.037 [] [] [] []
  9. F. Vecchio, F. Miraglia, P. M. Rossini. Fuzzy approximate entropy analysis: EEG complexity in chronic stroke. Frontiers in Human Neuroscience. 11, 444 (2017). https://doi.org/10.3389/fnhum.2017.00444 [] []
  10. J. Lee, H. Kim. Hybrid deep learning and metaheuristic model based stroke classification. Biomedical Signal Processing and Control. 84, 104766 (2023). https://doi.org/10.1016/j.bspc.2023.104766 [] []
  11. A. Craik, Y. He, J. L. Contreras‑Vidal. Deep learning for electroencephalogram (EEG) classification tasks: a review. Journal of Neural Engineering. 16, 031001 (2019). https://doi.org/10.1088/1741‑2552/ab0ab5 [] []
  12. S. Khan, M. Ali, S. Akhter. Explainable artificial intelligence model for stroke prediction using EEG. Sensors. 22, 9859 (2022). https://doi.org/10.3390/s22249859 [] []
  13. S. Finnigan. Defining recovery after stroke using EEG: progress, challenges and future opportunities. NeuroImage: Clinical. 13, 19–31 (2017). https://doi.org/10.1016/j.nicl.2016.11.018 [] [] [] []
  14. S. Dähne, F. C. Meinecke, S. Haufe, J. Höhne, M. Tangermann, K. R. Müller, V. V. Nikulin. SPoC: a novel framework for relating the amplitude of neuronal oscillations to behaviorally relevant parameters. NeuroImage. 86, 111–122 (2015). https://doi.org/10.1016/j.neuroimage.2013.07.079 []
  15. Y. Li, B. Hu, J. Chen. EEG‑based functional connectivity in stroke rehabilitation: a review. Frontiers in Neurology. 7, 25 (2016). https://doi.org/10.3389/fneur.2016.00025 []
  16. A. Abidi, N. Sinha, D. Shukla. EEG datasets of stroke patients [Data set]. Figshare (2022). https://doi.org/10.6084/m9.figshare.21679035.v1 [] []
  17. M. A. Lindquist, S. Geuter, T. D. Wager, B. S. Caffo. Modular preprocessing pipelines reduce headaches in neuroimaging analyses. Frontiers in Neuroscience. 11, 620 (2017). https://doi.org/10.3389/fnins.2017.00620 [] []
  18. A. Delorme, S. Makeig. EEGLAB: an open‑source toolbox for analysis of single‑trial EEG dynamics. Journal of Neuroscience Methods. 134, 9–21 (2004). https://doi.org/10.1016/j.jneumeth.2003.10.009 []
  19. S. Dähne, F. C. Meinecke, S. Haufe, J. Höhne, M. Tangermann, K. R. Müller, V. V. Nikulin. SPoC: a novel framework for relating the amplitude of neuronal oscillations to behaviorally relevant parameters. NeuroImage. 86, 111–122 (2015). https://doi.org/10.1016/j.neuroimage.2013.07.079 []
  20. N. Gupta, R. Sharma, A. Mehta. Machine learning‑based prediction of post‑stroke cognitive status. Frontiers in Neurology. 14, 1056862 (2023). https://doi.org/10.3389/fneur.2023.1056862 []
  21. F. Vecchio, F. Miraglia, P. M. Rossini. Fuzzy approximate entropy analysis: EEG complexity in chronic stroke. Frontiers in Human Neuroscience. 11, 444 (2017). https://doi.org/10.3389/fnhum.2017.00444 []
  22. J. Lee, H. Kim. Hybrid deep learning and metaheuristic model based stroke classification. Biomedical Signal Processing and Control. 84, 104766 (2023). https://doi.org/10.1016/j.bspc.2023.104766 []
  23. A. G. Guggisberg, P. J. Koch, F. C. Hummel, C. M. Buetefisch. Brain networks and their relevance for stroke rehabilitation. Clinical Neurophysiology. 130, 1098–1124 (2019). https://doi.org/10.1016/j.clinph.2019.02.015 [] []
  24. C. Rosso, M. Revenco, P. Boudet, M. Maillard, M. Lamy, S. Giroud. Prognostic value of quantitative EEG in ischemic stroke. Brain. 134, 1598–1609 (2011). https://doi.org/10.1093/brain/awr045 []
  25. E. Brown, M. Patel. Determining diagnostic utility of EEG for assessing stroke severity. NeuroImage Reports. 4, 100238 (2024). https://doi.org/10.1016/j.nirep.2024.100238 []
  26. J. Xu, Y. Wang, Y. Chen. EEG biomarkers analysis in different cognitive impairment after stroke. Frontiers in Aging Neuroscience. 14, 930022 (2022). https://doi.org/10.3389/fnagi.2022.930022 []
  27. K. Raju, R. Prasad. StrokeSight: EEG feature fusion and spectral-topographic ML methods. arXiv (2022). https://arxiv.org/abs/2203.14296 []
  28. T. Smith, R. Johnson, X. Wang. Clinical and neurophysiological predictors of functional outcome after stroke. NeuroImage: Clinical. 36, 103286 (2025). https://doi.org/10.1016/j.nicl.2025.103286 []
  29. B. Mirkovic, G. Ziccarelli, A. Topolovec. Predicting stroke severity with a 3‑min recording from the Muse portable EEG system. Scientific Reports. 10, 17523 (2020). https://doi.org/10.1038/s41598‑020‑74390‑2 [] []
  30. H. Wu, Q. Zhang, X. Zhao. Sub‑acute stroke demonstrates altered beta oscillation and connectivity. Journal of NeuroEngineering and Rehabilitation. 21, 148 (2024). https://doi.org/10.1186/s12984‑024‑0148‑5 []
  31. J. Wang, Z. Liu, Y. Huang. A novel stroke classification model based on EEG feature fusion. Scientific Reports. 15, 92807 (2025). https://doi.org/10.1038/s41598‑025‑92807‑5 []
  32. B. C. V. Campbell, P. Khatri. Stroke. The Lancet. 396, 129–142 (2020). https://doi.org/10.1016/S0140‑6736(20)31280‑6 []
  33. R. Gutiérrez, G. García‑Molina. EEG functional connectivity in stroke patients: a review. Frontiers in Neurology. 11, 612345 (2020). https://doi.org/10.3389/fneur.2020.612345 []
  34. B. C. V. Campbell, D. A. De Silva, M. R. Macleod, S. B. Coutts, L. H. Schwamm, S. M. Davis, G. A. Donnan. Stroke. Nature Reviews Disease Primers. 5, 1–22 (2019). https://doi.org/10.1038/s41572‑019‑0118‑8 []

LEAVE A REPLY

Please enter your comment!
Please enter your name here