Our partner MDH and Hamidur Rahman, has successfully completed a PhD on “Artificial Intelligence for Non-Contact-Based Driver Health Monitoring” in 2021 you can find the extended abstract and the link to the publication here http://urn.kb.se/resolve?urn=urn:nbn:se:mdh:diva-53529
In clinical situations, a patient’s physical state is often monitored by sensors attached to the patient, and medical staff are alerted if the patient’s status changes in an undesirable or life-threatening direction. However, in unsupervised situations, such as when driving a vehicle, connecting sensors to the driver is often troublesome and wired sensors may not produce sufficient quality due to factors such as movement and electrical disturbance. Using a camera as a non-contact sensor to extract physiological parameters based on video images offers a new paradigm for monitoring a driver’s health and mental state. Due to the advanced technical features in modern vehicles, driving is now faster, safer and more comfortable than before. To enhance transport security (i.e. to avoid unexpected traffic accidents), it is necessary to consider a vehicle driver as a part of the traffic environment and thus monitor the driver’s health and mental state. Such a monitoring system is commonly developed based on two approaches: driving behaviour-based and physiological parameters-based.
This research work demonstrates a non-contact approach that classifies a driver’s cognitive load based on physiological parameters through a camera system and vehicular data collected from control area networks considering image processing, computer vision, machine learning (ML) and deep learning (DL). In this research, a camera is used as a non-contact sensor and pervasive approach for measuring and monitoring the physiological parameters. The contribution of this research study is four-fold: 1) Feature extraction approach to extract physiological parameters (i.e. heart rate [HR], respiration rate [RR], inter-beat interval [IBI], heart rate variability [HRV] and oxygen saturation [SpO2]) using a camera system in several challenging conditions (i.e. illumination, motion, vibration and movement); 2) Feature extraction based on eye-movement parameters (i.e. saccade and fixation); 3) Identification of key vehicular parameters and extraction of useful features from lateral speed (SP), steering wheel angle (SWA), steering wheel reversal rate (SWRR), steering wheel torque (SWT), yaw rate (YR), lanex (LAN) and lateral position (LP); 4) Investigation of ML and DL algorithms for a driver’s cognitive load classification. Here, ML algorithms (i.e. logistic regression [LR], linear discriminant analysis [LDA], support vector machine [SVM], neural networks [NN], k-nearest neighbours [k-NN], decision tree [DT]) and DL algorithms (i.e. convolutional neural networks [CNN], long short-term memory [LSTM] networks and autoencoders [AE]) are used.
One of the major contributions of this research work is that physiological parameters were extracted using a camera. According to the results, feature extraction based on physiological parameters using a camera achieved the highest correlation coefficient of .96 for both HR and SpO2 compared to a reference system. The Bland Altman plots showed 95% agreement considering the correlation between the camera and the reference wired sensors. For IBI, the achieved quality index was 97.5% considering a 100 ms R-peak error. The correlation coefficients for 13 eye-movement features between non-contact approach and reference eye-tracking system ranged from .82 to .95.
For cognitive load classification using both the physiological and vehicular parameters, two separate studies were conducted: Study 1 with the 1-back task and Study 2 with the 2-back task. Finally, the highest average accuracy achieved in terms of cognitive load classification was 94% for Study 1 and 82% for Study 2 using LR algorithms considering the HRV parameter. The highest average classification accuracy of cognitive load was 92% using SVM considering saccade and fixation parameters. In both cases, k-fold cross-validation was used for the validation, where the value of k was 10. The classification accuracies using CNN, LSTM and autoencoder were 91%, 90%, and 90.3%, respectively.
This research study shows such a non-contact-based approach using ML, DL, image processing and computer vision is suitable for monitoring a driver’s cognitive state.
Place, publisher, year, edition, pages
Västerås: Mälardalen University , 2021.
Mälardalen University Press Dissertations, ISSN 1651-4238 ; 330