Inclusive Human Intention Prediction with Wearable Sensors: Machine Learning Techniques for the Reaching Task Use Case

Human intentions prediction is gaining importance with the increase in human–robot interaction challenges in several contexts, such as industrial and clinical. This paper compares Linear Discriminant Analysis (LDA) and Random Forest (RF) performance in predicting the intention of moving towards a target during reaching movements on ten subjects wearing four electromagnetic sensors. LDA and RF prediction accuracy is compared to observation-sample dimension and noise presence, training and prediction time. Both algorithms achieved good accuracy, which improves as the sample dimension increases, although LDA presents better results for the current dataset.


Introduction
Predicting human intentions by collecting and analyzing body signals is one of the main goals in human-robot interaction [1]. Accurate and real-time recognition of human motion intention could help in achieving suitable human-machine coordination [2] for both interactive robotic interfaces, like collaborative robots, and diagnostic systems, such as rehabilitation devices [3].
Several kinds of sensors are currently used to detect body signals, like surface electromyography [2,4], electroencephalography [3], and accelerometers. In recent years, research in human movement pattern recognition with the support of wearable sensors was widely conducted [2,3,5], also considering the effect of the positioning of the sensors in the obtained data [6,7]. Actually, wearable sensors allow noninvasive motion detection, full integration with commercially available devices [1], the possibility to acquire acceleration and velocity to reconstruct the detected movement [8], and the adaptation to inter-and intra-individual variability [9]. Since body signals are strongly affected by repeatability lack [2] and motion is subject-dependent, the challenge in predicting human intention increases in specific scenarios, like in a clinical environment, where the pathological subject can present peculiar motion patterns. In particular, laboratory-based optical motion analysis systems are widely adopted for periodical stroke condition assessment during rehabilitation [10] to obtain multiple bio-signals, useful in recognizing pathological symptoms and improving the healing rate of rehabilitation [11]. Therefore, the knowledge of expected natural behavior and movement patterns for the healthy subject becomes crucial to perform a correct evaluation.
Among all the possible movements, the reaching task is fundamental for the activity of daily living [12] because of the relevance of its functional aim.
Since many different strategies can be used to perform the same task [5,13], predictive models and machine learning algorithms are particularly suitable for analyzing the signals and predicting movement intention [2]. Developing relevant working methodologies is necessary, and machine learning techniques can face the limit of small data amounts. The literature provides various examples of machine learning techniques applied to human motion analysis. For instance, in [14] Linear Discriminant Analysis (LDA), Support Vector Machine and k Nearest Neighbor algorithms were applied for the identification of natural hand gesture, whereas Li et al. [15] exploited the Random Forest algorithm to discriminate eight different motions of the upper limb.
This study aims to compare LDA and RF machine learning techniques' performance in predicting the subject's intention of moving towards a specific direction or target in the illustrative scenario of a reaching movement, using data gathered from wearable electromagnetic sensors.

Participants
A convenience sample of ten healthy subjects (nine right-handed) were recruited from January to October 2009. Inclusion criteria were: (i) age over 18 years old, (ii) no current or previous neurological or orthopedic pathology of the upper arm. The study was approved by CPP Ile de France 8 ethical committee; recruited subjects gave their written informed consent to study participation, and procedures were conducted according to the Declaration of Helsinki.

Protocol
Testing sessions were performed during the morning, in the same environmental conditions. In each session, after a first preliminary trial for procedure familiarization, the operator asked the subject to perform six repetitions of unilateral sitting reaching movement, three with the right arm and three with the left arm. As depicted in [16], each subject was asked to perform the movement three times for each combination of directions (internal, middle, external), quotes (high, low), and distances (close, far).
The order of the target submitted to the subjects was standardized: close-middle (CM), farinternal (FI), high-external (HE), far-middle (FM), close-external (CE), high-internal (HI), closeinternal (CI), far-external (FE), high-middle (HM). The subjects were required to touch each target with the provided pointer and coming back to the initial condition, moving at a comfortable speed.

Experimental Setup
Subjects were seated on a chair, adjusted so that the table was at the navel level. They wore a wrist splint to which a pointer was rigidly attached to simulate an extended index finger. The subjects' trunk was fixed to the chairback using a wide strap. For each subject, four electromagnetic sensors were placed by a trained operator on i) acromion, ii) upper third of humerus, iii) wrist dorsum and iv) manubrium, respectively. During the acquisitions, the Polhemus Fastrak electromagnetic tracking system was used, which provides the position and orientation of each sensor as timestamped vector triplets (X, Y, Z) and (α, β, γ), at an output frequency of 30 Hz [17]. The system presented a Root Mean Square (RMS) static accuracy of 0.8 mm for X, Y, and Z receiver position, and 0.15° for receiver orientation, whereas the resolution was 0.0005 cm/cm of range and 0.025°, respectively.
Nine targets were positioned along three directions: (i) middle, on a parasagittal line emanating from the subject's shoulder, (ii) internal, and (iii) external, with an incline of ±45° with respect to the parasagittal line. Targets were composed of a red tape of 10 mm with a vertical stick of 15 mm of diameter. Figure 1 depicts the experimental setup. Distances between targets and subject were parametrized with respect to the anatomical upper limb length, i.e., the distance between the acromion and the end of the pointer. Two distances were considered: (a) far, corresponding to 90% of the total upper limb length, and (b) close, equal to 65% of the upper limb length. Six targets were placed at 70 mm of height from the table level, and three were placed above the distal sensor, at the same quote of the acromion from the table surface.

Data Treatment
Data were elaborated in MATLAB environment using only the information coming from a first sample of the acquired data. To create a dataset of comparable data among trials and among subjects, the acquired signals were trimmed to the actual movement portion. To identify motion starting and ending points, the absolute value of the velocity of the hand was considered. Therefore, the absolute value of the hand position was calculated for each instant t, and the velocity was then computed according to a two-point derivative approximation. According to the literature [18,19], the velocity was filtered with a fourth-order, zero-phase, low-pass Butterworth filter to remove noise, and a cutoff frequency of 3 Hz was selected for the filter [20]. The subject resting condition was identified as the mean value of the first and last ten acquired data samples, corresponding to an interval of 0.33 [s]. The starting and ending points of the movement were automatically selected by a custom-made code, as the first and last time instant in which the absolute value of the position first derivative is higher than a selected threshold. This threshold was iteratively identified by comparing the variance in the observation sample with the variance that the subject presents at the resting condition: if the variance is higher than 5 × 10 −3 mm, the threshold values are reduced by 1 × 10 −3 mm. Acquired data were then normalized in amplitude with respect to subject anthropometric quantities, computed for each patient from hand, arm and shoulder positions.
The relative sensors' shoulder-to-trunk, arm-to-shoulder and hand-to-arm distances along the subject resting phase were computed for each trial. For each subject, the average values of these nine quantities were calculated and used as reference values for the data normalization. To simulate data coming from accelerometers placed on subjects, the second derivatives of sensors' position were computed, applying two-point derivative twice and filtering the result.
Linear and angular position, velocity and acceleration signals were analyzed to identify a set of features for the implementation of the machine learning algorithms, and a portion of the overall motion was considered as the Observation Window (OW). Since the motion duration is unknown in advance, two approaches, subject-and trial-dependent, were used to evaluate the OW size: (i) custom window, computing an observation time for each trial using the information on the motion length, and (ii) average window, exploiting all the available data of all the subjects to compute a fixed OW. The evaluated features are the minimum, maximum and root-mean-square of: (i) sensor position (SP) components, (ii) sensor velocity modulus, or first derivative of SP, (iii) sensor acceleration modulus, or the second derivative of SP and iv) Euler angles, respectively. For each trial and subject, the computed features were rescaled to [−0.80, +0.80].
LDA and RF algorithms were implemented and trained using, respectively, 85% and 90% of the data; in both cases, the data chosen for the training phase were randomly selected. In the testing phase, the remaining 15% and 10% of the dataset were used for LDA and RF, respectively. Twenty different combinations of these parameters were tested; Table 1 depicts the selected features in the conducted tests. Two different conditions were evaluated for the OW size: 1/10 and 1/7 of the total motion time length. The first ten tests use the custom average approach, whereas the remaining tests use the custom window one. For each test, both the machine learning algorithms were tested. To analyze the algorithms' robustness, the same tests were repeated adding Gaussian Noise (noise sample power 0.04 dBW, load impedance 0.4 ohm) to the recorded data before computing the features. LDA and RF algorithms' prediction accuracy was computed and compared with respect to data sample dimension, number of considered features, and OW type. Accuracy was computed according to [21,22] and, in the case of RF accuracy, an out-of-bag (OOB) approach was also used. Table 2 depicts all the obtained results, averaged over 200 consecutive tests. Considering an OW equal to 1/10 of the total movement (average time length of 0.27 [s]), LDA presents, in the best case, an accuracy in the intention prediction of 86.13%, with a Standard Deviation (SD) of 0.036, and an RF with an accuracy of 73.73%, with an SD of 0.015. Increasing the sample at 1/7 of the motion (average time length of 0.37 s), the intention prediction accuracy rises at 92.80% with an SD of 0.027 for LDA, and 84.60% with an SD of 0.010 for RF. Comparing the results obtained from the analysis of data with and without noise, LDA presents a maximum difference equal to −1.53% and an average difference of −0.61%; these values decrease to −1.28% and −0.43%, respectively, for RF. Finally, the RF algorithm demands an average training time of 1.14 s (range: 0.87-1.88), which decreases to an average value of 0.078 s (range: 0.035-0.28) for LDA. The prediction time has been computed only for the tests with higher accuracy, i.e., tests 1, 2, 6 and 7. The average prediction time is 31 × 10 −4 s (range 30 × 10 −4 -33·10 −4 ) for RF and 11 × 10 −5 s (range 10 × 10 −5 -12 × 10 −5 ) for LDA.

Discussion
Comparing the intention prediction performance of the algorithms with respect to the OW size, better results were obtained when larger windows were considered. Nevertheless, a reasonable limit should be imposed to the window size to avoid the intention being predicted when the movement is close to its end. For features regarding Euler angles, sensor position and speed, as the window width increases from 1/10 to 1/7, accuracy improves by more than 10% for both algorithms in all the tests. When acceleration features are considered, the improvement achieved by a wider OW is about five percentage points, with an SD close to 1%. This behavior can likely be explained considering that the features calculated on the acceleration are subjected to the noise generated by the double derivation. Nevertheless, it represents a qualitative estimation of the results that the algorithms could provide, processing acceleration data from an accelerometer or inertial measurement units (IMU). This interpretation is supported by the results of the noise-added data: in those tests where the acceleration features are considered, noise does not significantly affect the accuracy. This could be justified by the fact that the computed acceleration signal is already noisy. Focusing on the time dimension, LDA reveals considerably shorter training times than RF. To decrease RF training time, the number of trees in the forest can be reduced: a preliminary analysis revealed that after about 40 trees, the accuracy of the algorithm tends to a horizontal asymptote. In the same way, LDA presents significantly shorter prediction times than RF in all the tests.

Conclusions
This paper investigates the human reaching movement comparing the performance of LDA and RF to predict subjects' intention of moving towards a specific direction or target, when analyzing data gathered from wearable electromagnetic sensors. A campaign on ten healthy subjects was performed, and features on measured and computed signals were evaluated. The analyses revealed that the OW size is a crucial quantity: the wider the window, the better the prediction performance. The introduction of noise does not significantly affect the prediction performance of both the algorithms when acceleration features are also considered. For both machine learning techniques, a good accuracy is demonstrated, although LDA presents more promising results in terms of accuracy, training time and prediction time with the current dataset.
Further experimental campaigns, including different kinds of sensors or their positioning strategies, are currently under evaluation. In fact, actual acceleration data gathered from accelerometers and/or IMU inertial sensors would allow an experimental validation of the hypotheses about the acceleration features. Besides, the employment of different wearable sensors could make the acquisition system less invasive for the subject and more flexible, promoting, for instance, the use of widely spread and cheaper electronic devices, like smartphones or smartwatches.