A time series autoencoder for load identification via dimensionality reduction of sensor recordings

Current progress in sensor technology is setting the ground to push toward satisfactory solutions to challenging engineering problems, like e.g., system identification and Structural Health Monitoring (SHM). In civil engineering, SHM is often based on the analysis of vibrational recordings, represented by time histories of displacements and/or accelerations, collected through pervasive sensor networks and shaped as Multivariate Time Series (MTS). Despite the great advances in soft computing techniques such as neural networks, inverse problems featuring regression tasks on raw vibrational measurements are still challenging. Developing dimensionality reduction tools, able to infer complex correlations within and across the recorded time series, is then of paramount importance. In this work, we designed an AutoEncoder (AE) capable of condensing MTS-shaped data in a reduced format featuring a few latent variables only. The obtained reduced data representation enhances the solution of inverse problems, like e.g., the identification of the parameters governing the dynamic load applied to a structural system. Numerical examples, aimed at the identification of the loading conditions on a shear-type building, are reported to assess the effectiveness of the proposed procedure.


Introduction
Data collected by pervasive sensor networks have to be processed, since they are usually unmanageable in their raw forms. Their dimension is the principal obstacle making their use extremely difficult, while the information content is typically highly redundant. Synthetic features like spectral peak frequencies, usually exploited when the acquired data are shaped as Time Series (TS), are extracted to solve engineering tasks, like load identification and Structural Health Monitoring (SHM) [1,2]. Deep Learning (DL) allows extracting features from the data according to the required task, avoiding any preliminar feature design [3][4][5][6]. Among DL techniques, AutoEncoders (AEs) are special type of Neural Networks (NN) able to obtain a reduced data representation [7], also called latent representation, without specifying the task the reduced data representation must be used for.
The NN architecture employed by an AE is usually deep or, in other words, involves the use of multiple sequential transformations. The advantages of employing AEs are manifold: (i) no feature engineering is necessary; (ii) the obtained reduced data representation can be used for different tasks; (iii) Eng they provide the most informative data representation by setting the number of latent variables or, at least, the one that allows to reconstruct data at best. Thanks to their reduced number, latent variables are often interpretable, but only at the price of knowing something about what stays behind the variability of the collected data [8].
In the following, a novel TS AE is proposed for the dimensionality reduction of the pseudo-experimental Multivariate Time Series (MTSs) recordings related to the displacement response of a two-storey shear building. The effectiveness of the dimensionality reduction is judged by the AE ability of reconstructing the input signals from their latent representation. Despite the lack of any a priori performed task-oriented feature engineering, the obtained reduced data representation allows the identification of the load conditions applied to the building.

Methodology: A Deep Autoencoder for Load Identification
A Neural Network (NN) is a collection of units, called neurons. Each neuron performs, in its basic form, a linear combination of its input V ∈ R L (which reads v b for the AE input channels, see below) via a weight vector ω, and applies a nonlinear activation function ζ. If a set of L neurons, called layer, is applied to V, the output becomes a vector U (V, Ω) ∈ R L , where Ω = [ω 1 , . . . , ω L ]. Many layers can be stacked one after another, making the NN architecture deep.
A special type of NN layer is the convolutional one, which allows to infer correlations within and across the inputs, whenever the inputs are shaped as a collection of one-dimensional arrays. In this work, the inputs are a set of MTSs v = [v 1 , . . . , v N ] ∈ R L×N acquired by a sensor system employing N sensors, and sampling L displacement recordings within a time interval (0, T). The output U (V, Ω) = [u 1 , . . . , u N out ] of a one-dimensional convolutional layer then reads where: * : R H out × R L → R L is the discrete convolution operator [9]; Ω n = ω 1 n , . . . , ω N n ∈ R H out ×N are the weights applied to v b (with b = 1, . . . , N); Ω = [Ω 1 , . . . , Ω N out ] ∈ R H out ×N×N out collects all the layer weights; H out is the kernel dimension; N also represents the number of channels of the input layer; N out is the number of channels of the output layer.
One-dimensional convolutional layers are the building blocks of the proposed AE. This latter is composed by an encoder enc and by a decoder dec. The encoder maps the input V into a latent representation z = z (V) ∈ R P , with P (L × N), while the decoder maps z into a two dimensional array U = U (z) ∈ R L×N . Being U shaped as V, we can enforce the AE to reconstruct V from z by defining as loss function to be minimised by the NN during the training, which consists in tuning the weights Ω ruling the layer operations. The latent representation z can be used to solve a regression problem, involving the identification of the parameter vector η ∈ R Q e.g., governing the loadings applied to the structure. If the decoder can (almost perfectly) reconstruct V starting from z, it means that z condenses all the relevant informations of V. As shown in Figure 1, a NN-based regression model r is employed to retrieve η starting from z, accomplishing this way the load identification task. To train r, a loss function c r (η, u r ) is defined as done in Equation (2), where u r ∈ R Q is the prediction of r. The training of the AE and of r takes place sequentially, Eng. Proc. 2020, 2, 34 3 of 8 first minimising c (V, U), and then minimising c r (η, u r ). A popular first-order stochastic gradient descend algorithm, called Adam [10], has been employed for these procedure tasks.
Proposed procedure for the regression of η on z. First (black part), the AE is trained by minimising c (V, U); next (orange part), r is trained by miminising c r (η, u r ). TS AE stays for Time Series AutoEncoder.

Results and Discussion
The lateral displacements of a two-storey building, shown in Figure 2, are monitored by a sensor system employing two sensors (one per floor), recording L samples within the time window (0, T). Then, the output of the monitoring system is an MTS V ∈ R L×N , with L = 250 and N = 2. The dynamic response of the structure is simulated by means of a two-dimensional shear building model wherein, due to the mass distribution and load bearing elements, torsional effects have been disregarded. Damping has not been modelled, having a negligible effect on the identification of continuously excited structures [11,12]. We assumed that the applied lateral loads consist of forces enforced at the floor levels, featuring a sinusoidal time dependence, ruled by the parameter φ, and a linearly increasing amplitude along the building height, governed by the parameter α, i.e. A c = 0.5cαsin (2πφt) with c = 1, 2. Therefore, the parameter vector η = {α, φ} looks sufficient to fully describe the loading conditions. A uniform probability density function was associated with each parameter: U α (α) = A dataset, collecting 12, 000 MTSs, has been assembled to train the AE and r; 4000 additional MTSs, forming the validation set, have been then employed to avoid overfitting. The training dataset is processed several times, or epochs. If the loss function computed with the validation set has not reduced for 50 epochs in a row, the training has been early stopped. A test set, gathering 512 MTSs, has been then employed to verify the reconstruction capacity of the AE, and the performance of the proposed load identification procedure. The reconstruction capacity has been evaluated through two error measures, employing either a standardised L 2 norm or a standardised L ∞ norm. The error measures have been computed for each reconstructed signal, and standardisation has been done by dividing the reconstruction error (either the L 2 or L ∞ norm) by the standard deviation of the original signal. Without standardisation, small inaccuracies in reconstructing large displacements would have counted more than large inaccuracies at smaller scales.
A thorough investigation has been carried out to study how the number P of latent variables and the parameter φ ruling the time dependence of loading, affect the reconstruction capacity of the AE; the other way around, no correlation between the reconstruction error and α has been found in our experiments. Indeed, the mean value and the spread of the reconstruction error can not be modelled as a function of α, but rather as a function of φ. Figures 3 and 4 depict the reconstruction error measured, respectively, by the standardised L 2 and L ∞ norms, when the input signals have been taken from the test set. The graphs for P = 5 (not reported for brevity) are analogous to those obtained for P = 6, even if showing slightly higher values of the reconstruction error. An increasing value of P does not lead to a monotonic enhancement of the AE reconstruction capacity, despite the intuition that a larger latent space should make reconstruction easier. Indeed, even if increasing the value of P has not led to retain more information on the system, we do expect that a more redundant representation should not be detrimental.  A clear relation between the error and φ can be underlined. Looking at the standardised L 2 norm, the reconstruction capacity of the AE seems worse when φ ≈ f str 1 and φ ≈ f str 2 . This result is not surprising: the beats produced in the displacement recordings, when φ is close to the structural frequencies of the building, are additional signal characteristics that the AE must struggle to account for. Focusing on the standardised L ∞ norm, the reconstruction error is still large for φ ≈ f str 2 , while it gets smaller for φ ≈ f str 1 .
Eng. Proc. 2020, 2, 34 6 of 8  In Figure 5, a qualitative assessment of the reconstruction capacity of the AE is reported, to better highlight the meaning of the two error norms: the good signal reconstruction obtained for φ f str 1 points toward the L ∞ norm as a more appropriate error measure. On the other hand, we are convinced that both these error measures give meaningful information, because the standardised L 2 norm addresses inaccuracies in reproducing the frequency content of the input signal, while the standardised L ∞ norm highlights the inability of catching its peaks. Still referring to Figure 5, we observe that the amplitude of the signal in Figure 5a is an order of magnitude greater than the one in Figure 5b, despite α = 702 N in the first case, and α = 4341 N in the second case. The reason is that we are exciting an undamped dynamic system with φ closer to f str 1 in Figure 5a than to f str 2 in Figure 5b. On the basis of the obtained latent representation z, we performed the regression of the parameters η governing the loading conditions. As shown in Figure 6b, the regression of the load frequency φ has been rather successfully accomplished: the graph has been obtained with the latent space dimension featuring the highest reconstruction capacity, linked to P = 4. An analogous result has been obtained for the regression of the load amplitude α, shown in Figure 6b, confirming that the proposed strategy, involving dimensionality reduction of the input and the use of a regression model, allows a correct load identification for the case at hand. It is also worth mentioning that the largest errors in the φ prediction have been obtained for the frequency range featuring the highest reconstruction error in the L ∞ norm.

Conclusions
The use of a time series autoencoder was proposed for the dimensionality reduction of sensor recordings, typically acquired for the SHM of civil structures. Thanks to the obtained latent representation, the regression of the parameters governing the loading conditions can be successfully carried out. Two error norms have been used to quantitatively assess the signal reconstruction capacity of the autoencoder, evaluated for different dimensions of the latent space. The capability of the autoencoder to reconstruct Eng. Proc. 2020, 2, 34 8 of 8 the input signals has been assessed also qualitatively, through comparison of the input and reconstructed signals in the less accurate cases.
In future works, we aim to understand the role of the latent space dimension on the autoencoder reconstruction capacity, and to investigate how to set it automatically and optimally.