Evaluation of Generative Modeling Techniques for Frequency Responses

: During microwave design, it is of practical interest to obtain insight in the statistical variability of a device’s frequency response with respect to several sources of variation. Unfortunately, the frequency response acquisition can be particularly time-consuming or expensive. This makes uncertainty quantiﬁcation unfeasible when dealing with complex networks. Generative modeling techniques that are based on machine learning can reduce the computation load by learning the underlying stochastic process from few instances of the device response and generating new ones by executing an inexpensive sampling strategy. This way, an arbitrary number of frequency responses can be obtained that are drawn from a probability distribution that resembles the original one. The use of Gaussian Process Latent Variable Models (GP-LVM) and Variational Autoencoders (VAE) as modeling algorithms will be evaluated in a generative framework. The framework includes a Vector Fitting (VF) pre-processing step which guarantees stability and reciprocity of S-matrices by converting them into a suitable rational model. Both GP-LVM and VAE are tested on the S-parameter responses of two linear multi-port network examples.


Introduction
In recent years, efficient and accurate uncertainty quantification methods have become a critical resource for the design on modern RF and microwave circuits. Indeed, due to the increasing integration and miniaturization capabilities, the performance of such circuits is largely affected by the tolerances of the manufacturing process. Typically, performing uncertainty quantification requires to obtain a large number of statistical samples (or instances), which is time-consuming and costly. Hence, several stochastic modeling techniques have been presented in recent years to overcome these limitations, for example based on the Polynomial Chaos (PC) expansion [1] or on Stochastic Reduced Order Models (SROM) [2].
Recently, novel techniques have been presented in the literature based on generative modeling [3,4]. The main advantage of such methods is that uncertainty quantification can be performed with accuracy and efficiency, independently on how many geometrical or electrical parameters of the system under study are under stochastic effects. Generative models are able to efficiently generate a large set of frequency responses whose distribution closely matches that of the system under study, starting from a limited set of training data. Such training data is a small set of frequency responses, which can be obtained via simulations or measurements.
In the following paper, we analyze a generative approach which can employ GP-LVM or VAE as generative model. Differently than the technique [4], which directly models the sampled frequency response via the VAE, we investigate a two-step approach, as proposed in [3]. Firstly, a suitable rational model of the training data is obtained via the Vector Fitting algorithm (VF) [5]. Then, the generative model is trained to describe and reproduce the stochastic variations of the rational model's parameters. Note that the stability and reciprocity of the system under study are guaranteed due to the VF characterization, while passivity is enforced by rejection. We analyze both GP-LVM's and VAE's capability to reproduce the data yielded by the rational representation, as well as complex relations between the frequency response and the design parameters.
The paper is organized as follows: the proposed approach and generative algorithms are presented in Section 2, while their validation is carried out in Section 3 by means of two pertinent numerical examples. Conclusions are drawn in Section 4.

Methodology
Our technique follows the workflow in Figure 1. The first step is converting few initial scattering parameter (S-parameter) instances, which constitute the training set, into rational form coefficients. This is executed by means of the VF algorithm, as described in Section 2.1. Then, the generative modeling algorithm is trained to reproduce the probability distribution of the rational coefficients, given a latent space prior distribution. Next, new S-parameter instances can be obtained by drawing samples from the model's latent space and extracting the corresponding output of the rational model. New instances are drawn until reaching a suitable number of passive responses. In order to evaluate the accuracy of the proposed method, Section 2.4 presents a suitable similarity metric, which is employed to compare the distribution of a large set of instances drawn from the computed generative model and from the system under study (also called validation set).

Vector Fitting
Starting from a set of S-matrix frequency samples, the VF algorithm [5] computes a rational model in a pole/residue form as where s is the complex frequency variable, while a i and r i are poles and residues, respectively. The set of poles and residues fully determines the frequency behaviour of each S-matrix element. Since the location of the poles fluctuates amongst different frequency responses realizations, it is decided to model them using a common pole set a i . Thereby, one instance is only represented by by a set of residues for each S-matrix element. Once a new set of residues are produced by the generative model, S-parameters can be extracted from the corresponding rational form by evaluating it on the desired frequency points.

Gaussian Process Latent Variable Model
In this framework, the purpose of the generative model is to reproduce the distribution of observed residues data Y, given a distribution of variables X in a latent space of lower dimensionality. Thus, the objective is to extract marginal probability p(Y) from the joint probability of Y and X: where p(Y|X) is the likelihood function, while p(X) is the prior of the latent variables. Assuming that the underlying stochastic process of the S-parameters is Gaussian, a standard normal density can be assigned to the prior p(X) = N (O, I). The GP-LVM [7] performs a mapping from the latent space to the observed space using Gaussian Processes (GPs); each dimension d of the data vectors in Y is modeled by an independent GP, such that the likelihood function becomes: where y d represents the observations vector of the d th residue in Y, whereas Σ is a specified kernel function matrix. For this application, the Automatic Relevance Detection (ARD) kernel is chosen, while the variational lower bound [7] is maximized to obtain the marginalized likelihood p(Y) and optimize GPs hyperparameters accordingly. The GP-LVM allows to predict a new instance of residue data, by evaluating the GPs on a sample drawn from the latent space prior.

Variational Autoencoder
The Variational Autoencoder [8] provides a different method of modeling the marginal probability p(Y), while assuming the same prior for the latent space variables: p(X) = N (O, I). Unlike the GP-LVM, the VAE maps the latent space to the observed space by learning simultaneously the posterior distribution p(X|Y) and the p(Y|X) likelihood. This technique leverages on Bayesian inference to approximate p(X|Y) with a parameterized variational distribution q φ , where φ is a suitable parameter of the model. This allows one to compute the posterior distribution p(X|Y) by minimizing a dissimilarity metric between p and q. For this purpose, the distance is usually defined as the Kullback-Leibler divergence D KL (q||p): At the same time, the conditional likelihood p θ (Y|X) can be parameterized over a suitable model parameter, indicated with θ, and optimized by maximizing its marginal log-likelihood p(Y). These two tasks can be combined into a single maximization of a variational lower bound L: where θ, φ are the model parameters. The latent space mapping is modeled by a neural network in an encoder-decoder architecture (Figure 2). The encoder represents an input instance y as a sample of q(X|Y), defined by the mean (µ x ) and the standard deviation (σ x ), while the decoder converts a sample x * from the posterior q(X|Y) into an instance of the observable space, reconstructing the initial input y . Adding an intermediate sampling operation allows one to produce x * by drawing a sample from a standard Gaussian distribution: where d is every latent space dimension. This strategy, known as "reparametrization trick", enables backpropagation, so that the network can be trained by maximizing the variational lower bound [8]. Indeed, the training forces the encoder to approximate p(X), reducing the KL divergence. In this manner, the network can generate a new instance by directly drawing a sample from the Gaussian prior and evaluating the corresponding decoder output.

Similarity Metric
The similarity between the distribution of the generated samples and the original one is estimated by the Cramer-Von-Mises (CM) statistic [9]. The CM distance can be computed on two sets of S-parameters, drawn respectively from the generated samples and the validation set, which are relative to the same frequency value. Thus, lower CM scores indicate higher similarity between the distributions. Note that, in our problem setting, the CM score is computed for each frequency sample of the computed S-parameters.

Results and Discussion
The proposed modeling framework is tested on a two microstrip devices: a pair of coupled transmission lines [3] and a folded-stub notch filter [10]. Their responses are computed using the settings in Table 1, via the simulator ADS Momentum [11]. Training and validation sets are obtained by varying several design parameters, whose value is chosen by sampling independent Gaussian distributions: each distribution has the same standard deviation and is centered in its parameter nominal value. The VF algorithm, as described in Section 2.1, converts the training instances into the corresponding rational form, which is then fed to the generative model. The order of the rational model, which corresponds to the number of poles and residues, is reported in Table 1. After training the GP-LVM and the VAE, as indicated in Section 2.2 and 2.3, new S-matrices instances are generated until 1000 passive frequency responses are reached. The complete workflow is repeated 10 times on different training sets for a more accurate validation. Figure 3 illustrates results for the S 21 parameter: 50 samples are drawn from the validation set, and the learnt distributions of the GP-LVM and VAE, respectively. The generated samples closely match the original ones, demonstrating a comparable accuracy of the generative models. Finally, the Figure 4 reports the CM score for the generated S 21 parameter. In Example 1, the GP-LVM shows a lower score, i.e., better accuracy, across all the frequency bandwidth. On the other hand, the VAE appears more accurate in Example 2. It is worth noting that the folded-stub filter presents a wide-band and highly variable frequency response: this causes higher CM-scores for both generative models in Example 2. Thus, both GP-LVM and VAE can be valuable when the framework is applied on new microwave devices.