posted on 2019-02-01, 10:18authored byKangrui Wang
The pursuit of the correlation structure of a high-dimensional random construct, underlines my doctoral studies. This thesis reports on the development of methodologies that help undertake learning of functional relationships between variables, given high-dimensional discontinuous data that exhibit non-stationary correlation structure, with such methods tying in with methods needed to undertake such difficult correlation learning–and its possible intuitive graphical representations as networks. These developed methods are then presented in an application-ready format, in which the relevant inference is typically undertaken with Markov Chain Monte Carlo methods. I have worked on developing Bayesian methodologies for the supervised learning of the functional relationship between a system vector and another tensor-valued observable that affects the system vector, given real training data that consists of known pairs of values of these variables. The probabilistic learning of the functional relation between these variables is done by modelling this function with a high-dimensional Gaussian Process (GP), and the likelihood is then parametrised by multiple covariance matrices. I have developed on the method of nesting GPs of different dimensionalities, to render covariance kernels nonstationary, by treating each kernel hyper-parameter as a realisation from a scalar-valued GP. The inner layer of this learning strategy is then built of scalar-valued GPs, which are nested within a tensor-valued GP, and inference is done with Metropolis-within-Gibbs. It is natural that such interest includes the learning of the correlation structure of multivariate, rectangularly-shaped data, which is manifest in the sought graphical model of this data, where I determine objective uncertainties in the learning of such a graphical models, where such uncertainty learning allows me to quantify the correlation between a pair of such datasets by computing the distance between the (posterior probability densities of the) learnt graphical models of the respective datasets. Applications include the learning of the very large, human disease-symptom network and computation of the distance between the vino-chemical graphical models of red and white Portuguese wines.