Developing Machine Learning-Based Algorithms: Classification and Regression
The curse of dimensionality causes well-known and widely discussed problems for machine learning methods. There is a hypothesis that using the Manhattan distance and even fractional lp quasinorms (for p less than 1) can help to overcome this curse in classification problems. In this thesis (chapter 3), this hypothesis is systematically tested. We demonstrated that fractional norms and quasinorms do not help to overcome the curse of dimensionality. A second strand of the thesis is to investigate a series of linear regression models based on different loss functions, in order to analyse the robustness of the coefficients across all models under consideration. This led us to propose a new, robust Piecewise Quadratic Sub Quadratic (PQSQ) regression model (chapter 4). The proposed method combines the advantages of the PQSQ-L1 and PQSQ-L2 loss functions, which yield the proposed PQSQ-Huber method. The thesis also includes the investigation of linear regression models in the presence of multicollinearity and outliers in a dataset. The Ordinary Least Squares (OLS) estimator is unstable and displays a large variance of coefficients, or its solution may even not exist. Thus, several regularisation methods, including ridge regression (RR), can reduce the variance of the OLS coefficients, at the cost of introducing some bias. However, ridge regression is also based on the minimization of a quadratic loss function, which is sensitive to outliers. As a result, we proposed novel robust ridge regression estimators based on the PQSQ function (chapter 5).
History
Supervisor(s)
Alexander Gorban; Evgeny MirkesDate of award
2023-03-06Author affiliation
School of Computing and Mathematical SciencesAwarding institution
University of LeicesterQualification level
- Doctoral
Qualification name
- PhD