University of Leicester
Browse

Linear Dimensionality Reduction: What Is Better?

Download (360.22 kB)
journal contribution
posted on 2025-07-28, 14:12 authored by Mohit Baliyan, Evgeny MirkesEvgeny Mirkes
This research paper focuses on dimensionality reduction, which is a major subproblem in any data processing operation. Dimensionality reduction based on principal components is the most used methodology. Our paper examines three heuristics, namely Kaiser’s rule, the broken stick, and the conditional number rule, for selecting informative principal components when using principal component analysis to reduce high-dimensional data to lower dimensions. This study uses 22 classification datasets and three classifiers, namely Fisher’s discriminant classifier, logistic regression, and K nearest neighbors, to test the effectiveness of the three heuristics. The results show that there is no universal answer to the best intrinsic dimension, but the conditional number heuristic performs better, on average. This means that the conditional number heuristic is the best candidate for automatic data pre-processing.<p></p>

History

Author affiliation

College of Science & Engineering Comp' & Math' Sciences

Version

  • VoR (Version of Record)

Published in

Data

Volume

10

Issue

5

Pagination

70 - 70

Publisher

MDPI AG

eissn

2306-5729

Copyright date

2025

Available date

2025-07-28

Language

en

Deposited by

Dr Evgeny Mirkes

Deposit date

2025-06-20

Data Access Statement

All datasets used in this study can be found in the UCI data repository. References are presented in Table 1. The code and copy of the datasets can be found in [90].

Usage metrics

    University of Leicester Publications

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC