Correction of AI systems by linear discriminants: Probabilistic foundations

Grechuk, B; Gorban, A; Golubkov, A; Mirkes, E; Tyukin, I

correctionAI.pdf (633.63 kB)

Correction of AI systems by linear discriminants: Probabilistic foundations

journal contribution

posted on 2019-05-21, 09:55 authored by B Grechuk, A Gorban, A Golubkov, E Mirkes, I Tyukin

Artificial Intelligence (AI) systems sometimes make errors and will make errors in the future, from time to time. These errors are usually unexpected, and can lead to dramatic consequences. Intensive development of AI and its practical applications makes the problem of errors more important. Total re-engineering of the systems can create new errors and is not always possible due to the resources involved. The important challenge is to develop fast methods to correct errors without damaging existing skills. We formulated the technical requirements to the ‘ideal’ correctors. Such correctors include binary classifiers, which separate the situations with high risk of errors from the situations where the AI systems work properly. Surprisingly, for essentially high-dimensional data such methods are possible: simple linear Fisher discriminant can separate the situations with errors from correctly solved tasks even for exponentially large samples. The paper presents the probabilistic basis for fast non-destructive correction of AI systems. A series of new stochastic separation theorems is proven. These theorems provide new instruments for fast non-iterative correction of errors of legacy AI systems. The new approaches become efficient in high-dimensions, for correction of high-dimensional systems in high-dimensional world (i.e. for processing of essentially high-dimensional data by large systems). We prove that this separability property holds for a wide class of distributions including log-concave distributions and distributions with a special ‘SMeared Absolute Continuity’ (SmAC) property defined through relations between the volume and probability of sets of vanishing volume. These classes are much wider than the Gaussian distributions. The requirement of independence and identical distribution of data is significantly relaxed. The results are supported by computational analysis of empirical data sets.

Funding

ANG and IYT were Supported by Innovate UK (KTP009890 and KTP010522) and the Ministry of Education and Science of the Russian Federation (Project 14.Y26.31.0022). BG thanks the University of Leicester for granting him academic study leave to do this research. The proposed corrector methodology was implemented and successfully tested with videostream data and security tasks in collaboration with industrial partners: Apical, ARM, and VMS under support of InnovateUK. We are grateful to them and personally to I. Romanenko, R. Burton, and K. Sofeikov.

History

Citation

Information Sciences, 2018, 466, pp. 303-322

Author affiliation

/Organisation/COLLEGE OF SCIENCE AND ENGINEERING/Department of Mathematics

Version

AM (Accepted Manuscript)

Published in

Information Sciences

Publisher

Elsevier

issn

0020-0255

Acceptance date

2018-07-22

Copyright date

2019

Available date

2019-07-25

Publisher DOI

https://doi.org/10.1016/j.ins.2018.07.040

Publisher version

https://www.sciencedirect.com/science/article/pii/S0020025518305607?via=ihub

Notes

The file associated with this record is under embargo until 12 months after publication, in accordance with the publisher's self-archiving policy. The full text may be available through the publisher links provided above.

Language

en

Administrator link

https://leicester.figshare.com/account/articles/10244954

Usage metrics

Keywords

Big data Non-iterative learning Error correction Measure concentration Blessing of dimensionality Linear discriminant

Licence

CC BY-NC-ND 4.0

Exports

RefWorks

BibTeX

Ref. manager

Endnote

DataCite

NLM

DC

Correction of AI systems by linear discriminants: Probabilistic foundations

Funding

History

Citation

Author affiliation

Version

Published in

Publisher

issn

Acceptance date

Copyright date

Available date

Publisher DOI

Publisher version

Notes

Language

Administrator link

Usage metrics

Categories

Keywords

Licence

Exports