posted on 2019-10-17, 08:52authored byP Jones, E Mirkes, T Yates, C Edwardson, M Catt, M Davies, K Khunti, A Rowlands
Few methods for classifying physical activity from accelerometer data have been tested
using an independent dataset for cross-validation, and even fewer using multiple independent
datasets. The aim of this study was to evaluate whether unsupervised machine learning was a viable
approach for the development of a reusable clustering model that was generalisable to independent
datasets. We used two labelled adult laboratory datasets to generate a k-means clustering model.
To assess its generalised application, we applied the stored clustering model to three independent
labelled datasets: two laboratory and one free-living. Based on the development labelled data, the
ten clusters were collapsed into four activity categories: sedentary, standing/mixed/slow ambulatory,
brisk ambulatory, and running. The percentages of each activity type contained in these categories
were 89%, 83%, 78%, and 96%, respectively. In the laboratory independent datasets, the consistency
of activity types within the clusters dropped, but remained above 70% for the sedentary clusters,
and 85% for the running and ambulatory clusters. Acceleration features were similar within each
cluster across samples. The clusters created reflected activity types known to be associated with
health and were reasonably robust when applied to diverse independent datasets. This suggests that
an unsupervised approach is potentially useful for analysing free-living accelerometer data.
Funding
The data collection of dataset 1 was funded by a research grant awarded by Unilever Discover to the
School of Sport and Health Sciences, University of Exeter. This research was supported by the National Institute
for Health Research (NIHR) Leicester Biomedical Research Centre, and the NIHR Collaboration for Leadership
in Applied Health Research and Care–East Midlands. The views expressed are those of the authors and not
necessarily those of the NHS, the NIHR, or the Department of Health.
History
Citation
Sensors, 2019, 19, 4504
Author affiliation
/Organisation/COLLEGE OF LIFE SCIENCES/School of Medicine/Diabetes Research Centre
The following are available online at http://www.mdpi.com/1424-8220/19/20/4504/s1,
Table S1: Summary of Time Domain Features Utilised in Previous Studies, Table S2: Summary of Frequency
Domain Features Utilised in Previous Studies, Table S3: Average cluster purity and event purity.