University of Leicester
Browse
Data_Temperature_Informed_Streaming_for_Optimising_Large-Scale_Multi-Tiered_Storage.pdf (10.85 MB)

Data Temperature Informed Streaming for Optimising Large-Scale Multi-Tiered Storage

Download (10.85 MB)
journal contribution
posted on 2024-05-24, 10:40 authored by Dominic Davies-Tagg, Ashiq AnjumAshiq Anjum, Ali Zahir, Lu Liu, Muhammad Usman Yaseen, Nick Antonopoulos
Data temperature is a response to the ever-growing amount of data. These data have to be stored, but they have been observed that only a small portion of the data are accessed more frequently at any one time. This leads to the concept of hot and cold data. Cold data can be migrated away from high-performance nodes to free up performance for higher priority data. Existing studies classify hot and cold data primarily on the basis of data age and usage frequency. We present this as a limitation in the current implementation of data temperature. This is due to the fact that age automatically assumes that all new data have priority and that usage is purely reactive. We propose new variables and conditions that influence smarter decision-making on what are hot or cold data and allow greater user control over data location and their movement. We identify new metadata variables and user-defined variables to extend the current data temperature value. We further establish rules and conditions for limiting unnecessary movement of the data, which helps to prevent wasted input output (I/O) costs. We also propose a hybrid algorithm that combines existing variables and new variables and conditions into a single data temperature. The proposed system provides higher accuracy, increases performance, and gives greater user control for optimal positioning of data within multi-tiered storage solutions.

History

Author affiliation

College of Science & Engineering Comp' & Math' Sciences

Version

  • VoR (Version of Record)

Published in

Big Data Mining and Analytics

Volume

7

Issue

2

Pagination

371 - 398

Publisher

Tsinghua University Press

issn

2096-0654

eissn

2097-406X

Copyright date

2024

Available date

2024-05-24

Language

en

Deposited by

Professor Ashiq Anjum

Deposit date

2024-05-23

Usage metrics

    University of Leicester Publications

    Categories

    No categories selected

    Licence

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC