University of Leicester
Browse

CPT+: Decreasing the time/space complexity of the Compact Prediction Tree

Download (800.74 kB)
conference contribution
posted on 2015-05-07, 10:25 authored by T. Gueniche, P. Fournier-Viger, Rajeev Raman, V. Tseng
Predicting next items of sequences of symbols has many applications in a wide range of domains. Several sequence prediction models have been proposed such as DG, All-k-order markov and PPM. Recently, a model named Compact Prediction Tree (CPT) has been proposed. It relies on a tree structure and a more complex prediction algorithm to offer considerably more accurate predictions than many state-of-the-art prediction models. However, an important limitation of CPT is its high time and space complexity. In this article, we address this issue by proposing three novel strategies to reduce CPT’s size and prediction time, and increase its accuracy. Experimental results on seven real life datasets show that the resulting model (CPT+) is up to 98 times more compact and 4.5 times faster than CPT, and has the best overall accuracy when compared to six state-of-the-art models from the literature: All-K-order Markov, CPT, DG, Lz78, PPM and TDAG.

Funding

Supported by a NSERC grant from the Government of Canada.

History

Citation

Lecture Notes in Computer Science Volume 9078, 2015, pp 625-636

Author affiliation

/Organisation/COLLEGE OF SCIENCE AND ENGINEERING/Department of Computer Science

Source

The 19th Pacific-Asia Conference on Knowledge Discovery and Data Mining, Ho Chi Minh City

Version

  • AM (Accepted Manuscript)

Published in

Lecture Notes in Computer Science Volume 9078

Publisher

Springer

issn

0302-9743

isbn

978-3-319-18031-1;978-3-319-18032-8

Copyright date

1007

Available date

2016-05-09

Publisher version

http://link.springer.com/chapter/10.1007/978-3-319-18032-8_49

Notes

Accepted as "REGULAR PAPER WITH LONG PRESENTATION" 27 of 405 submisisons accepted in this category.

Temporal coverage: start date

2015-05-19

Temporal coverage: end date

2015-05-22

Language

en

Usage metrics

    University of Leicester Publications

    Categories

    No categories selected

    Exports

    RefWorks
    BibTeX
    Ref. manager
    Endnote
    DataCite
    NLM
    DC