Change search
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
Efficient Learning on High-dimensional Operational Data
RISE - Research Institutes of Sweden (2017-2019). KTH Royal Institute of Technology, Sweden.
KTH Royal Institute of Technology, Sweden.
RISE - Research Institutes of Sweden (2017-2019), ICT, SICS. KTH Royal Institute of Technology, Sweden.ORCID iD: 0000-0001-6039-8493
2019 (English)In: 15th International Conference on Network and Service Management, CNSM 2019, Institute of Electrical and Electronics Engineers Inc. , 2019Conference paper, Published paper (Refereed)
Abstract [en]

In networked systems engineering, operational data gathered from sensors or logs can be used to build data-driven functions for performance prediction, anomaly detection, and other operational tasks. The number of data sources used for this purpose determines the dimensionality of the feature space for learning and can reach millions for medium-sized systems. Learning on a space with high dimensionality generally incurs high communication and computational costs for the learning process. In this work, we apply and compare a range of methods, including, feature selection, Principle Component Analysis (PCA), and autoencoders with the objective to reduce the dimensionality of the feature space while maintaining the prediction accuracy when compared with learning on the full space. We conduct the study using traces gathered from a testbed at KTH that runs a video-on-demand service and a key-value store under dynamic load. Our results suggest the feasibility of reducing the dimensionality of the feature space of operational data significantly, by one to two orders of magnitude in our scenarios, while maintaining prediction accuracy. The findings confirm the Manifold Hypothesis in machine learning, which states that real-world data sets tend to occupy a small subspace of the full feature space. In addition, we investigate the tradeoff between prediction accuracy and prediction overhead, which is crucial for applying the results to operational systems

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc. , 2019.
Keywords [en]
Data-driven engineering, Dimensionality reduction, Machine learning, ML, Anomaly detection, Dynamic loads, Forecasting, Principal component analysis, Video on demand, Computational costs, Data driven, High dimensionality, Operational systems, Performance prediction, Prediction accuracy, Principle component analysis, Video on demand services, Learning systems
National Category
Natural Sciences
Identifiers
URN: urn:nbn:se:ri:diva-44696DOI: 10.23919/CNSM46954.2019.9012741Scopus ID: 2-s2.0-85081966035ISBN: 9783903176249 (print)OAI: oai:DiVA.org:ri-44696DiVA, id: diva2:1417788
Conference
15th International Conference on Network and Service Management, CNSM 2019, 21 October 2019 through 25 October 2019
Note

Funding details: VINNOVA; Funding text 1: The authors are grateful to Erik Ylipaa with RISE AI, as well as to Andreas Johnsson, and Christofer Flinta with Ericsson Research for fruitful discussion around this work. This research has been partially supported by the Swedish Governmental Agency for Innovation Systems, VINNOVA, through project AutoDC and by the KTH Software Research Center CASTOR.

Available from: 2020-03-30 Created: 2020-03-30 Last updated: 2020-03-30Bibliographically approved

Open Access in DiVA

No full text in DiVA

Other links

Publisher's full textScopus

Authority records BETA

Stadler, Rolf

Search in DiVA

By author/editor
Stadler, Rolf
By organisation
RISE - Research Institutes of Sweden (2017-2019)SICS
Natural Sciences

Search outside of DiVA

GoogleGoogle Scholar

doi
isbn
urn-nbn

Altmetric score

doi
isbn
urn-nbn
Total: 2 hits
CiteExportLink to record
Permanent link

Direct link
Cite
Citation style
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Other style
More styles
Language
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Other locale
More languages
Output format
  • html
  • text
  • asciidoc
  • rtf
v. 2.35.10