Change search
Link to record
Permanent link

Direct link
BETA
Publications (10 of 20) Show all publications
Shahab, F., Stadler, R., Johnsson, A. & Flinta, C. (2019). Demonstration: Predicting distributions of service metrics. In: 2019 IFIP/IEEE Symposium on Integrated Network and Service Management, IM 2019: . Paper presented at 2019 IFIP/IEEE Symposium on Integrated Network and Service Management, IM 2019, 8 April 2019 through 12 April 2019 (pp. 745-746). Institute of Electrical and Electronics Engineers Inc.
Open this publication in new window or tab >>Demonstration: Predicting distributions of service metrics
2019 (English)In: 2019 IFIP/IEEE Symposium on Integrated Network and Service Management, IM 2019, Institute of Electrical and Electronics Engineers Inc. , 2019, p. 745-746Conference paper, Published paper (Refereed)
Abstract [en]

The ability to predict conditional distributions of service metrics is key to understanding end-to-end service behavior. From conditional distributions, other metrics can be derived, such as expected values and quantiles, which are essential for assessing SLA conformance. Our demonstrator predicts conditional distributions and derived metrics estimation in realtime, using infrastructure measurements. The distributions are modeled as Gaussian mixtures whose parameters are estimated using a mixture density network. The predictions are produced for a Video-on-Demand service that runs on a testbed at KTH.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2019
Keywords
Machine Learning, Service Engineering, Service Management, Forecasting, Learning systems, Video on demand, Conditional distribution, End-to-end service, Expected values, Gaussian mixtures, Mixture density, Video on demand services, Telecommunication services
National Category
Natural Sciences
Identifiers
urn:nbn:se:ri:diva-39271 (URN)2-s2.0-85067047473 (Scopus ID)9783903176157 (ISBN)
Conference
2019 IFIP/IEEE Symposium on Integrated Network and Service Management, IM 2019, 8 April 2019 through 12 April 2019
Available from: 2019-07-03 Created: 2019-07-03 Last updated: 2019-07-03Bibliographically approved
Moradi, F., Stadler, R. & Johnsson, A. (2019). Performance prediction in dynamic clouds using transfer learning. In: 2019 IFIP/IEEE Symposium on Integrated Network and Service Management, IM 2019: . Paper presented at 2019 IFIP/IEEE Symposium on Integrated Network and Service Management, IM 2019, 8 April 2019 through 12 April 2019 (pp. 242-250). Institute of Electrical and Electronics Engineers Inc.
Open this publication in new window or tab >>Performance prediction in dynamic clouds using transfer learning
2019 (English)In: 2019 IFIP/IEEE Symposium on Integrated Network and Service Management, IM 2019, Institute of Electrical and Electronics Engineers Inc. , 2019, p. 242-250Conference paper, Published paper (Refereed)
Abstract [en]

Learning a performance model for a cloud service is challenging since its operational environment changes during execution, which requires re-training of the model in order to maintain prediction accuracy. Training a new model from scratch generally involves extensive new measurements and often generates a data-collection overhead that negatively affects the service performance.In this paper, we investigate an approach for re-training neural-network models, which is based on transfer learning. Under this approach, a limited number of neural-network layers are re-trained while others remain unchanged. We study the accuracy of the re-trained model and the efficiency of the method with respect to the number of re-trained layers and the number of new measurements. The evaluation is performed using traces collected from a testbed that runs a Video-on-Demand service and a Key-Value Store under various load conditions. We study model re-training after changes in load pattern, infrastructure configuration, service configuration, and target metric. We find that our method significantly reduces the number of new measurements required to compute a new model after a change. The reduction exceeds an order of magnitude in most cases.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2019
Keywords
Machine Learning, Neural Networks, Performance Prediction, Service Management, Transfer Learning, Forecasting, Learning systems, Video on demand, Neural network model, Operational environments, Prediction accuracy, Service configuration, Video on demand services, Network layers
National Category
Natural Sciences
Identifiers
urn:nbn:se:ri:diva-39272 (URN)2-s2.0-85067071723 (Scopus ID)9783903176157 (ISBN)
Conference
2019 IFIP/IEEE Symposium on Integrated Network and Service Management, IM 2019, 8 April 2019 through 12 April 2019
Note

Funding details: VINNOVA; Funding text 1: ACKNOWLEDGMENT The authors are grateful to Jawwad Ahmed and Christofer Flinta, both at Ericsson Research, and Forough Shahab Samani at KTH for fruitful discussions around this work. This research has been partially supported by the Swedish Governmental Agency for Innovation Systems, VINNOVA, through projects Celtic SENDATE EXTEND and ITEA3 AutoDC.

Available from: 2019-07-03 Created: 2019-07-03 Last updated: 2019-07-03Bibliographically approved
Yanggratoke, R., Ahmed, J., Ardelius, J., Flinta, C., Johnsson, A., Gillblad, D. & Stadler, R. (2018). A service-agnostic method for predicting service metrics in real time. International Journal of Network Management, 28(2), Article ID e1991.
Open this publication in new window or tab >>A service-agnostic method for predicting service metrics in real time
Show others...
2018 (English)In: International Journal of Network Management, ISSN 1055-7148, E-ISSN 1099-1190, Vol. 28, no 2, article id e1991Article in journal (Refereed) Published
Abstract [en]

We predict performance metrics of cloud services using statistical learning, whereby the behaviour of a system is learned from observations. Specifically, we collect device and network statistics from a cloud testbed and apply regression methods to predict, in real-time, client-side service metrics for video streaming and key-value store services. Results from intensive evaluation on our testbed indicate that our method accurately predicts service metrics in real time (mean absolute error below 16% for video frame rate and read latency, for instance). Further, our method is service agnostic in the sense that it takes as input operating systems and network statistics instead of service-specific metrics. We show that feature set reduction significantly improves the prediction accuracy in our case, while simultaneously reducing model computation time. We find that the prediction accuracy decreases when, instead of a single service, both services run on the same testbed simultaneously or when the network quality on the path between the server cluster and the client deteriorates. Finally, we discuss the design and implementation of a real-time analytics engine, which processes streams of device statistics and service metrics from testbed sensors and produces model predictions through online learning. 

Keywords
cloud computing, machine learning, quality of service, real-time network analytics, statistical learning, Forecasting, Learning systems, Regression analysis, Statistics, Testbeds, Video streaming, Design and implementations, Mean absolute error, Network statistics, Performance metrics, Prediction accuracy, Real time network, Real-time analytics, Distributed computer systems
National Category
Natural Sciences
Identifiers
urn:nbn:se:ri:diva-33516 (URN)10.1002/nem.1991 (DOI)2-s2.0-85029351383 (Scopus ID)
Note

Funding details: VINNOVA; Funding details: 2013-03895, VINNOVA; This research has been supported by the Swedish Governmental Agency for Innovation Systems, VINNOVA, under grant 2013-03895.

Available from: 2018-03-23 Created: 2018-03-23 Last updated: 2018-08-17Bibliographically approved
Ahmed, J. I., Josefsson, T., Johnsson, A., Flinta, C., Moradi, F., Pasquini, R. & Stadler, R. (2018). Automated diagnostic of virtualized service performance degradation. In: IEEE/IFIP Network Operations and Management Symposium: Cognitive Management in a Cyber World, NOMS 2018. Paper presented at 2018 IEEE/IFIP Network Operations and Management Symposium, NOMS 2018, 23 April 2018 through 27 April 2018.
Open this publication in new window or tab >>Automated diagnostic of virtualized service performance degradation
Show others...
2018 (English)In: IEEE/IFIP Network Operations and Management Symposium: Cognitive Management in a Cyber World, NOMS 2018, 2018Conference paper, Published paper (Refereed)
Abstract [en]

Service assurance for cloud applications is a challenging task and is an active area of research for academia and industry. One promising approach is to utilize machine learning for service quality prediction and fault detection so that suitable mitigation actions can be executed. In our previous work, we have shown how to predict service-level metrics in real-time just from operational data gathered at the server side. This gives the service provider early indications on whether the platform can support the current load demand. This paper provides the logical next step where we extend our work by proposing an automated detection and diagnostic capability for the performance faults manifesting themselves in cloud and datacenter environments. This is a crucial task to maintain the smooth operation of running services and minimizing downtime. We demonstrate the effectiveness of our approach which exploits the interpretative capabilities of Self- Organizing Maps (SOMs) to automatically detect and localize different performance faults for cloud services.

Keywords
Fault detection, Fault localization, Machine learning, Service quality, System statistics, Video streaming, Conformal mapping, Learning systems, Quality of service, Self organizing maps, Automated detection, Automated diagnostics, Cloud applications, Self organizing maps(soms), Virtualized services
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:ri:diva-37294 (URN)10.1109/NOMS.2018.8406234 (DOI)2-s2.0-85050672220 (Scopus ID)9781538634165 (ISBN)
Conference
2018 IEEE/IFIP Network Operations and Management Symposium, NOMS 2018, 23 April 2018 through 27 April 2018
Available from: 2019-01-18 Created: 2019-01-18 Last updated: 2019-03-28Bibliographically approved
Samani, F. S. & Stadler, R. (2018). Predicting Distributions of Service Metrics using Neural Networks. In: 14th International Conference on Network and Service Management, CNSM 2018 and Workshops, 1st International Workshop on High-Precision Networks Operations and Control, HiPNet 2018 and 1st Workshop on Segment Routing and Service Function Chaining, SR+SFC 2018: . Paper presented at 14th International Conference on Network and Service Management, CNSM 2018 and Workshops, 1st International Workshop on High-Precision Networks Operations and Control, HiPNet 2018 and 1st Workshop on Segment Routing and Service Function Chaining, SR+SFC 2018, 5 November 2018 through 9 November 2018 (pp. 45-53).
Open this publication in new window or tab >>Predicting Distributions of Service Metrics using Neural Networks
2018 (English)In: 14th International Conference on Network and Service Management, CNSM 2018 and Workshops, 1st International Workshop on High-Precision Networks Operations and Control, HiPNet 2018 and 1st Workshop on Segment Routing and Service Function Chaining, SR+SFC 2018, 2018, p. 45-53Conference paper, Published paper (Refereed)
Abstract [en]

We predict the conditional distributions of service metrics, such as response time or frame rate, from infrastructure measurements in a cloud environment. From such distributions, key statistics of the service metrics, including mean, variance, or percentiles can be computed, which are essential for predicting SLA conformance or enabling service assurance. We model the distributions as Gaussian mixtures, whose parameters we predict using mixture density networks, a class of neural networks. We apply the method to a VoD service and a KV store running on our lab testbed. The results validate the effectiveness of the method when applied to operational data. In the case of predicting the mean of the frame rate or response time, the accuracy matches that of random forest, a baseline model.

Keywords
Generative Models, Machine Learning, Network Management, Service Engineering, Decision trees, Learning systems, Routing algorithms, Baseline models, Cloud environments, Conditional distribution, Gaussian mixtures, Generative model, Operational data, Service assurance, Forecasting
National Category
Natural Sciences
Identifiers
urn:nbn:se:ri:diva-37760 (URN)2-s2.0-85060906697 (Scopus ID)9783903176140 (ISBN)
Conference
14th International Conference on Network and Service Management, CNSM 2018 and Workshops, 1st International Workshop on High-Precision Networks Operations and Control, HiPNet 2018 and 1st Workshop on Segment Routing and Service Function Chaining, SR+SFC 2018, 5 November 2018 through 9 November 2018
Note

Funding details: VINNOVA; Funding text 1: The authors are grateful to Erik Ylipää with RISE SICS, as well as to Andreas Johnsson, Farnaz Moradi, Christofer Flinta, and Jawaad Ahmed with Ericsson Research for fruitful discussion around this work. This research has been partially supported by the Swedish Governmental Agency for Innovation Systems, VINNOVA, through project SENDATE-EXTEND.

Available from: 2019-02-11 Created: 2019-02-11 Last updated: 2019-08-08Bibliographically approved
Pasquini, R. & Stadler, R. (2017). Learning end-to-end application QoS from openflow switch statistics. In: 2017 IEEE Conference on Network Softwarization: Softwarization Sustaining a Hyper-Connected World: en Route to 5G, NetSoft 2017. Paper presented at 2017 IEEE Conference on Network Softwarization, NetSoft 2017, 3 July 2017 through 7 July 2017. Institute of Electrical and Electronics Engineers Inc.
Open this publication in new window or tab >>Learning end-to-end application QoS from openflow switch statistics
2017 (English)In: 2017 IEEE Conference on Network Softwarization: Softwarization Sustaining a Hyper-Connected World: en Route to 5G, NetSoft 2017, Institute of Electrical and Electronics Engineers Inc. , 2017Conference paper, Published paper (Refereed)
Abstract [en]

We use statistical learning to estimate end-to-end QoS metrics from device statistics, collected from a server cluster and an OpenFlow network. The results from our testbed, which runs a video-on-demand service and a key-value store, demonstrate that the learned models can estimate QoS metrics like frame rate or response time with errors bellow 10% for a given client. Interestingly, we find that service-level QoS metrics seem "encoded" in network statistics and it suffices to collect OpenFlow per port statistics to achieve accurate estimation at small overhead for data collection and model computation.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2017
Keywords
Machine Learning, Network Analytics, Open-Flow, Quality of Service, Software-Defined Networking, Learning systems, Software defined networking, Statistics, Video on demand, Accurate estimation, End-to-end application, Model computation, Open flow, Openflow networks, Openflow switches, Statistical learning, Video on demand services
National Category
Engineering and Technology
Identifiers
urn:nbn:se:ri:diva-38065 (URN)10.1109/NETSOFT.2017.8004198 (DOI)2-s2.0-85029372779 (Scopus ID)9781509060085 (ISBN)
Conference
2017 IEEE Conference on Network Softwarization, NetSoft 2017, 3 July 2017 through 7 July 2017
Available from: 2019-03-15 Created: 2019-03-15 Last updated: 2019-03-19Bibliographically approved
Stadler, R., Pasquini, R. & Fodor, V. (2017). Learning from Network Device Statistics. Journal of Network and Systems Management, 25(4), 672-698
Open this publication in new window or tab >>Learning from Network Device Statistics
2017 (English)In: Journal of Network and Systems Management, ISSN 1064-7570, E-ISSN 1573-7705, Vol. 25, no 4, p. 672-698Article in journal (Refereed) Published
Abstract [en]

We estimate end-to-end service metrics from network device statistics. Our approach is based upon statistical, supervised learning, whereby the mapping from device-level to service-level metrics is learned from observations, i.e., through monitoring the system. The approach enables end-to-end performance prediction without requiring an explicit model of the system, which is different from traditional engineering techniques that use stochastic modeling and simulation. The fact that end-to-end service metrics can be estimated from local network statistics with good accuracy in the scenarios we consider suggests that service-level properties are “encoded” in network-level statistics. We show that the set of network statistics needed for estimation can be reduced to a set of measurements along the network path between client and service backend, with little loss in estimation accuracy. The reported work is largely experimental and its results have been obtained through testbed measurements from a video streaming service and a KV store over an OpenFlow network .

Keywords
End-to-end performance Prediction, Feature selection, Machine learning, Network analytics, Network management, OpenFlow, Statistical learning, Feature extraction, Learning systems, Stochastic models, Stochastic systems, Video streaming, End-to-end performance, End-to-end service, Network statistics, Testbed measurements, Traditional engineerings, Video streaming services, Statistics
National Category
Electrical Engineering, Electronic Engineering, Information Engineering
Identifiers
urn:nbn:se:ri:diva-31335 (URN)10.1007/s10922-017-9426-z (DOI)2-s2.0-85029795404 (Scopus ID)
Available from: 2017-10-06 Created: 2017-10-06 Last updated: 2019-01-22Bibliographically approved
Ahmed, J., Johnsson, A., Moradi, F., Pasquini, R., Flinta, C. & Stadler, R. (2017). Online approach to performance fault localization for cloud and datacenter services. In: Proceedings of the IM 2017 - 2017 IFIP/IEEE International Symposium on Integrated Network and Service Management: . Paper presented at 15th IFIP/IEEE International Symposium on Integrated Network and Service Management, IM 2017, 8 May 2017 through 12 May 2017 (pp. 873-874). Institute of Electrical and Electronics Engineers Inc.
Open this publication in new window or tab >>Online approach to performance fault localization for cloud and datacenter services
Show others...
2017 (English)In: Proceedings of the IM 2017 - 2017 IFIP/IEEE International Symposium on Integrated Network and Service Management, Institute of Electrical and Electronics Engineers Inc. , 2017, p. 873-874Conference paper, Published paper (Refereed)
Abstract [en]

Automated detection and diagnosis of the performance faults in cloud and datacenter environments is a crucial task to maintain smooth operation of different services and minimize downtime. We demonstrate an effective machine learning approach based on detecting metric correlation stability violations (CSV) for automated localization of performance faults for datacenter services running under dynamic load conditions.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2017
Keywords
Dynamic loads, Learning systems, Automated detection, Datacenter, Different services, Fault localization, Load condition, Machine learning approaches, Fault detection
National Category
Engineering and Technology
Identifiers
urn:nbn:se:ri:diva-38070 (URN)10.23919/INM.2017.7987390 (DOI)2-s2.0-85029446145 (Scopus ID)9783901882890 (ISBN)
Conference
15th IFIP/IEEE International Symposium on Integrated Network and Service Management, IM 2017, 8 May 2017 through 12 May 2017
Available from: 2019-03-15 Created: 2019-03-15 Last updated: 2019-03-19Bibliographically approved
Pasquini, R., Moradi, F., Ahmed, J., Johnsson, A., Flinta, C. & Stadler, R. (2017). Predicting SLA conformance for cluster-based services. In: 2017 IFIP Networking Conference, IFIP Networking 2017 and Workshops: . Paper presented at 2017 IFIP Networking Conference and Workshops, IFIP Networking 2017, 12 June 2017 through 16 June 2017 (pp. 1-2).
Open this publication in new window or tab >>Predicting SLA conformance for cluster-based services
Show others...
2017 (English)In: 2017 IFIP Networking Conference, IFIP Networking 2017 and Workshops, 2017, p. 1-2Conference paper, Published paper (Refereed)
Abstract [en]

The ability to predict conformance or violation for given Service-level Agreements (SLAs) is critical for service assurance. We demonstrate a prototype for real-time conformance prediction based on the concept of the capacity region, which abstracts the underlying ICT infrastructure with respect to the load it can carry for a given SLA. The capacity region is estimated through measurements and statistical learning. We demonstrate prediction for a key-value store (Voldemort) that runs on a server cluster located at KTH.

Keywords
Capacity Region, Feasible Region, Real-time Prediction, Service-level Agreement (SLA), Statistical Learning, Network architecture, Quality of service, Capacity regions, Feasible regions, Service Level Agreement (SLA), Forecasting
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-34627 (URN)10.23919/IFIPNetworking.2017.8264873 (DOI)2-s2.0-85050566272 (Scopus ID)9783901882944 (ISBN)
Conference
2017 IFIP Networking Conference and Workshops, IFIP Networking 2017, 12 June 2017 through 16 June 2017
Note

 Funding details: VINNOVA; Funding details: VR, Vetenskapsrådet;

Available from: 2018-08-14 Created: 2018-08-14 Last updated: 2019-02-06Bibliographically approved
Flinta, C., Johnsson, A., Ahmed, J., Moradi, F., Pasquini, R. & Stadler, R. (2017). Real-time resource prediction engine for cloud management. In: Proceedings of the IM 2017 - 2017 IFIP/IEEE International Symposium on Integrated Network and Service Management: . Paper presented at 15th IFIP/IEEE International Symposium on Integrated Network and Service Management, IM 2017, 8 May 2017 through 12 May 2017 (pp. 877-878). Institute of Electrical and Electronics Engineers Inc.
Open this publication in new window or tab >>Real-time resource prediction engine for cloud management
Show others...
2017 (English)In: Proceedings of the IM 2017 - 2017 IFIP/IEEE International Symposium on Integrated Network and Service Management, Institute of Electrical and Electronics Engineers Inc. , 2017, p. 877-878Conference paper, Published paper (Refereed)
Abstract [en]

Predicting resource requirements for cloud services is critical for dimensioning, anomaly detection and service assurance. We demonstrate a system for real-time estimation of the needed amount of infrastructure resources, such as CPU and memory, for a given service. Statistical learning methods on server statistics and load parameters of the service are used for learning a resource prediction model. The model can be used as a guideline for service deployment and for real-time identification of resource bottlenecks. © 2017 IFIP.

Place, publisher, year, edition, pages
Institute of Electrical and Electronics Engineers Inc., 2017
Keywords
Cloud managements, Infrastructure resources, Real-time estimation, Real-time identification, Resource prediction, Resource requirements, Service deployment, Statistical learning methods, Forecasting
National Category
Engineering and Technology
Identifiers
urn:nbn:se:ri:diva-38069 (URN)10.23919/INM.2017.7987392 (DOI)2-s2.0-85029437876 (Scopus ID)9783901882890 (ISBN)
Conference
15th IFIP/IEEE International Symposium on Integrated Network and Service Management, IM 2017, 8 May 2017 through 12 May 2017
Available from: 2019-03-15 Created: 2019-03-15 Last updated: 2019-03-19Bibliographically approved
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-6039-8493

Search in DiVA

Show all publications
v. 2.35.7