Change search
Link to record
Permanent link

Direct link
BETA
Publications (10 of 12) Show all publications
Oz, I., Bhatti, M. K. K., Popov, K. & Brorsson, M. (2019). Regression-Based Prediction for Task-Based Program Performance. Journal of Circuits, Systems and Computers, 8(4), Article ID 1950060.
Open this publication in new window or tab >>Regression-Based Prediction for Task-Based Program Performance
2019 (English)In: Journal of Circuits, Systems and Computers, ISSN 0218-1266, Vol. 8, no 4, article id 1950060Article in journal (Refereed) Published
Abstract [en]

As multicore systems evolve by increasing the number of parallel execution units, parallel programming models have been released to exploit parallelism in the applications. Task-based programming model uses task abstractions to specify parallel tasks and schedules tasks onto processors at runtime. In order to increase the efficiency and get the highest performance, it is required to identify which runtime configuration is needed and how processor cores must be shared among tasks. Exploring design space for all possible scheduling and runtime options, especially for large input data, becomes infeasible and requires statistical modeling. Regression-based modeling determines the effects of multiple factors on a response variable, and makes predictions based on statistical analysis. In this work, we propose a regression-based modeling approach to predict the task-based program performance for different scheduling parameters with variable data size. We execute a set of task-based programs by varying the runtime parameters, and conduct a systematic measurement for influencing factors on execution time. Our approach uses executions with different configurations for a set of input data, and derives different regression models to predict execution time for larger input data. Our results show that regression models provide accurate predictions for validation inputs with mean error rate as low as 6.3%, and 14% on average among four task-based programs.

Keywords
Performance prediction, regression, task-based programs, Computer systems programming, Forecasting, Input output programs, Parallel processing systems, Parallel programming, Regression analysis, Scheduling, Parallel programming model, Regression-based model, Run-time configuration, Scheduling parameters, Task-based, Task-based programming, Multicore programming
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-34593 (URN)10.1142/S0218126619500609 (DOI)2-s2.0-85049081368 (Scopus ID)
Available from: 2018-08-14 Created: 2018-08-14 Last updated: 2019-07-01Bibliographically approved
Bhatti, M. K., Oz, I., Amin, S., Mushtaq, M., Farooq, U., Popov, K. & Brorsson, M. (2018). Locality-aware task scheduling for homogeneous parallel computing systems. Computing, 100(6), 557-595
Open this publication in new window or tab >>Locality-aware task scheduling for homogeneous parallel computing systems
Show others...
2018 (English)In: Computing, ISSN 0010-485X, E-ISSN 1436-5057, Vol. 100, no 6, p. 557-595Article in journal (Refereed) Published
Abstract [en]

In systems with complex many-core cache hierarchy, exploiting data locality can significantly reduce execution time and energy consumption of parallel applications. Locality can be exploited at various hardware and software layers. For instance, by implementing private and shared caches in a multi-level fashion, recent hardware designs are already optimised for locality. However, this would all be useless if the software scheduling does not cast the execution in a manner that promotes locality available in the programs themselves. Since programs for parallel systems consist of tasks executed simultaneously, task scheduling becomes crucial for the performance in multi-level cache architectures. This paper presents a heuristic algorithm for homogeneous multi-core systems called locality-aware task scheduling (LeTS). The LeTS heuristic is a work-conserving algorithm that takes into account both locality and load balancing in order to reduce the execution time of target applications. The working principle of LeTS is based on two distinctive phases, namely; working task group formation phase (WTG-FP) and working task group ordering phase (WTG-OP). The WTG-FP forms groups of tasks in order to capture data reuse across tasks while the WTG-OP determines an optimal order of execution for task groups that minimizes the reuse distance of shared data between tasks. We have performed experiments using randomly generated task graphs by varying three major performance parameters, namely: (1) communication to computation ratio (CCR) between 0.1 and 1.0, (2) application size, i.e., task graphs comprising of 50-, 100-, and 300-tasks per graph, and (3) number of cores with 2-, 4-, 8-, and 16-cores execution scenarios. We have also performed experiments using selected real-world applications. The LeTS heuristic reduces overall execution time of applications by exploiting inter-task data locality. Results show that LeTS outperforms state-of-the-art algorithms in amortizing inter-task communication cost.

Keywords
Directed acyclic graph (DAG), Embedded systems, Homogeneous systems, Multicore scheduling, Parallel computing, Runtime resource management, Directed graphs, Distributed computer systems, Energy utilization, Hardware, Heuristic algorithms, Multitasking, Optimization, Parallel processing systems, Scheduling algorithms, Homogeneous system, Multi core, Multi-level cache architecture, Parallel computing system, Performance parameters, Resource management, State-of-the-art algorithms, Scheduling
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-33981 (URN)10.1007/s00607-017-0581-6 (DOI)2-s2.0-85032798462 (Scopus ID)
Available from: 2018-07-03 Created: 2018-07-03 Last updated: 2018-08-14Bibliographically approved
Al-Shishtawy, A., Asif Fayyaz, M., Popov, K. & Vlassov, V. (2010). Achieving Robust Self-Management for Large-Scale Distributed Applications (7ed.). Kista, Sweden
Open this publication in new window or tab >>Achieving Robust Self-Management for Large-Scale Distributed Applications
2010 (English)Report (Other academic)
Abstract [en]

Autonomic managers are the main architectural building blocks for constructing self-management capabilities of computing systems and applications. One of the major challenges in developing self-managing applications is robustness of management elements which form autonomic managers. We believe that transparent handling of the effects of resource churn (joins/leaves/failures) on management should be an essential feature of a platform for self-managing large-scale dynamic distributed applications, because it facilitates the development of robust autonomic managers and hence improves robustness of self-managing applications. This feature can be achieved by providing a robust management element abstraction that hides churn from the programmer. In this paper, we present a generic approach to achieve robust services that is based on finite state machine replication with dynamic reconfiguration of replica sets. We contribute a decentralized algorithm that maintains the set of nodes hosting service replicas in the presence of churn. We use this approach to implement robust management elements as robust services that can operate despite of churn. Our proposed decentralized algorithm uses peer-to-peer replica placement schemes to automate replicated state machine migration in order to tolerate churn. Our algorithm exploits lookup and failure detection facilities of a structured overlay network for managing the set of active replicas. Using the proposed approach, we can achieve a long running and highly available service, without human intervention, in the presence of resource churn. In order to validate and evaluate our approach, we have implemented a prototype that includes the proposed algorithm.

Place, publisher, year, edition, pages
Kista, Sweden: , 2010 Edition: 7
Series
SICS Technical Report, ISSN 1100-3154 ; 2010:02
Keywords
autonomic computing, distributed systems, self-management, replicated state machines, service migration, peer-to-peer
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-23666 (URN)
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2018-08-14Bibliographically approved
De Palma, N., Parlavanzas, N., Popov, K., Brand, P. & Vlassov, V. (2009). Tools for Autonomic Computing (10ed.). In: 5th International Conference on Autonomic and Autonomous Systems (ICAS 2009): . Paper presented at 5th International Conference on Autonomic and Autonomous Systems (ICAS 2009), 21-25 April 2009, Valencia, Spain. (pp. 313-320). IEEE Computer Society
Open this publication in new window or tab >>Tools for Autonomic Computing
Show others...
2009 (English)In: 5th International Conference on Autonomic and Autonomous Systems (ICAS 2009), IEEE Computer Society , 2009, 10, p. 313-320Conference paper, Published paper (Refereed)
Place, publisher, year, edition, pages
IEEE Computer Society, 2009 Edition: 10
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-23669 (URN)
Conference
5th International Conference on Autonomic and Autonomous Systems (ICAS 2009), 21-25 April 2009, Valencia, Spain.
Projects
Grid4All
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2018-08-14Bibliographically approved
Faxén, K.-F., Popov, K., Janson, S. & Albertsson, L. (2008). Embla - Data Dependence Profiling for Parallel Programming (10ed.). In: Proceedings of the 2008 International Conference on Complex, Intelligent and Software Intensive Systems: . Paper presented at 2008 International Conference on Complex, Intelligent and Software Intensive Systems (pp. 780-785).
Open this publication in new window or tab >>Embla - Data Dependence Profiling for Parallel Programming
2008 (English)In: Proceedings of the 2008 International Conference on Complex, Intelligent and Software Intensive Systems, 2008, 10, p. 780-785Conference paper, Published paper (Refereed)
Abstract [en]

With the proliferation of multicore processors, there is an urgent need for tools and methodologies supporting parallelization of existing applications. In this paper, we present a novel tool for aiding programmers in parallelizing programs. The tool, Embla, is based on the Valgrind framework, and allows the user to discover the data dependences in a sequential program, thereby exposing opportunities for parallelization. Embla performs an off-line dynamic analysis, and records dependences as they arise during program execution. It reports an optimistic view of parallelizable sequences, and ignores dependences that do not arise during execution. Moreover, since the tool instruments the machine code of the program, it is largely language independent. Since Embla finds the dependencies that occur for particular executions, the confidence one would assign to its results depend on whether different executions yield different (bad) or largely the same (good) dependencies. We present a preliminary investigation into this issue using 84 different inputs to the SPEC CPU 2006 benchmark 403.gcc. The results indicate that there is a strong correlation between coverage and finding dependencies; executing the entire program is likely to reveal all dependencies.

National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-22255 (URN)
Conference
2008 International Conference on Complex, Intelligent and Software Intensive Systems
Projects
Embla
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2018-08-14Bibliographically approved
Nimar, G., Vlassov, V. & Popov, K. (2005). Practical Experience in Building an Agent System for Semantics-Based Provision and Selection of Grid Services (1ed.). In: : . Paper presented at Sixth International Conference on Palel Processing and Applied Mathematics.
Open this publication in new window or tab >>Practical Experience in Building an Agent System for Semantics-Based Provision and Selection of Grid Services
2005 (English)Conference paper, Published paper (Refereed)
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-21099 (URN)
Conference
Sixth International Conference on Palel Processing and Applied Mathematics
Projects
GES3
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2018-08-14Bibliographically approved
Groleau, W., Vlassov, V. & Popov, K. (2005). Towards Semantics-Based Resource Discovery for the Grid (1ed.). In: : . Paper presented at CoreGRID Integration Workshop.
Open this publication in new window or tab >>Towards Semantics-Based Resource Discovery for the Grid
2005 (English)Conference paper, Published paper (Refereed)
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-21100 (URN)
Conference
CoreGRID Integration Workshop
Projects
GES3
Note

Preliminary proceedings.

Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2018-08-14Bibliographically approved
Popov, K., Brand, P. & Haridi, S. (2003). An efficient marshaling framework for distributed systems (2ed.). In: : . Paper presented at PaCT 2003: 7th Int'l Conference on Parallel Computing Technologies, 15-19 Sept 2003, Nizhni Novgorod, Russia.
Open this publication in new window or tab >>An efficient marshaling framework for distributed systems
2003 (English)Conference paper, Published paper (Refereed)
Abstract [en]

An efficient (un)marshaling framework is presented. It is designed for distributed applications implemented in languages such as C++. A marshaler/unmarshaler pair converts arbitrary \emph{structured} data between its host and network representations. This technology can also be used for persistent storage. Our framework simplifies the design of efficient and flexible marshalers. The network latency is reduced by concurrent execution of (un)marshaling and network operations. The framework is actually used in Mozart, a distributed programming system that implements Oz, a multi-paradigm concurrent language. Mozart, including the implementation of the framework, is available at http://www.mozart-oz.org.

National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-22429 (URN)
Conference
PaCT 2003: 7th Int'l Conference on Parallel Computing Technologies, 15-19 Sept 2003, Nizhni Novgorod, Russia
Note

http://www.sics.se/~kost/pubs/marshaling-PaCT03.ps

Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2018-08-20Bibliographically approved
Popov, K., Vlassov, V., Rafea, M., Holmgren, F., Brand, P. & Haridi, S. (2003). Parallel agent-based simulation on a cluster of workstations (1ed.). In: : . Paper presented at Euro-Par 2003: International Conference on Parallel and Distributed Computing, 26-29 Aug 2003, Klagenfurt, Austria.
Open this publication in new window or tab >>Parallel agent-based simulation on a cluster of workstations
Show others...
2003 (English)Conference paper, Published paper (Refereed)
Abstract [en]

We discuss a parallel implementation of an agent-based simulation. Our approach allows to adapt a sequential simulator for large-scale simulation on a cluster of workstations. We target discrete-time simulation models that capture the behavior of WWW. The real-world phenomena of emerged aggregated behavior of the Internet population is studied. The system distributes data among workstations, which allows large-scale simulations infeasible on a stand-alone computer. The model properties cause traffic between workstations proportional to partition sizes. Network latency is hidden by concurrent simulation of multiple users. The system is implemented in Mozart that provides multithreading, dataflow variables, component-based software development, and network-transparency. Currently we can simulate up to 106 Web users on 104 Web sites using a cluster of 16 computers, which takes few seconds per simulation step, and for a problem of the same size, parallel simulation offers speedups between 11 and 14.

Publisher
p. 11
Series
LNCS ; 2790
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-22430 (URN)10.1007/b12024 (DOI)978-3-540-40788-1 (ISBN)
Conference
Euro-Par 2003: International Conference on Parallel and Distributed Computing, 26-29 Aug 2003, Klagenfurt, Austria
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2018-08-20Bibliographically approved
Popov, K., Rafea, M., Haridi, S., Holmgren, F., Brand, P. & Vlassov, V. (2003). Parallel agent-based simulation on a cluster of workstations. (1ed.). Parallel Processing Letters, 13(4), 629-641
Open this publication in new window or tab >>Parallel agent-based simulation on a cluster of workstations.
Show others...
2003 (English)In: Parallel Processing Letters, ISSN 0129-6264, Vol. 13, no 4, p. 629-641Article in journal (Refereed) Published
Abstract [en]

We discuss a parallel implementation of an agent-based simulation. Our approach allows to adapt a sequential simulator for large-scale simulation on a cluster of workstations. We target discrete-time simulation models that capture the behavior of Web users and Web sites. Web users are connected with each other in a graph resembling the social network. Web sites are also connected in a similar graph. Users are stateful entities. At each time step, they exhibit certain behaviour such as visiting bookmarked sites, exchanging information about Web sites in the "word-of-mouth" style, and updating bookmarks. The real-world phenomena of emerged aggregated behavior of the Internet population is studied. The system distributes data among workstations, which allows large-scale simulations infeasible on a stand-alone computer. The model properties cause traffic between workstations proportional to partition sizes. Network latency is hidden by concurrent simulation of multiple users. The system is implemented in Mozart that provides multithreading, dataflow variables, component-based software development, and network-transparency. Currently we can simulate up to 1 million Web users on 100 million Web sites using a cluster of 16 computers, which takes few seconds per simulation step, and for a problem of the same size, parallel simulation offers speedups between 11 and 14.

National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-22388 (URN)10.1142/S0129626403001562 (DOI)
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2018-08-20Bibliographically approved
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0002-9431-5139

Search in DiVA

Show all publications
v. 2.35.7