Ändra sökning
Avgränsa sökresultatet
1 - 27 av 27
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf
Träffar per sida
  • 5
  • 10
  • 20
  • 50
  • 100
  • 250
Sortering
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
  • Standard (Relevans)
  • Författare A-Ö
  • Författare Ö-A
  • Titel A-Ö
  • Titel Ö-A
  • Publikationstyp A-Ö
  • Publikationstyp Ö-A
  • Äldst först
  • Nyast först
  • Skapad (Äldst först)
  • Skapad (Nyast först)
  • Senast uppdaterad (Äldst först)
  • Senast uppdaterad (Nyast först)
  • Disputationsdatum (tidigaste först)
  • Disputationsdatum (senaste först)
Markera
Maxantalet träffar du kan exportera från sökgränssnittet är 250. Vid större uttag använd dig av utsökningar.
  • 1.
    Balador, Ali
    et al.
    RISE Research Institutes of Sweden.
    Sinaei, Sima
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Pettersson, M
    Artificial Intelligence Enabled Distributed Edge Computing for Internet of Things2022Ingår i: ERCIM News, ISSN 0926-4981, E-ISSN 1564-0094, nr 129, s. 41-42Artikel i tidskrift (Övrigt vetenskapligt)
  • 2.
    Balador, Ali
    et al.
    RISE Research Institutes of Sweden.
    Sinaei, Sima
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Pettersson, Mats
    Blekinge Institute of Technology, Sweden.
    Kaya, Ilhan
    Organize Sanayi Bolgesi, Turkey.
    DAIS Project - Distributed Artificial Intelligence Systems: Objectives and Challenges2023Ingår i: ACM SIGAda Ada Letters, ISSN 1094-3641, E-ISSN 1557-9476, Vol. 42, nr 2, s. 96-98Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    DAIS is a step forward in the area of artificial intelligence and edge computing. DAIS intends to create a complete framework for self-organizing, energy efficient and private-by-design distributed AI. DAIS is a European project with a consortium of 47 partners from 11 countries coordinated by RISE Research Institute of Sweden.

  • 3.
    Geraeinejad, V.
    et al.
    University of Tehran, Iran.
    Sinaei, Sima
    RISE Research Institutes of Sweden, Digitala system, Industriella system. Mälardalen University, Sweden.
    Modarressi, M.
    University of Tehran, Iran; School of Computer Science, Iran.
    Daneshtalab, M.
    University of Tehran, Iran; Tallinn University of Technology, Estonia.
    RoCo-NAS: Robust and Compact Neural Architecture Search2021Ingår i: Proceedings of the International Joint Conference on Neural Networks, Vol. JulyArtikel i tidskrift (Refereegranskat)
    Abstract [en]

    Deep model compression has been studied widely, and state-of-the-art methods can now achieve high compression ratios with minimum accuracy loss. Recent advances in adversarial attacks reveal the inherent vulnerability of deep neural networks to slightly perturbed images called adversarial examples. Since then, extensive efforts have been performed to enhance deep networks’ robustness via specialized loss functions and learning algorithms. Previous works suggest that network size and robustness against adversarial examples contradict on most occasions. In this paper, we investigate how to optimize compactness and robustness to adversarial attacks of neural network architectures while maintaining the accuracy using multi-objective neural architecture search. We propose the use of previously generated adversarial examples as an objective to evaluate the robustness of our models in addition to the number of floating-point operations to assess model complexity i.e. compactness. Experiments on some recent neural architecture search algorithms show that due to their limited search space they fail to find robust and compact architectures. By creating a novel neural architecture search (RoCo-NAS), we were able to evolve an architecture that is up to 7% more accurate against adversarial samples than its more complex architecture counterpart. Thus, the results show inherently robust architectures regardless of their size. This opens up a new range of possibilities for the exploration and design of deep neural networks using automatic architecture search.

  • 4.
    Loni, M.
    et al.
    Mälardalen University, Sweden.
    Sinaei, Sima
    Mälardalen University, Sweden.
    Zoljodi, A.
    Shiraz University of Technology, Iran.
    Daneshtalab, M.
    Mälardalen University, Sweden.
    Sjödin, M.
    Mälardalen University, Sweden.
    DeepMaker: A multi-objective optimization framework for deep neural networks in embedded systems2020Ingår i: Microprocessors and microsystems, ISSN 0141-9331, E-ISSN 1872-9436, Vol. 73Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Deep Neural Networks (DNNs) are compute-intensive learning models with growing applicability in a wide range of domains. Due to their computational complexity, DNNs benefit from implementations that utilize custom hardware accelerators to meet performance and response time as well as classification accuracy constraints. In this paper, we propose DeepMaker framework that aims to automatically design a set of highly robust DNN architectures for embedded devices as the closest processing unit to the sensors. DeepMaker explores and prunes the design space to find improved neural architectures. Our proposed framework takes advantage of a multi-objective evolutionary approach that exploits a pruned design space inspired by a dense architecture. DeepMaker considers the accuracy along with the network size factor as two objectives to build a highly optimized network fitting with limited computational resource budgets while delivers an acceptable accuracy level. In comparison with the best result on the CIFAR-10 dataset, a generated network by DeepMaker presents up to a 26.4x compression rate while loses only 4% accuracy. Besides, DeepMaker maps the generated CNN on the programmable commodity devices, including ARM Processor, High-Performance CPU, GPU, and FPGA.

  • 5.
    Loni, M.
    et al.
    Mälardalen University, Sweden.
    Zoljodi, A.
    Shiraz University of Technology, Iran.
    Sinaei, Sima
    Mälardalen University, Sweden.
    Daneshtalab, M.
    Mälardalen University, Sweden.
    Sjödin, M.
    Mälardalen University, Sweden.
    NeuroPower: Designing Energy Efficient Convolutional Neural Network Architecture for Embedded Systems2019Ingår i: Lecture Notes in Computer Science, ISSN 0302-9743, E-ISSN 1611-3349, Vol. 11727 LNCS, s. 208-222Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Convolutional Neural Networks (CNNs) suffer from energy-hungry implementation due to their computation and memory intensive processing patterns. This problem is even more significant by the proliferation of CNNs on embedded platforms. To overcome this problem, we offer NeuroPower as an automatic framework that designs a highly optimized and energy efficient set of CNN architectures for embedded systems. NeuroPower explores and prunes the design space to find improved set of neural architectures. Toward this aim, a multi-objective optimization strategy is integrated to solve Neural Architecture Search (NAS) problem by near-optimal tuning network hyperparameters. The main objectives of the optimization algorithm are network accuracy and number of parameters in the network. The evaluation results show the effectiveness of NeuroPower on energy consumption, compacting rate and inference time compared to other cutting-edge approaches. In comparison with the best results on CIFAR-10/CIFAR-100 datasets, a generated network by NeuroPower presents up to 2.1x/1.56x compression rate, 1.59x/3.46x speedup and 1.52x/1.82x power saving while loses 2.4%/0.6% accuracy, respectively. 

  • 6.
    Mirsalari, S. A.
    et al.
    University of Tehran, Iran.
    Nazari, N.
    University of Tehran, Iran.
    Ansarmohammadi, S. A.
    University of Tehran, Iran.
    Sinaei, Sima
    Mälardalen University, Sweden.
    Salehi, M. E.
    University of Tehran, Iran.
    Daneshtalab, M.
    Mälardalen University, Sweden; Tallinn University of Technology, Estonia.
    ELC-ECG: Efficient LSTM cell for ECG classification based on quantized architecture2021Ingår i: Proceedings - IEEE International Symposium on Cicuits and Systems, ISSN 0271-4310, Vol. MayArtikel i tidskrift (Refereegranskat)
    Abstract [en]

    Long Short-Term Memory (LSTM) is one of the most popular and effective Recurrent Neural Network (RNN) models used for sequence learning in applications such as ECG signal classification. Complex LSTMs could hardly be deployed on resource-limited bio-medical wearable devices due to the huge amount of computations and memory requirements. Binary LSTMs are introduced to cope with this problem. However, naive binarization leads to significant accuracy loss in ECG classification. In this paper, we propose an efficient LSTM cell along with a novel hardware architecture for ECG classification. By deploying 5-level binarized inputs and just 1-level binarization for weights, output, and in-memory cell activations, the delay of one LSTM cell operation is reduced 50x with about 0.004% accuracy loss in comparison with full precision design of ECG classification.

  • 7.
    Mirsalari, S. A.
    et al.
    University of Tehran, Iran.
    Sinaei, Sima
    Mälardalen University, Sweden.
    Salehi, M. E.
    University of Tehran, Iran.
    Daneshtalab, M.
    Mälardalen University, Sweden.
    MuBiNN: Multi-level binarized recurrent neural network for EEG signal classification2020Ingår i: Proceedings - IEEE International Symposium on Cicuits and Systems, ISSN 0271-4310, Vol. OctoberArtikel i tidskrift (Refereegranskat)
    Abstract [en]

    Recurrent Neural Networks (RNN) are widely used for learning sequences in applications such as EEG classification. Complex RNNs could be hardly deployed on wearable devices due to their computation and memory-intensive processing patterns. Generally, reduction in precision leads much more efficiency and binarized RNNs are introduced as energy-efficient solutions. However, naive binarization methods lead to significant accuracy loss in EEG classification. In this paper, we propose a multi-level binarized LSTM, which significantly reduces computations whereas ensuring an accuracy pretty close to the full precision LSTM. Our method reduces the delay of the 3-bit LSTM cell operation 47 with less than 0.01% accuracy loss. 

  • 8.
    Mirsalari, Seyed Ahmad
    et al.
    University of Tehran, Iran.
    Nazari, Najmeh
    University of Tehran, Iran.
    Sinaei, Sima
    RISE Research Institutes of Sweden, Digitala system, Industriella system. Mälardalen University, Sweden.
    Salehi, Mostafa E.
    University of Tehran, Iran; School of Computer Science, Institute for Research in Fundamental Sciences, Iran.
    Daneshtalab, Masoud
    Mälardalen University, Sweden.
    FaCT-LSTM: Fast and Compact Ternary Architecture for LSTM Recurrent Neural Networks2022Ingår i: IEEE design & test, ISSN 2168-2356, E-ISSN 2168-2364, Vol. 39, nr 3, s. 45-53Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    This article proposes a Fast and Compact Ternary LSTM (FaCTLSTM), which bridges the accuracy gap between the full-precision and quantized neural networks.

  • 9.
    Mishchenko, Kateryna
    et al.
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Mohammadi, Samaneh
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Mohammadi, Mohammadreza
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Sinaei, Sima
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Hyperparameters Optimization for Federated Learning System: Speech Emotion Recognition Case Study2023Ingår i: 2023 Eighth International Conference on Fog and Mobile Edge Computing (FMEC), IEEE, 2023, s. 80-86Konferensbidrag (Refereegranskat)
    Abstract [en]

    Context: Federated Learning (FL) has emerged as a promising, massively distributed way to train a joint deep model across numerous edge devices, ensuring user data privacy by retaining it on the device. In FL, Hyperparameters (HP) significantly affect the training overhead regarding computation and transmission time, computation and transmission load, as well as model accuracy. This paper presents a novel approach where Hyperparameters Optimization (HPO) is used to optimize the performance of the FL model for Speech Emotion Recognition (SER) application. To solve this problem, both Single-Objective Optimization (SOO) and Multi-Objective Optimization (MOO) models are developed and evaluated. The optimization model includes two objectives: accuracy and total execution time. Numerical results show that optimal Hyperparameters (HP) settings allow for improving both the accuracy of the model and its computation time. The proposed method assists FL system designers in finding optimal parameters setup, allowing them to carry out model design and development efficiently depending on their goals.

  • 10.
    Mohammadi, Mohammadreza
    et al.
    RISE Research Institutes of Sweden, Digitala system, Industriella system. University of Padova, Italy.
    Allocca, Roberto
    University of Naples Federico II, Italy.
    Eklund, David
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Shrestha, Rakesh
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Sinaei, Sima
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Privacy-preserving Federated Learning System for Fatigue Detection2023Ingår i: Proceedings of the 2023 IEEE International Conference on Cyber Security and Resilience, CSR 2023, s. 624-629Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Context:. Drowsiness affects the driver’s cognitive abilities, which are all important for safe driving. Fatigue detection is a critical technique to avoid traffic accidents. Data sharing among vehicles can be used to optimize fatigue detection models and ensure driving safety. However, data privacy issues hinder the sharing process. To tackle these challenges, we propose a Federated Learning (FL) approach for fatigue-driving behavior monitoring. However, in the FL system, the privacy information of the drivers might be leaked. In this paper, we propose to combine the concept of differential privacy (DP) with Federated Learning for the fatigue detection application, in which artificial noise is added to parameters at the drivers’ side before aggregating. This approach will ensure the privacy of drivers’ data and the convergence of the federated learning algorithms. In this paper, the privacy level in the system is determined in order to achieve a balance between the noise scale and the model’s accuracy. In addition, we have evaluated our models resistance against a model inversion attack. The effectiveness of the attack is measured by the Mean Squared Error (MSE) between the reconstructed data point and the training data. The proposed approach, compared to the non-DP case, has a 6% accuracy loss while decreasing the effectiveness of the attacks by increasing the MSE from 5.0 to 7.0, so a balance between accuracy and noise scale is achieved.

  • 11.
    Mohammadi, Mohammadreza
    et al.
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Shrestha, Rakesh
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Sinaei, Sima
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Salcines, Alberto
    Tst, Spain.
    Pampliega, David
    Schneider Electric, Spain.
    Clemente, Raul
    Schneider Electric, Spain.
    Sanz, Ana Lourdes
    Schneider Electric, Spain.
    Anomaly Detection Using LSTM-Autoencoder in Smart Grid: A Federated Learning Approach2023Ingår i: ACM International Conference Proceeding Series, Association for Computing Machinery , 2023, s. 48-54Konferensbidrag (Refereegranskat)
    Abstract [en]

    ABSTRACT. Anomaly detection is critical in industrial systems such as smart grid systems to guarantee their safe and effective operation. The smart grid stations contain sensitive data, and they are concerned about sharing it with a third-party server to establish a centralized anomaly detection system. Federated Learning (FL) is a feasible solution to these problems for enhancing anomaly detection in smart grid systems. This study describes a method for developing an unsupervised anomaly detection based on FL system using a synthetic dataset based on real-world grid system behavior. The paper investigates the usage of FL’s long short-term memory autoencoder (LSTM-AE) for anomaly detection. For more accurate identification, this research explores the performance of integrating LSTM-AE with one-class support vector machine (OC-SVM) and isolation forest (IF) and compares their results with a threshold-based anomaly detection approach. Moreover, an approach is described for generating synthetic anomalies with different levels of difficulty to evaluate the robustness of the anomaly detection FL model. The FL models results are compared with the centralized version of the models as a baseline and the results show that FL models outperformed the centralized approach by detecting higher outlier data by achieving 99% F1-Score.

  • 12.
    Mohammadi, Samaneh
    et al.
    RISE Research Institutes of Sweden, Digitala system, Industriella system. Mälardalen University, Sweden.
    Balador, Ali
    Mälardalen University, Sweden.
    Sinaei, Sima
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Flammini, Francesco
    Mälardalen University, Sweden.
    Balancing privacy and performance in federated learning: A systematic literature review on methods and metrics2024Ingår i: Journal of Parallel and Distributed Computing, ISSN 0743-7315, E-ISSN 1096-0848, Vol. 192, artikel-id 104918Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Federated learning (FL) as a novel paradigm in Artificial Intelligence (AI), ensures enhanced privacy by eliminating data centralization and brings learning directly to the edge of the user’s device. Nevertheless, new privacy issues have been raised particularly during training and the exchange of parameters between servers and clients. While several privacy-preserving FL solutions have been developed to mitigate potential breaches in FL architectures, their integration poses its own set of challenges. Incorporating these privacy-preserving mechanisms into FL at the edge computing level can increase both communication and computational overheads, which may, in turn, compromise data utility and learning performance metrics. This paper provides a systematic literature review on essential methods and metrics to support the most appropriate trade-offs between FL privacy and other performance-related application requirements such as accuracy, loss, convergence time, utility, communication, and computation overhead. We aim to provide an extensive overview of recent privacy-preserving mechanisms in FL used across various applications, placing a particular focus on quantitative privacy assessment approaches in FL and the necessity of achieving a balance between privacy and the other requirements of real-world FL applications. This review collects, classifies, and discusses relevant papers in a structured manner, emphasizing challenges, open issues, and promising research directions.

    Ladda ner fulltext (pdf)
    fulltext
  • 13.
    Mohammadi, Samaneh
    et al.
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Mohammadi, Mohammadreza
    RISE Research Institutes of Sweden, Digitala system, Industriella system. University of Padua, Italy.
    Sinaei, Sima
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Balador, Ali
    Mälardalen University, Sweden.
    Nowroozi, Ehsan
    Queen’s University Belfast, UK.
    Flammini, Francesco
    Mälardalen University, Sweden.
    Conti, Mauro
    University of Padua, Italy.
    Balancing Privacy and Accuracy in Federated Learning for Speech Emotion Recognition2023Ingår i: ACSIS Annals of Computer Science and Information Systems, Vol. 35, s. 191-199Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Context: Speech Emotion Recognition (SER) is a valuable technology that identifies human emotions from spoken language, enabling the development of context-aware and personalized intelligent systems. To protect user privacy, Federated Learning (FL) has been introduced, enabling local training of models on user devices. However, FL raises concerns about the potential exposure of sensitive information from local model parameters, which is especially critical in applications like SER that involve personal voice data. Local Differential Privacy (LDP) has prevented privacy leaks in image and video data. However, it encounters notable accuracy degradation when applied to speech data, especially in the presence of high noise levels. In this paper, we propose an approach called LDP-FL with CSS, which combines LDP with a novel client selection strategy (CSS). By leveraging CSS, we aim to improve the representatives of updates and mitigate the adverse effects of noise on SER accuracy while ensuring client privacy through LDP. Furthermore, we conducted model inversion attacks to evaluate the robustness of LDP-FL in preserving privacy. These attacks involved an adversary attempting to reconstruct individuals' voice samples using the output labels provided by the SER model. The evaluation results reveal that LDP-FL with CSS achieved an accuracy of 65-70%, which is 4% lower than the initial SER model accuracy. Furthermore, LDP-FL demonstrated exceptional resilience against model inversion attacks, outperforming the non-LDP method by a factor of 10. Overall, our analysis emphasizes the importance of achieving a balance between privacy and accuracy in accordance with the requirements of the SER application.

  • 14.
    Mohammadi, Samaneh
    et al.
    RISE Research Institutes of Sweden, Digitala system, Industriella system. Mälardalen University, Sweden.
    Sinaei, Sima
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Balador, Ali
    RISE Research Institutes of Sweden, Digitala system, Industriella system. Mälardalen University, Sweden.
    Flammini, Francesco
    Mälardalen University, Sweden.
    Optimized Paillier Homomorphic Encryption in Federated Learning for Speech Emotion Recognition2023Ingår i: 2023 IEEE 47th Annual Computers, Software, and Applications Conference (COMPSAC), 2023, s. 1021-1022Konferensbidrag (Refereegranskat)
    Abstract [en]

    Federated Learning is an approach to distributed machine learning that enables collaborative model training on end devices. FL enhances privacy as devices only share local model parameters instead of raw data with a central server. However, the central server or eavesdroppers could extract sensitive information from these shared parameters. This issue is crucial in applications like speech emotion recognition (SER) that deal with personal voice data. To address this, we propose Optimized Paillier Homomorphic Encryption (OPHE) for SER applications in FL. Paillier homomorphic encryption enables computations on ciphertext, preserving privacy but with high computation and communication overhead. The proposed OPHE method can reduce this overhead by combing Paillier homomorphic encryption with pruning. So, we employ OPHE in one of the use cases of a large research project (DAIS) funded by the European Commission using a public SER dataset.

  • 15.
    Mohammadi, Samaneh
    et al.
    RISE Research Institutes of Sweden, Digitala system, Industriella system. Mälardalen University, Sweden.
    Sinaei, Sima
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Balador, Ali
    Mälardalen University, Sweden.
    Flammini, Francesco
    Mälardalen University, Sweden.
    Secure and Efficient Federated Learning by Combining Homomorphic Encryption and Gradient Pruning in Speech Emotion Recognition2023Ingår i: ISPEC 2023: Information Security Practice and Experience: International Conference on Information Security Practice and Experience / [ed] Weizhi Meng, Zheng Yan & Vincenzo Piuri, Springer Nature Singapore , 2023, s. 1-16Kapitel i bok, del av antologi (Refereegranskat)
    Abstract [en]

    Speech Emotion Recognition (SER) detects human emotions expressed in spoken language. SER is highly valuable in diverse fields; however, privacy concerns arise when analyzing speech data, as it reveals sensitive information like biometric identity. To address this, Federated Learning (FL) has been developed, allowing models to be trained locally and just sharing model parameters with servers. However, FL introduces new privacy concerns when transmitting local model parameters between clients and servers, as third parties could exploit these parameters and disclose sensitive information. In this paper, we introduce a novel approach called Secure and Efficient Federated Learning (SEFL) for SER applications. Our proposed method combines Paillier homomorphic encryption (PHE) with a novel gradient pruning technique. This approach enhances privacy and maintains confidentiality in FL setups for SER applications while minimizing communication and computation overhead and ensuring model accuracy. As far as we know, this is the first paper that implements PHE in FL setup for SER applications. Using a public SER dataset, we evaluated the SEFL method. Results show substantial efficiency gains with a key size of 1024, reducing computation time by up to 25% and communication traffic by up to 70%. Importantly, these improvements have minimal impact on accuracy, effectively meeting the requirements of SER applications.

  • 16.
    Nazari, N.
    et al.
    University of Tehran,, Iran.
    Mirsalari, S. A.
    University of Tehran,, Iran.
    Sinaei, Sima
    Mälardalen University, Sweden.
    Salehi, M. E.
    University of Tehran,, Iran.
    Daneshtalab, M.
    Mälardalen University, Sweden.
    Multi-level Binarized LSTM in EEG Classification for Wearable Devices2020Ingår i: Proceedings - 2020 28th Euromicro International Conference on Parallel, Distributed and Network-Based Processing, PDP 2020, s. 175-181Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Long Short-Term Memory (LSTM) is widely used in various sequential applications. Complex LSTMs could be hardly deployed on wearable and resourced-limited devices due to the huge amount of computations and memory requirements. Binary LSTMs are introduced to cope with this problem, however, they lead to significant accuracy loss in some applications such as EEG classification which is essential to be deployed in wearable devices. In this paper, we propose an efficient multi-level binarized LSTM which has significantly reduced computations whereas ensuring an accuracy pretty close to full precision LSTM. By deploying 5-level binarized weights and inputs, our method reduces area and delay of MAC operation about 31 and 27 in 65nm technology, respectively with less than 0.01% accuracy loss. In contrast to many compute-intensive deep-learning approaches, the proposed algorithm is lightweight, and therefore, brings performance efficiency with accurate LSTM-based EEG classification to realtime wearable devices. IEEE.

  • 17.
    Shrestha, Rakesh
    et al.
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Mishra, Ashutosh
    Yonsei University, South Korea.
    Bajracharya, Rojeena
    Mälardalen University, Sweden.
    Sinaei, Sima
    RISE Research Institutes of Sweden, Digitala system, Industriella system.
    Kim, Shiho
    Yonsei University, South Korea.
    6G Network for Connecting CPS and Industrial IoT (IIoT): Chapter 22023Ingår i: Cyber-Physical Systems for Industrial Transformation / [ed] Gunasekaran Manogaran, Nour Eldeen Mahmoud Khalifa, Mohamed Loey, Mohamed Hamed N. Taha, CRC Press, 2023Kapitel i bok, del av antologi (Övrigt vetenskapligt)
    Abstract [en]

    The IoT comprises billions of intelligent devices that interact, gather, and share data via sensors and actuators. The Industrial IoT (IIoT), specifically used in industry and production, is used in automation and rapid production of goods based on machine learning techniques. Similarly, Cyber-Physical System (CPS) plays a vital role in the next-generation industry. The CPSs are intelligent systems that interconnect the physical world through embedded systems, sensors, actuators with the cyberworld. We require a communication backbone for interconnecting and information processing, which 6G networks can fulfill. The 6G has a higher capacity and improved characteristics than previous cellular networks, accelerating the applications and deployments of 6G-based IIoT networks in industry platforms. This chapter discusses how the 6G networks can help interconnect the CPS and IIoT through smart connection, digital twinning, and immersive technology.

  • 18.
    Sinaei, Sima
    et al.
    Mälardalen University, Sweden.
    Daneshtalab, M.
    Mälardalen University, Sweden.
    Hardware acceleration for recurrent neural networks2020Bok (Övrigt vetenskapligt)
    Abstract [en]

    This chapter focuses on the LSTM model and is concerned with the design of a high-performance and energy-efficient solution to implement deep learning inference. The chapter is organized as follows: Section 2.1 introduces Recurrent Neural Networks (RNNs). In this section Long Short Term Memory (LSTM) and Gated Recurrent Unit (GRU) network models are discussed as special kind of RNNs. Section 2.2 discusses inference acceleration with hardware. In Section 2.3, a survey on various FPGA designs is presented within the context of the results of previous related works and after which Section 2.4 concludes the chapter. 

  • 19.
    Sinaei, Sima
    et al.
    University of Tehran, Iran.
    Fatemi, O.
    University of Tehran, Iran.
    An MOPSO method for mapping multimedia applications onto MP-SoC architectures2011Ingår i: Canadian Conference on Electrical and Computer Engineering (CCECE), ISSN 0840-7789, s. 001361-001364Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    System level modeling and design space exploration has an important role in Multi processor embedded system on chip design. Y-chart modeling is a well-known method for solving design space exploration problem. One of the most important stages in Y-chart approach is mapping an application onto architecture. In this paper, MOPSO algorithm has been proposed to obtain optimized solutions for mapping. The proposed method is tested using the MJPEG application as a case study in terms of accuracy and efficiency. Simulation results show that proposed algorithm will provide the designer with accurate solutions with a considerable reduction in design time. Finally a number of multi objective optimization results are simulated and verified by the Sesame framework.

  • 20.
    Sinaei, Sima
    et al.
    University of Tehran, Iran.
    Fatemi, O.
    University of Tehran, Iran.
    Multi-objective algorithms for the application mapping problem in heterogeneous multiprocessor embedded system design2019Ingår i: Journal of Supercomputing, ISSN 0920-8542, E-ISSN 1573-0484, Vol. 75, nr 8, s. 4150-4176Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Design at the Electronic System-Level tackles the increasing complexity of embedded systems by raising the level of abstraction in system specification and modeling. Two important steps in this process are evaluation of a single design configuration and design space exploration. The exponential size of the design space, along with the complex task of simulating a single design point, makes it impossible to explore the design space efficiently in almost all MPSoC design situations. In order to overcome this problem, one or both of the main steps of the design process (i.e., simulation and exploration) must be accelerated. In this paper, for the first part of the design process, high-level analytical models for application mapping and evaluation are presented in order to accelerate the evaluation of a single design configuration. In the second part of the design process, two multi-objective optimization algorithms that are based on particle swarm optimization and simulated annealing have been proposed for performing design space exploration. Considering multimedia applications as case studies, each of these methods produces a set of near-optimal points. Simulation results show that the proposed methods can lead to near-optimal design configurations with acceptable accuracy in a reasonable time. 

  • 21.
    Sinaei, Sima
    et al.
    University of Tehran, Iran.
    Fatemi, O.
    University of Tehran, Iran.
    Novel Heuristic Mapping Algorithms for Design Space Exploration of Multiprocessor Embedded Architectures2016Ingår i: Proceedings - 24th Euromicro International Conference on Parallel, Distributed, and Network-Based Processing, PDP 2016, s. 801-804Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Electronic System level design has an important role in the multi-processor embedded system on chip design. Two important steps in this process are evaluation of a single design configuration and design space exploration. In the first part of design process, high-level simple analytical models for application mapping and evaluation are used and modified aiming at accelerating the evaluation of a single design configuration. Using the analytical model the design space is pruned and explored at high speed with low accuracy. In the second part of the design process, two Multi Objective Optimization Algorithms based on Particle Swarm Optimization and Simulated Annealing have been proposed to perform design space exploration of the pruned design space with higher accuracy taking advantages of low-level architectural simulation engines. The results obtained by proposed algorithms will provide the designer more accurate solutions within an acceptable time. Considering the MJPEG application as the case study, each of these methods produces a set of near-optimal points. Simulation results show that the proposed methods can lead to near-optimal design configurations with acceptable accuracy in reasonable time. © 2016 IEEE.

  • 22.
    Sinaei, Sima
    et al.
    University of Tehran, Iran.
    Fatemi, O.
    University of Tehran, Iran.
    Run-time mapping algorithm for dynamic workloads on heterogeneous MPSoCs platforms2018Ingår i: Proceedings - 21st Euromicro Conference on Digital System Design, DSD 2018, s. 373-380Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Task mapping exploration plays an important role in the high performance achieved by heterogeneous multi-processor system-on-chip (MPSoC) platforms. The dynamic of application workloads in modern MPSoC-based embedded systems are consistently growing. Nowadays, the execution of different applications is done concurrently and these applications compete for resources in such systems. This paper presents a novel run-time mapping algorithm for multimedia applications. The objective of application mapping is to minimize execution time in a predefined budget of energy consumption. This algorithm is divided to two phases: design-time and run-time. During design-time, application clustering is combined with design space exploration, then a set of rules for mapping is extracted by using Association Rule Mining techniques, and after that, during run-time, feature extraction and application classification is performed based on the rule sets. The evaluation of the proposed algorithm is done by using a heterogeneous MPSoC system with several applications that have different communication and computation behaviors. The experimental results revealed that during run-time, applications were correctly classified by the proposed algorithm and the best resources selected for mapping accurately. The results clearly showcase the proposed algorithm’s effectiveness. 

  • 23.
    Sinaei, Sima
    et al.
    University of Tehran, Iran.
    Fatemi, O.
    University of Tehran, Iran.
    Run-time mapping algorithm for dynamic workloads using association rule mining2018Ingår i: Journal of systems architecture, ISSN 1383-7621, E-ISSN 1873-6165, Vol. 91Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Task mapping exploration plays an important role in the high performance achieved by heterogeneous multi-processor system-on-chip (MPSoC) platforms. The dynamic of application workloads in modern MPSoC-based embedded systems are consistently growing. Nowadays, the execution of different applications is done concurrently, and these applications compete for resources in such systems. To cope with the dynamism of application workloads at runtime and improve the efficiency of the underlying system architecture, this paper presents a hybrid task mapping algorithm for multimedia applications. That consists of two phases: design-time and run-time. During design-time, static mapping exploration is performed, and the applications are clustered based on their efficient mapping, then a set of rules for mapping is extracted by Association Rule Mining techniques. During run-time, when a new application enters to the system, this application is classified to one of the existing clusters using the rule sets extracted at design-time phase. The objective of application mapping is to minimize execution time in a predefined budget of energy consumption. A heterogeneous MPSoC system is used to evaluate the proposed algorithm. The experimental results revealed that during run-time by using the proposed algorithm, suitable resources regarding energy consumption and execution time are selected for mapping. 

  • 24.
    Sinaei, Sima
    et al.
    University of Tehran, Iran.
    Fatemi, O.
    University of Tehran, Iran.
    Tree-based algorithm for design space exploration and mapping application onto heterogeneous platforms2018Ingår i: 2017 19th International Symposium on Computer Architecture and Digital Systems, CADS 2017, Vol. JanuaryArtikel i tidskrift (Refereegranskat)
    Abstract [en]

    Application task mapping onto a given heterogeneous processors has been known as one of the most significant problems in system level design of embedded systems. The huge number of mapping configurations as well as the complexity of evaluating a mapping, makes the task of finding optimal solutions a really time-consuming task. This paper proposes a novel tree-based exploration algorithm to solve the mapping problem. The algorithm prunes the design space such that it can be explored in less time while it still includes desirable points. The proposed Algorithm perform exploration in multilevel and in each level uses Genetic Algorithm for searching among mapping configuration. Simulation results reveal that multi-level explorations lead to find near-optimal mapping efficiently, with more than 91% accuracy in less time.

  • 25.
    Sinaei, Sima
    et al.
    University of Tehran, .
    Fatemi, O.
    University of Tehran, .
    Pimentel, A. D.
    University of Amsterdam, Netherlands.
    Run-time mapping algorithm for dynamic workloads using process merging transformations2018Ingår i: Proceedings - 2017 17th International Conference on Embedded Computer Systems: Architectures, Modeling, and Simulation, SAMOS 2017, Vol. January, s. 188-195Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Exploration of task mappings has an important role to achieve high performance in heterogeneous multi-processor system-on-chip (MPSoC) platforms. The application workloads in modern MPSoC-based embedded systems are becoming increasingly dynamic. Different applications concurrently execute and contend for resources in such systems. In this paper, a run-time algorithm is proposed to analytically evaluate the system throughput of to-be-executed applications (modelled as Kahn Process Networks, KPNs) in order to quickly determine a proper resource binding for these applications. Merging transformations on the KPNs are applied to capture the cases in which the number of processes in the KPN is larger than the number of available processing resources, thereby modeling the effects of binding multiple processes to a single processor. We evaluated our algorithm using a heterogeneous MPSoC system with several applications. Our experimental results revealed that during runtime, the performance of selected mapping with regard to available resources is close to the optimal performance obtained by exhaustive search and simulation. Therefore, the results clearly confirm that our algorithm is effective.

  • 26.
    Sinaei, Sima
    et al.
    University of Tehran, Iran.
    Pimentel, A. D.
    University of Tehran, Iran.
    Fatemi, O.
    University of Amsterdam, Netherlands.
    Run-time resource allocation for embedded Multiprocessor System-on-Chip using tree-based design space exploration2017Ingår i: Proceedings - 2017 12th IEEE International Conference on Design and Technology of Integrated Systems in Nanoscale Era, DTIS 2017Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    The dynamic nature of application workloads in modern MPSoC-based embedded systems is growing. To cope with the dynamism of application workloads at run time and to improve the efficiency of the underlying system architecture, this paper presents a novel run-time resource allocation algorithm for multimedia applications with the objective of minimizing energy consumption for predefined deadlines. This algorithm is based on a novel tree-based design space exploration (DSE) method, which is performed in two phases: design-time and run-time. During design time, application clustering is combined with the tree-based DSE, and after that, feature extraction and application classification is performed during run-time based on well-known machine learning techniques. We evaluated our algorithm using a heterogeneous MPSoC system with several applications that have different communication and computation behaviors. Our experimental results revealed that during runtime, more than 91% of the applications were classified correctly by our proposed algorithm to select the best resources for allocation. Therefore the results clearly confirm that our algorithm is effective.

  • 27.
    Sinaei, Sima
    et al.
    University of Tehran, Iran.
    Rad, R. S.
    Sirjan University of Technology, Iran.
    Ghodsvali, E.
    Amir Kabir University, Iran.
    Ada in real-time embedded system2013Ingår i: Research Journal of Applied Sciences, Engineering and Technology, ISSN 2040-7459, E-ISSN 2040-7467, Vol. 5, nr 14, s. 3803-3809Artikel i tidskrift (Refereegranskat)
    Abstract [en]

    Ada has an important role in the real-time/embedded/safety-critical areas. It is the only ISO-standard, object-oriented, concurrent, real-time programming language. Ada is used as a usual language for application areas such as defense embedded systems that reliability and efficiency are very essential. One of the main Ada’s characteristics in compare with other programming languages is that, Ada was developed from the ground up with capabilities that provide real-time requirements. In this study it will be shown why Ada is used as the new standard for real-time programming languages and basic characteristics of real-time programming system in general and how they are addressed in Ada will be explained. 

1 - 27 av 27
RefereraExporteraLänk till träfflistan
Permanent länk
Referera
Referensformat
  • apa
  • ieee
  • modern-language-association-8th-edition
  • vancouver
  • Annat format
Fler format
Språk
  • de-DE
  • en-GB
  • en-US
  • fi-FI
  • nn-NO
  • nn-NB
  • sv-SE
  • Annat språk
Fler språk
Utmatningsformat
  • html
  • text
  • asciidoc
  • rtf