Building open-domain conversational systems (or chatbots) that produce convincing responses is a recognized challenge. Recent state-of-the-art (SoTA) transformer-based models for the generation of natural language dialogue have demonstrated impressive performance in simulating human-like, single-turn conversations in English.This work investigates, by an empirical study, the potential for transfer learning of such models to Swedish language. DialoGPT, an English language pre-trained model, is adapted by training on three different Swedish language conversational datasets obtained from publicly available sources: Reddit, Familjeliv and the GDC. Perplexity score (an automated intrinsic metric) and surveys by human evaluation were used to assess the performances of the fine-tuned models. We also compare the DialoGPT experiments with an attention-mechanism-based seq2seq baseline model, trained on the GDC dataset. The results indicate that the capacity for transfer learning can be exploited with considerable success. Human evaluators asked to score the simulated dialogues judged over 57% of the chatbot responses to be human-like for the model trained on the largest (Swedish) dataset. The work agrees with the hypothesis that deep monolingual models learn some abstractions which generalize across languages. We contribute the codes, datasets and model checkpoints and host the demos on the HuggingFace platform.
We present the design and evaluation of a 3.5-year embedded sensing deployment at the Mithræum of Circus Maximus, a UNESCO-protected underground archaeological site in Rome (Italy). Unique to our work is the use of energy harvesting through thermal and kinetic energy sources. The extreme scarcity and erratic availability of energy, however, pose great challenges in system software, embedded hardware, and energy management. We tackle them by testing, for the first time in a multi-year deployment, existing solutions in intermittent computing, low-power hardware, and energy harvesting. Through three major design iterations, we find that these solutions operate as isolated silos and lack integration into a complete system, performing suboptimally. In contrast, we demonstrate the efficient performance of a hardware/software co-design featuring accurate energy management and capturing the coupling between energy sources and sensed quantities. Installing a battery-operated system alongside also allows us to perform a comparative study of energy harvesting in a demanding setting. Albeit the latter reduces energy availability and thus lowers the data yield to about 22% of that provided by batteries, our system provides a comparable level of insight into environmental conditions and structural health of the site. Further, unlike existing energy-harvesting deployments that are limited to a few months of operation in the best cases, our system runs with zero maintenance since almost 2 years, including 3 months of site inaccessibility due to a COVID19 lockdown
Intermittently powered embedded devices ensure forward progress of programs through state checkpointing in non-volatile memory. Checkpointing is, however, expensive in energy and adds to the execution times. To minimize this overhead, we present DICE, a system that renders differential checkpointing profitable on these devices. DICE is unique because it is a software-only technique and efficient because it only operates in volatile main memory to evaluate the differential. DICE may be integrated with reactive (Hibernus) or proactive (MementOS, HarvOS) checkpointing systems, and arbitrary code can be enabled with DICE using automatic code-instrumentation requiring no additional programmer effort. By reducing the cost of checkpoints, DICE cuts the peak energy demand of these devices, allowing operation with energy buffers that are one-eighth of the size originally required, thus leading to benefits such as smaller device footprints and faster recharging to operational voltage level. The impact on final performance is striking: with DICE, Hibernus requires one order of magnitude fewer checkpoints and one order of magnitude shorter time to complete a workload in real-world settings.
Transiently powered computers (TPCs) form the foundation of the battery-less Internet of Things, using energy harvesting and small capacitors to power their operation. This kind of power supply is characterized by extreme variations in supply voltage, as capacitors charge when harvesting energy and discharge when computing. We experimentally find that these variations cause marked fluctuations in clock speed and power consumption. Such a deceptively minor observation is overlooked in existing literature. Systems are thus designed and parameterized in overly conservative ways, missing on a number of optimizations.We rather demonstrate that it is possible to accurately model and concretely capitalize on these fluctuations. We derive an energy model as a function of supply voltage and prove its use in two settings. First, we develop EPIC, a compile-time energy analysis tool. We use it to substitute for the constant power assumption in existing analysis techniques, giving programmers accurate information on worst-case energy consumption of programs. When using EPIC with existing TPC system support, run-time energy efficiency drastically improves, eventually leading up to a 350% speedup in the time to complete a fixed workload. Further, when using EPIC with existing debugging tools, it avoids unnecessary program changes that hurt energy efficiency. Next, we extend the MSPsim emulator and explore its use in parameterizing a different TPC system support. The improvements in energy efficiency yield up to more than 1000% time speedup to complete a fixed workload.
The Concordance Index (C-index) is a commonly used metric in Survival Analysis for evaluating the performance of a prediction model. In this paper, we propose a decomposition of the C-index into a weighted harmonic mean of two quantities: one for ranking observed events versus other observed events, and the other for ranking observed events versus censored cases. This decomposition enables a finer-grained analysis of the relative strengths and weaknesses between different survival prediction methods. The usefulness of this decomposition is demonstrated through benchmark comparisons against classical models and state-of-the-art methods, together with the new variational generative neural-network-based method (SurVED) proposed in this paper. The performance of the models is assessed using four publicly available datasets with varying levels of censoring. Using the C-index decomposition and synthetic censoring, the analysis shows that deep learning models utilize the observed events more effectively than other models. This allows them to keep a stable C-index in different censoring levels. In contrast to such deep learning methods, classical machine learning models deteriorate when the censoring level decreases due to their inability to improve on ranking the events versus other events.
Word sense disambiguation (WSD) is a core task in computational linguistics that involves interpreting polysemous words in context by identifying senses from a predefined sense inventory. Despite the dominance of BERT and its derivatives in WSD evaluation benchmarks, their effectiveness in encoding and retrieving word senses, especially in languages other than English, remains relatively unexplored. This paper provides a detailed quantitative analysis, comparing various BERT-based models for Russian, and examines two primary WSD strategies: fine-tuning and feature-based nearest-neighbor classification. The best results are obtained with the ruBERT model coupled with the feature-based nearest neighbor strategy. This approach adeptly captures even fine-grained meanings with limited data and diverse sense distributions.
Recent advances in Deep Learning have led to a significant performance increase on several NLP tasks, however, the models become more and more computationally demanding. Therefore, this paper tackles the domain of computationally efficient algorithms for NLP tasks. In particular, it investigates distributed representations of n -gram statistics of texts. The representations are formed using hyperdimensional computing enabled embedding. These representations then serve as features, which are used as input to standard classifiers. We investigate the applicability of the embedding on one large and three small standard datasets for classification tasks using nine classifiers. The embedding achieved on par F_1 scores while decreasing the time and memory requirements by several times compared to the conventional n -gram statistics, e.g., for one of the classifiers on a small dataset, the memory reduction was 6.18 times; while train and test speed-ups were 4.62 and 3.84 times, respectively. For many classifiers on the large dataset, memory reduction was ca. 100 times and train and test speed-ups were over 100 times. Importantly, the usage of distributed representations formed via hyperdimensional computing allows dissecting strict dependency between the dimensionality of the representation and n-gram size, thus, opening a room for tradeoffs.
Body maps are visual documents, where somatic experiences can be drawn onto a graphical representation of an outline of the human body. They hold the ability to capture complex and non-explicit emotions and somatic felt sensations, elaborating narratives that cannot be simply spoken. We present an illustrative example of "how-to"complete a body map, together with four case studies that provide examples of using body maps in design research. We identify five uses of body maps as generative tools for soma-based design, ranging from sampling bodily experience, heightening bodily self-awareness, understanding changing bodily experience over time, identifying patterns of bodily experience, and transferring somatic experiential qualities into physical designs. The different requirements for scaffolding the use of body maps in user-centred design versus first-person autobiographical design research are discussed. We provide this Pictorial as a resource for designers and researchers who wish to integrate body maps into their practice. © 2022 Owner/Author.
Body area networks (BANs), cloud computing, and machine learning are platforms that can potentially enable advanced healthcare outside the hospital. By applying distributed sensors and drug delivery devices on/in our body and connecting to such communication and decision-making technology, a system for remote diagnostics and therapy is achieved with additional autoregulation capabilities. Challenges with such autarchic on-body healthcare schemes relate to integrity and safety, and interfacing and transduction of electronic signals into biochemical signals, and vice versa. Here, we report a BAN, comprising flexible on-body organic bioelectronic sensors and actuators utilizing two parallel pathways for communication and decision-making. Data, recorded from strain sensors detecting body motion, are both securely transferred to the cloud for machine learning and improved decision-making, and sent through the body using a secure body-coupled communication protocol to auto-actuate delivery of neurotransmitters, all within seconds. We conclude that both highly stable and accurate sensing—from multiple sensors—are needed to enable robust decision making and limit the frequency of retraining. The holistic platform resembles the self-regulatory properties of the nervous system, i.e., the ability to sense, communicate, decide, and react accordingly, thus operating as a digital nervous system. © 2021, The Author(s).
Simulation of the real world is a widely researchedtopic in various fields. The automotive industry in particular isvery dependent on real world simulations, since these simulations are needed in order to prove the safety of advance driverassistance systems (ADAS) and autonomous driving (AD). Inthis paper we propose a deep learning based model for simulating the outputs from production sensors used in autonomousvehicles. We introduce an improved Recurrent ConditionalGenerative Adversarial Network (RC-GAN) consisting of Recurrent Neural Networks (RNNs) that use Long Short-TermMemory (LSTM) in both the generator and the discriminatornetworks in order to generate production sensor errors thatexhibit long-term temporal correlations. The network is trainedin a sequence-to-sequence fashion where we condition theoutput from the model on sequences describing the surroundingenvironment. This enables the model to capture spatial andtemporal dependencies, and the model is used to generatesynthetic time series describing the errors in a productionsensor which can be used for more realistic simulations. Themodel is trained on a data set collected from real roads withvarious traffic settings, and yields significantly better results ascompared to previous works.
The report provides a knowledge base on the digital transformation in the water industry, its visionand potential. Key success factors are pointed out and challenges with workforce competence,data management and cybersecurity is outlined. A catalogue with ten examples of successful digitalapplications is provided for inspiration.
We present the experimental evaluation of different security mechanisms applied to persistent state in intermittent computing. Whenever executions become intermittent because of energy scarcity, systems employ persistent state on non-volatile memories (NVMs) to ensure forward progress of applications. Persistent state spans operating system and network stack, as well as applications. While a device is off recharging energy buffers, persistent state on NVMs may be subject to security threats such as stealing sensitive information or tampering with configuration data, which may ultimately corrupt the device state and render the system unusable. Based on modern platforms of the Cortex M*series, we experimentally investigate the impact on typical intermittent computing workloads of different means to protect persistent state, including software and hardware implementations of staple encryption algorithms and the use of ARM TrustZone protection mechanisms. Our results indicate that i) software implementations bear a significant overhead in energy and time, sometimes harming forward progress, but also retaining the advantage of modularity and easier updates; ii) hardware implementations offer much lower overhead compared to their software counterparts, but require a deeper understanding of their internals to gauge their applicability in given application scenarios; and iii) TrustZone shows almost negligible overhead, yet it requires a different memory management and is only effective as long as attackers cannot directly access the NVMs
This paper focus on providing a secure and trustworthy solution for virtual machine (VM) migration within an existing cloud provider domain, and/or to the other federating cloud providers. The infrastructure-as-a-service (IaaS) cloud service model is mainly addressed to extend and complement the previous Trusted Computing techniques for secure VM launch and VM migration case. The VM migration solution proposed in this paper uses a Trust_Token based to guarantee that the user VMs can only be migrated and hosted on a trustworthy and/or compliant cloud platforms. The possibility to also check the compliance of the cloud platforms with the pre-defined baseline configurations makes our solution compatible with an existing widely accepted standards-based, security-focused cloud frameworks like FedRAMP. Our proposed solution can be used for both inter- and intra-cloud VM migrations. Different from previous schemes, our solution is not dependent on an active (on-line) trusted third party; that is, the trusted third party only performs the platform certification and is not involved in the actual VM migration process. We use the Tamarin solver to realize a formal security analysis of the proposed migration protocol and show that our protocol is safe under the Dolev-Yao intruder model. Finally, we show how our proposed mechanisms fulfill major security and trust requirements for secure VM migration in cloud environments.
Meeting the security and privacy needs for IoT data becomes equally important in the newly introduced intermediary Fog Computing layer, as it was in its former technological layer - Cloud; but the accomplishment of such security is critical and challenging. While security assurance of the fog layer devices is imperative due to their exposure to the public Internet, it becomes even more complex, than the cloud layer, as it involves a large number of heterogeneous devices deployed hierarchically. Manual audit and certification schemes are unsuitable for large number of fog nodes thereby inhibiting the involved stakeholders to use manual security assurance schemes altogether. However, scalable and feasible security assurance can be provided by introducing automated and continuous monitoring and auditing of fog nodes to ensure a trusted, updated and vulnerability free fog layer. This paper presents such an solution in the form of an automated Fog Node Audit and Certification scheme (FoNAC) which guarantees a secure fog layer through the proposed fog layer assurance mechanism. FoNAC leverages Trusted Platform Module (TPM 2.0) capabilities to evaluate/audit the platform integrity of the operating fog nodes and grants certificate to the individual node after a successful security audit. FoNAC security is also validated through its formal security analysis performed using AVISPA under Dolev-Yao intruder model. The security analysis of FoNAC shows its resistance against cyber-attacks like impersonation, replay attack, forgery, Denial of Service(DoS) and MITM attack.
Introduction: Individuals recovering from COVID-19 often experience a range of post-recovery symptoms. However, the literature on post-COVID-19 symptoms reveals conflicting results, necessitating a heightened focus on longitudinal studies to comprehend the trajectory of impairments over time. Our study aimed to investigate changes in long-term impairments among individuals infected with COVID-19 and explore potential predictors influencing these changes. Methods: We conducted a web-survey targeting individuals that had been infected with COVID-19 at four time-points: T0 (baseline), T1 (three months), T2 (six months), and T3 (twelve months). The survey included contextual factors, factors related to body functions and structures, and post-COVID impairments. The longitudinal sample included 213 individuals (with a mean age of 48.92 years). Linear mixed models were employed to analyze changes in post-COVID impairments over time and identify impacting factors. Results: Findings revealed a general decline in post-COVID impairments over time, with each symptom exhibiting a dynamic pattern of fluctuations. Factors such as initial infection severity, education level, and work status were significantly associated with the levels of impairments. Discussion: The study emphasizes that post-COVID impairments are not static but exhibit variations over time. Personalized care, especially for vulnerable populations, is crucial. The results underscore the need for long-term monitoring and multidisciplinary treatment approaches. Targeted support and interventions are highlighted for individuals with severe initial infections and those in socioeconomically disadvantaged groups.
Background: The COVID-19 pandemic has triggered a global mental health crisis. Yet, we know little about the lasting effects of COVID-19 infection on mental health. This prospective longitudinal study aimed to investigate the trajectories of mental health changes in individuals infected with COVID-19 and to identify potential predictors that may influence these changes. Methods: A web-survey that targeted individuals that had been infected with COVID-19 was used at three time-points: T0 (baseline), T1 (six months), and T2 (twelve months). The survey included demographics, questions related to COVID-19 status, previous psychiatric diagnosis, post-COVID impairments, fatigue, and standardized measures of depression, anxiety, insomnia. Linear mixed models were used to examine changes in depression, anxiety, and insomnia over time and identify factors that impacted trajectories of mental health outcomes. Results: A total of 236 individuals completed assessments and was included in the longitudinal sample. The participants’ age ranged between 19 and 81 years old (M = 48.71, SD = 10.74). The results revealed notable changes in mental health outcomes over time. The trajectory of depression showed significant improvement over time while the trends in anxiety and insomnia did not exhibit significant changes over time. Younger participants and individuals who experienced severe COVID-19 infection in the acute phase were identified as high-risk groups with worst mental ill-health. The main predictors of the changes in the mental health outcomes were fatigue and post-COVID impairments. Conclusions: The findings of our study suggest that mental health outcomes following COVID-19 infection exhibit a dynamic pattern over time. The study provides valuable insights into the mental health trajectory following COVID-19 infection, emphasizing the need for ongoing assessment, support, and interventions tailored to the evolving mental health needs of this population.
Through a soma design process, we explored how to design a shape-changing car seat as a point of interaction between the car and the driver. We developed a low-fdelity prototyping tool to support this design work and describe our experiences of using this tool in a workshop with a car manufacturer. We share the co-designed patterns that we developed: re-engaging in driving; dis-engaging from driving; saying farewell; and being held while turning. Our analysis contributes design knowledge on how we should design for a car seat to ‘touch’ larger, potentially heavier parts of the body including the back, shoulders, hips, and bottom. The non-habitual experience of shape-changing elements in the driver seat helped pinpoint the link between somatic experience and intelligent rational behaviour in driving tasks. Relevant meaning-making processes arose when the two were aligned, improving on the holistic coming together of driver, car, and the road travelled.
A multiphysics Simulation-Driven Design approach has been undertaken to augment the OCP Leopard Server thermal management and heat recovery hardware with the Nexalus hybrid liquid-cooled sealed server technology. Independent testing at the RISE Research Institute of Sweden has proven up to 98% heat recovery is achievable at water temperatures up to and exceeding 65°C. The improved design could maintain the elevated water temperature over a range of CPU workloads, ranging from 8% to 75%. Importantly, the design solution achieves this within an architecture that is IOU in height, half that of the original stock 20U server, potentially doubling the compute density of a rack.
Dynamic Adaptive Streaming over HTTP (DASH) is a standard for delivering video in segments and adapting each segment’s bitrate (quality), to adjust to changing and limited network bandwidth. We study segment prefetching, informed by machine learning predictions of bitrates of client segment requests, implemented at the network edge. We formulate this client segment request prediction problem as a supervised learning problem of predicting the bitrate of a client’s next segment request, in order to prefetch it at the mobile edge, with the objective of jointly improving the video streaming experience for the users and network bandwidth utilization for the service provider. The results of extensive evaluations showed a segment request prediction accuracy of close to 90% and reduced video segment access delay with a cache hit ratio of 58%, and reduced transport network load by lowering the backhaul link utilization by 60.91%.
The ongoing roll-out of 5G networks paves the way for many fascinating applications such as virtual reality (VR), augmented reality (AR), and autonomous driving. Moreover, 5G enables billions of devices to transfer an unprecedented amount of data at the same time. This transformation calls for novel technologies like multi-access edge computing (MEC) to satisfy the stringent latency and bitrate requirements of the mentioned applications. The main challenge pertaining to MEC is that the edge MEC nodes are usually characterized by scarce computational resources compared to the core or cloud, arising the challenge of efficiently utilizing the edge resources while ensuring that the service requirements are satisfied. When considered with the users’ mobility, this poses another challenge, which lies in minimization of the service interruption for the users whose service requests are represented as service function chains (SFCs) composed of virtualized network functions (VNFs) instantiated on the MEC nodes or on the cloud. In this paper, we study the problem of joint user association, SFC placement, and resource allocation, employing mixed-integer linear programming (MILP) techniques. The objective function of this MILP-based problem formulation are to minimize (i) the service provisioning cost, (ii) the transport network utilization, and (iii) the service interruption. Moreover, a heuristic algorithm is proposed to tackle the scalability issue of the MILP-based algorithms. Finally, comprehensive experiments are performed to draw a comparison between these approaches.
Streaming high-quality video over dynamic radio networks is challenging. Dynamic adaptive streaming over HTTP (DASH) is a standard for delivering video in segments, and adapting its quality to adjust to a changing and limited network bandwidth. We present a machine learning-based predictive pre-fetching and caching approach for DASH video streaming, implemented at the multi-access edge computing server. We use ensemble methods for machine learning (ML) based segment request prediction and an integer linear programming (ILP) technique for pre-fetching decisions. Our approach reduces video segment access delay with a cache-hit ratio of 60% and alleviates transport network load by reducing the backhaul link utilization by 69%. We validate the ML model and the pre-fetching algorithm, and present the trade-offs involved in pre-fetching and caching for resource-constrained scenarios.
For most modern data centres, it is of high value to select practical methods for improving energy efficiency and reducing energy waste. IT-equipment and cooling systems are the two most significant energy consumers in data centres, thus the energy efficiency of any data centre mainly relies on the energy efficiency of its computational and cooling systems. Existing techniques of optimising the energy usage of both these systems have to be compared. However, such experiments cannot be conducted in real plants as they may harm the electronic equipment. This paper proposes a modelling toolbox which enables building models of data centres of any scale and configuration with relative ease. The toolbox is implemented as a set of building blocks which model individual components of a typical data centre, such as processors, local fans, servers, units of cooling systems, it provides methods of adjusting the internal parameters of the building blocks, as well as contains constructors utilising the building blocks for building models of data centre systems of different levels from server to the server room. The data centre model is meant to accurate estimating the energy consumption as well as the evolution of the temperature of all computational nodes and the air temperature inside the data centre. The constructed model capable of substitute for the real data centre at examining the performance of different energy-saving strategies in dynamic mode: the model provides information about data centre operating states at each time point (as model outputs) and takes values of adjustable parameters as the control signals from system implementing energy-saving algorithm (as model inputs). For Module 1 of the SICS ICE data centre located in Luleå, Sweden, the model was constructed from the building blocks. After adjusting the internal parameters of the building blocks, the model demonstrated the behaviour quite close to real data from the SICS ICE data centre. Therefore the model is applicable to use as a substitute for the real data centre. Some examples of using the model for testing energy-saving strategies are presented at the end of the paper.
The “Voight-Kampff” Generative AI Authorship Verification task aims to determine whether a text was generated by an AI or written by a human. As in its fictional inspiration,1 the Voight-Kampff task structures AI detection as a builder-breaker challenge: The builders, participants in the PAN lab, submit software to detect AI-written text and the breakers, participants in the ELOQUENT lab, submit AI-written text with the goal of fooling the builders. We formulate the task in a way that is reminiscent of a traditional authorship verification problem, where given a pair of texts, their human or machine authorship is to be inferred. For this first task installment, we further restrict the problem so that each pair is guaranteed to contain one human and one machine text. Hence the task description reads: Given two texts, one authored by a human, one by a machine: pick out the human. In total, we evaluated 43 detection systems (30 participant submissions and 13 baselines), ranging from linear classifiers to perplexity-based zero-shot systems. We tested them on 70 individual test set variants organized in 14 base collections, each designed on different constraints such as short texts, Unicode obfuscations, or language switching. The top systems achieve very high scores, proving themselves not perfect but sufficiently robust across a wide range of specialized testing regimes. Code used for creating the datasets and evaluating the systems, baselines, and data are available on GitHub.
This report focus on the intersection ofForeign Information Manipulation andInterference and Large Language Models.The aim is to give a non-technicalcomprehensive understanding of howweaknesses in the language models canbe used for creating malicious content tobe used in FIMI.
Distance-bounding anonymous credentials could be used for any location proofs that do not need to identify the prover and thus could make even notoriously invasive mechanisms such as location-based services privacy-preserving. There is, however, no secure distance-bounding protocol for general attribute-based anonymous credentials. Brands and Chaum’s (EUROCRYPT’93) protocol combining distance-bounding and Schnorr identification comes close, but does not fulfill the requirements of modern distance-bounding protocols. For that, we need a secure distance-bounding zero-knowledge proof-of-knowledge resisting mafia fraud, distance fraud, distance hijacking and terrorist fraud. Our approach is another attempt toward combining distance bounding and Schnorr to construct a distance-bounding zero-knowledge proof-of-knowledge. We construct such a protocol and prove it secure in the (extended) DFKO model for distance bounding. We also performed a symbolic verification of security properties needed for resisting these attacks, implemented in Tamarin. Encouraged by results from Singh et al. (NDSS’19), we take advantage of lessened constraints on how much can be sent in the fast phase of the distance-bounding protocol and achieve a more efficient protocol. We also provide a version that does not rely on being able to send more than one bit at a time which yields the same properties except for (full) terrorist fraud resistance.
There is a growing consensus around the transformative and innovative power of Artificial Intelligence (AI) technology. AI will transform which products are launched and how new business models will be developed to support them. Despite this, little research exists today that systematically explores how AI will change and support various aspects of innovation management. To address this question, this article proposes a holistic, multi-dimensional AI maturity model that describes the essential conditions and capabilities necessary to integrate AI into current systems, and guides organisations on their journey to AI maturity. It explores how various elements of the innovation management system can be enabled by AI at different maturity stages. Two key experimentation stages are identified, 1) an initial stage that focuses on optimisation and incremental innovation, and 2) a higher maturity stage where AI becomes an enabler of radical innovation. We conclude that AI technologies can be applied to democratise and distribute innovation across organisations.
Molecular property prediction is essential in chemistry, especially for drug discovery applications. However, available molecular property data is often limited, encouraging the transfer of information from related data. Transfer learning has had a tremendous impact in fields like Computer Vision and Natural Language Processng signaling for its potential in molecular property prediction. We present a pre-training procedure for molecular representation learning using reaction data and use it to pre-train a SMILES Transformer. We fine-tune and evaluate the pretrained model on 12 molecular property prediction tasks from MoleculeNet within physical chemistry, biophysics, and physiology and show a statistically significant positive effect on 5 of the 12 tasks compared to a non-pre-trained baseline model.
Syftet med förstudien Den cirkulära bilen var att börja bygga konkreta visioner som möjliggör att Sverige har en cirkulärt anpassad bilflotta med fossilfria och klimatneutrala transporter år 2045 och att bygga en solid bas för ett steg 2-projekt, som i sin tur kommer att ge stöd och kapacitet för aktörer att accelerera den cirkulära bilvärdekedjan. Projektet har samlat 13 parter från hela värdekedjan och gemensamt lagt grunden till vidare arbete i ett fortsättningsprojekt – en ansökan som genererat intresse från ett stort antal parter både befintliga och nytillkommande. Inom studien har startmöten och workshops genomförts där parter samlats digitalt och frågeställningar sonderats. Intervjuer har genomförts med parter där möjligheter och utmaningar med omställningen diskuterats. Studiebesök har genomförts där kunskapsdelning skett och samverkan möjliggjorts. Fysisk workshop har genomförts med samtliga parter. Här tittade man gemensamt på trender och möjliga framtidsscenarios genom hela systemet. Detta gav en bra grund för det vidare arbetet med steg 2. Förstudien har genererat stort intresse från aktörer i hela värdekedjan, skapat nya kontakter och möjligheter till samverkan och blivit uppstarten på en gemensam kunskapsresa för verklig förändring. Studien har initierat arbete brett i värdekedjan kopplat till gemensamma frågeställningar samt framtidsspaningar, vilket möjliggör gemensamt arbete för bred omställning och tydliggjort behovet av åtgärder som förflyttar hela systemet. Detta ses som en god grund för ett steg 2 projekt med förutsättningar för att förverkliga den cirkulära bilvärdekedjan.
To enhance the computational efficiency of quantized Transformers, we replace the dot-product and Softmax-based attention with an alternative mechanism involving addition and ReLU activation only. This side-steps the expansion to double precision often required by matrix multiplication and avoids costly Softmax evaluations but maintains much of the core functionality of conventional dot-product attention. It can enable more efficient execution and support larger quantized Transformer models on resource-constrained hardware or alternative arithmetic systems like homomorphic encryption. Training experiments on four common benchmark tasks show test set prediction scores comparable to those of conventional Transformers with dot-product attention. Our scaling experiments also suggest significant computational savings, both in plaintext and under encryption. The ReLU and addition-based attention mechanism introduced in this paper may enable privacy-preserving AI applications operating under homomorphic encryption by avoiding the costly multiplication of encrypted variables.
Adequate privacy protection is crucial for implementing modern AI algorithms in medicine. With Fully Homomorphic Encryption (FHE), a party without access to the secret key can perform calculations and advanced analytics on encrypted data without taking part of either the input data or the results. FHE can therefore work as an enabler for situations where computations are carried out by parties that are denied plain text access to sensitive data. It is a scenario often found with digital services that process personal health-related data or medical data originating from a healthcare provider, for example, when the service is delivered by a third-party service provider located in the cloud. There are practical challenges to be aware of when working with FHE. The current work aims to improve accessibility and reduce barriers to entry by providing code examples and recommendations to aid developers working with health data in developing FHE-based applications. HEIDA is available on the GitHub repository: https://github.com/rickardbrannvall/HEIDA.
People living with type 1 diabetes often use several apps and devices that help them collect and analyse data for a better monitoring and management of their disease. When such health related data is analysed in the cloud, one must always carefully consider privacy protection and adhere to laws regulating the use of personal data. In this paper we present our experience at the pilot Vinter competition 2021-22 organised by Vinnova. The competition focused on digital services that handle sensitive diabetes related data. The architecture that we proposed for the competition is discussed in the context of a hypothetical cloud-based service that calculates diabetes self-care metrics under strong privacy preservation. It is based on Fully Homomorphic Encryption (FHE)-a technology that makes computation on encrypted data possible. Our solution promotes safe key management and data life-cycle control. Our benchmarking experiment demonstrates execution times that scale well for the implementation of personalised health services. We argue that this technology has great potentials for AI-based health applications and opens up new markets for third-party providers of such services, and will ultimately promote patient health and a trustworthy digital society.
This study investigates the use of transfer learning and modular design for adapting a pretrained model to optimize energy efficiency and heat reuse in edge data centers while meeting local conditions, such as alternative heat management and hardware configurations. A Physics-Informed Data-Driven Recurrent Neural Network (PIDD RNN) is trained on a small scale-model experiment of a six-server data center to control cooling fans and maintain the exhaust chamber temperature within safe limits. The model features a hierarchical regularizing structure that reduces the degrees of freedom by connecting parameters for related modules in the system. With a RMSE value of 1.69, the PIDD RNN outperforms both a conventional RNN (RMSE: 3.18), and a State Space Model (RMSE: 2.66). We investigate how this design facilitates transfer learning when the model is fine-tuned over a few epochs to small dataset from a second set-up with a server located in a wind tunnel. The transferred model outperforms a model trained from scratch over hundreds of epochs.
Finding synergies between heat producing and heat consuming actors in an economy provides opportunity for more efficient energy utilization and reduction of overall power consumption. We propose to use low-grade heat recovered from data centers directly in food processing industries, for example for the drying of fruit and berries. This study analyses how the heat output of industrial IT-load on servers can dry apples in a small-scale experimental set up. To keep the temperatures of the server exhaust airflow near a desired set-point we use a model predictive controller (MPC) re-purposed to the drying experiment set-up from a previous work that used machine learning models for cluster thermal management. Thus, conditions with for example 37 C for 8 hours drying can be obtained with results very similar to conventional drying of apples. The proposed solution increases the value output of the electricity used in a data center by capturing and using the excess heat that would otherwise be exhausted. The results from our experiments show that drying foods with excess heat from data center is possible with potential of strengthening the food processing industry and contribute to food self-sufficiency in northern Sweden.
Cooling of IT equipment consumes a large proportion of a modern data centre’s energy budget and is therefore an important target for optimal control. This study analyses a scaled down system of six servers with cooling fans by implementing a minimal data driven time-series model in TensorFlow/Keras, a modern software package popular for deep learning. The model is inspired by the physical laws of heat exchange, but with all parameters obtained by optimisation. It is encoded as a customised Recurrent Neural Network and exposed to the time-series data via n-step Prediction Error Minimisation (PEM). The thus obtained Digital Twin of the physical system is then used directly to construct a Model Predictive Control (MPC) type regulator that executes in real time. The MPC is then compared in simulation with a self-tuning PID controller that adjust its parameters on-line by gradient descent.
Low latency requirements are expected to increase with 5G telecommunications driving data and compute to EDGE data centers located in cities near to end users. This article presents a testbed for such data centers that has been built at RISE ICE Datacenter in northern Sweden in order to perform full stack experiments on load balancing, cooling, micro-grid interactions and the use of renewable energy sources. This system is described with details on both hardware components and software implementations used for data collection and control. A use case for off-grid operation is presented to demonstrate how the test lab can be used for experiments on edge data center design, control and autonomous operation. © 2020 Author.
This report investigates the problem of where to place computation workload in an edge-cloud network topology considering the trade-off between the location specific cost of computation and data communication.
This article investigates the problem of where to place the computation workload in an edge-cloud network topology considering the trade-off between the location-specific cost of computation and data communication. For this purpose, a Monte Carlo simulation model is defined that accounts for different workload types, their distribution across time and location, as well as correlation structure. Results confirm and quantify the intuition that optimization can be achieved by distributing a part of cloud computation to make efficient use of resources in an edge data center network, with operational energy savings of 4â6% and up to 50% reduction in its claim for cloud capacity.
In this paper we generate word meta-embeddings from already existing embeddings using cross-encoding. Previous approaches can only work with words that exist in each source embedding, while the architecture presented here drops this requirement. We demonstrate the method using two pre-trained embeddings, namely GloVE and FastText. Furthermore, we propose additional improvements to the training process of the metaembedding. Results on six standard tests for word similarity show that the meta-embedding trained outperforms the original embeddings. Moreover, this performance can be further increased with the proposed improvements, resulting in a competitive performance with those reported earlier.
The Internet of Things is expanding and since IoT devices and IoT networks are used in many crucial areas in modern societies, ranging from security and military applications to healthcare monitoring and production efficiency, the need to secure these devices is of great importance. Intrusion detection systems (IDS) play a significant role in securing IoT networks as their goal is to detect intruders that have gained access to one or several IoT nodes. While most IDS have been designed to detect a specific or at most a few attacks, the DETONAR framework detects multiple attacks. However, it is run on a designated sniffer network which adds additional cost in terms of hardware and maintenance. In this paper, we propose DETONAR-Light, adapting DETONAR to run using data collected at a border router rather than on sniffer logs. Our experiments show that this is possible almost without any decrease of detection and attack classification rate for many attacks
A prominent approach to solving combinatorial optimization problems on parallel hardware is Ising machines, i.e., hardware implementations of networks of interacting binary spin variables. Most Ising machines leverage second-order interactions although important classes of optimization problems, such as satisfiability problems, map more seamlessly to Ising networks with higher-order interactions. Here, we demonstrate that higher-order Ising machines can solve satisfiability problems more resource-efficiently in terms of the number of spin variables and their connections when compared to traditional second-order Ising machines. Further, our results show on a benchmark dataset of Boolean k-satisfiability problems that higher-order Ising machines implemented with coupled oscillators rapidly find solutions that are better than second-order Ising machines, thus, improving the current state-of-the-art for Ising machines.
Stream processing has been an active research field for more than 20 years, but it is now witnessing its prime time due to recent successful efforts by the research community and numerous worldwide open-source communities. The goal of this tutorial is threefold. First, we aim to review and highlight noteworthy past research findings, which were largely ignored until very recently. Second, we intend to underline the differences between early ('00-'10) and modern ('11-'18) streaming systems, and how those systems have evolved through the years. Most importantly, we wish to turn the attention of the database community to recent trends: streaming systems are no longer used only for classic stream processing workloads, namely window aggregates and joins. Instead, modern streaming systems are being increasingly used to deploy general event-driven applications in a scalable fashion, challenging the design decisions, architecture and intended use of existing stream processing systems.
The 6TiSCH architecture has been gaining attraction as a promising solution to ensure reliability and security for communication in applications for the Industrial Internet of Things (IIoT). While many different aspects of the architecture have been investigated in literature, an in-depth analysis of the security features included in its design is still missing. In this paper, we assess the security vulnerabilities of the 6top protocol, a core component of the 6TiSCH architecture for enabling network nodes to negotiate communication resources. Our analysis highlights two possible attacks against the 6top protocol that can impair network performance and reliability in a significant manner. To prove the feasibility of the attacks in practice, we implemented both of them on the Contiki-NG Operating System and tested their effectiveness on a simple deployment with three Zolertia RE-Mote sensor nodes. Also, we carried out a set of simulations using Cooja in order to assess their impact on larger networks. Our results show that both attacks reduce reliability in the overall network and increase energy consumption of the network nodes.
The current advancements in open domain text generation have been spearheaded by Transformer-based large language models. Leveraging efficient parallelization and vast training datasets, these models achieve unparalleled text generation capabilities. Even so, current models are known to suffer from deficiencies such as repetitive texts, looping issues, and lack of robustness. While adversarial training through generative adversarial networks (GAN) is a proposed solution, earlier research in this direction has predominantly focused on older architectures, or narrow tasks. As a result, this approach is not yet compatible with modern language models for open-ended text generation, leading to diminished interest within the broader research community. We propose a computationally efficient GAN approach for sequential data that utilizes the parallelization capabilities of Transformer models. Our method revolves around generating multiple branching sequences from each training sample, while also incorporating the typical next-step prediction loss on the original data. In this way, we achieve a dense reward and loss signal for both the generator and the discriminator, resulting in a stable training dynamic. We apply our training method to pre-trained language models, using data from their original training set but less than 0.01% of the available data. A comprehensive human evaluation shows that our method significantly improves the quality of texts generated by the model while avoiding the previously reported sparsity problems of GAN approaches. Even our smaller models outperform larger original baseline models with more than 16 times the number of parameters. Finally, we corroborate previous claims that perplexity on held-out data is not a sufficient metric for measuring the quality of generated texts.
Extracting semantically useful natural language sentence representations frompre-trained deep neural networks such as Transformers remains a challenge. Wefirst demonstrate that pre-training objectives impose a significant task bias ontothe final layers of models, with a layer-wise survey of the Semantic Textual Similarity (STS) correlations for multiple common Transformer language models. Wethen propose a new self-supervised method called Contrastive Tension (CT) tocounter such biases. CT frames the training objective as a noise-contrastive taskbetween the final layer representations of two independent models, in turn makingthe final layer representations suitable for feature extraction. Results from multiple common unsupervised and supervised STS tasks indicate that CT outperformsprevious State Of The Art (SOTA), and when combining CT with supervised datawe improve upon previous SOTA results with large margins.
The introduction of immensely large Causal Language Models (CLMs) has rejuvenated the interest in open-ended text generation. However, controlling the generative process for these Transformer-based models is at large an unsolved problem. Earlier work has explored either plug-and-play decoding strategies, or more powerful but blunt approaches such as prompting. There hence currently exists a trade-off between fine-grained control, and the capability for more expressive high-level instructions. To alleviate this trade-off, we propose an encoder-decoder architecture that enables intermediate text prompts at arbitrary time steps. We propose a resource-efficient method for converting a pre-trained CLM into this architecture, and demonstrate its potential on various experiments, including the novel task of contextualized word inclusion. Our method provides strong results on multiple experimental settings, proving itself to be both expressive and versatile.
We propose a portfolio of exact and metaheuristic methods for the rich examination timetabling problem introduced by Battistutta et al. (in: Hebrard, Musliu (eds) 17th International conference on the integration of constraint programming, artificial intelligence, and operations research (CPAIOR-2020), LNCS, vol 12296. Springer, Berlin, pp 69–81, 2020). The problem includes several real-world features that arise in Italian universities, such as examinations split into two parts, possible requirements of multiple rooms for a single examination, and unavailabilities and preferences for periods and rooms. We developed a CP model encoded in the MiniZinc modeling language and solved it with Gecode, as well as two MIP models solved with Gurobi. The first MIP model is encoded natively and the second one again in MiniZinc. Finally, we extended the metaheuristic method based on simulated annealing of Battistutta et al. by introducing a new neighborhood relation. We compare the different techniques on the real-world instances provided by Battistutta et al., which have been slightly refined by correcting some semantic issues. Finally, we developed a solution checker that is publicly available, together with all instances and solutions, for inspection and future comparisons.