Change search
Link to record
Permanent link

Direct link
Publications (10 of 15) Show all publications
Brännvall, R., Adomaitis, L., Görnerup, O. & Sedrati, A. (2025). Technical Report for the Forgotten-by-Design Project: Targeted Obfuscation for Machine Learning. arXiv (Cornell University)
Open this publication in new window or tab >>Technical Report for the Forgotten-by-Design Project: Targeted Obfuscation for Machine Learning
2025 (English)In: arXiv (Cornell University)Article in journal (Other academic) Published
Abstract [en]

The right to privacy, enshrined in various human rights declarations, faces new challenges in the age of artificial intelligence (AI). This paper explores the concept of the Right to be Forgotten (RTBF) within AI systems, contrasting it with traditional data erasure methods. We introduce Forgotten by Design, a proactive approach to privacy preservation that integrates instance-specific obfuscation techniques during the AI model training process. Unlike machine unlearning, which modifies models post-training, our method prevents sensitive data from being embedded in the first place. Using the LIRA membership inference attack, we identify vulnerable data points and propose defenses that combine additive gradient noise and weighting schemes. Our experiments on the CIFAR-10 dataset demonstrate that our techniques reduce privacy risks by at least an order of magnitude while maintaining model accuracy (at 95% significance). Additionally, we present visualization methods for the privacy-utility trade-off, providing a clear framework for balancing privacy risk and model accuracy. This work contributes to the development of privacy-preserving AI systems that align with human cognitive processes of motivated forgetting, offering a robust framework for safeguarding sensitive information and ensuring compliance with privacy regulations.

Place, publisher, year, edition, pages
Cornell University, 2025
National Category
Computer Sciences
Identifiers
urn:nbn:se:ri:diva-78991 (URN)10.48550/arxiv.2501.11525 (DOI)
Available from: 2025-09-22 Created: 2025-09-22 Last updated: 2025-12-11Bibliographically approved
Li, N., Zahra, S., de Brito, M. M., Flynn, C. M., Görnerup, O., Worou, K., . . . Nivre, J. (2024). Using LLMs to Build a Database of Climate Extreme Impacts. In: ClimateNLP 2024 - 1st Workshop on Natural Language Processing Meets Climate Change, Proceedings of the Workshop: . Paper presented at 1st Workshop on Natural Language Processing Meets Climate Change, ClimateNLP 2024. Bangkok, Thailand. 16 August 2024 (pp. 93-110). Association for Computational Linguistics (ACL)
Open this publication in new window or tab >>Using LLMs to Build a Database of Climate Extreme Impacts
Show others...
2024 (English)In: ClimateNLP 2024 - 1st Workshop on Natural Language Processing Meets Climate Change, Proceedings of the Workshop, Association for Computational Linguistics (ACL) , 2024, p. 93-110Conference paper, Published paper (Refereed)
Abstract [en]

To better understand how extreme climate events impact society, we need to increase the availability of accurate and comprehensive information about these impacts. We propose a method for building large-scale databases of climate extreme impacts from online textual sources, using LLMs for information extraction in combination with more traditional NLP techniques to improve accuracy and consistency. We evaluate the method against a small benchmark database created by human experts and find that extraction accuracy varies for different types of information. We compare three different LLMs and find that, while the commercial GPT-4 model gives the best performance overall, the open-source models Mistral and Mixtral are competitive for some types of information.

Place, publisher, year, edition, pages
Association for Computational Linguistics (ACL), 2024
Keywords
Computational linguistics; Database systems; Open systems; Benchmark database; Climate event; Climate extremes; Comprehensive information; Extraction accuracy; Extreme climates; Human expert; Large-scale database; Open-source model; Performance; Data accuracy
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-76190 (URN)2-s2.0-85204502136 (Scopus ID)
Conference
1st Workshop on Natural Language Processing Meets Climate Change, ClimateNLP 2024. Bangkok, Thailand. 16 August 2024
Note

The research presented in this paper was supported by the Swedish Research Council (grants no. 2022-02909, 2022-03448 and 2022-06599). Ni Li is supported by the VUB Research Council in the framework of a EUTOPIA inter-university co-tutelle PhD program between the Vrije Universiteit Brussel, Belgium, and the Technische Universit\u00E4t Dresden, Germany. The EUTOPIA alliance is part of the European Universities Initiatives co-funded by the European Union. The experiments with the open-source LLMs were enabled by the National Academic Infrastructure for Supercomputing in Sweden (NAISS), partially funded by the Swedish Research Council through grant agreement no. 2022-06725. We thank NAISS for providing computational resources under Project 2024/22-211.

Available from: 2024-11-18 Created: 2024-11-18 Last updated: 2025-12-08Bibliographically approved
Holst, A., Bouguelia, M.-R. -., Görnerup, O., Pashami, S., Al-Shishtawy, A., Falkman, G., . . . Soliman, A. (2019). Eliciting structure in data. In: CEUR Workshop Proceedings: . Paper presented at 2019 Joint ACM IUI Workshops, ACMIUI-WS 2019, 20 March 2019.
Open this publication in new window or tab >>Eliciting structure in data
Show others...
2019 (English)In: CEUR Workshop Proceedings, 2019Conference paper, Published paper (Refereed)
Abstract [en]

This paper demonstrates how to explore and visualize different types of structure in data, including clusters, anomalies, causal relations, and higher order relations. The methods are developed with the goal of being as automatic as possible and applicable to massive, streaming, and distributed data. Finally, a decentralized learning scheme is discussed, enabling finding structure in the data without collecting the data centrally. © 2019 Copyright held for the individual papers by the papers’ authors. Copying permitted for private and academic purposes. This volume is published and copyrighted by its editors.

Keywords
Anomaly detection, Causal inference, Clustering, Distributed analytics, Higher-order structure, Information visualization, Information systems, User interfaces, Causal inferences, Data acquisition
National Category
Natural Sciences
Identifiers
urn:nbn:se:ri:diva-38261 (URN)2-s2.0-85063227224 (Scopus ID)
Conference
2019 Joint ACM IUI Workshops, ACMIUI-WS 2019, 20 March 2019
Available from: 2019-04-02 Created: 2019-04-02 Last updated: 2025-09-23Bibliographically approved
Boman, M., Ben Abdesslem, F., Forsell, E., Gillblad, D., Görnerup, O., Isacsson, N., . . . Kaldo, V. (2019). Learning machines in Internet-delivered psychological treatment. Progress in Artificial Intelligence, 8(4), 475-485
Open this publication in new window or tab >>Learning machines in Internet-delivered psychological treatment
Show others...
2019 (English)In: Progress in Artificial Intelligence, ISSN 2192-6352, E-ISSN 2192-6360, Vol. 8, no 4, p. 475-485Article in journal (Refereed) Published
Abstract [en]

A learning machine, in the form of a gating network that governs a finite number of different machine learning methods, is described at the conceptual level with examples of concrete prediction subtasks. A historical data set with data from over 5000 patients in Internet-based psychological treatment will be used to equip healthcare staff with decision support for questions pertaining to ongoing and future cases in clinical care for depression, social anxiety, and panic disorder. The organizational knowledge graph is used to inform the weight adjustment of the gating network and for routing subtasks to the different methods employed locally for prediction. The result is an operational model for assisting therapists in their clinical work, about to be subjected to validation in a clinical trial.

Place, publisher, year, edition, pages
Springer Verlag, 2019
Keywords
Ensemble learning, Gating network, Internet-based psychological treatment, Learning machine, Machine learning, Decision support systems, Learning systems, Conceptual levels, Decision supports, Learning machines, Machine learning methods, Operational model, Organizational knowledge, Psychological treatments, Patient treatment
National Category
Natural Sciences
Identifiers
urn:nbn:se:ri:diva-39062 (URN)10.1007/s13748-019-00192-0 (DOI)2-s2.0-85066625908 (Scopus ID)
Available from: 2019-06-26 Created: 2019-06-26 Last updated: 2025-09-23Bibliographically approved
Kreuger, P., Steinert, R., Görnerup, O. & Gillblad, D. (2018). Distributed dynamic load balancing with applications in radio access networks. International Journal of Network Management, 28(2)
Open this publication in new window or tab >>Distributed dynamic load balancing with applications in radio access networks
2018 (English)In: International Journal of Network Management, ISSN 1055-7148, E-ISSN 1099-1190, Vol. 28, no 2Article in journal (Refereed) Published
Abstract [en]

Managing and balancing load in distributed systems remains a challenging problem in resource management, especially in networked systems where scalability concerns favour distributed and dynamic approaches. Distributed methods can also integrate well with centralised control paradigms if they provide high-level usage statistics and control interfaces for supporting and deploying centralised policy decisions. We present a general method to compute target values for an arbitrary metric on the local system state and show that autonomous rebalancing actions based on the target values can be used to reliably and robustly improve the balance for metrics based on probabilistic risk estimates. To balance the trade-off between balancing efficiency and cost, we introduce 2 methods of deriving rebalancing actuations from the computed targets that depend on parameters that directly affects the trade-off. This enables policy level control of the distributed mechanism based on collected metric statistics from network elements. Evaluation results based on cellular radio access network simulations indicate that load balancing based on probabilistic overload risk metrics provides more robust balancing solutions with fewer handovers compared to a baseline setting based on average load.

Place, publisher, year, edition, pages
John Wiley & Sons, 2018
Keywords
Self-organising heterogeneous networks; Distributed dynamic load balancing; Methods/control theories; Network Management/Wireless & mobile networks
National Category
Computer Sciences
Identifiers
urn:nbn:se:ri:diva-32825 (URN)10.1002/nem.2014 (DOI)2-s2.0-85036539033 (Scopus ID)
Funder
Swedish Foundation for Strategic Research , RIT15-0075EU, Horizon 2020, 671639
Available from: 2017-12-05 Created: 2017-12-05 Last updated: 2025-09-23Bibliographically approved
Görnerup, O. & Gillblad, D. (2018). Streaming word similarity mining on the cheap. In: : . Paper presented at Conference on Empirical Methods in Natural Language Processing (EMNLP).
Open this publication in new window or tab >>Streaming word similarity mining on the cheap
2018 (English)Conference paper, Published paper (Other academic)
Abstract [en]

Accurately and efficiently estimating word similarities from text is fundamental in natural language processing. In this paper, we propose a fast and lightweight method for estimating similarities from streams by explicitly counting second-order co-occurrences. The method rests on the observation that words that are highly correlated with respect to such counts are also highly similar with respect to first-order co-occurrences. Using buffers of co-occurred words per word to count second-order co-occurrences, we can then estimate similarities in a single pass over data without having to do prohibitively expensive similarity calculations. We demonstrate that this approach is scalable, converges rapidly, behaves robustly under parameter changes, and that it captures word similarities on par with those given by state-of-the-art word embeddings.

National Category
Natural Language Processing
Identifiers
urn:nbn:se:ri:diva-35186 (URN)
Conference
Conference on Empirical Methods in Natural Language Processing (EMNLP)
Available from: 2018-09-18 Created: 2018-09-18 Last updated: 2025-09-23Bibliographically approved
Görnerup, O., Gillblad, D. & Vasiloudis, T. (2017). Domain-Agnostic Discovery of Similarities and Concepts at Scale (7ed.). Knowledge and Information Systems, 51, 531-560
Open this publication in new window or tab >>Domain-Agnostic Discovery of Similarities and Concepts at Scale
2017 (English)In: Knowledge and Information Systems, ISSN 0219-1377, E-ISSN 0219-3116, Vol. 51, p. 531-560Article in journal (Refereed) Published
Abstract [en]

Appropriately defining and efficiently calculating similarities from large data sets are often essential in data mining, both for gaining understanding of data and generating processes, and for building tractable representations. Given a set of objects and their correlations, we here rely on the premise that each object is characterized by its context, i.e. its correlations to the other objects. The similarity between two objects can then be expressed in terms of the similarity between their contexts. In this way, similarity pertains to the general notion that objects are similar if they are exchangeable in the data. We propose a scalable approach for calculating all relevant similarities among objects by relating them in a correlation graph that is transformed to a similarity graph. These graphs can express rich structural properties among objects. Specifically, we show that concepts - abstractions of objects - are constituted by groups of similar objects that can be discovered by clustering the objects in the similarity graph. These principles and methods are applicable in a wide range of fields, and will here be demonstrated in three domains: computational linguistics, music and molecular biology, where the numbers of objects and correlations range from small to very large.

Place, publisher, year, edition, pages
Springer, 2017 Edition: 7
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24561 (URN)10.1007/s10115-016-0984-2 (DOI)2-s2.0-84984793995 (Scopus ID)
Note

This paper is an extended version of Görnerup, O., Gillblad, D. and Vasiloudis, T. (2015), Knowing an object by the company it keeps: A domain-agnostic scheme for similarity discovery, in "IEEE International Conference on Data Mining (ICDM 2015)".

Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2025-09-23Bibliographically approved
Kreuger, P., Görnerup, O., Gillblad, D., Lundborg, T., Corcoran, D. & Ermedahl, A. (2015). Autonomous load balancing of heterogeneous networks (11ed.). In: 2015 IEEE 81st Vehicular Technology Conference (VTC Spring): . Paper presented at 81st IEEE Vehicular Technology Conference (VTC Spring 2015), May 11-14, 2015, Glasgow, UK. , Article ID 7145712.
Open this publication in new window or tab >>Autonomous load balancing of heterogeneous networks
Show others...
2015 (English)In: 2015 IEEE 81st Vehicular Technology Conference (VTC Spring), 2015, 11, article id 7145712Conference paper, Published paper (Refereed)
Abstract [en]

This paper presents a method for load balancing heterogeneous networks by dynamically assigning values to the LTE cell range expansion (CRE) parameter. The method records hand-over events online and adapts flexibly to changes in terminal traffic and mobility by maintaining statistical estimators that are used to support autonomous assignment decisions. The proposed approach has low overhead and is highly scalable due to a modularised and completely distributed design that exploits self- organisation based on local inter-cell interactions. An advanced simulator that incorporates terminal traffic patterns and mobility models with a radio access network simulator has been developed to validate and evaluate the method.

Series
IEEE Vehicular Technology Conference, ISSN 1550-2252
Keywords
autonomous network management, self-organising heterogenous networks, distributed algorithms, statistical modelling
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24457 (URN)10.1109/VTCSpring.2015.7145712 (DOI)2-s2.0-84940399308 (Scopus ID)978-1-4799-8088-8 (ISBN)
Conference
81st IEEE Vehicular Technology Conference (VTC Spring 2015), May 11-14, 2015, Glasgow, UK
Projects
HetNet
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2025-09-23Bibliographically approved
Görnerup, O., Gillblad, D. & Vasiloudis, T. (2015). Knowing an Object by the Company It Keeps: A Domain-Agnostic Scheme for Similarity Discovery (18ed.). In: 2015 IEEE International Conference on Data Mining: . Paper presented at 15th IEEE International Conference on Data Mining (ICDM 2015), November 14-17, 2015, Atlantic City, US (pp. 121-130). , Article ID 7373316.
Open this publication in new window or tab >>Knowing an Object by the Company It Keeps: A Domain-Agnostic Scheme for Similarity Discovery
2015 (English)In: 2015 IEEE International Conference on Data Mining, 2015, 18, p. 121-130, article id 7373316Conference paper, Published paper (Refereed)
Abstract [en]

Appropriately defining and then efficiently calculating similarities from large data sets are often essential in data mining, both for building tractable representations and for gaining understanding of data and generating processes. Here we rely on the premise that given a set of objects and their correlations, each object is characterized by its context, i.e. its correlations to the other objects, and that the similarity between two objects therefore can be expressed in terms of the similarity between their respective contexts. Resting on this principle, we propose a data-driven and highly scalable approach for discovering similarities from large data sets by representing objects and their relations as a correlation graph that is transformed to a similarity graph. Together these graphs can express rich structural properties among objects. Specifically, we show that concepts - representations of abstract ideas and notions - are constituted by groups of similar objects that can be identified by clustering the objects in the similarity graph. These principles and methods are applicable in a wide range of domains, and will here be demonstrated for three distinct types of objects: codons, artists and words, where the numbers of objects and correlations range from small to very large.

National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24463 (URN)10.1109/ICDM.2015.85 (DOI)2-s2.0-84963516560 (Scopus ID)978-1-4673-9504-5 (ISBN)
Conference
15th IEEE International Conference on Data Mining (ICDM 2015), November 14-17, 2015, Atlantic City, US
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2025-09-23Bibliographically approved
Kreuger, P., Gillblad, D., Görnerup, O., Corcoran, D., Lundborg, T. & Ermedahl, A. (2015). Methods, Nodes and system for enabling redistribution of cell load (13ed.). .
Open this publication in new window or tab >>Methods, Nodes and system for enabling redistribution of cell load
Show others...
2015 (English)Patent (Other (popular science, discussion, etc.))
Abstract [en]

Patent for distributed load balancing mechanism for LTE, developed by SICS in collaboration with Ericsson DURA

Publisher
p. 49
Keywords
load balancing, cell range expansion, autonomous radio access management, distributed algorithms, LTE
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-24526 (URN)
Projects
HetNet
Available from: 2016-10-31 Created: 2016-10-31 Last updated: 2025-09-23Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0000-0001-9244-4546

Search in DiVA

Show all publications