Change search
Link to record
Permanent link

Direct link
Publications (4 of 4) Show all publications
Fahria, K., Mowla, N. & Stenberg, S. (2025). Human-Centric Ground Truth Evaluation and Acceptance (Hu-GTVA): An Oversight by Design process for RAG-LLM evaluation. Borås: RISE Research Institutes of Sweden
Open this publication in new window or tab >>Human-Centric Ground Truth Evaluation and Acceptance (Hu-GTVA): An Oversight by Design process for RAG-LLM evaluation
2025 (English)Report (Other academic)
Abstract [en]

We present Hu-GTVA, a framework for human-grounded test case generation, validation, and acceptance designed to create contextual ground truths for the evaluation of retrieval-augmented generation (RAG) systems in high-stakes public-sector contexts. The framework addresses the challenge of aligning RAG system evaluations with expert-grounded domain knowledge by combining automated test case generation, structured expert annotation, and dual-review protocols. We demonstrate its application in collaboration with the Swedish National Financial Management Authority (ESV), where it supports the evaluation of Konsekvenshjälpen, a RAG-LLM system for regulatory impact assessment assistance. Hu-GTVA takes conceptual motivation from both the principle of Oversight by Design and the regulatory requirement of Human Oversight under Article 14 of the EU AI Act. Oversight by Design emphasizes integrating oversight considerations already during the design phase, while Human Oversight defines who, when, and what must be governed to ensure accountable AI use. Drawing from both, Hu-GTVA introduces structured expert review, acceptance criteria, and quantitative agreement metrics to bring human judgment into the evaluation process before deployment. Designed for modularity and domain adaptability, the framework can be extended to other high-risk settings such as healthcare or critical infrastructure. Hu-GTVA offers a reproducible and human-centered pre-hoc RAG-LLM evaluation pipeline.

Place, publisher, year, edition, pages
Borås: RISE Research Institutes of Sweden, 2025. p. 30
Series
RISE Rapport
Keywords
Human oversight, oversight by design, ground truth, RAG, RAG LLM, RAG evaluation, RAGChecker, RAGAS, TruLens
National Category
Computer Sciences
Identifiers
urn:nbn:se:ri:diva-80077 (URN)978-91-90109-25-0 (ISBN)
Available from: 2025-12-22 Created: 2025-12-22 Last updated: 2026-01-22Bibliographically approved
Fahria, K. & Mowla, N. (2025). Testing AI in relation to Traditional Software Testing: A Comparative Overview. RISE Research Institutes of Sweden
Open this publication in new window or tab >>Testing AI in relation to Traditional Software Testing: A Comparative Overview
2025 (English)Report (Other academic)
Place, publisher, year, edition, pages
RISE Research Institutes of Sweden, 2025. p. 13
Series
RISE Rapport ; 2025:40
National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-78272 (URN)978-91-90036-27-3 (ISBN)
Available from: 2025-03-27 Created: 2025-03-27 Last updated: 2026-01-22Bibliographically approved
Fahria, K., Kabir, F., Mowla, N. & Fakhrul Abedin, S. (2025). Towards Explainable Automotive Intrusion Detection: A Chunk-based Framework forCAN Traffic. In: : . Paper presented at Swedish National Computer Networking and Cloud Computing Workshop (SNCNW), arranged at University West in Trollhättan, June 10-11, 2025.
Open this publication in new window or tab >>Towards Explainable Automotive Intrusion Detection: A Chunk-based Framework forCAN Traffic
2025 (English)Conference paper, Published paper (Other academic)
Abstract [en]

In this work, we propose an explainable intrusion detection framework for Controller Area Network bus traffic using the ROAD dataset. By segmenting raw traffic into fixed-size chunks, we extract features that capture timing behavior, entropy, payload statistics, and CAN ID survival rates. We evaluate three classifiers, Decision Tree, Random Forest (with TreeSHAP), and Feedforward Neural Network (with KernelSHAP). The framework extracts multi-level features from CAN traffic, revealing through explainability that tree models detect protocol anomalies while neural networks capture signal-level distortions, underscoring the role of model choice in explainable IDS design.

National Category
Computer and Information Sciences
Identifiers
urn:nbn:se:ri:diva-78747 (URN)
Conference
Swedish National Computer Networking and Cloud Computing Workshop (SNCNW), arranged at University West in Trollhättan, June 10-11, 2025
Note

This work is supported by the EU project Citcom.AI,Vinnova INTERSTICE project (reference number: 2024-00661), and VINNOVA FFI Project MAGIC (referencenumber: 2024-03687). This work is also partiallysupported by KKS Research Profile NIIT, and DataCommunication Security Laboratory at Ewha WomansUniversity, South Korea.

Available from: 2025-08-15 Created: 2025-08-15 Last updated: 2026-01-22Bibliographically approved
Mowla, N. (2024). From AI Act to Structured Testing of AI Systems. RISE Research Institutes of Sweden
Open this publication in new window or tab >>From AI Act to Structured Testing of AI Systems
2024 (English)Report (Other academic)
Abstract [en]

The Citcom.AI RISE testing approach is a step towards structured AI system evaluation and testing under the AI Act's regulatory framework. It establishes a definition of context in the scenario of different AI application domains, AI subfields, and use cases. In particular, a systematic evaluation, from defining the context and application to detailed risk assessments, linking each AI application to corresponding testing standards and methodologies, is presented. The approach translates AI Act’s high level regulatory requirements for different AI system risk levels to appropriate technical testing techniques for achieving trustworthiness across different domains and AI subfields, promoting responsible AI deployment and fostering trust in AI applications. 

Place, publisher, year, edition, pages
RISE Research Institutes of Sweden, 2024. p. 12
Series
RISE Rapport ; 2024:84
Keywords
Testing AI, AI Act, AI systems, Standards, Context
National Category
Computer Systems Computer Sciences
Identifiers
urn:nbn:se:ri:diva-76073 (URN)9789189971462 (ISBN)
Available from: 2024-11-13 Created: 2024-11-13 Last updated: 2026-01-22Bibliographically approved
Organisations
Identifiers
ORCID iD: ORCID iD iconorcid.org/0009-0004-8393-1683

Search in DiVA

Show all publications