Endre søk
Link to record
Permanent link

Direct link
Publikasjoner (4 av 4) Visa alla publikasjoner
Fahria, K., Mowla, N. & Stenberg, S. (2025). Human-Centric Ground Truth Evaluation and Acceptance (Hu-GTVA): An Oversight by Design process for RAG-LLM evaluation. Borås: RISE Research Institutes of Sweden
Åpne denne publikasjonen i ny fane eller vindu >>Human-Centric Ground Truth Evaluation and Acceptance (Hu-GTVA): An Oversight by Design process for RAG-LLM evaluation
2025 (engelsk)Rapport (Annet vitenskapelig)
Abstract [en]

We present Hu-GTVA, a framework for human-grounded test case generation, validation, and acceptance designed to create contextual ground truths for the evaluation of retrieval-augmented generation (RAG) systems in high-stakes public-sector contexts. The framework addresses the challenge of aligning RAG system evaluations with expert-grounded domain knowledge by combining automated test case generation, structured expert annotation, and dual-review protocols. We demonstrate its application in collaboration with the Swedish National Financial Management Authority (ESV), where it supports the evaluation of Konsekvenshjälpen, a RAG-LLM system for regulatory impact assessment assistance. Hu-GTVA takes conceptual motivation from both the principle of Oversight by Design and the regulatory requirement of Human Oversight under Article 14 of the EU AI Act. Oversight by Design emphasizes integrating oversight considerations already during the design phase, while Human Oversight defines who, when, and what must be governed to ensure accountable AI use. Drawing from both, Hu-GTVA introduces structured expert review, acceptance criteria, and quantitative agreement metrics to bring human judgment into the evaluation process before deployment. Designed for modularity and domain adaptability, the framework can be extended to other high-risk settings such as healthcare or critical infrastructure. Hu-GTVA offers a reproducible and human-centered pre-hoc RAG-LLM evaluation pipeline.

sted, utgiver, år, opplag, sider
Borås: RISE Research Institutes of Sweden, 2025. s. 30
Serie
RISE Rapport
Emneord
Human oversight, oversight by design, ground truth, RAG, RAG LLM, RAG evaluation, RAGChecker, RAGAS, TruLens
HSV kategori
Identifikatorer
urn:nbn:se:ri:diva-80077 (URN)978-91-90109-25-0 (ISBN)
Tilgjengelig fra: 2025-12-22 Laget: 2025-12-22 Sist oppdatert: 2026-01-22bibliografisk kontrollert
Fahria, K. & Mowla, N. (2025). Testing AI in relation to Traditional Software Testing: A Comparative Overview. RISE Research Institutes of Sweden
Åpne denne publikasjonen i ny fane eller vindu >>Testing AI in relation to Traditional Software Testing: A Comparative Overview
2025 (engelsk)Rapport (Annet vitenskapelig)
sted, utgiver, år, opplag, sider
RISE Research Institutes of Sweden, 2025. s. 13
Serie
RISE Rapport ; 2025:40
HSV kategori
Identifikatorer
urn:nbn:se:ri:diva-78272 (URN)978-91-90036-27-3 (ISBN)
Tilgjengelig fra: 2025-03-27 Laget: 2025-03-27 Sist oppdatert: 2026-01-22bibliografisk kontrollert
Fahria, K., Kabir, F., Mowla, N. & Fakhrul Abedin, S. (2025). Towards Explainable Automotive Intrusion Detection: A Chunk-based Framework forCAN Traffic. In: : . Paper presented at Swedish National Computer Networking and Cloud Computing Workshop (SNCNW), arranged at University West in Trollhättan, June 10-11, 2025.
Åpne denne publikasjonen i ny fane eller vindu >>Towards Explainable Automotive Intrusion Detection: A Chunk-based Framework forCAN Traffic
2025 (engelsk)Konferansepaper, Publicerat paper (Annet vitenskapelig)
Abstract [en]

In this work, we propose an explainable intrusion detection framework for Controller Area Network bus traffic using the ROAD dataset. By segmenting raw traffic into fixed-size chunks, we extract features that capture timing behavior, entropy, payload statistics, and CAN ID survival rates. We evaluate three classifiers, Decision Tree, Random Forest (with TreeSHAP), and Feedforward Neural Network (with KernelSHAP). The framework extracts multi-level features from CAN traffic, revealing through explainability that tree models detect protocol anomalies while neural networks capture signal-level distortions, underscoring the role of model choice in explainable IDS design.

HSV kategori
Identifikatorer
urn:nbn:se:ri:diva-78747 (URN)
Konferanse
Swedish National Computer Networking and Cloud Computing Workshop (SNCNW), arranged at University West in Trollhättan, June 10-11, 2025
Merknad

This work is supported by the EU project Citcom.AI,Vinnova INTERSTICE project (reference number: 2024-00661), and VINNOVA FFI Project MAGIC (referencenumber: 2024-03687). This work is also partiallysupported by KKS Research Profile NIIT, and DataCommunication Security Laboratory at Ewha WomansUniversity, South Korea.

Tilgjengelig fra: 2025-08-15 Laget: 2025-08-15 Sist oppdatert: 2026-01-22bibliografisk kontrollert
Mowla, N. (2024). From AI Act to Structured Testing of AI Systems. RISE Research Institutes of Sweden
Åpne denne publikasjonen i ny fane eller vindu >>From AI Act to Structured Testing of AI Systems
2024 (engelsk)Rapport (Annet vitenskapelig)
Abstract [en]

The Citcom.AI RISE testing approach is a step towards structured AI system evaluation and testing under the AI Act's regulatory framework. It establishes a definition of context in the scenario of different AI application domains, AI subfields, and use cases. In particular, a systematic evaluation, from defining the context and application to detailed risk assessments, linking each AI application to corresponding testing standards and methodologies, is presented. The approach translates AI Act’s high level regulatory requirements for different AI system risk levels to appropriate technical testing techniques for achieving trustworthiness across different domains and AI subfields, promoting responsible AI deployment and fostering trust in AI applications. 

sted, utgiver, år, opplag, sider
RISE Research Institutes of Sweden, 2024. s. 12
Serie
RISE Rapport ; 2024:84
Emneord
Testing AI, AI Act, AI systems, Standards, Context
HSV kategori
Identifikatorer
urn:nbn:se:ri:diva-76073 (URN)9789189971462 (ISBN)
Tilgjengelig fra: 2024-11-13 Laget: 2024-11-13 Sist oppdatert: 2026-01-22bibliografisk kontrollert
Organisasjoner
Identifikatorer
ORCID-id: ORCID iD iconorcid.org/0009-0004-8393-1683
v. 2.47.0