Publications

Conference paper

Jiang J, Leofante F, Rago A, Toni Fet al., 2023,

Formalising the robustness of counterfactual explanations for neural networks

, 37th AAAI Conference on Artificial Intelligence (AAAI 2023), Publisher: Association for the Advancement of Artificial Intelligence, Pages: 14901-14909, ISSN: 2374-3468

The use of counterfactual explanations (CFXs) is an increasingly popular explanation strategy for machine learning models. However, recent studies have shown that these explanations may not be robust to changes in the underlying model (e.g., following retraining), which raises questions about their reliability in real-world applications. Existing attempts towards solving this problem are heuristic, and the robustness to model changes of the resulting CFXs is evaluated with only a small number of retrained models, failing to provide exhaustive guarantees. To remedy this, we propose the first notion to formally and deterministically assess the robustness (to model changes) of CFXs for neural networks, that we call ∆-robustness. We introduce an abstraction framework based on interval neural networks to verify the ∆-robustness of CFXs against a possibly infinite set of changes to the model parameters, i.e., weights and biases. We then demonstrate the utility of this approach in two distinct ways. First, we analyse the ∆-robustness of a number of CFX generation methods from the literature and show that they unanimously host significant deficiencies in this regard. Second, we demonstrate how embedding ∆-robustness within existing methods can provide CFXs which are provably robust.

Conference paper

Potyka N, Yin X, Toni F, 2023,

Explaining random forests using bipolar argumentation and Markov networks

, AAAI 23, Pages: 9458-9460, ISSN: 2159-5399

Random forests are decision tree ensembles that can be used to solve a variety of machine learning problems. However, as the number of trees and their individual size can be large, their decision making process is often incomprehensible. We show that their decision process can be naturally represented as an argumentation problem, which allows creating global explanations via argumentative reasoning. We generalize sufficientand necessary argumentative explanations using a Markov network encoding, discuss the relevance of these explanations and establish relationships to families of abductive explanations from the literature. As the complexity of the explanation problems is high, we present an efficient approximation algorithm with probabilistic approximation guarantees.

Conference paper

Nguyen H-T, Goebel R, Toni F, Stathis K, Satoh Ket al., 2023,

How well do SOTA legal reasoning models support abductive reasoning?

, Logic Programming and Legal Reasoning Workshop@ICLP2023

We examine how well the state-of-the-art (SOTA) models used in legal reasoning support abductivereasoning tasks. Abductive reasoning is a form of logical inference in which a hypothesis is formulatedfrom a set of observations, and that hypothesis is used to explain the observations. The ability toformulate such hypotheses is important for lawyers and legal scholars as it helps them articulate logicalarguments, interpret laws, and develop legal theories. Our motivation is to consider the belief thatdeep learning models, especially large language models (LLMs), will soon replace lawyers because theyperform well on tasks related to legal text processing. But to do so, we believe, requires some form ofabductive hypothesis formation. In other words, while LLMs become more popular and powerful, wewant to investigate their capacity for abductive reasoning. To pursue this goal, we start by building alogic-augmented dataset for abductive reasoning with 498,697 samples and then use it to evaluate theperformance of a SOTA model in the legal field. Our experimental results show that although thesemodels can perform well on tasks related to some aspects of legal text processing, they still fall short insupporting abductive reasoning tasks.

Journal article

Rago A, Russo F, Albini E, Toni F, Baroni Pet al., 2023,

Explaining classifiers’ outputs with causal models and argumentation

, Journal of Applied Logics, Vol: 10, Pages: 421-449, ISSN: 2631-9810

We introduce a conceptualisation for generating argumentation frameworks (AFs) from causal models for the purpose of forging explanations for mod-els’ outputs. The conceptualisation is based on reinterpreting properties of semantics of AFs as explanation moulds, which are means for characterising argumentative relations. We demonstrate our methodology by reinterpreting the property of bi-variate reinforcement in bipolar AFs, showing how the ex-tracted bipolar AFs may be used as relation-based explanations for the outputs of causal models. We then evaluate our method empirically when the causal models represent (Bayesian and neural network) machine learning models for classification. The results show advantages over a popular approach from the literature, both in highlighting specific relationships between feature and classification variables and in generating counterfactual explanations with respect to a commonly used metric.

Journal article

Albini E, Rago A, Baroni P, Toni Fet al., 2023,

Achieving descriptive accuracy in explanations via argumentation: the case of probabilistic classifiers

, Frontiers in Artificial Intelligence, Vol: 6, Pages: 1-18, ISSN: 2624-8212

The pursuit of trust in and fairness of AI systems in order to enable human-centric goals has been gathering pace of late, often supported by the use of explanations for the outputs of these systems. Several properties of explanations have been highlighted as critical for achieving trustworthy and fair AI systems, but one that has thus far been overlooked is that of descriptive accuracy (DA), i.e., that the explanation contents are in correspondence with the internal working of the explained system. Indeed, the violation of this core property would lead to the paradoxical situation of systems producing explanations which are not suitably related to how the system actually works: clearly this may hinder user trust. Further, if explanations violate DA then they can be deceitful, resulting in an unfair behavior toward the users. Crucial as the DA property appears to be, it has been somehow overlooked in the XAI literature to date. To address this problem, we consider the questions of formalizing DA and of analyzing its satisfaction by explanation methods. We provide formal definitions of naive, structural and dialectical DA, using the family of probabilistic classifiers as the context for our analysis. We evaluate the satisfaction of our given notions of DA by several explanation methods, amounting to two popular feature-attribution methods from the literature, variants thereof and a novel form of explanation that we propose. We conduct experiments with a varied selection of concrete probabilistic classifiers and highlight the importance, with a user study, of our most demanding notion of dialectical DA, which our novel method satisfies by design and others may violate. We thus demonstrate how DA could be a critical component in achieving trustworthy and fair systems, in line with the principles of human-centric AI.

Conference paper

Nguyen HT, Goebel R, Toni F, Stathis K, Satoh Ket al., 2023,

LawGiBa – Combining GPT, knowledge bases, and logic programming in a legal assistance system

, JURIX 2023: The Thirty-sixth Annual Conference, Maastricht, the Netherlands, 18–20 December 2023, Publisher: IOS Press, Pages: 371-374, ISSN: 0922-6389

We present LawGiBa, a proof-of-concept demonstration system for legal assistance that combines GPT, legal knowledge bases, and Prolog’s logic programming structure to provide explanations for legal queries. This novel combination effectively and feasibly addresses the hallucination issue of large language models (LLMs) in critical domains, such as law. Through this system, we demonstrate how incorporating a legal knowledge base and logical reasoning can enhance the accuracy and reliability of legal advice provided by AI models like GPT. Though our work is primarily a demonstration, it provides a framework to explore how knowledge bases and logic programming structures can be further integrated with generative AI systems, to achieve improved results across various natural languages and legal systems.

Conference paper

Jiang J, Lan J, Leofante F, Rago A, Toni Fet al., 2023,

Provably Robust and Plausible Counterfactual Explanations for Neural Networks via Robust Optimisation.

, Publisher: PMLR, Pages: 582-597

Conference paper

Albini E, Rago A, Baroni P, Toni Fet al., 2022,

Descriptive accuracy in explanations: the case of probabilistic classifiers

, 15th International Conference on Scalable Uncertainty Management (SUM 2022), Publisher: Springer, Pages: 279-294

A user receiving an explanation for outcomes produced by an artificially intelligent system expects that it satisfies the key property of descriptive accuracy (DA), i.e. that the explanation contents are in correspondence with the internal working of the system. Crucial as this property appears to be, it has been somehow overlooked in the XAI literature to date. To address this problem, we consider the questions of formalising DA and of analysing its satisfaction by explanation methods. We provide formal definitions of naive, structural and dialectical DA, using the family of probabilistic classifiers as the context for our analysis. We evaluate the satisfaction of our given notions of DA by several explanation methods, amounting to two popular feature-attribution methods from the literature and a novel form of explanation that we propose and complement our analysis with experiments carried out on a varied selection of concrete probabilistic classifiers.

Conference paper

Maurizio P, Toni F, 2022,

Learning assumption-based argumentation frameworks

, 31st International Conference on Inductive Logic Programming (ILP 2022)

. We propose a novel approach to logic-based learning whichgenerates assumption-based argumentation (ABA) frameworks from positive and negative examples, using a given background knowledge. TheseABA frameworks can be mapped onto logic programs with negationas failure that may be non-stratified. Whereas existing argumentationbased methods learn exceptions to general rules by interpreting the exceptions as rebuttal attacks, our approach interprets them as undercutting attacks. Our learning technique is based on the use of transformationrules, including some adapted from logic program transformation rules(notably folding) as well as others, such as rote learning and assumptionintroduction. We present a general strategy that applies the transformation rules in a suitable order to learn stratified frameworks, and we alsopropose a variant that handles the non-stratified case. We illustrate thebenefits of our approach with a number of examples, which show that,on one hand, we are able to easily reconstruct other logic-based learningapproaches and, on the other hand, we can work out in a very simpleand natural way problems that seem to be hard for existing techniques.

Conference paper

Potyka N, Yin X, Toni F, 2022,

On the tradeoff between correctness and completeness in argumentative explainable AI

, 1st International Workshop on Argumentation for eXplainable AI, Publisher: CEUR Workshop Proceedings, Pages: 1-8, ISSN: 1613-0073

Explainable AI aims at making the decisions of autonomous systems human-understandable. Argumentation frameworks are a natural tool for this purpose. Among them, bipolar abstract argumentation frameworks seem well suited to explain the effect of features on a classification decision and their formal properties can potentially be used to derive formal guarantees for explanations. Two particular interesting properties are correctness (if the explanation says that X affects Y, then X affects Y ) and completeness (if X affects Y, then the explanation says that X affects Y ). The reinforcement property of bipolar argumentation frameworks has been used as a natural correctness counterpart in previous work. Applied to the classification context, it basically states that attacking features should decrease and supporting features should increase the confidence of a classifier. In this short discussion paper, we revisit this idea, discuss potential limitations when considering reinforcement without a corresponding completeness property and how these limitations can potentially be overcome.

Conference paper

Rago A, Baroni P, Toni F, 2022,

Explaining causal models with argumentation: the case of bi-variate reinforcement

, 19th International Conference on Principles of Knowledge Representation and Reasoning (KR 2022), Publisher: IJCAI Organisation, Pages: 505-509, ISSN: 2334-1033

Causal models are playing an increasingly important role inmachine learning, particularly in the realm of explainable AI.We introduce a conceptualisation for generating argumenta-tion frameworks (AFs) from causal models for the purposeof forging explanations for the models’ outputs. The concep-tualisation is based on reinterpreting desirable properties ofsemantics of AFs as explanation moulds, which are meansfor characterising the relations in the causal model argumen-tatively. We demonstrate our methodology by reinterpretingthe property of bi-variate reinforcement as an explanationmould to forge bipolar AFs as explanations for the outputs ofcausal models. We perform a theoretical evaluation of theseargumentative explanations, examining whether they satisfy arange of desirable explanatory and argumentative propertie

Conference paper

Jiang J, Rago A, Toni F, 2022,

Should counterfactual explanations always be data instances?

, XLoKR 2022: The Third Workshop on Explainable Logic-Based Knowledge Representation

Counterfactual explanations (CEs) are an increasingly popular way of explaining machine learning classifiers. Predominantly, they amount to data instances pointing to potential changes to the inputs that would lead to alternative outputs. In this position paper we question the widespread assumption that CEs should always be data instances, and argue instead that in some cases they may be better understood in terms of special types of relations between input features and classification variables. We illustrate how a special type of these relations, amounting to critical influences, can characterise and guide the search for data instances deemed suitable as CEs. These relations also provide compact indications of which input features - rather than their specific values in data instances - have counterfactual value.

Conference paper

Rago A, Russo F, Albini E, Baroni P, Toni Fet al., 2022,

Forging argumentative explanations from causal models

, Proceedings of the 5th Workshop on Advances in Argumentation in Artificial Intelligence 2021 co-located with the 20th International Conference of the Italian Association for Artificial Intelligence (AIxIA 2021), Publisher: CEUR Workshop Proceedings, Pages: 1-15, ISSN: 1613-0073

We introduce a conceptualisation for generating argumentation frameworks (AFs) from causal models for the purpose of forging explanations for models' outputs. The conceptualisation is based on reinterpreting properties of semantics of AFs as explanation moulds, which are means for characterising argumentative relations. We demonstrate our methodology by reinterpreting the property of bi-variate reinforcement in bipolar AFs, showing how the extracted bipolar AFs may be used as relation-based explanations for the outputs of causal models.

Conference paper

Sukpanichnant P, Rago A, Lertvittayakumjorn P, Toni Fet al., 2021,

LRP-based argumentative explanations for neural networks

, XAI.it 2021 - Italian Workshop on Explainable Artificial Intelligence, Pages: 71-84, ISSN: 1613-0073

In recent years, there have been many attempts to combine XAI with the field of symbolic AI in order to generate explanations for neural networks that are more interpretable and better align with human reasoning, with one prominent candidate for this synergy being the sub-field of computational argumentation. One method is to represent neural networks with quantitative bipolar argumentation frameworks (QBAFs) equipped with a particular semantics. The resulting QBAF can then be viewed as an explanation for the associated neural network. In this paper, we explore a novel LRP-based semantics under a new QBAF variant, namely neural QBAFs (nQBAFs). Since an nQBAF of a neural network is typically large, the nQBAF must be simplified before being used as an explanation. Our empirical evaluation indicates that the manner of this simplification is all important for the quality of the resulting explanation.

Journal article

Rago A, Cocarascu O, Bechlivanidis C, Lagnado D, Toni Fet al., 2021,

Argumentative explanations for interactive recommendations

, Artificial Intelligence, Vol: 296, Pages: 1-22, ISSN: 0004-3702

A significant challenge for recommender systems (RSs), and in fact for AI systems in general, is the systematic definition of explanations for outputs in such a way that both the explanations and the systems themselves are able to adapt to their human users' needs. In this paper we propose an RS hosting a vast repertoire of explanations, which are customisable to users in their content and format, and thus able to adapt to users' explanatory requirements, while being reasonably effective (proven empirically). Our RS is built on a graphical chassis, allowing the extraction of argumentation scaffolding, from which diverse and varied argumentative explanations for recommendations can be obtained. These recommendations are interactive because they can be questioned by users and they support adaptive feedback mechanisms designed to allow the RS to self-improve (proven theoretically). Finally, we undertake user studies in which we vary the characteristics of the argumentative explanations, showing users' general preferences for more information, but also that their tastes are diverse, thus highlighting the need for our adaptable RS.

Search or filter publications

Filter by type:

Filter by year:

Results

Search results

Formalising the robustness of counterfactual explanations for neural networks

Explaining random forests using bipolar argumentation and Markov networks

How well do SOTA legal reasoning models support abductive reasoning?

Explaining classifiers’ outputs with causal models and argumentation

Achieving descriptive accuracy in explanations via argumentation: the case of probabilistic classifiers

LawGiBa – Combining GPT, knowledge bases, and logic programming in a legal assistance system

Provably Robust and Plausible Counterfactual Explanations for Neural Networks via Robust Optimisation.

Descriptive accuracy in explanations: the case of probabilistic classifiers

Learning assumption-based argumentation frameworks

On the tradeoff between correctness and completeness in argumentative explainable AI

Explaining causal models with argumentation: the case of bi-variate reinforcement

Should counterfactual explanations always be data instances?

Forging argumentative explanations from causal models

LRP-based argumentative explanations for neural networks

Argumentative explanations for interactive recommendations