Interactive Explainability of Machine learning applied to language tasks ==========================================================================
**Project context**: The thesis takes place within the Descartes project (www.cnrsatcreate.cnrs.fr/descartes/), a large France-Singapore collaboration project on applying AI to urban systems. The project will generate a lot of data about artifical systems deployed in the wild, part of which will be expressed as textual data (expert reports, user reactions, news coverage, social media conversations). Natural language processing (NLP) models can help access that voluminous information, but there is an important need from operators, policy makers and public institutions to understand the reasons behind models’ behaviours and the information they extract, to be able to evaluate their potential issues (accuracy, fairness, biases). This thesis will investigate methods design to explain machine learning systems typically used in NLP while integrating an interactive process with the system users.
**Thesis subject**: Modern machine-learning based AI systems, while achieving good results on a lot of tasks, still appear as “black-box” models, where it is difficult to trace the path from the input (a text, an image, a set of sensor measures) to the decision (classification of a document, an image, a situation). The issue of explainability poses two different problems: (1) what is a good explanation, and specifically what is a good explanation in the context of textual models? and (2) how to scale existing explanation methods to the kind of models used in NLP tasks? About (1), existing methods for image classification or tabular data tend to rely on the extraction of a set of pixels or features that are sufficient for generating predictions, or increase the probabilities of the prediction. It is less straightforward for textual input, which consists of words, but whose meanings are inter-related in a given context (for instance “good” in a review could be an indication that the review is positive … unless it is preceded by “not”). So the first problem of this thesis will be to provide humanly acceptable explanations of simple text classifiers such as those foreseen for the detection tasks in the dedicated sub-project of Descartes. About (2), modern NLP models are based on very large and complex architectures, such as the transformer family. Logically sufficient or causally satisfying explanations are difficult to get for such cases, as both such methods suffer from scalability problems. So we will explore heuristics based on our solution to the first problem guiding an interactive procedure between explainee (the person requesting the explanation) and the ML system whose predictions should be explained. We will evaluate the procedure on those users targeted for the use cases of the project. Brian Lim from NUS Singapore will help design the validating experiments.
**Competences for the student**: A background in Computer Science and/or Machine learning. Familiarity or a willingness to acquire a familiarity with both model based and model agnostic explanation paradigms that use either logical or statistical methods. A familiarity with NLP / dialogue would be a plus. Given the nature of the project, the student should be open to work in a cross-disciplinary environment, and have good English communication skills
**Supervision**: The thesis will happen within the France-Singapore collaboration, with advisors from both sides. The student will be registered at the University of Toulouse, and part of the IRIT lab, but is expected to spend a good part of the thesis in Singapore at the partner lab, with funding provided by the Descartes project.
The thesis will be supervised on the French side by Nicholas Asher and Philippe Muller, both NLP experts on text and conversation analysis, and co-advised by Nancy Chen from the A* lab, expert in NLP and dialogue, and Brian Lim at the National University of Singapore, an expert on Human-Computer interaction. The French advisors will also spend time at NUS during the thesis.
Contact: nicholas.asher@irit.fr, philippe.muller@irit.fr, nfychen@i2r.a-star.edu.sg
References: – Descartes project: www.cnrsatcreate.cnrs.fr/descartes/ – A Survey of the State of Explainable AI for Natural Language Processing Marina Danilevsky, Kun Qian, Ranit Aharonov, Yannis Katsis, Ban Kawas, Prithviraj Sen, ACL 2020. aclanthology.org/2020.aacl-main.46/ – Explanation in artificial intelligence: Insights from the social sciences Tim Miller, Artificial Intelligence 267:1-38 (2019) – Interpretable Machine Learning, Christoph Molnar. christophm.github.io/interpretable-ml-book/ – Alexey Ignatiev, Nina Narodytska, and Joao Marques-Silva. 2019. On Relating Explanations and Adversarial Examples. In NeurIPS. 15857–15867. – Shrikumar, A.; Greenside, P.; and Kundaje, A. 2017. Learning important features through propagating activation differences. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, 3145–3153. JMLR. org. -Ribeiro, M. T.; Singh, S.; and Guestrin, C. 2016. Why should I trust you?: Explaining the predictions of any classifier. In ACM SIGKDD.