PhD- A Generic and Model-Agnostic Evaluation Framework for Decision-Making Tasks ( e.g. Task-Oriented Dialogue ) in Brittany France

March 27, 2024

We are looking for an enthusiastic and talented student interested in novel technologies and in their evolution, to work on A Generic and Model-Agnostic Evaluation Framework for Decision-Making Tasks such as Task-Oriented Dialogue.

The candidate will join the team NADIA (Natural Dialogue Interaction) in the R&D division of Orange at Lannion Brittany France (a charm and touristic city next to the sea: http://www.bretagne-cotedegranitrose.com/ , https://www.lannion.bzh/). Similarly, the PhD student will enrol at the University of Marseille, and will be attached to the laboratory LIS (Laboratoire d’Informatique et Systèmes).

The candidate must hold a Master’s degree in Computer Science, Mathematics, Engineering or a related field, with a high mathematical level (optimisation, statistics and probability theory). Programming skills are absolutely necessary. Understanding autoregressive generative models is very important.

Experience in Machine Learning, Deep Learning and AI is desired, as well as mastering deep learning modules such as Torch, pyTorch and TensorFlow. Fluent English level is required, while knowing French is not essential.

To apply please go to: https://orange.jobs/jobs/v3/offers/133680?lang=en

PhD- A Generic and Model-Agnostic Evaluation Framework for Decision-Making Tasks (e.g. Task-Oriented Dialogue)

Abstract:

Orange has implemented various bot solutions, and records large amounts of human-machine and human-to-human conversations for customer care. Often, these conversations are not rigorously evaluated.

Large language models (LLMs) are a breakthrough in many natural language processing (NLP) tasks, including the development of agents capable of solving complex tasks [6]. Moreover, developing chatbots has become democratized. It is likely that there will be a proliferation of solutions in the near future. The boundaries between various NLP tasks and the domains (e.g. tourism, restaurant, retail, technical support, etc.) are blurring.

The evaluation of the various solutions is becoming a real need, it is necessary to broaden the scope of evaluation and to make it transposable.

Previous work studied the correlation between objective and subjective metrics (indicators) to evaluate conversations [3] and for text generation [6]. Others predicted the quality of the conversation [1]. Model-agnostic scores are proposed in [4] to compare the behaviour of two dialogue systems. Recently the DSTC (Dialogue System Technology Challenge) has been focused on the evaluation of dialogues [5].

Inspired by the game theory [7], interpretability [2] and self-driving cars [8], we would like to infer the strategy that has been followed by the dialogue system [10] in order to evaluate it under distinct perspectives. We will work both on public (WebShop, ALFWorld,…) and private Orange (Technical Assistance, Commercial Bots, etc.) datasets.

References:

[1] Rojas-Barahona Lina M. (2020). Is the User Enjoying the Conversation? A Case Study on the Impact on the Reward Function. In proceedings of NeurIPS workshop HLDS2020.

[2] Michele Cafagna, Lina M. Rojas-Barahona, Kees van Deemter, Albert Gatt. Interpreting Vision and Language generative models with semantic visual priors. in Frontiers in AI. Special issue : Explainable AI in Natural Language.

[3] Marilyn A. Walker, Diane J. Litman, Candace A. Kamm, and Alicia Abella. 1997a. PARADISE: A framework for evaluating spoken dialogue agents. ACL and EACL, pages 271–280, Madrid, Spain.

[4] Ultes, Stefan, and Wolfgang Maier. “Similarity scoring for dialogue behaviour comparison.” SIGDIAL. 2020.

[5] Mehri, Shikib, et al. “Interactive evaluation of dialog track at DSTC9.” arXiv preprint arXiv:2207.14403 (2022).

[6] Shunyu Yao, Jeffrey Zhao, Dian Yu, Nan Du, Izhak Shafran, Karthik Narasimhan, and Yuan Cao. 2022. React: Synergizing reasoning and acting in language models. arXiv preprint arXiv:2210.03629 [7] Yu, Xiaopeng, et al. “Model-based opponent modeling.” Advances in Neural Information Processing Systems 35 (2022): 28208-28221.

[8] Teng, Siyu, et al. “Motion planning for autonomous driving: The state of the art and future perspectives.” IEEE Transactions on Intelligent Vehicles (2023).

Profile :

– Experience in Machine Learning and AI, especially deep learning and Generative AI.

– You master mathematics (optimisation, statistics, probabilités, etc.).

– Programming skills are absolutely necessary

– Fluent English level is required, while knowing French is not essential.

– You are curious about the innovation and are able to follow the advances in the state of the art.

– You have good team-working skills and you are able to work in multidisciplinary projects, collaborating with others while being able to work autonomously on your own tasks

– Mastering DL modules such as Torch, pyTorch, TensorFlow

– Good communication skills, writing proficiency you are able to write scientific papers and reports, as well as to prepare oral presentations.

Training (Master in computer science, Mathematics, statistics, Engineering …) :

The candidate must hold a Master’s degree in Computer Science, Mathematics, Engineering or a related field, with a high mathematical level (optimisation, statistics and probability theory).
Experiences (internships, …) :A first experience with deep learning approaches in an internship would be desired,

Orange is a principal actor in the innovation for the sector of information technologies and communications. The innovation is a principal grown variable for the Orange group.
In the division of Technology and Global Innovation, where the ambition is to move innovation in Orange forward and to reinforce the technological leadership, you will work for the team NADIA (Natural Dialogue Interaction), in charge of developing dialogue systems and pursuing the research on dialogue in natural language through machine learning such as supervised, unsupervised, semi-supervised (including of course generative AI) and reinforcement learning.

Lina Rojas-Barahona
INNOV/DATA-AI/AITT/NADIA
Senior Research Scientist AI&Dialogue
lina.rojas@orange.com<mailto:lina.rojas@orange.com>
Google Scholar<https://scholar.google.com/citations?user=n42dh0cAAAAJ&hl=en&oi=ao>