Can Machines Read Between the Lines?
Associate Professor：INOUE Naoya
Natural Language Processing, Language Understanding, Commonsense Reasoning
Language Model, Deep Learning, AI, Explainability, Data Science, Inference, Argumentation Analysis
Skills and background we are looking for in prospective students
Required: Passion for creating machines that can understand human language. Preferred: (a) Basic knowledge about linear algebra, probability, statistics, and algorithms and (b) experience in programming and Linux.
What you can expect to learn in this laboratory
In the lab, you will actively discuss with lab members, come up with innovative ideas, program these ideas as a computational model, and quantitatively evaluate these ideas. From a number of trials, someday you will obtain interesting insights. We then write and submit a paper to academic conferences to polish the idea more, communicating with researchers worldwide. Through these activities, you can acquire so many general-purpose skills, let alone expertise in natural language processing research and its related areas. You will know how to think critically, how to dive into unknown fields, how to plan, how to present your work, how to program, and how to work as a team.
【Job category of graduates】Academia, Information Technology
We study how to create machines that can understand our human language. Our research field is called Natural Language Processing (NLP), and a wide variety of research topics have been explored in NLP. Our focus is to equip machines with an ability to “infer” something--making implicit things explicit and reading between the lines. The examples of our research topics are the following:
1. Reading Comprehension Model that Can Logically Think and Concisely Explain Their Own Decision
Recent advances in Deep Learning have a large impact on NLP, resulting in more accurate NLP systems. However, for the systems to be more robust for unseen input texts, it is crucial for the systems to logically come up with an answer. Furthermore, when the systems are deployed in the real world, it is required for them to explain their own decision as well as answer given questions(explainability). How can we design such a computational model to do this? How can we train such a model with a machine learning algorithm? How do we imitate human reasoning? We proposed a self-supervised learning algorithm for creating such an explainable reading comprehension model .
Figure: Self-training of an explainable reading comprehension model 
2. Designing and Creating Large-scale Benchmark Dataset for Natural Language Understanding
To objectively measure the progress of research, we need a quantitative evaluation measure indicating the quality of our models. On what condition can we say the model has successfully read between the lines? How can we quantitatively measure it? Can we create such a benchmark dataset at scale? We proposed a crowdsourcing approach for this problem .
3. Argumentation Analysis and Assessment
When we write argumentative texts (e.g. essays), we usually leave background knowledge that we expect the reader to have implicit. For machines to fully understand such texts, it is crucial to auto-complete such background knowledge. We study the problem of analyzing argumentative texts as one important application of our technology. The research questions here include: How can we recognize the implicit logical structure of arguments? How can we assess argumentative texts and recognize their weaknesses? How can we create a benchmark dataset of argument analysis ?
In addition to these research topics, we are also studying “inference” in other contexts: (a) multi-modal NLP systems using non-textual information such as vision, sound, and physics, (b) story understanding, and (c) involving humans to get more robust AI models (human-in-the-loop). Our lab just started on 4/1/2022, so everything is flexible and from scratch. Please join us if one of these topics sounds interesting to you--I need your force to kick-start our lab!
- Naoya Inoue, Harsh Trivedi, Steven Sinha, Niranjan Balasubramanian and Kentaro Inui. Summarize-then-Answer: Generating Concise Explanations for Multi-hop Reading Comprehension. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (EMNLP2021), 2021, pp.6064-6080
- Naoya Inoue, Pontus Stenetorp and Kentaro Inui. R4C: A Benchmark for Evaluating RC Systems to Get the Right Answer for the Right Reason. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL2020), 2020, pp.6740–6750
- Farjana Sultana Mim, Naoya Inoue, Shoichi Naito, Keshav Singh and Kentaro Inui. LPAttack: A Feasible Annotation Scheme for Capturing Logic Pattern of Attacks in Arguments. In Proceedings of the 13th Language Resources and Evaluation Conference (LREC2022), pp. 2446-2459, June 2022.
CPU/GPU Cluster Machines
I respect you and will do the best I can to bring the best out of you. I will help you to be an independent researcher--to plan a research project truly enjoyable to you and make progress on the project yourself. I also encourage you to present your work at academic conferences and to collaborate with researchers worldwide to be a global researcher. Our lab will have a wide variety of study/reading groups and one-on-one weekly meetings. We communicate in English for our lab to be a global environment.