Understanding Language by Computer
Associate Professor：SHIRAI Kiyoaki
Natural Language Processing, Machine Learning, Artificial Intelligence
Statistical Natural Language Processing, Support for Web Access, NLP Application
Skills and background we are looking for in prospective students
Interest in human language, Desire for learning natural language processing, Fundamental knowledge on algorithm and automaton
What you can expect to learn in this laboratory
How to find new problems on natural language processing by conducting comprehensive survey of previous work. How to explore solution for your own research questions by learning necessary fundamental techniques and methods of natural language processing. Writing skill and presentation skill to tell your research outcome to others by publishing a paper in a domestic/international conference and giving the presentation both in the university and at a conference.
【Job category of graduates】 Information Technology
Natural Language Processing (NLP) is a technique to utilize a computer to understand a language we daily use, process a huge amount of texts, and provide a new service. NLP has great ability to enrich our life, but it is difficult to understand a language by a computer. Our laboratory tackles such difficult problems.
Major research themes in our laboratory can be summarized as follows.
(1) Natural language analysis based on a large corpus
“Natural language analysis” means a process to understand a meaning of a sentence. In general, a huge amount of knowledge and rules is required to understand sentences. However, it is difficult to prepare such knowledge exhaustively. We study techniques to acquire statistical information from a large amount of texts (corpus) and use it for accurate natural language analysis.
(2) Support of Web Access
It is a technique to help people to search on Web. One of the examples of this research topic is an interactive question answering (QA) system. A QA system accepts a question in natural language and searches an answer of it from Web. We try to develop a QA system that can interact with a user to search answer precisely. For example, when a user asks an ambiguous question “What is the country that won the World Cup?”, the QA system asks the user “What kinds of sports do you mean?” or “When is it held?” to get the correct answer.
(3) Opinion Mining
Nowadays, people often post a review about a product or service on Web media such as a blog or social networks. Opinion of others is useful for people who want to buy a new product. In opinion mining, for a given target (product or service), we analyze users’ review, judge whether a user expresses a positive or negative opinion, and reveal reputation of the target.
(4) NLP application
We try to develop many NLP application systems. For example, a free conversation system is a computer that can enjoy free conversation or chat. Another example is a visualization system of cooking recipe that can help beginners to understand complicated cooking actions.
- Aye Aye Mar, Kiyoaki Shirai. Automatic Construction of Annotated Corpus with Implicit Aspect. The Thirteenth Edition of the Language Resources and Evaluation Conference, pp. 6985-6991, 2022.
- Shangzhuang Han, Kiyoaki Shirai. Unsupervised Word Sense Disambiguation based on Word Embedding and Collocation. The 13th International Conference on Agents and Artificial Intelligence, Volume 2, pp.1218-1225, 2021.
- Kiyoaki Shirai, Yunmin Xiang. Over-sampling Methods for Polarity Classification of Imbalanced Microblog Texts. The 33rd Pacific Asia Conference on Language, Information and Computation, pp.248-256, 2019.
We perform several research activities in our laboratory to enhance students’ ability of finding a new problem and solving it as well as presentation and communication skills. First, we regularly have a seminar to study previous work where one of students introduces a related paper to other laboratory members. We also have a regular seminar to discuss students’ research contents. Students often have a meeting with the supervisor to discuss their research progress and future direction.