SCIDOCA 2024

Aims and Scope

Recent proliferation of scientific papers and technical documents has become an obstacle to efficient information acquisition of new information in various fields. It is almost impossible for individual researchers to check and read all related documents. Even retrieving relevant documents is becoming harder and harder. This workshop gathers all the researchers and experts who are aiming at scientific document analysis from various perspectives, and invite technical paper presentations and system demonstrations that cover any aspects of scientific document analysis.

Important Dates (Time zone: AOE (Anywhere on Earth))

Submission Deadline for Long Papers: ~~January 31, 2024~~ February 29, 2024 (Firm deadline)
Submission Deadline for Short Papers: ~~January 31, 2024~~ February 29, 2024 (Firm deadline)
Notification of Acceptance (Long+Short Papers): March 15, 2024
Camera-Ready due (Long+Short Papers): March 22, 2024
Hard deadline for LNAI proceedings: March 25, 2024

Registration

Please register the workshop at registration page of JSAI International Symposia on AI 2024.

Program

Day 1 (May 28, 2024) (Place: ROOM S (52))

14:20: Workshop opening
14:30-18:00: Session 1 (SC: Dr. Vu Tran)

14:30: Texylon: Dataset of Log-to-Description and Description-to-Log Generation for Text Analytics Tools
Masato Nakata, Kosuke Morita, Hirotaka Kameko, and Shinsuke Mori

15:00: Tokenization Adaptation for Robust Chemical Named Entity Recognition with Limited Domain-Specific Text
Tuan An Dao, Hiroki Teranishi, Yuji Matsumoto, Akiko Aizawa

Coffee break

15:30: A Practical and Customizable System for Named Entity Recognition and Relation Extraction in Materials Science Publications
Van-Thuy Phi and Yuji Matsumoto

16:00: From Vietnamese to English: Advancing VQA with Cross-Linguistic Mapping (online)
Tung Le, Dung Vo, and Huy Tien Nguyen

Coffee break
17:00: Invited talk 1 (SC: Prof. Le-Minh Nguyen)
Speaker: Prof. Long Tran-Thanh (The Deputy-Head and the Director of Research at the department of Computer Science, University of Warwick, UK)

Title: Do LLM Agents Behave Strategically?

Abstract: The rapid advancement of large language model (LLM) based AI systems has been disruptive to many sectors of our lives, and soon they will act as autonomous agents on our behalf in dealing with complex real-world problems while interacting with other agents and humans. However, it is unclear that when multiple of these LLM agents interact with each other, how they would influence each other’s behaviour. This is especially true if they are programmed to be strategic (i.e., selfish, or malicious) on the behalf of their human owners/creators. These strategic behaviours, if not mitigated efficiently, will cause societal, financial, ethical, and safety disasters. In this talk, I will discuss a number of key research challenges and problems that need to be addressed to avoid such disasters. These include: (i) last round/last iterate convergence in non-cooperative multi-agent learning; (ii) efficient learning with limited verifications against strategic manipulators; and (iii) truthful machine learning. While it is still unknown how these challenges should be addressed in the multi-LLM-agent setting, I will demonstrate how this has been addressed within the multiagent research community for simpler agent models, drawing some intuitions for future work on LLM agents.

Short bio: Long is currently the Deputy-Head and the Director of Research at the department of Computer Science, University of Warwick, UK. He is also the university’s Chair of Digital Research Spotlight. Long has been doing active research in a number of key areas of Artificial Intelligence and multi-agent systems, mainly focusing on multi-armed bandits, game theory, and incentive engineering, and their applications to AI for Social Good. He has published more than 80 papers at peer-reviewed A* conferences in AI/ML (including AAAI, AAMAS, CVPR, ECAI, IJCAI, NeurIPS, UAI) and journals (JAAMAS, AIJ), and have received a number of prestigious national/international awards, including 2 best paper honourable mention awards at top-tier AI conferences (AAAI, ECAI), 2 Best PhD Thesis Awards (one in the UK and one in Europe), and the co-recipient of the 2021 AIJ Prominent Paper Award (for one of the 2 most influential papers between 2014-2021 published at the Artificial Intelligence Journal).
18:00: Day 1 closing

Day 2 (May 29, 2024) (Place: ROOM S (52))

11:00: Invited talk 2 (SC: Prof. Yuji Matsumoto )
Speaker: Assoc. Prof. Mori Junichiro (Graduate School of Information Science and Technology, University of Tokyo)

Title: Large-scale scholarly data analysis: network analysis perspective

Abstract: I had been involved as a co-researcher in the JST・CREST project "Knowledge discovery from large-scale literature information based on structural understanding" under the leadership of Prof. Yuji Matsumoto. Therein, our particular focus was on "Knowledge discovery based on the structural relationships of large-scale citation networks and literature texts." In this talk, I will present the latest research outcomes in several ongoing follow-up projects related to large-scale literature data analysis. Specifically, in the current JST・CREST project "Human Computation foundations for collaboration between humans and AI," we are currently working on supporting scientific activities based on literature data analysis. Additionally, in the NEDO project "Forecasting scientific and technological trends based on the fusion of pre-trained Language models and network models," we are also working on predicting the impact of scientific research using large-scale literature data aiming at supporting science and technology policy making. Lastly, in our university's collaborative project with the industry "Technology Informatics," we are working on knowledge extraction from literature data to support R&D activities. Through these our recent researches and their findings, I would like to share insights particularly on the foundational techniques and applications of large-scale literature data analysis based on network analysis approaches.

Short bio: Junichiro Mori is an Associate Professor at the Graduate School of Information Science and Technology, the University of Tokyo. He obtained his doctoral and master's degrees in Information Science and Technology from the University of Tokyo. His researches have focused on Artificial Intelligence, particularly in the fields of user modeling, information extraction, and social network analysis. His current research interests include data mining with graphs, social network analysis, and representation learning.
12:00: Lunch break
14:00-15:25: Session 2 (SC: Dr. Vu Tran)

14:00: Hallucination for large language modeling: a comprehensive review
Dang Hoang Anh, Vu Tran, and Nguyen Le Minh

14:30: Vietnamese Elementary Math Reasoning using Large Language Model with Refined Translation and Dense-retrieved Chain-of-thought
Nguyen-Khang Le, Dieu-Hien Nguyen, Dinh-Truong Do, Chau Nguyen, and Minh Le Nguyen

15:00: A Framework for Enhancing Statute Law Retrieval using Large Language Models
Trang Ngoc Anh Pham, Dinh-Truong Do, and Minh Le Nguyen

Coffee break
15:30-17:00: Session 3 (SC: Dr. Vu Tran)

15:30: Improving LLM Prompting with Ensemble of Instructions: A Case Study on Sentiment Analysis
Vu Tran and Tomoko Matsui

16:00: Enhancing Document Retrieval in COVID-19 Research: Leveraging Large Language Models for Hidden Relation Extraction
Hoang-An Trieu, Dinh-Truong Do, Chau Nguyen, Vu Tran, and Minh Le Nguyen

16:30: Semantic Parsing for Question and Answering Over DBLP Database with Large Language Models
Le-Minh Nguyen, Le-Nguyen Khang, Kieu Que Anh, Nguyen Dieu Hien, and Yukari Nagai

17:00: Workshop closing

Topics

Relevant topics include, but are not limited to, the following:

text analysis
document structure analysis
logical structure analysis
figure and table analysis
citation analysis of scientific and technical documents
scientific information assimilation
summarization and visualization
knowledge discovery/mining from scientific papers and data
similar document retrieval
entity and relation linking between documents and knowledge base
survey generation
resources for scientific documents analysis
document understanding in general
NLP systems aiming for scientific documents including tagging, parsing, coreference, etc.

Submissions

There are two classes of submissions:

Long paper on original and completed work, including concrete evaluation and analysis wherever appropriate; and
Short paper on a small, focused contribution, work in progress, a negative result, or an opinion piece.

The page limits are up to 14 pages including references for the longer papers, and up to 7 pages including references for the short papers. (Reviewers will be told that there is no penalty for writing a shorter submission.)

All submissions should be written in English, formatted according to the Springer Verlag LNCS style in a pdf form, which can be obtained from here. The paper should be anonymized. If you use a word file, please follow the instruction of the format, and then convert it into a pdf form and submit it at the paper submission page.

For both classes, in addition to the original unpublished work, we also accept the papers that have already been published or presented in other venues. This submission should also be anonymized, and will be reviewed by the program committee.

You can submit your paper at https://easychair.org/conferences/?conf=scidoca2024 . If you cannot submit a paper by EasyChair System by some trouble, please send email to "nguyenml[at]jaist.ac.jp"

If a paper is accepted, at least one author of the paper must register the workshop and present it. Please register the workshop at registration page.

Workshop Chairs

Minh Le Nguyen, Japan Advanced Institute of Science and Technology
Yuji Matsumoto, RIKEN Center for Advanced Intelligence Project (Advisor)

Program Committee Members

Nguyen Le Minh, Japan Advanced Institute of Science and Technology
Noriki Nishida, RIKEN Center for Advanced Intelligence Project
Vu Tran, The Institute of Statistical Mathematics
Yusuke Miyao, The University of Tokyo
Yuji Matsumoto, RIKEN Center for Advanced Intelligence Project
Yoshinobu Kano, Shizuoka University
Akiko Aizawa, National Institute of Informatics
Ken Satoh, National Institute of Informatics and Sokendai
Junichiro Mori, The University of Tokyo
Kentaro Inui, Tohoku University
Nguyen Ha Thanh, National Institute of Informatics
Nguyen Minh Phuong, Japan Advanced Institute of Science and Technology

For any inquiry concerning the workshop, please send it to "nguyenml[at]jaist.ac.jp"

SCIDOCA 2024 home page https://www.jaist.ac.jp/event/SCIDOCA/2024/

Eighth International Workshop on SCIentific DOCument Analysis
(SCIDOCA 2024)
associated with JSAI-isAI 2024

Aims and Scope

Important Dates (Time zone: AOE (Anywhere on Earth))

Registration

Program

Day 1 (May 28, 2024) (Place: ROOM S (52))

Day 2 (May 29, 2024) (Place: ROOM S (52))

Topics

Submissions

Workshop Chairs

Program Committee Members

Preivous SCIDOCA workshops

Eighth International Workshop on SCIentific DOCument Analysis (SCIDOCA 2024) associated with JSAI-isAI 2024

Aims and Scope

Important Dates (Time zone: AOE (Anywhere on Earth))

Registration

Program

Day 1 (May 28, 2024) (Place: ROOM S (52))

Day 2 (May 29, 2024) (Place: ROOM S (52))

Topics

Submissions

Workshop Chairs

Program Committee Members

Preivous SCIDOCA workshops

Eighth International Workshop on SCIentific DOCument Analysis
(SCIDOCA 2024)
associated with JSAI-isAI 2024