Third International Workshop on SCIentific DOCument Analysis
associated with JSAI International Symposia on AI 2018 (IsAI-2018)

Workshop: November 12 - 13, 2018

Raiosha in Hiyoshi Campus of Keio University, Yokohama, Kanagawa

Aims and Scope

Recent proliferation of scientific papers and technical documents has become an obstacle to efficient information acquisition of new information in various fields.. It is almost impossible for individual researchers to check and read all related documents. Even retrieving relevant documents is becoming harder and harder. This workshop gathers all the researchers and experts who are aiming at scientific document analysis from various perspectives, and invite technical paper presentations and system demonstrations that cover any aspects of scientific document analysis.

Important Dates

paper submission deadline: September 15, 2018
paper submission deadline: September 29, 2018 (revised!)
Camera-ready due: October 20, 2018 (11:59pm PST; UTC-8)
Workshop: November 12-13, 2018


Please register the workshop at registration page of JSAI International Symposia on AI 2018.


Relevant topics include, but are not limited to, the following:

  • text analysis
  • document structure analysis
  • logical structure analysis
  • figure and table analysis
  • citation analysis of scientific and technical documents
  • scientific information assimilation
  • summarization and visualization
  • knowledge discovery/mining from scientific papers and data
  • similar document retrieval
  • entity and relation linking between documents and knowledge base
  • survey generation
  • resources for scientific documents analysis
  • document understanding in general
  • NLP systems aiming for scientific documents including tagging, parsing, coreference, etc.

Invited Speaker

Miles Crawford, Allen Institute for Artificial Intelligence

Title:Semantic Scholar's Document Analysis at Scale

Abstract: The mission of the Semantic Scholar project is to provide a high-quality, comprehensive tool for finding and evaluating scholarly papers. To achieve this, we've had to scale up complex document analysis and aggregation to power a high-traffic, publicly-available product:

This talk covers some of the recent research projects and extraction capabilities created by the Semantic Scholar team. We'll also describe how we've combined research and engineering efforts with techniques from industry to create a machine-learning driven tool that can operate at high scale, containing over forty million searchable documents and serving over two million people every month.



There are two classes of submissions:
  • Long paper on original and completed work, including concrete evaluation and analysis wherever appropriate; and
  • Short paper on a small, focused contribution, work in progress, a negative result, or an opinion piece.

The page limits are up to 14 pages including references for the longer papers, and up to 7 pages including references for the short papers. (Reviewers will be told that there is no penalty for writing a shorter submission.)

All submissions should be written in English, formatted according to the Springer Verlag LNCS style in a pdf form, which can be obtained from here. The paper should be anonymized. If you use a word file, please follow the instruction of the format, and then convert it into a pdf form and submit it at the paper submission page.

For both classes, in addition to the original unpublished work, we also accept the papers that have already been published or presented in other venues. This submission should also be anonymized, and will be reviewed by the program committee.

The accepted papers will not be archived in general. The papers are distributed to the participants of the workshop on a USB flash drive. If the authors hope to make their paper publicly available, we also will provide a link to the pdf on this webpage. Otherwise, we do not upload the papers on the web. Unpublished submissions on both long and short paper tracks are considered as the candidates for post-proceedings of LNAI (the authors can also reject the invitation, if they wish). The papers will be archived only by this post-proceedings.

You can submit your paper at . If you cannot submit a paper by EasyChair System by some trouble, please send email to "nomura[at]"

If a paper is accepted, at least one author of the paper must register the workshop and present it. Please register the workshop at registration page.

Post Proceedings

Selected papers will be published as a post-proceedings via Springer Verlag "Lecture Notes in Artificial Intelligence" series after the second round of review after the workshop.

SCIDOCA2018 Program (November 12, 2018)

Long paper (L): 20min presentation + 10min Q&A
Short paper (S): 15min presentation + 5min Q&A
  • 10:00-10:10: Opening
  • 10:10-11:10: Invited Talk:
    • Miles Crawford. Semantic Scholar's Document Analysis at Scale
  • 11:30-12:30: Relation Extraction:
    • (L) Qin Dai, Naoya Inoue, Paul Reisert and Kentaro Inui. Scientific Knowledge Acquisition via the Interaction between Relation Extraction and Knowledge Graph Completion
    • (L) Biswanath Barik and Björn Gambäck. Causal Relation Identification in Complex Event Structures Using Convolutional Neural Networks
  • 12:30-14:00: Lunch
  • 14:00-15:00: Terms and Keywords:
    • (S) Thaer M. Dieb, Hiroyuki Oka and Masashi Ishii. Linking Polymers Names Abbreviations to their Definitions in Related Scientific Documents
    • (S) Kimitaka Asatani, Junichiro Mori and Ichiro Sakata. Keyword extraction using citation network: Verification of network-based method
    • (S) Shuhei Kondo, Yuji Matsumoto and Hiroyuki Shindo. Translating Chemical Substance Names using Attentional Encoder-Decoder
  • 15:30-16:50: Table and Graph Analysis:
    • (L) Keisuke Goto, Hiroyuki Shindo and Yuji Matsumoto. Line Detection Considering Spatial Context for Reading Line Charts
    • (L) Mako Akeda and Yoshinobu Kano. Brain Function and Coordinate Extraction from Neuroscience Full Text Papers
    • (S) Hiroyuki Oka, Hiroyuki Shindo, Keisuke Goto, Yuji Matsumoto, Atsushi Yoshizawa, Isao Kuwajima and Masashi Ishii. Automatic extraction of polymer data from tables in xml

SCIDOCA2018 Program (November 13, 2018)

  • 9:40-11:00: Classification and Clustering:
    • (L) Sungchul Choi and Jaeyoung Kim. Patent Document Clustering with Deep Embeddings
    • (L) Vu Tran, Minh Le Nguyen and Ken Satoh. Combining Lexical and Latent Features for Legal Case Retrieval Task
    • (S) Koichiro Watanabe, Shuntaro Yada and Kyo Kageura. Characteristics of Sentences with References in Scholarly Papers: An Explorative Analysis
  • 11:20-12:30: Extraction of Information Resources:
    • (L) Peter David. Quantity and Unit Extraction for Scientific Document Analysis
    • (S) Akira Suzuki and Masashi Ishii. Constructing a “Unit dictionary” from scientific articles
    • (S) Akiko Aizawa, Takeshi Sagara, Kenichi Iwatsuki and Goran Topic. Construction of a New ACL Anthology Corpus for Deeper Analysis of Scientific Papers
  • 12:30-12:40: Closing

Workshop Chairs

Yuji Matsumoto, Nara Institute of Science and Technology, Japan
Shoshin Nomura, NII, Japan

Program Committee Members

Takeshi Abekawa, NII
Akiko Aizawa, NII
Naoya Inoue, Tohoku University
Kentaro Inui, Tohoku University
Yoshinobu Kano, Shizuoka University
Yusuke Miyao, NII
Junichiro Mori, University of Tokyo
Hidetsugu Nanba, Hiroshima City University
Ken Satoh, NII
Hiroyuki Shindo, NAIST
Yoshimasa Tsuruoka, University of Tokyo
Minh Le Nguyen, JAIST
Pontus Stenetorp, University College London

For any inquiry concerning the workshop, please send it to "nomura[at]"

SCIDOCA 2018 home page

Back To Top