Second International Workshop on SCIentific DOCument Analysis
(SCIDOCA 2017)
associated with JSAI International Symposia on AI 2017 (IsAI-2017)

Workshop: November 14 - 15, 2017

Aims and Scope

Recent proliferation of scientific papers and technical documents has become an obstacle to efficient information acquisition of new information in various fields.. It is almost impossible for individual researchers to check and read all related documents. Even retrieving relevant documents is becoming harder and harder. This workshop gathers all the researchers and experts who are aiming at scientific document analysis from various perspectives, and invite technical paper presentations and system demonstrations that cover any aspects of scientific document analysis.

Important Dates

Workshop: November 14 - 15, 2017

Submission Deadline: September 29, 2017 (11:59pm PST; UTC-8)
Notification: October 11, 2017
Camera-ready due: October 20, 2017 (11:59pm PST; UTC-8)


Please register the workshop at registration page of JSAI International Symposia on AI 2017.


Relevant topics include, but are not limited to, the following:

  • text analysis
  • document structure analysis
  • logical structure analysis
  • figure and table analysis
  • citation analysis of scientific and technical documents
  • scientific information assimilation
  • summarization and visualization
  • knowledge discovery/mining from scientific papers and data
  • similar document retrieval
  • entity and relation linking between documents and knowledge base
  • survey generation
  • resources for scientific documents analysis
  • document understanding in general
  • NLP systems aiming for scientific documents including tagging, parsing, coreference, etc.


There are two classes of submissions:
  • Long paper on original and completed work, including concrete evaluation and analysis wherever appropriate; and
  • Short paper on a small, focused contribution, work in progress, a negative result, or an opinion piece.

The page limits are up to 14 pages including references for the longer papers, and up to 7 pages including references for the short papers. (Reviewers will be told that there is no penalty for writing a shorter submission.)

All submissions should be written in English, formatted according to the Springer Verlag LNCS style in a pdf form, which can be obtained from here. The paper should be anonymized. If you use a word file, please follow the instruction of the format, and then convert it into a pdf form and submit it at the paper submission page.

For both classes, in addition to the original unpublished work, we also accept the papers that have already been published or presented in other venues. This submission should also be anonymized, and will be reviewed by the program committee.

The accepted papers will not be archived in general. The papers are distributed to the participants of the workshop on a USB flash drive. If the authors hope to make their paper publicly available, we also will provide a link to the pdf on this webpage. Otherwise, we do not upload the papers on the web. Unpublished submissions on both long and short paper tracks are considered as the candidates for post-proceedings of LNAI (the authors can also reject the invitation, if they wish). The papers will be archived only by this post-proceedings.

You can submit your paper at . If you cannot submit a paper by EasyChair System by some trouble, please send email to "nomura[at]"

If a paper is accepted, at least one author of the paper must register the workshop and present it. Please register the workshop at registration page.

Post Proceedings

Selected papers will be published as a post-proceedings via Springer Verlag "Lecture Notes in Artificial Intelligence" series after the second round of review after the workshop.

SCIDOCA2017 Program (November 14, 2017)

  • 13:30-13:40: Opening
  • 13:40-14:40: Invited Talk
  • Simone Teufel, University of Cambridge, United Kingdom.

    Title: Do Future Work sections have a purpose? (and other global scientometric questions)

    Abstract: How should the community of scientific text processing set its sights (and goals) for the near future? One possibility is to look at global scientometric questions; questions that historians of science ask, such as: What are the most contested ideas in a particular research area at this moment? Which new ideas emerged in the past 5 years in that area? Maybe we could even ask where the most innovative research happens at the moment. At first, such questions might seem unanswerable and vague, but given a large text base, I believe that it is possible to answer important aspects of these questions objectively and quantifiably. In doing so, we won't be able to ignore entailment and inference in scientific writing. This is hard, but exciting; and recent developments in NLP, such as better parsing and more robust entailment, should help us some of the way here.
    As a test case and thought experiment, I will ask the question of which purpose, if any, future work sections fulfill in a scientific paper.
  • 14:40-15:00: Break
  • 15:00-16:30: Long papers 1
    • Yating Zhang, Xinran Liu, and Yuji Matsumoto. Domain Adaptation for Sentence Classification: A Study on Structured Abstract Generation
    • Qin Dai, Naoya Inoue, Paul Reisert, and Kentaro Inui. Leveraging Document-specific Information for Identifying Relations in Scientific Articles
    • Hayato Hashimoto, Kazutoshi Shinoda, Hikaru Yokono, and Akiko Aizawa. Automatic Generation of Review Matrices as Multi-document Summarization of Scientific Papers
  • 16:30-16:50: Break
  • 16:50-17:50: Long papers 2
    • Truong-Son Nguyen, Le Minh Nguyen, and Ken Satoh. Improving entailment recognition in legal texts using corpus generation
    • Shuhei Kondo, and Yuji Matsumoto. Species-Metabolites Relation Extraction using Distant Supervision with KNApSAcK Core Database

SCIDOCA2017 Program (November 15, 2017)

  • 10:30-11:50: Short papers 1
    • Yoshinobu Kano. Towards Text Mining for Meta-analysis of Neuroscience Papers
    • Kimitaka Asatani, Ochi Masanao, Junichiro Mori, and Ichiro Sakata. Predicting future citation from the temporal information of citation network
    • Vu Duc Tran, Minh Le Nguyen, and Ken Satoh. Similarity with Document Components Vector Representations by Context Expansion from Document Structures
    • Masaharu Yoshioka, and Shinjiro Hara. Construction of In-house Papers/Figures Database System for a Particular Research Domain using PDFs - Application of Nano-crsytal Device Development Domain -
  • 11:50-13:30: Lunch
  • 13:30-14:50: Short papers 2
    • Lai Dac Viet, Nguyen Le Minh, Sinh Vu Trong and Ken Satoh. ConvAMR: Abstract meaning representation parsing for legal document
    • Karen Sargsyan and Karine Mazmanian. BioChemQA: A New Challenge Dataset for Finding Answers in Biological Publications
    • Atsuki Sawayama, Jun Suzuki, Hiroyuki Shindo, and Yuji Matsumoto. Improving Named Entity Recognition in Tasks with Non-immediate Response Setting
    • Kazutaka Kinugawa and Yoshimasa Tsuruoka. A Neural Hierarchical Extractive Summarizer for Academic Papers
  • 14:50-15:00: Closing

Workshop Chairs

Yuji Matsumoto, Nara Institute of Science and Technology, Japan
Hiroshi Noji, Nara Institute of Science and Technology, Japan

Program Committee Members

Takeshi Abekawa, NII
Akiko Aizawa, NII
Naoya Inoue, Tohoku University
Kentaro Inui, Tohoku University
Yoshinobu Kano, Shizuoka University
Yusuke Miyao, NII
Junichiro Mori, University of Tokyo
Hidetsugu Nanba, Hiroshima City University
Shoshin Nomura, NII
Ken Satoh, NII
Hiroyuki Shindo, NAIST
Yoshimasa Tsuruoka, University of Tokyo
Minh Le Nguyen, JAIST
Pontus Stenetorp, University College London

For any inquiry concerning the workshop, please send it to "noji[at]"

SCIDOCA 2017 home page

Back To Top