News & Events

Press Release

Retrieve-Revise-Refine: A novel framework for retrieval of concise entailing legal article set

  • Researchers propose a novel three-stage framework, Retrieve-Revise-Refine, specifically designed to address the intricate challenge of legal article set retrieval, which focuses on retrieving a concise (i.e., precise and compact) set of entailing legal articles.
  • Secondly, they rigorously evaluate the framework using two datasets, where they observe notable improvements in the macro F2 score, achieving increases of 3.17% and 4.24% over the previous state-of-the-art methods, respectively.
  • Lastly, their comprehensive ablation studies and subsequent analysis provide valuable insights into the critical functions of each stage within the framework.

Artificial Intelligence (AI) continues to redefine the boundaries of legal technology, offering promise in automating advanced tasks such as legal question answering and consultation. In the domain of statute law, a particularly principal challenge is the task of retrieving the concise set of entailing legal articles to a query, a task essential to enhancing these advanced applications. In this context, we refer to this task as entailing legal article set retrieval or, more briefly, legal article set retrieval.

The task of retrieving entailing legal article sets differs markedly from traditional information retrieval (IR) in two main aspects. Firstly, unlike the traditional IR which returns a ranked list of articles, the legal article set retrieval task seeks a concise set of articles. This level of specificity extends to the nature of the legal queries and legal articles themselves: they are inherently complex and steeped in specialized legal language, demanding a retrieval system with deeper legal reasoning and linking capacity. Secondly, while traditional IR efforts primarily involve ranking candidates by relevance, our task requires that the retrieved articles not just relate to but jointly entail the contents of a query or its negation. These characteristics set this task apart from the broader goals and methods of traditional IR tasks.

Previous research in legal article set retrieval has predominantly employed two approaches. The first approach combines classical IR models with fine-tuned language models (LMs), and then ensembles the retrieval results to consolidate the final retrieved sets. Meanwhile, the second approach uses classical IR models exclusively for preliminary candidate filtering, which prepares inputs for further LM fine-tuning; the final results are often ensembled from various fine-tuned LMs.

To address the task of legal article retrieval, a team of researchers from the Japan Advanced Institute of Science and Technology (JAIST), led by Professor Le-Minh Nguyen and including doctoral students Chau Nguyen proposed framework, called Retrieve-Revise-Refine. The framework is designed to pinpoint the concise set of legal articles that either entail a query or its negation, advancing the current understanding of this task. Furthermore, their approach leverages the unique advantages of combining both small LMs and large LMs to improve the accuracy of the articles retrieved (i.e., precision), while endeavoring to limit the loss in coverage (i.e., recall). The framework consists of three stages:

  1. Retrieve: Maximizing the comprehensive retrieval of entailing articles using an ensemble of multiple small LMs, fine-tuned with various tailored strategies.
  2. Revise: Large LMs are utilized to assess the validity of the query with respect to each combination of articles from the top retrieval results, aiming to derive a more compact subset of entailing legal articles.
  3. Refine: Further distilling the outputs from the second stage, using insights derived from the small LMs' predictions as refiners for the predictions of the large LMs.

As shown in the empirical results, their proposed framework achieved state-of-the-art results for the task across two datasets, showing improvements of 3.17% and 4.24%, respectively.
Their study was published online in Information Processing & Management.

pr20241127-11.jpg

Figure1: Overall architecture of Retrieve-Revise-Refine framework.
Image caption: Overall architecture of Retrieve-Revise-Refine framework.
Image credit: Nguyen Le Minh from JAIST.
Image license: Original Content
Usage restrictions: Cannot be reused without permission

Reference

Title of original paper: Retrieve-Revise-Refine: A novel framework for retrieval of concise entailing legal article set
Authors: Chau Nguyen, Phuong Nguyen, Le-Minh Nguyen
Journal: Information Processing & Management
DOI: 10.1016/j.ipm.2024.103949

Funding information

This work is supported partly by AOARD grant FA23862214039.

November 27, 2024

PAGETOP