Publication
Recent publications from our team
Papers
SPECTRA: Faster Large Language Model Inference with Optimized Internal and External Speculation
Author:Nguyen-Khang Le, Truong Dinh Do, Le-Minh Nguyen
ACL 2025
Abstract: Inference with modern Large Language Models (LLMs) is both computationally expensive and time-consuming. Speculative decoding has emerged as a promising solution, but existing approaches face key limitations: training-based methods require a draft model that is challenging to obtain and lacks generalizability, while training-free methods offer limited speedup gains. In this work, we present Spectra, a novel framework for accelerating LLM inference without the need for additional training or modification to the original LLM. Spectra introduces two new techniques for efficiently utilizing internal and external speculation, each outperforming corresponding state-of-the-art (SOTA) methods independently. When combined, these techniques achieve up to a 4.08x speedup across various benchmarks and LLM architectures, significantly surpassing existing training-free approaches. The implementation of Spectra is publicly available.

Our Team

Professor. Nguyen Le Minh
Director of Research Centre for Interpretable AI at JAIST
MSc. Le Nguyen Khang
PhD Student at Nguyen's Lab
MSc. Do Dinh Truong
PhD Student at Nguyen's LabContact Us
We are seeking students passionate about Natural Language Processing (NLP) and Deep Learning.
Location:
IS Building Ⅲ 7F, 1 Chome-1 Asahidai, Nomi, Ishikawa, Japan
Email:
nguyenml[at]jaist.ac.jp
Call:
+81 761-51-1221