Published on Jan 03, 2023
The Language Model Self-Improved (LMSI) technique, developed by researchers at Google and the University of Illinois at Urbana-Champaign (UIUC), fine-tunes a large language model (LLM) based on a dataset generated by that model. By utilizing LMSI, the researchers were able to improve the performance of the LLM on six benchmarks and set new state-of-the-art accuracy records on four of them.
In the beginning, the team used a pre-trained 540B parameter PaLM model. The model was trained using unlabeled training data along with chain-of-thought prompts. In addition to the inputs, the model generated answers to these questions, which were then used with the inputs as a fine-tuning training dataset. In order to evaluate the fine-tuned model, a suite of benchmark datasets for three different natural language processing (NLP) tasks was used: arithmetic reasoning, common sense reasoning, and natural language inference.
Chain-of-thought (CoT) prompting augments the input question given to a language model by prepending an example question and answer as well as the reasoning steps to arrive at the answer. Google’s PaLM model, when used with CoT prompting, achieves state-of-the-art few-shot performance on several reasoning benchmarks. LMSI researchers were interested in exploring PaLM’s performance on additional datasets in light of this few-shot performance.
However, the challenge with fine-tuning is the same as with any supervised learning problem: obtaining a labeled dataset. The key idea in LMSI is to generate this dataset using the PaLM model itself. By prepending CoT examples and applying prompts, such as “let’s think step-by-step,” the team augmented questions from a training dataset and fed them into PaLM which generated multiple candidate output answers. The highest-confidence answers were filtered using self-consistency. Using the resulting question/answer dataset, the original PaLM model was then fine-tuned.
Aside from the fine-tuned 540B PaLM model, the team also investigated knowledge distillation, utilizing the generated dataset to fine-tune smaller versions of PaLM. A fine-tuned 62B parameter model outperformed a pre-trained 540B parameter model, while a fine-tuned 8B parameter model outperformed a pre-trained 62B parameter model.
Presentations
Browse LSET presentations to understand interesting…
Explore Now
eBooks
Get complete guides to empower yourself academically…
Explore Now
Infographics
Learn about information technology and business…
Error: Contact form not found.
Error: Contact form not found.
Error: Contact form not found.
Error: Contact form not found.
Error: Contact form not found.
Error: Contact form not found.
Error: Contact form not found.
Error: Contact form not found.
Error: Contact form not found.
Error: Contact form not found.
Error: Contact form not found.
Error: Contact form not found.
Error: Contact form not found.
[wpforms id=”9030″]