AI Research Highlights | Week 42, 2023
1. Take a Step Back: Evoking Reasoning via Abstraction in Large Language Models
In this paper, Deepmind researchers suggested Step-back Prompting as a way to enhance LLMs' capacity for reasoning. This prompting approach consists of two steps: abstraction and reasoning. It was inspired by how humans use abstraction before reasoning, which simplifies the problem and lowers the likelihood of errors during subsequent stages of reasoning.
2. MemGPT: Towards LLMs as Operating Systems
The researchers from UC Berkeley proposed MemGPT, an OS-inspired LLM system for virtual context management. MemGPT draws inspiration from virtual memory paging, dividing the virtual context managed by LLM into main context and external context, allowing for unbounded context. MemGPT can create conversational agents that remember, reflect, and evolve dynamically through long-term interactions with their users. MemGPT showed great performance on evaluation tasks about document analysis and conversational agents. The project can be found here.
3. NEWTON: Are Large Language Models Capable of Physical Reasoning?
The authors introduced NEWTON, a Repository, Pipeline, and Benchmark designed to evaluate the physical reasoning capability of LLMs. NEWTON is designed to solve the limited exploration of LLM's physical reasoning abilities, specifically concerning the crucial attributes for comprehending everyday objects. The project can be found here.
4. Catastrophic Jailbreak of Open-source LLMs via Exploiting Generation
Researchers from Princeton University proposed the generation exploitation attack, an extremely simple approach that disrupts model alignment by only manipulating variations of decoding methods, as shown below. They also propose an effective alignment method that explores diverse generation strategies, which can reasonably reduce the misalignment rate under their attack. The project can be found here.
5. Large Language Models can Learn Rules
In this paper, researchers present Hypotheses-to-Theories (HtT), containing two stages of induction and deduction, to learn explicit rules and apply them to reasoning problems. The results of experiments showed that HtT outperformed other baseline prompting methods on numerical reasoning and relational reasoning problems.
6. Diversity of Thought Improves Reasoning Abilities of Large Language Models
In this paper, the authors proposed a method called DIV-SE and a cost-effective alternative IDIV-SE that automatically improves prompts diversity by soliciting feedback from the LLM to ideate approaches. They pointed out that there is a large room for improvement in using the LLM as a guide to improving the prompt.
7. Promptor: A Conversational and Autonomous Prompt Generation Agent for Intelligent Text Entry Techniques
In this paper, researchers from Cambridge introduced Promptor, a conversational prompt-generation agent designed to engage proactively with designers. The results show that Promptor-designed prompts result in a 35% increase in similarity and 22% in coherence over those by designers.
8. Learn From Model Beyond Fine-Tuning: A Survey
A group of researchers provided a comprehensive review of learning from models (LFM), covering methods such as fine-tuning, model distillation, model reuse, meta-learning, and model editing. The relevant papers they discussed in this article can be found here.
9. Impact of Co-occurrence on Factual Knowledge of Large Language Models
LLMs often make factually incorrect information. In this paper, the authors investigate the impact of co-occurrence statistics of the pre-training corpora on factual knowledge of LLMs, discovering co-occurrence bias. They suggested further research on mitigating co-occurrence bias to ensure the reliability of language models.
*The researchers behind the publications deserve full credit for their work.