Dr. Yanjun Qi

Here are Papers I Reviewed. I am science curious.

Reviews Indexed

Toggle Menu

Index
Recent Posts By GenAI Category
- FM Basic
- FM Adapt
- FM Risk
- FM Multi
- FM Efficiency
Past Posts By DNN Category
Basic DNN Reads
- BasicDeep
- BasicML

FMAdapt

Recent Readings for Adaptation of Foundation Models (since 2022) (Index of Posts):

No.	Read Date	Title and Information	We Read @
1	2024, Apr, 23	LLM fine tuning	2024-S25
2	2024, Apr, 16	MultiAgent LLMs	2024-S23
3	2024, Apr, 11	LLM Agents	2024-S22
4	2024, Apr, 9	Self-exam LLM and reasoning	2024-S21
5	2024, Apr, 4	Prompt Engineering	2024-S20
6	2024, Mar, 26	Model editing and Disgorgement	2024-S17
7	2024, Mar, 21	Domain Centered FMs	2024-S16
8	2024, Mar, 14	Knowledge Augmented FMs	2024-S14

Here is a detailed list of posts!

[1]: LLM fine tuning

read on: - 23 Apr 2024
Alignment

In this session, our readings cover:

Required Readings:

Recent Large Language Models Reshaping the Open-Source Arena

https://deci.ai/blog/list-of-large-language-models-in-open-source/
The release of Meta’s Llama model and the subsequent release of Llama 2 in 2023 kickstarted an explosion of open-source language models, with better and more innovative models being released on what seems like a daily basis. With new open-source models being released on a daily basis, here we dove into the ocean of open-source possibilities to curate a select list of the most intriguing and influential models making waves in recent months, inlcuding Qwen1.5/ Yi/ Smaug/ Mixtral-8x7B-v0.1/ DBRX/ SOLAR-10.7B-v1.0 / Tulu 2 / WizardLM/ Starling 7B/ OLMo-7B/ Gemma and DeciLM-7B.
Plus the newly avaiable DBRX model https://www.databricks.com/blog/introducing-dbrx-new-state-art-open-llm

Instruction Tuning for Large Language Models: A Survey

https://arxiv.org/abs/2308.10792
Shengyu Zhang, Linfeng Dong, Xiaoya Li, Sen Zhang, Xiaofei Sun, Shuhe Wang, Jiwei Li, Runyi Hu, Tianwei Zhang, Fei Wu, Guoyin Wang
This paper surveys research works in the quickly advancing field of instruction tuning (IT), a crucial technique to enhance the capabilities and controllability of large language models (LLMs). Instruction tuning refers to the process of further training LLMs on a dataset consisting of \textsc{(instruction, output)} pairs in a supervised fashion, which bridges the gap between the next-word prediction objective of LLMs and the users’ objective of having LLMs adhere to human instructions. In this work, we make a systematic review of the literature, including the general methodology of IT, the construction of IT datasets, the training of IT models, and applications to different modalities, domains and applications, along with an analysis on aspects that influence the outcome of IT (e.g., generation of instruction outputs, size of the instruction dataset, etc). We also review the potential pitfalls of IT along with criticism against it, along with efforts pointing out current deficiencies of existing strategies and suggest some avenues for fruitful research. Project page: this http URL

Delta tuning: A comprehensive study of parameter efficient methods for pre-trained language models

https://arxiv.org/abs/2203.06904
Despite the success, the process of fine-tuning large-scale PLMs brings prohibitive adaptation costs. In fact, fine-tuning all the parameters of a colossal model and retaining separate instances for different tasks are practically infeasible. This necessitates a new branch of research focusing on the parameter-efficient adaptation of PLMs, dubbed as delta tuning in this paper. In contrast with the standard fine-tuning, delta tuning only fine-tunes a small portion of the model parameters while keeping the rest untouched, largely reducing both the computation and storage costs. Recent studies have demonstrated that a series of delta tuning methods with distinct tuned parameter selection could achieve performance on a par with full-parameter fine-tuning, suggesting a new promising way of stimulating large-scale PLMs. In this paper, we first formally describe the problem of delta tuning and then comprehensively review recent delta tuning approaches. We also propose a unified categorization criterion that divide existing delta tuning methods into three groups: addition-based, specification-based, and reparameterization-based methods. Though initially proposed as an efficient method to steer large models, we believe that some of the fascinating evidence discovered along with delta tuning could help further reveal the mechanisms of PLMs and even deep neural networks. To this end, we discuss the theoretical principles underlying the effectiveness of delta tuning and propose frameworks to interpret delta tuning from the perspective of optimization and optimal control, respectively. Furthermore, we provide a holistic empirical study of representative methods, where results on over 100 NLP tasks demonstrate a comprehensive performance comparison of different approaches. The experimental results also cover the analysis of combinatorial, scaling and transferable properties of delta tuning.

[2]: MultiAgent LLMs

read on: - 16 Apr 2024
Agent

In this session, our readings cover:

Required Readings:

Large Language Model based Multi-Agents: A Survey of Progress and Challenges

Taicheng Guo, Xiuying Chen, Yaqi Wang, Ruidi Chang, Shichao Pei, Nitesh V. Chawla, Olaf Wiest, Xiangliang Zhang
Large Language Models (LLMs) have achieved remarkable success across a wide array of tasks. Due to the impressive planning and reasoning abilities of LLMs, they have been used as autonomous agents to do many tasks automatically. Recently, based on the development of using one LLM as a single planning or decision-making agent, LLM-based multi-agent systems have achieved considerable progress in complex problem-solving and world simulation. To provide the community with an overview of this dynamic field, we present this survey to offer an in-depth discussion on the essential aspects of multi-agent systems based on LLMs, as well as the challenges. Our goal is for readers to gain substantial insights on the following questions: What domains and environments do LLM-based multi-agents simulate? How are these agents profiled and how do they communicate? What mechanisms contribute to the growth of agents’ capacities? For those interested in delving into this field of study, we also summarize the commonly used datasets or benchmarks for them to have convenient access. To keep researchers updated on the latest studies, we maintain an open-source GitHub repository, dedicated to outlining the research on LLM-based multi-agent systems.

[3]: LLM Agents

read on: - 11 Apr 2024
Agent

Required Readings:

A Survey on Large Language Model based Autonomous Agents

https://arxiv.org/abs/2308.11432
Autonomous agents have long been a prominent research focus in both academic and industry communities. Previous research in this field often focuses on training agents with limited knowledge within isolated environments, which diverges significantly from human learning processes, and thus makes the agents hard to achieve human-like decisions. Recently, through the acquisition of vast amounts of web knowledge, large language models (LLMs) have demonstrated remarkable potential in achieving human-level intelligence. This has sparked an upsurge in studies investigating LLM-based autonomous agents. In this paper, we present a comprehensive survey of these studies, delivering a systematic review of the field of LLM-based autonomous agents from a holistic perspective. More specifically, we first discuss the construction of LLM-based autonomous agents, for which we propose a unified framework that encompasses a majority of the previous work. Then, we present a comprehensive overview of the diverse applications of LLM-based autonomous agents in the fields of social science, natural science, and engineering. Finally, we delve into the evaluation strategies commonly used for LLM-based autonomous agents. Based on the previous studies, we also present several challenges and future directions in this field. To keep track of this field and continuously update our survey, we maintain a repository of relevant references at this https URL.

[4]: Self-exam LLM and reasoning

read on: - 09 Apr 2024
Reasoning

In this session, our readings cover:

Required Readings:

Augmented Language Models: a Survey

Grégoire Mialon, Roberto Dessì, Maria Lomeli, Christoforos Nalmpantis, Ram Pasunuru, Roberta Raileanu, Baptiste Rozière, Timo Schick, Jane Dwivedi-Yu, Asli Celikyilmaz, Edouard Grave, Yann LeCun, Thomas Scialom
This survey reviews works in which language models (LMs) are augmented with reasoning skills and the ability to use tools. The former is defined as decomposing a potentially complex task into simpler subtasks while the latter consists in calling external modules such as a code interpreter. LMs can leverage these augmentations separately or in combination via heuristics, or learn to do so from demonstrations. While adhering to a standard missing tokens prediction objective, such augmented LMs can use various, possibly non-parametric external modules to expand their context processing ability, thus departing from the pure language modeling paradigm. We therefore refer to them as Augmented Language Models (ALMs). The missing token objective allows ALMs to learn to reason, use tools, and even act, while still performing standard natural language tasks and even outperforming most regular LMs on several benchmarks. In this work, after reviewing current advance in ALMs, we conclude that this new research direction has the potential to address common limitations of traditional LMs such as interpretability,

Self-Consistency Improves Chain of Thought Reasoning in Language Models

https://arxiv.org/abs/2203.11171
Chain-of-thought prompting combined with pre-trained large language models has achieved encouraging results on complex reasoning tasks. In this paper, we propose a new decoding strategy, self-consistency, to replace the naive greedy decoding used in chain-of-thought prompting. It first samples a diverse set of reasoning paths instead of only taking the greedy one, and then selects the most consistent answer by marginalizing out the sampled reasoning paths. Self-consistency leverages the intuition that a complex reasoning problem typically admits multiple different ways of thinking leading to its unique correct answer. Our extensive empirical evaluation shows that self-consistency boosts the performance of chain-of-thought prompting with a striking margin on a range of popular arithmetic and commonsense reasoning benchmarks, including GSM8K (+17.9%), SVAMP (+11.0%), AQuA (+12.2%), StrategyQA (+6.4%) and ARC-challenge (+3.9%).

If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents

https://arxiv.org/abs/2401.00812
Ke Yang, Jiateng Liu, John Wu, Chaoqi Yang, Yi R. Fung, Sha Li, Zixuan Huang, Xu Cao, Xingyao Wang, Yiquan Wang, Heng Ji, Chengxiang Zhai
The prominent large language models (LLMs) of today differ from past language models not only in size, but also in the fact that they are trained on a combination of natural language and formal language (code). As a medium between humans and computers, code translates high-level goals into executable steps, featuring standard syntax, logical consistency, abstraction, and modularity. In this survey, we present an overview of the various benefits of integrating code into LLMs’ training data. Specifically, beyond enhancing LLMs in code generation, we observe that these unique properties of code help (i) unlock the reasoning ability of LLMs, enabling their applications to a range of more complex natural language tasks; (ii) steer LLMs to produce structured and precise intermediate steps, which can then be connected to external execution ends through function calls; and (iii) take advantage of code compilation and execution environment, which also provides diverse feedback for model improvement. In addition, we trace how these profound capabilities of LLMs, brought by code, have led to their emergence as intelligent agents (IAs) in situations where the ability to understand instructions, decompose goals, plan and execute actions, and refine from feedback are crucial to their success on downstream tasks. Finally, we present several key challenges and future directions of empowering LLMs with code.

[5]: Prompt Engineering

read on: - 04 Apr 2024
Prompting

In this session, our readings cover:

Required Readings:

Unleashing the potential of prompt engineering in Large Language Models: a comprehensive review

https://arxiv.org/abs/2310.14735
Banghao Chen, Zhaofeng Zhang, Nicolas Langrené, Shengxin Zhu / This paper delves into the pivotal role of prompt engineering in unleashing the capabilities of Large Language Models (LLMs). Prompt engineering is the process of structuring input text for LLMs and is a technique integral to optimizing the efficacy of LLMs. This survey elucidates foundational principles of prompt engineering, such as role-prompting, one-shot, and few-shot prompting, as well as more advanced methodologies such as the chain-of-thought and tree-of-thoughts prompting. The paper sheds light on how external assistance in the form of plugins can assist in this task, and reduce machine hallucination by retrieving external knowledge. We subsequently delineate prospective directions in prompt engineering research, emphasizing the need for a deeper understanding of structures and the role of agents in Artificial Intelligence-Generated Content (AIGC) tools. We discuss how to assess the efficacy of prompt methods from different perspectives and using different methods. Finally, we gather information about the application of prompt engineering in such fields as education and programming, showing its transformative potential. This comprehensive survey aims to serve as a friendly guide for anyone venturing through the big world of LLMs and prompt engineering.

[6]: Model editing and Disgorgement

read on: - 26 Mar 2024
ModelEdit

In this session, our readings cover:

Required Readings:

Editing Large Language Models: Problems, Methods, and Opportunities

https://arxiv.org/abs/2305.13172
Yunzhi Yao, Peng Wang, Bozhong Tian, Siyuan Cheng, Zhoubo Li, Shumin Deng, Huajun Chen, Ningyu Zhang Despite the ability to train capable LLMs, the methodology for maintaining their relevancy and rectifying errors remains elusive. To this end, the past few years have witnessed a surge in techniques for editing LLMs, the objective of which is to efficiently alter the behavior of LLMs within a specific domain without negatively impacting performance across other inputs. This paper embarks on a deep exploration of the problems, methods, and opportunities related to model editing for LLMs. In particular, we provide an exhaustive overview of the task definition and challenges associated with model editing, along with an in-depth empirical analysis of the most progressive methods currently at our disposal. We also build a new benchmark dataset to facilitate a more robust evaluation and pinpoint enduring issues intrinsic to existing techniques. Our objective is to provide valuable insights into the effectiveness and feasibility of each editing technique, thereby assisting the community in making informed decisions on the selection of the most appropriate method for a specific task or context. Code and datasets are available at this https URL. Comments: EMNLP 2023. Updated with new experiments

[7]: Domain Centered FMs

read on: - 21 Mar 2024
DomainAdapt

In this session, our readings cover:

Required Readings:

Large Language Models for Software Engineering: A Systematic Literature Review

Large Language Models (LLMs) have significantly impacted numerous domains, including Software Engineering (SE). Many recent publications have explored LLMs applied to various SE tasks. Nevertheless, a comprehensive understanding of the application, effects, and possible limitations of LLMs on SE is still in its early stages. To bridge this gap, we conducted a systematic literature review on LLM4SE, with a particular focus on understanding how LLMs can be exploited to optimize processes and outcomes. We collect and analyze 229 research papers from 2017 to 2023 to answer four key research questions (RQs). In RQ1, we categorize different LLMs that have been employed in SE tasks, characterizing their distinctive features and uses. In RQ2, we analyze the methods used in data collection, preprocessing, and application highlighting the role of well-curated datasets for successful LLM for SE implementation. RQ3 investigates the strategies employed to optimize and evaluate the performance of LLMs in SE. Finally, RQ4 examines the specific SE tasks where LLMs have shown success to date, illustrating their practical contributions to the field. From the answers to these RQs, we discuss the current state-of-the-art and trends, identifying gaps in existing research, and flagging promising areas for future study.

[8]: Knowledge Augmented FMs

read on: - 14 Mar 2024
RAG

In this session, our readings cover:

Required Readings:

Retrieval-Augmented Generation for AI-Generated Content: A Survey

https://arxiv.org/abs/2402.19473v1
The development of Artificial Intelligence Generated Content (AIGC) has been facilitated by advancements in model algorithms, scalable foundation model architectures, and the availability of ample high-quality datasets. While AIGC has achieved remarkable performance, it still faces challenges, such as the difficulty of maintaining up-to-date and long-tail knowledge, the risk of data leakage, and the high costs associated with training and inference. Retrieval-Augmented Generation (RAG) has recently emerged as a paradigm to address such challenges. In particular, RAG introduces the information retrieval process, which enhances AIGC results by retrieving relevant objects from available data stores, leading to greater accuracy and robustness. In this paper, we comprehensively review existing efforts that integrate RAG technique into AIGC scenarios. We first classify RAG foundations according to how the retriever augments the generator. We distill the fundamental abstractions of the augmentation methodologies for various retrievers and generators. This unified perspective encompasses all RAG scenarios, illuminating advancements and pivotal technologies that help with potential future progress. We also summarize additional enhancements methods for RAG, facilitating effective engineering and implementation of RAG systems. Then from another view, we survey on practical applications of RAG across different modalities and tasks, offering valuable references for researchers and practitioners. Furthermore, we introduce the benchmarks for RAG, discuss the limitations of current RAG systems, and suggest potential directions for future research. Project: this https URL

Retrieval-Augmented Generation for Large Language Models: A Survey

https://arxiv.org/abs/2312.10997
Large language models (LLMs) demonstrate powerful capabilities, but they still face challenges in practical applications, such as hallucinations, slow knowledge updates, and lack of transparency in answers. Retrieval-Augmented Generation (RAG) refers to the retrieval of relevant information from external knowledge bases before answering questions with LLMs. RAG has been demonstrated to significantly enhance answer accuracy, reduce model hallucination, particularly for knowledge-intensive tasks. By citing sources, users can verify the accuracy of answers and increase trust in model outputs. It also facilitates knowledge updates and the introduction of domain-specific knowledge. RAG effectively combines the parameterized knowledge of LLMs with non-parameterized external knowledge bases, making it one of the most important methods for implementing large language models. This paper outlines the development paradigms of RAG in the era of LLMs, summarizing three paradigms: Naive RAG, Advanced RAG, and Modular RAG. It then provides a summary and organization of the three main components of RAG: retriever, generator, and augmentation methods, along with key technologies in each component. Furthermore, it discusses how to evaluate the effectiveness of RAG models, introducing two evaluation methods for RAG, emphasizing key metrics and abilities for evaluation, and presenting the latest automatic evaluation framework. Finally, potential future research directions are introduced from three aspects: vertical optimization, horizontal scalability, and the technical stack and ecosystem of RAG.

Even More

A Survey of Table Reasoning with Large Language Models

Xuanliang Zhang, Dingzirui Wang, Longxu Dou, Qingfu Zhu, Wanxiang Che
https://arxiv.org/abs/2402.08259
Table reasoning, which aims to generate the corresponding answer to the question following the user requirement according to the provided table, and optionally a text description of the table, effectively improving the efficiency of obtaining information. Recently, using Large Language Models (LLMs) has become the mainstream method for table reasoning, because it not only significantly reduces the annotation cost but also exceeds the performance of previous methods. However, existing research still lacks a summary of LLM-based table reasoning works. Due to the existing lack of research, questions about which techniques can improve table reasoning performance in the era of LLMs, why LLMs excel at table reasoning, and how to enhance table reasoning abilities in the future, remain largely unexplored. This gap significantly limits progress in research. To answer the above questions and advance table reasoning research with LLMs, we present this survey to analyze existing research, inspiring future work. In this paper, we analyze the mainstream techniques used to improve table reasoning performance in the LLM era, and the advantages of LLMs compared to pre-LLMs for solving table reasoning. We provide research directions from both the improvement of existing methods and the expansion of practical applications to inspire future research.

Here is a name list of posts!

BackTop

LLM fine tuning

29 minute read

In this session, our readings cover:

MultiAgent LLMs

16 minute read

In this session, our readings cover:

LLM Agents

23 minute read

Required Readings:

Self-exam LLM and reasoning

19 minute read

In this session, our readings cover:

Prompt Engineering

26 minute read

In this session, our readings cover:

Model editing and Disgorgement

19 minute read

In this session, our readings cover:

Domain Centered FMs

23 minute read

In this session, our readings cover:

Knowledge Augmented FMs

17 minute read

In this session, our readings cover:

FMAdapt

Recent Readings for Adaptation of Foundation Models (since 2022) (Index of Posts):

Here is a detailed list of posts!

[1]: LLM fine tuning

Required Readings:

Recent Large Language Models Reshaping the Open-Source Arena

Instruction Tuning for Large Language Models: A Survey

Delta tuning: A comprehensive study of parameter efficient methods for pre-trained language models

More readings

Gemini: A Family of Highly Capable Multimodal Models

QLoRA: Efficient Finetuning of Quantized LLMs

related: LoRA: Low-Rank Adaptation of Large Language Models

Astraios: Parameter-Efficient Instruction Tuning Code Large Language Models

[2]: MultiAgent LLMs

Required Readings:

Large Language Model based Multi-Agents: A Survey of Progress and Challenges

More Readings:

Understanding the planning of LLM agents: A survey

LLM Agents can Autonomously Hack Websites

Agent-FLAN: Designing Data and Methods of Effective Agent Tuning for Large Language Models

Humanoid Locomotion as Next Token Prediction

[3]: LLM Agents

Required Readings:

A Survey on Large Language Model based Autonomous Agents

More Readings:

Position Paper: Agent AI Towards a Holistic Intelligence

Tool Use in LLMs

Practices for Governing Agentic AI Systems

Emergent autonomous scientific research capabilities of large language models

What Makes a Dialog Agent Useful?

[4]: Self-exam LLM and reasoning

Required Readings:

Augmented Language Models: a Survey

Self-Consistency Improves Chain of Thought Reasoning in Language Models

If LLM Is the Wizard, Then Code Is the Wand: A Survey on How Code Empowers Large Language Models to Serve as Intelligent Agents

More Readings:

ReAct: Synergizing Reasoning and Acting in Language Models

Towards Reasoning in Large Language Models: A Survey

Large Language Models Can Self-Improve

Orca 2: Teaching Small Language Models How to Reason /

[5]: Prompt Engineering

Required Readings:

Unleashing the potential of prompt engineering in Large Language Models: a comprehensive review

More Readings:

Skeleton-of-Thought: Large Language Models Can Do Parallel Decoding

Topologies of Reasoning: Demystifying Chains, Trees, and Graphs of Thoughts

[6]: Model editing and Disgorgement

Required Readings:

Editing Large Language Models: Problems, Methods, and Opportunities

More Readings:

Tuning Language Models by Proxy

A Survey of Machine Unlearning

AI Model Disgorgement: Methods and Choices

[7]: Domain Centered FMs

Required Readings:

Large Language Models for Software Engineering: A Systematic Literature Review

More Readings:

Large language models generate functional protein sequences across diverse families

Large Language Models in Law: A Survey

ChemLLM: A Chemical Large Language Model

FunSearch: Making new discoveries in mathematical sciences using Large Language Models

Transforming the future of music creation

Segment Anything

EMO: Emote Portrait Alive - Generating Expressive Portrait Videos with Audio2Video Diffusion Model under Weak Conditions

Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models

BloombergGPT: A Large Language Model for Finance

Emu Video: Factorizing Text-to-Video Generation by Explicit Image Conditioning

[8]: Knowledge Augmented FMs

Required Readings:

Retrieval-Augmented Generation for AI-Generated Content: A Survey

Retrieval-Augmented Generation for Large Language Models: A Survey

More Readings:

Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models

A Comprehensive Study of Knowledge Editing for Large Language Models

Even More

A Survey of Table Reasoning with Large Language Models

Here is a name list of posts!