About me

My name is Yu-Sheng (Ethan) Su. I am currently a Research Fellow hosted by Eric Xing from CMU / MBZUAI and work on large-scale pre-trained language models (LLMs) now. I completed my Ph.D. in 2023 from the Department of Computer Science and Technology at Tsinghua University. Throughout my Ph.D. (from 2019 to 2023), I had the privilege of being advised by Zhiyuan Liu and joining THUNLP Lab hosted by Maosong Sun. Besides, I work closely with a start-up team, ModelBest. For further details on my academic research and experience, please refer to my [Google Scholar].

On the Job Market

I’m on the job market, looking for academic and industrial research positions related to LLMs. [Google Schlar]

Research

My research spans the areas of natural language processing and machine learning, specifically focusing on large language models (LLMs). I am particularly interested in how to better pre-train, fine-tune/instruction-tune, evaluate LLMs, and advance them in real-world scenarios. Thus, my research broadly covers the following topics:

Retrieval Augmented LLM Equip LLMs with the capability to leverage external retrieval information, thereby enhancing their understanding and increasing their trustworthiness. (CokeBERT, CSS-LM) [Talk20230723]
(Fine-tune/Instruction-tune) Computational Efficiency Tuning Develop theories, tools, and algorithms to tune LLMs, enabling them to better understand human’s instruction and efficiently adapt to downstream tasks in a computation-friendly manner. (e.g., parameter-efficient tuning methods, instruction tuning). (Prompt Transferability, IPT, Parameter-efficient Fine-tuning Survey, APET) [Talk20230723, Talk20240307]

Recently, I am more focus on:

Autonomous Agent Developing agents (based on LLMs) that can autonomously interact with the external environment (or humans) to self-improve and drive long-horizon decision-making, thereby accomplishing more complex tasks in the real world (AgentVerse, XAgent, Tool Leaning, ChatDev) Note that: Currently, I am exploring ways to transform an assistant agent into an autonomous agent. [Talk20240228, Talk20240307]

AI Alignment Interpretability - Model Emotion, Scale-oversight - Chateval

News

[May. 2024] Thrilled to announce that ChatDev was accepted by ACL 2024.
[May. 2024] I will attend ICLR (Vienna, Austria) conference. Let’s grab a coffee. Welcome DM me via Twitter.
[Mar. 2024] Invited Research Talk at Microsoft Research - AI Frontiers hosted by Guoqing Zheng. Topic: Efficient Tuning of Large-scale Pre-trained Language Models (LLMs) [slide]
[Feb. 2024] Invited Research Talk at Microsoft Research - Semantic Machines hosted by Ben Van Durme. Topic: Next-Generation Co-pilot: From Assistant Agent to Autonomy Agent [slide]
[Jan. 2024] Thrilled to announce that AgentVerse and ChatEval were accepted by ICLR 2024.
[Jan. 2024] Invited Research Talk at Alibaba DAMO Academy. Topic: From Assistant Agents (Co-pilots) to Autonomous Agents [slide].
[Dec. 2023] I will attend EMNLP (Singapore) and NeurIPS (New Orleans, U.S.A) conference. Welcome DM me via Twitter / yushengsu.thu@gmail.com and say “HI” to me.
[Oct. 2023] Thrilled to announce that Exploring the Impact of Model Scaling on Parameter-efficient Tuning Methods was accepted by EMNLP 2023 as the main conference paper.
[Oct. 2023] Thrilled to announce that we propose a next-generation AI agent, X Agent, that can accomplish more challenging tasks in the world.
[Aug. 2023] I got my Ph.D.! Thanks to all those who trust me and support me.
[Jul. 2023] Invited Research Talk at CMU/MBZUAI hosted by Eric P. Xing. Topic: Efficient Adaptation of Large-scale Pre-trained Language Models [slide].
[May. 2023] AgentVerse was published. It provides a flexible framework that simplifies the process of building LLM-based agents to accomplish various tasks in the real world.
[Apr. 2023] Tool Learning survey was published. It demonstrates how recently proposed LLMs leverage the emerging ability to comprehend, create, and manipulate tools, thereby assisting humans in accomplishing their intended objectives.
[Jan. 2023] Parameter-efficient Fine-tuning of Large-scale Pre-trained Language Models was accepted by Natural Machine Intelligence (Cover Article).
[Sep. 2022] Advancement of Foundation Models. Invited Talk (on-line) @ SAP - AI Research, Headquarters, Germany.
[Jul. 2022] Will orally present our 2 accepted works at NAACL 2022 (Seattle, USA). Welcome DM me via Twitter / yushengsu.thu@gmail.com and say “HI” to me.
[Apr. 2022] Transferability of Prompt Tuning and Knowledge Inheritance are accepted to NAACL 2022.

Publications

Communicative agents for software development
Chen Qian, Xin Cong, Cheng Yang, Weize Chen, Yusheng Su, Juyuan Xu, Zhiyuan Liu, Maosong Sun
ACL 2024. [pdf] [code]
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors in Agents
Yusheng Su^*, Weize Chen^*, Jingwei Zuo, Cheng Yang, Chenfei Yuan, Chen Qian, Chi-Min Chan, Yujia Qin, Yaxi Lu, Ruobing Xie, Zhiyuan Liu, Maosong Sun, Jie Zhou ( ^* indicates equal contribution)
ICLR 2024. [pdf] [code]
ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate
Chi-Min Chan, Weize Chen, Yusheng Su, Jianxuan Yu, Wei Xue, Shanghang Zhang, Jie Fu, Zhiyuan Liu
ICLR 2024. [pdf] [code]
Exploring the Impact of Model Scaling on Parameter-efficient Tuning Methods
Yusheng Su, Chi-Min Chan, Jiali Cheng, Yujia Qin, Yankai Lin, Shengding Hu, Zonghan Yang, Ning Ding, Xingzhi Sun, Guotong Xie, Zhiyuan Liu, Maosong Sun
EMNLP 2023. [pdf] [code]
Parameter-efficient Fine-tuning of Large-scale Pre-trained Language Models
Ning Ding, Yujia Qin, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen, Chi-Min Chan, Weize Chen, Jing Yi, Weilin Zhao, Xiaozhi Wang, Zhiyuan Liu, Hai-Tao Zheng, Jianfei Chen, Yang Liu, Jie Tang, Juanzi Li, Maosong Sun.
Nature Machine Intelligence 2023 (Cover Article). [pdf] [code]
On Transferability of Prompt Tuning for Natural Language Processing
Yusheng Su, Xiaozhi Wang, Yujia Qin, Chi-Min Chan, Yankai Lin, Zhiyuan Liu, Peng Li, Juanzi Li, Lei Hou, Maosong Sun, Jie Zhou
NAACL 2022 (Oral). [pdf] [code] [BibTex] [slide] [video]
Knowledge Inheritance for Pre-trained Language Models
Yujia Qin, Yankai Lin, Jing Yi, Jiajie Zhang, Xu Han, Zhengyan Zhang, Yusheng Su, Zhiyuan Liu, Peng Li, Maosong Sun, Jie Zhou
NAACL 2022 (Oral). [pdf] [code]
Exploring Low-dimensional Intrinsic Task Subspace via Prompt Tuning
Yujia Qin, Xiaozhi Wang, Yusheng Su, Yankai Lin, Ning Ding, Zhiyuan Liu, Juanzi Li, Lei Hou, Peng Li, Maosong Sun, Jie Zhou
ACL 2022 Findings. [pdf] [code]
CPM: A large-scale Generative Chinese Pre-trained Language Model
Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun
AI OPEN 2021. [pdf] [code]
CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of Pre-trained Language Models
Yusheng Su, Xu Han, Yankai Lin, Zhengyan Zhang, Zhiyuan Liu, Peng Li, Maosong Sun
WWW 2021 Workshop, IEEE/TASLP 2021. [pdf] [code] [slide]
CokeBERT: Contextual Knowledge Selection and Embedding Towards Enhanced Pre-Trained Language Models
Yusheng Su, Xu Han, Zhengyan Zhang, Peng Li, Zhiyuan Liu, Yankai Lin, Jie Zhou, Maosong Sun
EMNLP 2020 Findings, AI OPEN 2021. [pdf] [pdf] [code]

Under Review or Preprint Version

Human Emotion Knowledge Representation Emerges in Large Language Models and Supports Discrete Emotion Inference
Yusheng Su^*, Ming Li^*, Hsiu-Yuan Huang, Jiali Cheng, Xin Hu, Xinmiao Zhang, Huadong Wang, Yujia Qin, Xiaozhi Wang, Zhiyuan Liu, Dan Zhang ( ^* indicates equal contribution)
(Submitted to Nature Human Behaviour 2023). [pdf] [code] (Refactoring - User friendly toolkit coming soon)
Tool Learning with Foundation Models
Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Yufei Huang, Chaojun Xiao, Chi Han, Yi Ren Fung, Yusheng Su, Huadong Wang, Cheng Qian, Runchu Tian, Kunlun Zhu, Shihao Liang, Xingyu Shen, Bokai Xu, Zhen Zhang, Yining Ye, Bowen Li, Ziwei Tang, Jing Yi, Yuzhang Zhu, Zhenning Dai, Lan Yan, Xin Cong, Yaxi Lu, Weilin Zhao, Yuxiang Huang, Junxi Yan, Xu Han, Xian Sun, Dahai Li, Jason Phang, Cheng Yang, Tongshuang Wu, Heng Ji, Zhiyuan Liu, Maosong Sun
ArXiv 2023. [pdf] [code]

Efficient Training Projects

(Leader/Co-leader) Prompt Transferability. This system assists users in building a prompt bank, allowing them to save well-trained prompts. It also enables swift access and reuse of these prompts whenever the user requires them on unseen tasks and heterogeneous models.

Agents Projects

(Leader/Co-leader) AgentVerse. AgentVerse provides a framework that streamlines the process of developing custom multi-agent systems using LLMs in user-defined environments. This facilitates the design of more efficient multi-agent systems that can be applied to real-world applications. [Youtube1], [Youtube2]

(Member) XAgent. XAgent makes more effective decisions and execute efficient actions to accomplish tasks with an unprecedented degree of autonomy. [Youtube1], [Youtube2]

(Member) ChatDev. ChatDev creates customized software using natural language idea through LLM-powered multi-agent collaboration. [Youtube1], [Youtube2] [Andrew Ng’s talk]

(Member) Tool Learning. Tool learning for LLMs, open-source solutions of ChatGPT-Plugins.

LLM Pre-training Projects

(Member) CPM-X. The first chinese-version large-scale pre-trained project and released a series of LLMs in 2020-2021.

Talks

[Mar. 2024] Invited Research Talk at Microsoft Research - AI Frontiers hosted by Guoqing Zheng. Topic: Efficient Tuning of Large-scale Pre-trained Language Models (LLMs) [slide]
[Feb. 2024] Invited Research Talk at Microsoft Research - Semantic Machines hosted by Ben Van Durme. Topic: Next-Generation Co-pilot: From Assistant Agent to Autonomy Agent [slide]
[Jan. 2024] Invited Research Talk at Alibaba DAMO Academy]. Topic: From Assistant Agents (Co-pilots) to Autonomous Agents [slide]
[Jul. 2023] Invited Research Talk at CMU/MBZUAI hosted by Eric P. Xing. Topic: Efficient Adaptation of Large-scale Pre-trained Language Models [slide].
[Sep. 2022] Invited Talk (on-line) @ SAP - AI Research, Headquarter, Germany
[Jul. 2022] Oral Talk @ NAACL 2022, [slide] [video]
[Apr. 2021] Spotlight Talks (on-line) @ WWW 2021 (Self-Supervised Learning Workshop)

Professional Services

Reviewer (Since 2021): ACL, NAACL, AACL, ACL Roling, EMNLP, COLING, ICLR, ICML, IJCAI, AAAI

Work Experiences

Tsinghua University - NLP Lab. (Beijing) 2019 - 2023

Ph.D. NLP Group (hosted by Maosong Sun), AI, Computer Science Department
Advised by Zhiyuan Liu.

MediaTek. (Taiwan) 2018 - 2019

Deep/Machine Learning Engineer Intern
Advised by Jing-Han Wang.

Microsoft. (Taiwan) 2015 - 2016

Research and Development Intern
Advised by Kuang-Chao Yeh and Gordon Chang.

Pre-doctoral Student Mentoring

(Since 2021-2023) Chi-Min Chan: Tsinghua University (BS) -> Hong Kong University of Science and Technology (HKUST) (MS)
(Since 2022-2023) Jiali Cheng: University of North Carolina (MS->PhD)
(Since 2022-2023) Yu Xia: Peking University (MS) -> Tsinghua University (PhD)
(Since 2022-2023) Xiuyuan Huang: University of Science and Technology Beijing (BS) -> Peking University (MS)

Yusheng Su