About me

My name is Yu-Sheng (Ethan) Su. I am currently a Research Fellow hosted by Eric Xing from CMU / MBZUAI and work on large-scale pre-trained language models (LLMs) now. I completed my Ph.D. in 2023 from the Department of Computer Science and Technology at Tsinghua University. Throughout my Ph.D. (from 2019 to 2023), I had the privilege of being advised by Zhiyuan Liu and joining THUNLP Lab hosted by Maosong Sun. Besides, I work closely with a start-up team, ModelBest. For further details on my academic research and experience, please refer to my [Google Scholar].

On the Job Market

I’m on the job market, looking for academic and industrial research positions related to LLMs. [Google Schlar]

Research

My research spans the areas of natural language processing and machine learning, specifically focusing on large language models (LLMs). I am particularly interested in how to better pre-train, fine-tune/instruction-tune, evaluate LLMs, and advance them in real-world scenarios. Thus, my research broadly covers the following topics:

  • Retrieval Augmented LLM Equip LLMs with the capability to leverage external retrieval information, thereby enhancing their understanding and increasing their trustworthiness. (CokeBERT, CSS-LM) [Talk20230723]

  • (Fine-tune/Instruction-tune) Computational Efficiency Tuning Develop theories, tools, and algorithms to tune LLMs, enabling them to better understand human’s instruction and efficiently adapt to downstream tasks in a computation-friendly manner. (e.g., parameter-efficient tuning methods, instruction tuning). (Prompt Transferability, IPT, Parameter-efficient Fine-tuning Survey, APET) [Talk20230723, Talk20240307]

Recently, I am more focus on:

  • Autonomous Agent Developing agents (based on LLMs) that can autonomously interact with the external environment (or humans) to self-improve and drive long-horizon decision-making, thereby accomplishing more complex tasks in the real world (AgentVerse, XAgent, Tool Leaning, ChatDev) Note that: Currently, I am exploring ways to transform an assistant agent into an autonomous agent. [Talk20240228, Talk20240307]

News

Publications

  • AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors in Agents
    Yusheng Su*, Weize Chen*, Jingwei Zuo, Cheng Yang, Chenfei Yuan, Chen Qian, Chi-Min Chan, Yujia Qin, Yaxi Lu, Ruobing Xie, Zhiyuan Liu, Maosong Sun, Jie Zhou* indicates equal contribution)
    ICLR 2024. [pdf] [code]

  • ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate
    Chi-Min Chan, Weize Chen, Yusheng Su, Jianxuan Yu, Wei Xue, Shanghang Zhang, Jie Fu, Zhiyuan Liu
    ICLR 2024. [pdf] [code]

  • Exploring the Impact of Model Scaling on Parameter-efficient Tuning Methods
    Yusheng Su, Chi-Min Chan, Jiali Cheng, Yujia Qin, Yankai Lin, Shengding Hu, Zonghan Yang, Ning Ding, Xingzhi Sun, Guotong Xie, Zhiyuan Liu, Maosong Sun
    EMNLP 2023. [pdf] [code]

  • Parameter-efficient Fine-tuning of Large-scale Pre-trained Language Models
    Ning Ding, Yujia Qin, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen, Chi-Min Chan, Weize Chen, Jing Yi, Weilin Zhao, Xiaozhi Wang, Zhiyuan Liu, Hai-Tao Zheng, Jianfei Chen, Yang Liu, Jie Tang, Juanzi Li, Maosong Sun.
    Nature Machine Intelligence 2023 (Cover Article). [pdf] [code]

  • On Transferability of Prompt Tuning for Natural Language Processing
    Yusheng Su, Xiaozhi Wang, Yujia Qin, Chi-Min Chan, Yankai Lin, Zhiyuan Liu, Peng Li, Juanzi Li, Lei Hou, Maosong Sun, Jie Zhou
    NAACL 2022 (Oral). [pdf] [code] [BibTex] [slide] [video]

  • Knowledge Inheritance for Pre-trained Language Models
    Yujia Qin, Yankai Lin, Jing Yi, Jiajie Zhang, Xu Han, Zhengyan Zhang, Yusheng Su, Zhiyuan Liu, Peng Li, Maosong Sun, Jie Zhou
    NAACL 2022 (Oral). [pdf] [code]

  • Exploring Low-dimensional Intrinsic Task Subspace via Prompt Tuning
    Yujia Qin, Xiaozhi Wang, Yusheng Su, Yankai Lin, Ning Ding, Zhiyuan Liu, Juanzi Li, Lei Hou, Peng Li, Maosong Sun, Jie Zhou
    ACL 2022 Findings. [pdf] [code]

  • CPM: A large-scale Generative Chinese Pre-trained Language Model
    Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun
    AI OPEN 2021. [pdf] [code]

  • CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of Pre-trained Language Models
    Yusheng Su, Xu Han, Yankai Lin, Zhengyan Zhang, Zhiyuan Liu, Peng Li, Maosong Sun
    WWW 2021 Workshop, IEEE/TASLP 2021. [pdf] [code] [slide]

  • CokeBERT: Contextual Knowledge Selection and Embedding Towards Enhanced Pre-Trained Language Models
    Yusheng Su, Xu Han, Zhengyan Zhang, Peng Li, Zhiyuan Liu, Yankai Lin, Jie Zhou, Maosong Sun
    EMNLP 2020 Findings, AI OPEN 2021. [pdf] [pdf] [code]

Under Review or Preprint Version

  • Human Emotion Knowledge Representation Emerges in Large Language Models and Supports Discrete Emotion Inference
    Yusheng Su*, Ming Li*, Hsiu-Yuan Huang, Jiali Cheng, Xin Hu, Xinmiao Zhang, Huadong Wang, Yujia Qin, Xiaozhi Wang, Zhiyuan Liu, Dan Zhang* indicates equal contribution)
    (Submitted to Nature Human Behaviour 2023). [pdf] [code] (Refactoring - User friendly toolkit coming soon)

  • Tool Learning with Foundation Models
    Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Yufei Huang, Chaojun Xiao, Chi Han, Yi Ren Fung, Yusheng Su, Huadong Wang, Cheng Qian, Runchu Tian, Kunlun Zhu, Shihao Liang, Xingyu Shen, Bokai Xu, Zhen Zhang, Yining Ye, Bowen Li, Ziwei Tang, Jing Yi, Yuzhang Zhu, Zhenning Dai, Lan Yan, Xin Cong, Yaxi Lu, Weilin Zhao, Yuxiang Huang, Junxi Yan, Xu Han, Xian Sun, Dahai Li, Jason Phang, Cheng Yang, Tongshuang Wu, Heng Ji, Zhiyuan Liu, Maosong Sun
    ArXiv 2023. [pdf] [code]

  • Communicative agents for software development
    Chen Qian, Xin Cong, Cheng Yang, Weize Chen, Yusheng Su, Juyuan Xu, Zhiyuan Liu, Maosong Sun
    ArXiv 2023. [pdf] [code]

Efficient Training Projects

  • (Leader/Co-leader) Prompt Transferability. This system assists users in building a prompt bank, allowing them to save well-trained prompts. It also enables swift access and reuse of these prompts whenever the user requires them on unseen tasks and heterogeneous models.

Readme Card

Agents Projects

  • (Leader/Co-leader) AgentVerse. AgentVerse provides a framework that streamlines the process of developing custom multi-agent systems using LLMs in user-defined environments. This facilitates the design of more efficient multi-agent systems that can be applied to real-world applications. [Youtube1], [Youtube2]

Readme Card

  • (Member) XAgent. XAgent makes more effective decisions and execute efficient actions to accomplish tasks with an unprecedented degree of autonomy. [Youtube1], [Youtube2]

Readme Card

  • (Member) ChatDev. ChatDev creates customized software using natural language idea through LLM-powered multi-agent collaboration. [Youtube1], [Youtube2] [Andrew Ng’s talk]

Readme Card

  • (Member) Tool Learning. Tool learning for LLMs, open-source solutions of ChatGPT-Plugins.

Readme Card

LLM Pre-training Projects

  • (Member) CPM-X. The first chinese-version large-scale pre-trained project and released a series of LLMs in 2020-2021.

Readme Card

Talks

Professional Services

Reviewer (Since 2021): ACL, NAACL, AACL, ACL Roling, EMNLP, COLING, ICLR, ICML, IJCAI, AAAI

Work Experiences

Tsinghua University - NLP Lab. (Beijing) 2019 - 2023

MediaTek. (Taiwan) 2018 - 2019

  • Deep/Machine Learning Engineer Intern
  • Advised by Jing-Han Wang.

Microsoft. (Taiwan) 2015 - 2016

Pre-doctoral Student Mentoring

  • (Since 2021-2023) Chi-Min Chan: Tsinghua University (BS) -> Hong Kong University of Science and Technology (HKUST) (MS)
  • (Since 2022-2023) Jiali Cheng: University of North Carolina (MS->PhD)
  • (Since 2022-2023) Yu Xia: Peking University (MS) -> Tsinghua University (PhD)
  • (Since 2022-2023) Xiuyuan Huang: University of Science and Technology Beijing (BS) -> Peking University (MS)