About me
My name is Yu-Sheng (Ethan) Su. I am a research scientist at AMD GenAI team (is hiring 🔥) and work on large-scale foundation models, especially focusing on data, model architecture, and training efficiency optimization. Before I joined AMD, I was a Postdoctoral Researcher hosted by Eric Xing from CMU / MBZUAI. I completed my Ph.D. in Department of Computer Science and Technology at Tsinghua University. Throughout my Ph.D. (from 2019 to 2023), I had the privilege of being advised by Zhiyuan Liu and being a part of the THUNLP Lab hosted by Maosong Sun. Besides, I worked closely with some LLM start-up teams including ModelBest and llm360.
Hiring
AMD’s GenAI team focuses on building a series of foundation models and is hiring 🔥 for multiple roles, including Principal Research Scientist, Applied Research Scientist, and Research Intern. [Click here] to know more and feel free to reach me out if you’re interested in.
Research
My primary work and research focus on advancing models toward achieving AGI. Thus, I concentrate on three key areas: (1) scaling data volume and quality, (2) enhancing the robustness of model architectures, and (3) optimizing training efficiency to push beyond the boundaries of current state-of-the-art model capabilities and facilitate rapid its iterative development [Google Scholar] [GitHub].
Talks
- [Mar. 2024] Invited Research Talk at Microsoft Research - AI Frontiers hosted by Guoqing Zheng. Topic: Efficient Tuning of Large-scale Pre-trained Language Models (LLMs) [slide]
- [Feb. 2024] Invited Research Talk at Microsoft Research - Semantic Machines hosted by Ben Van Durme. Topic: Next-Generation Co-pilot: From Assistant Agent to Autonomy Agent [slide]
- [Jan. 2024] Invited Research Talk at Alibaba DAMO Academy hosted by Ting-En Lin. Topic: From Assistant Agents (Co-pilots) to Autonomous Agents [slide]
- [Jul. 2023] Invited Research Talk at CMU/MBZUAI hosted by Eric P. Xing. Topic: Efficient Adaptation of Large-scale Pre-trained Language Models [slide].
- [Sep. 2022] Invited Talk (on-line) @ SAP - AI Research, Headquarter, Germany
News
- [Sep. 2024] Andrew Ng highlights our work - ChatDev at Sequoia Capital talk.
- [Aug. 2024] Joined AMD GenAI team. We are hiring 🔥.
- [May. 2024] Thrilled to announce that ChatDev was accepted by ACL 2024.
- [May. 2024] I will attend ICLR (Vienna, Austria) conference. Let’s grab a coffee. Welcome DM me via Twitter.
- [Mar. 2024] Invited Research Talk at Microsoft Research - AI Frontiers hosted by Guoqing Zheng. Topic: Efficient Tuning of Large-scale Pre-trained Language Models (LLMs) [slide]
- [Feb. 2024] Invited Research Talk at Microsoft Research - Semantic Machines hosted by Ben Van Durme. Topic: Next-Generation Co-pilot: From Assistant Agent to Autonomy Agent [slide]
- [Jan. 2024] Thrilled to announce that AgentVerse and ChatEval were accepted by ICLR 2024.
- [Jan. 2024] Invited Research Talk at Alibaba DAMO Academy. Topic: From Assistant Agents (Co-pilots) to Autonomous Agents [slide].
- [Dec. 2023] I will attend EMNLP (Singapore) and NeurIPS (New Orleans, U.S.A) conference. Welcome DM me via Twitter / yushengsu.thu@gmail.com and say “HI” to me.
- [Oct. 2023] Thrilled to announce that Exploring the Impact of Model Scaling on Parameter-efficient Tuning Methods was accepted by EMNLP 2023 as the main conference paper.
- [Oct. 2023] Thrilled to announce that we propose a next-generation AI agent, X Agent, that can accomplish more challenging tasks in the world.
- [Aug. 2023] I got my Ph.D.! Thanks to all those who trust me and support me.
- [Jul. 2023] Invited Research Talk at CMU/MBZUAI hosted by Eric P. Xing. Topic: Efficient Adaptation of Large-scale Pre-trained Language Models [slide].
- [May. 2023] AgentVerse was published. It provides a flexible framework that simplifies the process of building LLM-based agents to accomplish various tasks in the real world.
- [Apr. 2023] Tool Learning survey was published. It demonstrates how recently proposed LLMs leverage the emerging ability to comprehend, create, and manipulate tools, thereby assisting humans in accomplishing their intended objectives.
- [Jan. 2023] Parameter-efficient Fine-tuning of Large-scale Pre-trained Language Models was accepted by Natural Machine Intelligence (Cover Article).
- [Sep. 2022] Advancement of Foundation Models. Invited Talk (on-line) @ SAP - AI Research, Headquarters, Germany.
- [Jul. 2022] Will orally present our 2 accepted works at NAACL 2022 (Seattle, USA). Welcome DM me via Twitter / yushengsu.thu@gmail.com and say “HI” to me.
- [Apr. 2022] Transferability of Prompt Tuning and Knowledge Inheritance are accepted to NAACL 2022.
Publications
ChatDev Communicative agents for software development
Chen Qian, Xin Cong, Cheng Yang, Weize Chen, Yusheng Su, Juyuan Xu, Zhiyuan Liu, Maosong Sun
ACL 2024. [pdf] [code]AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors in Agents
Yusheng Su*, Weize Chen*, Jingwei Zuo, Cheng Yang, Chenfei Yuan, Chen Qian, Chi-Min Chan, Yujia Qin, Yaxi Lu, Ruobing Xie, Zhiyuan Liu, Maosong Sun, Jie Zhou (Â * indicates equal contribution)
ICLR 2024. [pdf] [code]ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate
Chi-Min Chan, Weize Chen, Yusheng Su, Jianxuan Yu, Wei Xue, Shanghang Zhang, Jie Fu, Zhiyuan Liu
ICLR 2024. [pdf] [code]Exploring the Impact of Model Scaling on Parameter-efficient Tuning Methods
Yusheng Su, Chi-Min Chan, Jiali Cheng, Yujia Qin, Yankai Lin, Shengding Hu, Zonghan Yang, Ning Ding, Xingzhi Sun, Guotong Xie, Zhiyuan Liu, Maosong Sun
EMNLP 2023. [pdf] [code]Parameter-efficient Fine-tuning of Large-scale Pre-trained Language Models
Ning Ding, Yujia Qin, Guang Yang, Fuchao Wei, Zonghan Yang, Yusheng Su, Shengding Hu, Yulin Chen, Chi-Min Chan, Weize Chen, Jing Yi, Weilin Zhao, Xiaozhi Wang, Zhiyuan Liu, Hai-Tao Zheng, Jianfei Chen, Yang Liu, Jie Tang, Juanzi Li, Maosong Sun.
Nature Machine Intelligence 2023 (Cover Article). [pdf] [code]On Transferability of Prompt Tuning for Natural Language Processing
Yusheng Su, Xiaozhi Wang, Yujia Qin, Chi-Min Chan, Yankai Lin, Zhiyuan Liu, Peng Li, Juanzi Li, Lei Hou, Maosong Sun, Jie Zhou
NAACL 2022 (Oral). [pdf] [code] [BibTex] [slide] [video]Knowledge Inheritance for Pre-trained Language Models
Yujia Qin, Yankai Lin, Jing Yi, Jiajie Zhang, Xu Han, Zhengyan Zhang, Yusheng Su, Zhiyuan Liu, Peng Li, Maosong Sun, Jie Zhou
NAACL 2022 (Oral). [pdf] [code]Exploring Low-dimensional Intrinsic Task Subspace via Prompt Tuning
Yujia Qin, Xiaozhi Wang, Yusheng Su, Yankai Lin, Ning Ding, Zhiyuan Liu, Juanzi Li, Lei Hou, Peng Li, Maosong Sun, Jie Zhou
ACL 2022 Findings. [pdf] [code]CPM: A large-scale Generative Chinese Pre-trained Language Model
Zhengyan Zhang, Xu Han, Hao Zhou, Pei Ke, Yuxian Gu, Deming Ye, Yujia Qin, Yusheng Su, Haozhe Ji, Jian Guan, Fanchao Qi, Xiaozhi Wang, Yanan Zheng, Guoyang Zeng, Huanqi Cao, Shengqi Chen, Daixuan Li, Zhenbo Sun, Zhiyuan Liu, Minlie Huang, Wentao Han, Jie Tang, Juanzi Li, Xiaoyan Zhu, Maosong Sun
AI OPEN 2021. [pdf] [code]CSS-LM: A Contrastive Framework for Semi-supervised Fine-tuning of Pre-trained Language Models
Yusheng Su, Xu Han, Yankai Lin, Zhengyan Zhang, Zhiyuan Liu, Peng Li, Maosong Sun
WWW 2021 Workshop, IEEE/TASLP 2021. [pdf] [code] [slide]CokeBERT: Contextual Knowledge Selection and Embedding Towards Enhanced Pre-Trained Language Models
Yusheng Su, Xu Han, Zhengyan Zhang, Peng Li, Zhiyuan Liu, Yankai Lin, Jie Zhou, Maosong Sun
EMNLP 2020 Findings, AI OPEN 2021. [pdf] [pdf] [code]
Under Review or Preprint Version
Human Emotion Knowledge Representation Emerges in Large Language Models and Supports Discrete Emotion Inference
Yusheng Su*, Ming Li*, Hsiu-Yuan Huang, Jiali Cheng, Xin Hu, Xinmiao Zhang, Huadong Wang, Yujia Qin, Xiaozhi Wang, Zhiyuan Liu, Dan Zhang (Â * indicates equal contribution)
(Submitted to Nature Human Behaviour 2023). [pdf] [code] (Refactoring - User friendly toolkit coming soon)Tool Learning with Foundation Models
Yujia Qin, Shengding Hu, Yankai Lin, Weize Chen, Ning Ding, Ganqu Cui, Zheni Zeng, Yufei Huang, Chaojun Xiao, Chi Han, Yi Ren Fung, Yusheng Su, Huadong Wang, Cheng Qian, Runchu Tian, Kunlun Zhu, Shihao Liang, Xingyu Shen, Bokai Xu, Zhen Zhang, Yining Ye, Bowen Li, Ziwei Tang, Jing Yi, Yuzhang Zhu, Zhenning Dai, Lan Yan, Xin Cong, Yaxi Lu, Weilin Zhao, Yuxiang Huang, Junxi Yan, Xu Han, Xian Sun, Dahai Li, Jason Phang, Cheng Yang, Tongshuang Wu, Heng Ji, Zhiyuan Liu, Maosong Sun
ArXiv 2023. [pdf] [code]
Efficient Training Projects
- (Leader/Co-leader) Prompt Transferability. This system assists users in building a prompt bank, allowing them to save well-trained prompts. It also enables swift access and reuse of these prompts whenever the user requires them on unseen tasks and heterogeneous models.
Agents Projects
- (Leader/Co-leader) AgentVerse. AgentVerse provides a framework that streamlines the process of developing custom multi-agent systems using LLMs in user-defined environments. This facilitates the design of more efficient multi-agent systems that can be applied to real-world applications. [NVIDIA’s Official Bolg], [Youtube1], [Youtube2]
- (Member) XAgent. XAgent makes more effective decisions and executes efficient actions to accomplish tasks with an unprecedented degree of autonomy. [Youtube1], [Youtube2]
- (Member) ChatDev. ChatDev creates customized software using natural language ideas through LLM-powered multi-agent collaboration. [Youtube1], [Youtube2], [Andrew Ng highlights our work at Sequoia Capital talk]
- (Member) Tool Learning. Tool learning for LLMs, open-source solutions of ChatGPT-Plugins.
LLM Pre-training Projects
- (Member) CPM-X. The first chinese-version large-scale pre-trained project and released a series of LLMs in 2020-2021.
Experiences
Sailing Lab - CMU (U.S) & MBZUAI (U.A.E), 2023 - 2024
- Postdoctoral Researcher
- Advised and hosted by Eric P. Xing
THUNLP Lab - Tsinghua University (China), 2019 - 2023
- Ph.D
- Advised by Zhiyuan Liu
- Hosted by Maosong Sun
MediaTek (Taiwan), 2018 - 2019
- Deep/Machine Learning Engineer Intern
- Advised by Jing-Han Wang.
Microsoft, 2015 - 2016
- Research and Development Intern
- Advised by Kuang-Chao Yeh and Gordon Chang.
Professional Services
Reviewer (Since 2021): ACL, NAACL, AACL, ACL Roling, EMNLP, COLING, ICLR, ICML, IJCAI, AAAI, NeurIPS
Pre-doctoral Student Mentoring
- (Since 2021-2023) Chi-Min Chan: Tsinghua University (BS) -> Hong Kong University of Science and Technology (HKUST) (MS)
- (Since 2022-2023) Jiali Cheng: University of North Carolina (MS->PhD)
- (Since 2022-2023) Yu Xia: Peking University (MS) -> Tsinghua University (PhD)
- (Since 2022-2023) Xiuyuan Huang: University of Science and Technology Beijing (BS) -> Peking University (MS)