Liwen Zhu 朱丽雯
Algorithm Research Engineer
Tencent AI Lab (also as AI Platform), Beijing / Shenzhen
I am an Algorithm Research Engineer at Tencent AI Lab. My research interests span Reinforcement Learning, Large Language Model Agents, and Game AI.
Currently, I am working on the JueWu AI project, building RL agents for Honor of Kings that surpass professional players, and developing Companion AI deployed at 100M+ DAU scale. I also explore LLM-based strategic reasoning in complex game environments.
Previously, I worked on RLHF and LLM post-training at Tencent WXG. I received my M.S. from Peking University (advised by Prof. Zongqing Lu and Prof. Yonghong Tian).
liwenzhu [at] pku.edu.cn
Experience
Algorithm Research Engineer — RL, LLM Agent, Game AI
- JueWu AI: PPO-based RL for Honor of Kings, surpassing KPL pros. Showcased in national competition.
- Companion AI: Offline RL covering 80+ heroes, deployed at 100M+ DAU.
- LLM Game Agent: Strategic reasoning and planning in complex strategy games.

Algorithm Research Engineer — LLM Post-training, RLHF
- Built Reward Model pipeline for WeLM with gap alignment algorithm.
- Led PPO-based RLHF alignment across multiple SFT versions; reduced reward hacking.

Research Intern — Deep RL, Cloud Computing
- RL-based online job scheduling outperforming FIFO and SJF heuristics.
- Multi-task auxiliary learning to augment state representations.

Algorithm Intern — RL, NLP
- Modeled relation extraction as RL problem; entity relations in financial news.
- Aspect-based sentiment analysis with Docker deployment on ES.

Algorithm Intern — Multi-Agent RL, Traffic Signal Control
- Decentralized traffic signal control via meta-learning with intrinsic rewards.
- Developed Shenzhen traffic simulation dataset and PengBo RL Platform.

Selected Publications
-
Multi-Agent Coordination via Multi-Level Communication
Ziluo Ding, Zeyuan Liu, Zhirui Fang, Kefan Su, Liwen Zhu, Zongqing Lu
NeurIPS 2024
[Paper] -
From Unknown to Known: An AI Coaching Problem in Open-World Environments
Xuejie Liu, Anji Liu, Zhengxinyue, Liwen Zhu, Zihao Wang, Xiaojuan Tang, Haowei Lin, Haobo Fu, Yitao Liang
EMNLP 2024
[Paper] -
An Advanced Reinforcement Learning Framework for Online Scheduling of Deferrable Workloads in Cloud Computing
Liwen Zhu*, Hang Dong*, Zhao Shan, Bo Qiao, Yonghong Tian
arXiv 2024
[Paper]
Patents
- Liwen Zhu, Guohua Wang, Zefeng Weng, Hai Li, Qianben Chen. A Method of Enterprise Relationship Inference for Financial Big Data.
- Yonghong Tian, Liwen Zhu, Peixi Peng, Wen Gao. An Intrinsic Rewarded Meta-Reinforcement Learning Method for Traffic Signal Control.
Education
M.S. in Computer Application Technology, GPA: 3.77 (Top 10%)
Advisor: Zongqing Lu, Yonghong Tian
Lab: Multimedia Learning Group, Institute of Digital Media (NELVT)
B.E. in Intelligent Traffic System, GPA: 3.91 (Top 1%)
Honors & Awards
| 2022 | Tencent Open Source Collaboration Award |
| 2022 | Star of Tomorrow Honor, Microsoft Research Asia |
| 2019 | Outstanding Graduates of Beijing |
| 2019 | Beijing Excellent Undergraduate Thesis |
| 2019 | Meritorious Winner, Mathematical Contest in Modeling (MCM/ICM) |
| 2018 | National Scholarship, Ministry of Education of China |
| 2018 | Grand Prize, iCAN International Innovation and Entrepreneurship Competition |
| 2018 | Silver Award, “Chuang Qing Chun” Capital Entrepreneurship Competition |
Academic Service
- Reviewer: ICML, NeurIPS, ICLR 2022/2023/2024/2025
- Invited Speaker: The 9th World Radar Expo (WRE), Nanjing, 2021
- Volunteer: IEEE MIPR 2020; New Generation AI Academician Summit Forum 2019