Liwen Zhu 朱丽雯

Algorithm Research Engineer

Tencent AI Lab (also as AI Platform), Beijing / Shenzhen

I am an Algorithm Research Engineer at Tencent AI Lab. My research interests span Reinforcement Learning, Large Language Model Agents, and Game AI.

Currently, I am working on the JueWu AI project, building RL agents for Honor of Kings that surpass professional players, and developing Companion AI deployed at 100M+ DAU scale. I also explore LLM-based strategic reasoning in complex game environments.

Previously, I worked on RLHF and LLM post-training at Tencent WXG. I received my M.S. from Peking University (advised by Prof. Zongqing Lu and Prof. Yonghong Tian).

liwenzhu [at] pku.edu.cn

Experience

Oct 2023 – Present

Tencent TEG / AI Lab (also as AI Platform), Shenzhen
Algorithm Research Engineer — RL, LLM Agent, Game AI

JueWu AI: PPO-based RL for Honor of Kings, surpassing KPL pros. Showcased in national competition.
Companion AI: Offline RL covering 80+ heroes, deployed at 100M+ DAU.
LLM Game Agent: Strategic reasoning and planning in complex strategy games.

Jul 2022 – Oct 2023

Tencent WXG / Pattern Recognition Center, Beijing
Algorithm Research Engineer — LLM Post-training, RLHF

Built Reward Model pipeline for WeLM with gap alignment algorithm.
Led PPO-based RLHF alignment across multiple SFT versions; reduced reward hacking.

Oct 2021 – Mar 2022

Microsoft Research Asia, Beijing Star of Tomorrow
Research Intern — Deep RL, Cloud Computing

RL-based online job scheduling outperforming FIFO and SJF heuristics.
Multi-task auxiliary learning to augment state representations.

May 2021 – Jul 2022

Tencent, Beijing / Shenzhen
Algorithm Intern — RL, NLP

Modeled relation extraction as RL problem; entity relations in financial news.
Aspect-based sentiment analysis with Docker deployment on ES.

Jul 2019 – Jul 2022

Pengcheng Laboratory, Shenzhen
Algorithm Intern — Multi-Agent RL, Traffic Signal Control

Decentralized traffic signal control via meta-learning with intrinsic rewards.
Developed Shenzhen traffic simulation dataset and PengBo RL Platform.

Selected Publications

Meta Variationally Intrinsic Motivated Reinforcement Learning for Decentralized Traffic Signal Control
Liwen Zhu, Peixi Peng, Zongqing Lu, Yonghong Tian
IEEE Transactions on Knowledge and Data Engineering (TKDE), 2022
[Paper] [Code]
Enhance Reasoning for Large Language Models in the Game Werewolf
Shuang Wu*, Liwen Zhu*, Tao Yang, Shiwei Xu, Qiang Fu, Wei Yang, Haobo Fu
arXiv 2024
[Paper] [Code]
MTLight: Efficient Multi-Task Reinforcement Learning for Traffic Signal Control
Liwen Zhu, Peixi Peng, Zongqing Lu, Yonghong Tian
ICLR 2022 Workshop (Gamification and Multiagent Solutions)
[Paper] [Code]
Multi-Agent Coordination via Multi-Level Communication
Ziluo Ding, Zeyuan Liu, Zhirui Fang, Kefan Su, Liwen Zhu, Zongqing Lu
NeurIPS 2024
[Paper]
From Unknown to Known: An AI Coaching Problem in Open-World Environments
Xuejie Liu, Anji Liu, Zhengxinyue, Liwen Zhu, Zihao Wang, Xiaojuan Tang, Haowei Lin, Haobo Fu, Yitao Liang
EMNLP 2024
[Paper]
An Advanced Reinforcement Learning Framework for Online Scheduling of Deferrable Workloads in Cloud Computing
Liwen Zhu*, Hang Dong*, Zhao Shan, Bo Qiao, Yonghong Tian
arXiv 2024
[Paper]

Patents

Liwen Zhu, Guohua Wang, Zefeng Weng, Hai Li, Qianben Chen. A Method of Enterprise Relationship Inference for Financial Big Data.
Yonghong Tian, Liwen Zhu, Peixi Peng, Wen Gao. An Intrinsic Rewarded Meta-Reinforcement Learning Method for Traffic Signal Control.

Education

Sep 2019 – Jun 2022

Peking University, Beijing
M.S. in Computer Application Technology, GPA: 3.77 (Top 10%)
Advisor: Zongqing Lu, Yonghong Tian
Lab: Multimedia Learning Group, Institute of Digital Media (NELVT)

Sep 2015 – Jun 2019

Beijing University of Technology, Beijing
B.E. in Intelligent Traffic System, GPA: 3.91 (Top 1%)

Honors & Awards

2022	Tencent Open Source Collaboration Award
2022	Star of Tomorrow Honor, Microsoft Research Asia
2019	Outstanding Graduates of Beijing
2019	Beijing Excellent Undergraduate Thesis
2019	Meritorious Winner, Mathematical Contest in Modeling (MCM/ICM)
2018	National Scholarship, Ministry of Education of China
2018	Grand Prize, iCAN International Innovation and Entrepreneurship Competition
2018	Silver Award, “Chuang Qing Chun” Capital Entrepreneurship Competition

Academic Service

Reviewer: ICML, NeurIPS, ICLR 2022/2023/2024/2025
Invited Speaker: The 9th World Radar Expo (WRE), Nanjing, 2021
Volunteer: IEEE MIPR 2020; New Generation AI Academician Summit Forum 2019