Hengshuai Yao

I studied reinforcement learning at Reinforcement Learning and Artificial Intelligence (RLAI) lab from 2008 to 2014 in a Ph.D program at Department of Computing Science, University of Alberta. My thesis is on model-based reinforcement learning with linear function approximation. During my Ph.D studies, I worked with Csaba Szepesvari, Rich Sutton, Dale Schuurmans, and Davood Rafiei on reinforcement learning theory, algorithms and web applications. I joined NCSoft game studio in San Francisco in 2016 working on reinforcement learning in games. I moved back to Canada and joined Huawei in 2017.

RL Lab:

Our Lab is focused on reinforcement learning for practical problems, from large state-action space problems to safety robot machines; as well as fundamental algorithm research.

Thesis

Publication

Here is the testing performance of a low-quantile $Q_{0.1}$ driving in CARLA (A 3D driving simulator) in a safety-based reward function as we have more and more samples. This motivates “Distributional Reinforcement Learning”!

hi
hi

Reviewing

AAAI, ICML, NIPS, UAI, ADPRL, AIStat, CIKM, WWW.

Robocup Soccer

I was a member of TsinghuAeolus (soccer simulation team at Tsinghua University) for World Cup Simulation League. Here shows a game of our team TsinghuAeolus playing against Everest in the final match at Robocup 2003. Our TsinghuAeolus was based on hierarchical reinforcement learning. Decisions the player has to make include: dribbling, passing, running, shooting, goal keeping, positioning (team position), resting, defending, etc. A high-level policy learns how to select a decision, and a low-level controller executes the selected decision. Both the high-level and low-level controller were learned using reinforcement learning.

Tetris

I developed a policy iteration algorithm to play a Tetris game. In this game, there are only hard shapes: “S” and “Z”. The player was trained with data of randomly playing the game.

Contact

my first name at gmail