学术报告
我的位置在: 首页 > 学术报告 > 正文
Warm Start Actor-Critic: From Approximation Error to Suboptimality Gap
浏览次数:日期:2023-08-05编辑:信科院 科研办

报告人:章君山,美国加州大学戴维斯分校教授,IEEE Fellow

报告时间:2023年8月6日 (星期日) 10:00 - 11:00  

报告地点:国家超算长沙中心1号楼304会议室


报告摘要: Conventional reinforcement learning (RL) techniques face the formidable challenge of high sample complexity and intensive computation load, which hinders RL's applicability in real-world tasks. To tackle this challenge, Warm-Start RL is emerging as a promising new paradigm, with the basic idea being to accelerate online learning by starting with an initial policy trained offline. Indeed, owing to the knowledge transfer from an initial policy, Warm-Start RL has been successfully applied in AlphaZero and ChatGPT, demonstrating its great potential to speed up online learning. Despite these remarkable successes, a fundamental understanding of Warm-Start RL is lacking. The primary objective of this study is to quantify the impact of function approximation errors on the sub-optimality gap for Warm-Start RL. We consider the widely used “Actor-Critic” method for RL. For the unbiased case, we give sufficient conditions on the question‘how good the warm-start policy needs to be’to achieve fast convergence. For the biased case, our findings reveal that a‘good’warm-start policy (obtained by offline training) may be insufficient, and bias reduction in online learning also plays an essential role to lower the suboptimality gap. We then investigate bias reduction using adaptive ensemble learning and planning.


报告人简介: 章君山教授是美国加州大学戴维斯分校教授。他的研究兴趣包括通信网络、物联网 (IoT)、雾计算、社交网络、智能电网。他目前的研究重点是信息网络和数据科学中的基础问题,包括雾计算及其在物联网和5G中的应用、物联网数据隐私/安全、移动社交网络的优化/控制、认知无线电网络、智能电网的随机建模和控制等。

章君山教授是IEEE Fellow,是通信网络领域多个重要会议的TPC 联合主席,包括IEEE INFOCOM 2012 和 ACM MOBIHOC 2015。他是 ACM/IEEE SEC 2017、WiOPT 2016 和 IEEE Communication Theory Workshop 2007的主席。他是IEEE通信学会的杰出讲师。他是IEEE Transactions on Wireless Communications 的主编,还担任 IEEE/ACM Transactions on Networking 的特约编辑和IEEE Network Magazine等期刊和杂志的编辑。


邀请人:李肯立


联系人:肖国庆