Research Interests
I am a research scientist at OpenAI RL and reasoning team working on large-scale RL training, reasoning model and SWE agent. I am a core contributor of o1 and o3. Previouly I spent 3.5 years at Deepmind working on fundamental research on RL and multi-armed bandits. See my talk at Stanford RL forum about information-directed sampling for explorationsSelected Publications
Leveraging Demonstrations to Improve Online Learning: Quality Matters
Botao Hao, Rahul Jain, Tor Lattimore, Benjamin Van Roy, Zheng Wen
ICML 2023. [arXiv]Regret Bounds for Information-Directed Reinforcement Learning
Botao Hao, Tor Lattimore
NeurIPS 2022. [arXiv]Contextual Information-Directed Sampling
Botao Hao, Tor Lattimore, Chao Qin
ICML 2022. [arXiv]Online Sparse Reinforcement Learning
Botao Hao, Tor Lattimore, Csaba Szepesvári, Mengdi Wang
AISTATS 2021. [arXiv] [poster]Information Directed Sampling for Sparse Linear Bandits
Botao Hao, Tor Lattimore, Wei Deng
NeurIPS 2021 (spotlight). [Proceedings] [slides]High-Dimensional Sparse Linear Bandits
Botao Hao, Tor Lattimore, Mengdi Wang
NeurIPS 2020. [arXiv] [slides] [poster]Adaptive Exploration in Linear Contextual Bandit
Botao Hao, Tor Lattimore, Csaba Szepesvári
AISTATS 2020. [arXiv] [slides]Sparse and Low-rank Tensor Estimation via Cubic Sketchings
Botao Hao, Anru Zhang, Guang Cheng
IEEE Transactions on Information Theory (2020). [arXiv] [slides]
Accepted in part to AISTATS 2020.Bootstrapping Upper Confidence Bound
Botao Hao, Yasin Abbasi-Yadkori, Zheng Wen, Guang Cheng
NeurIPS 2019. [arXiv] [poster]Simultaneous Clustering and Estimation of Heterogeneous Graphical Models
Botao Hao, Will Wei Sun, Yufeng Liu, Guang Cheng
Journal of Machine Learning Research. [pdf] [slides]
