Sitemap
A list of all the posts and pages found on the site. For you robots out there is an XML version available for digesting as well.
Pages
Posts
Future Blog Post
Published:
This post will show up by default. To disable scheduling of future posts, edit config.yml
and set future: false
.
Blog Post number 4
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 3
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 2
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
Blog Post number 1
Published:
This is a sample blog post. Lorem ipsum I can’t remember the rest of lorem ipsum and don’t have an internet connection right now. Testing testing testing this blog post. Blog posts are cool.
portfolio
Portfolio item number 1
Short description of portfolio item number 1
Portfolio item number 2
Short description of portfolio item number 2
publications
On the Linear Convergence of Policy Gradient under Hadamard Parametrization
Published in Information and Inference: A Journal of the IMA, 2023. https://arxiv.org/abs/2305.19575.
Recommended citation: Jiacai Liu, Jinchi Chen, and Ke Wei. On the Linear Convergence of Policy Gradient under Hadamard Parameterization. arXiv:2305.19575, 2023.
On the Convergence of Projected Policy Gradient for Any Constant Step Sizes
Published in arxiv, 2023. https://arxiv.org/abs/2311.01104.
Recommended citation: Jiacai Liu, Wenye Li, and Ke Wei. On the Convergence of Projected Policy Gradient for Any Constant Step Sizes. arXiv:2311.0110, 2023.
Elementary Analysis of Policy Gradient Methods
Published in arxiv, 2024. https://arxiv.org/abs/2404.03372.
Recommended citation: Jiacai Liu, Wenye Li, and Ke Wei. Elementary Analysis of Policy Gradient Methods. arXiv:2404.03372, 2024.
$phi$-Update: A Class of Policy Update Methods with Policy Convergence Guarante
Published in ICLR, 2024. https://openreview.net/pdf?id=fh7GYa7cjO.
Recommended citation: Wenye Li, Jiacai Liu and Ke Wei. $\phi$-Update: A Class of Policy Update Methods with Policy Convergence Guarante. In The Thirteen International Conference on Learning Representations, ICLR 2025, Singapore.
Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage Policy Optimization
Published in arxiv, 2024. https://arxiv.org/pdf/2412.18279.
Recommended citation: Jiacai Liu and Chaojie Wang and Chris Yuhao Liu and Liang Zeng and Rui Yan and Yiwen Sun and Yang Liu and Yahui Zhou. arXiv:2412.18279, 2024.
talks
Some Progress on the Convergence of Policy Gradient Methods
Published:
Abstract: In this talk, I’ll introduce some novel convergence results of the policy gradient method and its variants from my papers. There are three main results I would like to present in this talk: 1. The projected policy gradient converges in a finite number of iterations for any given constant step size. 2. The exact softmax policy gradient converges to the optimum for any given constant step size. 3. $\phi$-update, a class of policy update methods with a policy convergence guarantee that goes beyond the policy gradient descent (or mirror descent) framework.
teaching
Teaching experience 1
Undergraduate course, University 1, Department, 2014
This is a description of a teaching experience. You can use markdown like any other post.
Teaching experience 2
Workshop, University 1, Department, 2015
This is a description of a teaching experience. You can use markdown like any other post.