Publications

You can also find my articles on my Google Scholar profile.

Improving Multi-Step Reasoning Abilities of Large Language Models with Direct Advantage Policy Optimization

Published in arxiv, 2024. https://arxiv.org/pdf/2412.18279.

Recommended citation: Jiacai Liu and Chaojie Wang and Chris Yuhao Liu and Liang Zeng and Rui Yan and Yiwen Sun and Yang Liu and Yahui Zhou. arXiv:2412.18279, 2024.

$phi$-Update: A Class of Policy Update Methods with Policy Convergence Guarante

Published in ICLR, 2024. https://openreview.net/pdf?id=fh7GYa7cjO.

Recommended citation: Wenye Li, Jiacai Liu and Ke Wei. $\phi$-Update: A Class of Policy Update Methods with Policy Convergence Guarante. In The Thirteen International Conference on Learning Representations, ICLR 2025, Singapore.

Elementary Analysis of Policy Gradient Methods

Published in arxiv, 2024. https://arxiv.org/abs/2404.03372.

Recommended citation: Jiacai Liu, Wenye Li, and Ke Wei. Elementary Analysis of Policy Gradient Methods. arXiv:2404.03372, 2024.

On the Convergence of Projected Policy Gradient for Any Constant Step Sizes

Published in arxiv, 2023. https://arxiv.org/abs/2311.01104.

Recommended citation: Jiacai Liu, Wenye Li, and Ke Wei. On the Convergence of Projected Policy Gradient for Any Constant Step Sizes. arXiv:2311.0110, 2023.

On the Linear Convergence of Policy Gradient under Hadamard Parametrization

Published in Information and Inference: A Journal of the IMA, 2023. https://arxiv.org/abs/2305.19575.

Recommended citation: Jiacai Liu, Jinchi Chen, and Ke Wei. On the Linear Convergence of Policy Gradient under Hadamard Parameterization. arXiv:2305.19575, 2023.