Lihong Li (李力鸿)

Cited by

	All	Since 2019
Citations	25268	17225
h-index	65	56
i10-index	101	86

3800

1900

950

2850

2008200920102011201220132014201520162017201820192020202120222023202493 189 217 342 437 544 592 831 986 1225 1765 2333 3044 3531 3390 3779 1148

Public access

View all

14 articles

0 articles

available

not available

Based on funding mandates

Co-authors

John LangfordMicrosoft Research New YorkVerified email at hunch.net
Michael LittmanBrown UniversityVerified email at brown.edu
Jianfeng GaoMicrosoft Research, RedmondVerified email at microsoft.com
Wei Chu（褚崴）InfVerified email at gatsby.ucl.ac.uk
Li DengChief AI Officer, Citadel (former)Verified email at ieee.org
Robert SchapireMicrosoft ResearchVerified email at microsoft.com
Bo DaiGoogle Brain & Georgia TechVerified email at google.com
Denny ZhouResearch Scientist, Google DeepMindVerified email at google.com
Jianshu ChenPrincipal Scientist, AmazonVerified email at ucla.edu
Asli CelikyilmazResearcher @ FAIR at Meta AIVerified email at ieee.org
Dale SchuurmansUniversity of Alberta, Google DeepMindVerified email at cs.ualberta.ca
Zachary C. LiptonRaj Reddy Associate Professor of Machine Learning @ Carnegie Mellon University; CTO + CSO @ AbridgeVerified email at cmu.edu
Yun-Nung (Vivian) ChenNational Taiwan UniversityVerified email at ieee.org
Emma BrunskillAssociate Professor of Computer Science, Stanford UniversityVerified email at cs.stanford.edu
Faisal Ahmed, PhDMicrosoftVerified email at microsoft.com
Thomas J. WalshSony AIVerified email at sony.com
Xiujun LiUniversity of Washington / AppleVerified email at cs.washington.edu
Chong WangAppleVerified email at cs.princeton.edu
Csaba SzepesvariDeepMind & University of AlbertaVerified email at cs.ualberta.ca
Ofir NachumOpenAIVerified email at openai.com

Lihong Li (李力鸿)

Amazon

Verified email at amazon.com - Homepage

Reinforcement Learning Machine Learning Artificial Intelligence


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
A contextual-bandit approach to personalized news article recommendation L Li, W Chu, J Langford, RE Schapire Proceedings of the 19th international conference on World wide web, 661-670, 2010	3255	2010
An empirical evaluation of thompson sampling O Chapelle, L Li Advances in neural information processing systems 24, 2011	1729	2011
Parallelized stochastic gradient descent M Zinkevich, M Weimer, L Li, A Smola Advances in neural information processing systems 23, 2010	1709	2010
Contextual bandits with linear payoff functions W Chu, L Li, L Reyzin, R Schapire Proceedings of the Fourteenth International Conference on Artificial …, 2011	1176	2011
Doubly robust policy evaluation and learning M Dudík, J Langford, L Li arXiv preprint arXiv:1103.4601, 2011	885	2011
Doubly Robust Policy Evaluation and Learning M Dudık, J Langford, L Li	885*
Neural approaches to conversational AI J Gao, M Galley, L Li The 41st international ACM SIGIR conference on research & development in …, 2018	873	2018
Doubly robust off-policy value evaluation for reinforcement learning N Jiang, L Li International conference on machine learning, 652-661, 2016	818	2016
Unbiased offline evaluation of contextual-bandit-based news article recommendation algorithms L Li, W Chu, J Langford, X Wang Proceedings of the fourth ACM international conference on Web search and …, 2011	651	2011
PAC model-free reinforcement learning AL Strehl, L Li, E Wiewiora, J Langford, ML Littman Proceedings of the 23rd international conference on Machine learning, 881-888, 2006	622	2006
Sparse Online Learning via Truncated Gradient. J Langford, L Li, T Zhang Journal of Machine Learning Research 10 (3), 2009	591	2009
Towards a unified theory of state abstraction for MDPs. L Li, TJ Walsh, ML Littman AI&M 1 (2), 3, 2006	582	2006
Taming the monster: A fast and simple algorithm for contextual bandits A Agarwal, D Hsu, S Kale, J Langford, L Li, R Schapire International Conference on Machine Learning, 1638-1646, 2014	546	2014
Towards end-to-end reinforcement learning of dialogue agents for information access B Dhingra, L Li, X Li, J Gao, YN Chen, F Ahmed, L Deng arXiv preprint arXiv:1609.00777, 2016	515*	2016
Doubly robust policy evaluation and optimization M Dudík, D Erhan, J Langford, L Li	495	2014
End-to-end task-completion neural dialogue systems X Li, YN Chen, L Li, J Gao, A Celikyilmaz arXiv preprint arXiv:1703.01008, 2017	446	2017
Neuro-symbolic program synthesis E Parisotto, A Mohamed, R Singh, L Li, D Zhou, P Kohli arXiv preprint arXiv:1611.01855, 2016	396	2016
Reinforcement Learning in Finite MDPs: PAC Analysis. AL Strehl, L Li, ML Littman Journal of Machine Learning Research 10 (11), 2009	365	2009
Breaking the curse of horizon: Infinite-horizon off-policy estimation Q Liu, L Li, Z Tang, D Zhou Advances in neural information processing systems 31, 2018	361	2018
Contextual bandit algorithms with supervised learning guarantees A Beygelzimer, J Langford, L Li, L Reyzin, RE Schapire Arxiv preprint arXiv:1002.4058, 2010	341	2010

The system can't perform the operation now. Try again later.

Articles 1–20

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors