Micah Carroll

Cited by

	All	Since 2019
Citations	885	884
h-index	8	8
i10-index	8	8

420

210

105

315

2020202120222023202421 45 94 314 401

Public access

View all

6 articles

0 articles

available

not available

Based on funding mandates

Co-authors

Anca D DraganAssistant Professor at UC Berkeley // Director, AI Safety and Alignment, Google DeepMindVerified email at berkeley.edu
Rohin ShahResearch Scientist, Google DeepMindVerified email at deepmind.com
Stuart RussellProfessor of Computer Science, University of California, BerkeleyVerified email at cs.berkeley.edu
Sam DevlinMicrosoft Research CambridgeVerified email at microsoft.com
Katja HofmannMicrosoft ResearchVerified email at microsoft.com
David Scott KruegerUniversity Assistant Professor, University of CambridgeVerified email at cam.ac.uk
Alan ChanCentre for the Governance of AI; Mila, Université de MontréalVerified email at mila.quebec
Dylan Hadfield-MenellMassachusetts Institute of TechnologyVerified email at csail.mit.edu
Smitha MilliCornell TechVerified email at berkeley.edu

Micah Carroll

PhD student, UC Berkeley

Verified email at berkeley.edu - Homepage

AI Safety AI Alignment AI Influence Recommender systems Human-AI Collaboration


Title Sort by citations Sort by year Sort by title	Cited by Cited by	Year
On the Utility of Learning About Humans for Human-AI Coordination M Carroll, R Shah, MK Ho, T Griffiths, S Seshia, P Abbeel, A Dragan Advances in Neural Information Processing Systems, 2019, 5174-5185, 2019	360	2019
Open problems and fundamental limitations of reinforcement learning from human feedback S Casper, X Davies, C Shi, TK Gilbert, J Scheurer, J Rando, R Freedman, ... arXiv preprint arXiv:2307.15217, 2023	266	2023
Harms from Increasingly Agentic Algorithmic Systems A Chan, R Salganik, A Markelius, C Pang, N Rajkumar, D Krasheninnikov, ... Proceedings of the 2023 ACM Conference on Fairness, Accountability, and …, 2023	64*	2023
Estimating and Penalizing Induced Preference Shifts in Recommender Systems M Carroll, A Dragan, S Russell, D Hadfield-Menell International Conference on Machine Learning, 2022 (Spotlight), 2686-2708, 2022	58*	2022
Characterizing Manipulation from AI Systems M Carroll, A Chan, H Ashton, D Krueger EEAMO 2023, 2023	33	2023
Uni[MASK]: Unified inference in sequential decision problems M Carroll, O Paradise, J Lin, R Georgescu, M Sun, D Bignell, S Milani, ... NeurIPS 2022 (Oral), 2022	29*	2022
Engagement, user satisfaction, and the amplification of divisive content on social media S Milli, M Carroll, Y Wang, S Pandey, S Zhao, AD Dragan arXiv preprint arXiv:2305.16941, 2023	27*	2023
Evaluating the Robustness of Collaborative Agents P Knott, M Carroll, S Devlin, K Ciosek, K Hofmann, AD Dragan, R Shah AAMAS 2021 (Extended Abstract), 2021	27	2021
Optimal Behavior Prior: Data-Efficient Human Models for Improved Human-AI Collaboration M Yang, M Carroll, A Dragan NeurIPS 2022 Human in the Loop Learning (HiLL) Workshop, 2022	6	2022
Time-Efficient Reward Learning via Visually Assisted Cluster Ranking D Zhang, M Carroll, A Bobu, A Dragan NeurIPS 2022 Human in the Loop Learning (HiLL) Workshop, 2022	5	2022
Who Needs to Know? Minimal Knowledge for Optimal Coordination N Lauffer, A Shah, M Carroll, MD Dennis, S Russell International Conference on Machine Learning 2023, 18599-18613, 2023	4	2023
Overview of current AI alignment approaches M Carroll	4	2018
AI Alignment with Changing and Influenceable Reward Functions M Carroll, D Foote, A Siththaranjan, S Russell, A Dragan arXiv preprint arXiv:2405.17713, 2024	2	2024

The system can't perform the operation now. Try again later.

Articles 1–13

Citations per year

Duplicate citations

Merged citations

Add co-authorsCo-authors

Follow

Cited by

Co-authors