🔬 Research Summary by Andy Zou, a second-year PhD student at CMU, advised by Zico Kolter and Matt Fredrikson. He is also a cofounder of the Center for AI Safety (safe.ai). [Original paper by Andy Zou, Zifan … [Read more...] about Universal and Transferable Adversarial Attacks on Aligned Language Models
Education
Value-based Fast and Slow AI Nudging
🔬 Research summary by Dr. Marianna Ganapini, our Faculty Director. [Original paper by Marianna B. Ganapini, Francesco Fabiano, Lior Horesh, Andrea Loreggia, Nicholas Mattei, Keerthiram Murugesan, Vishal … [Read more...] about Value-based Fast and Slow AI Nudging
Tell me, what are you most afraid of? Exploring the Effects of Agent Representation on Information Disclosure in Human-Chatbot Interaction
🔬 Research Summary by Stephan Schlögl, a professor of Human-Centered Computing at MCI - The Entrepreneurial School in Innsbruck (Austria), where his research and teaching particularly focuses on humans’ interactions with … [Read more...] about Tell me, what are you most afraid of? Exploring the Effects of Agent Representation on Information Disclosure in Human-Chatbot Interaction
People are not coins: Morally distinct types of predictions necessitate different fairness constraints
🔬 Research Summary by Corinna Hertweck, a fourth-year PhD student at the University of Zurich and the Zurich University of Applied Sciences where she is working on algorithmic fairness. [Original paper by … [Read more...] about People are not coins: Morally distinct types of predictions necessitate different fairness constraints
Are we ready for a multispecies Westworld?
✍️ Column by Jeff Sebo and Leonie N. Bossert Jeff Sebo is Clinical Associate Professor of Environmental Studies, Affiliated Professor of Bioethics, Medical Ethics, Philosophy, and Law, Director of the Animal Studies … [Read more...] about Are we ready for a multispecies Westworld?




