Safety and Security

Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models

January 14, 2024

🔬 Research Summary by Leyang Cui, a senior researcher at Tencent AI lab. [Original paper by Yue Zhang , Yafu Li , Leyang Cui, Deng Cai , Lemao Liu, Tingchen Fu, Xinting Huang, Enbo Zhao , Yu Zhang , Yulong Chen, … [Read more...] about Siren’s Song in the AI Ocean: A Survey on Hallucination in Large Language Models

Down the Toxicity Rabbit Hole: Investigating PaLM 2 Guardrails

January 3, 2024

🔬 Research Summary by Ashique KhudaBukhsh, an assistant professor at the Rochester Institute of Technology specializing in natural language processing, computational social science, and responsible AI. [Original … [Read more...] about Down the Toxicity Rabbit Hole: Investigating PaLM 2 Guardrails

Unpacking Human-AI interaction (HAII) in safety-critical industries

December 20, 2023

🔬 Research Summary by Tita Alissa Bach, Ph.D., is a Principal Researcher at the Digital Transformation research team at DNV, Norway, focusing on Human Factors in AI in safety-critical industries [Original paper … [Read more...] about Unpacking Human-AI interaction (HAII) in safety-critical industries

A Machine Learning Challenge or a Computer Security Problem?

December 20, 2023

🔬 Research Summary by Ilia Shumailov, a Ph.D. in Computer Science from the University of Cambridge, specializing in Machine Learning and Computer Security. During the PhD under the supervision of Prof Ross Anderson, Ilia … [Read more...] about A Machine Learning Challenge or a Computer Security Problem?

LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI’s ChatGPT Plugins

December 7, 2023

🔬 Research Summary by Umar Iqbal, an Assistant professor at Washington University in St. Louis, researching computer security and privacy. [Original paper by Umar Iqbal (Washington University in St. Louis), … [Read more...] about LLM Platform Security: Applying a Systematic Evaluation Framework to OpenAI’s ChatGPT Plugins

« Previous Page