

Modern AI has advanced rapidly with increasing compute and model capacity, enabling ever larger and more powerful intelligent systems. However, high-quality data is not scaling at the same rate, creating a mismatch between compute growth and data availability. Synthetic data offers a promising way to address this gap by enabling scalable and controllable data generation. In this talk, I will present our recent works that use synthetic data to improve learning across three distinct domains: natural language processing, computer vision, and reinforcement learning. The first two works show how synthetic data can be used to construct higher-quality training datasets, improving the quality and diversity of generative commonsense reasoning and enabling more accurate and robust deepfake detection in social media images. The third work moves beyond static datasets by incorporating diffusion models into reinforcement learning from human feedback, improving both sample and feedback efficiency in challenging sequential decision-making tasks.
嘉宾介绍
Bei Peng is a Lecturer (Assistant Professor) in the School of Computer Science at the University of Sheffield. Her research focuses on deep reinforcement learning, multi-agent systems, and human-in-the-loop machine learning. Before joining Sheffield, Bei was a Lecturer in the Department of Computer Science at the University of Liverpool. Prior to that, she was a Postdoctoral Researcher in reinforcement learning in the Whiteson Research Lab at the University of Oxford, advised by Professor Shimon Whiteson. She received her Ph.D. in Computer Science from Washington State University, supervised by Professor Matthew E. Taylor.

数据分析从入门到精通,狗熊学习卡助您一臂之力!69元/年,狗熊会所有视频课程无限看,代码轻松学。欢迎小伙伴们扫码购入~

夜雨聆风