AI研讨会 | Aligning Multimodal AI with Human Preference ...

研讨会信息

🎤 Speaker: Dr. Yu WANG,

Principal Researcher at TikTok

📰 Title:Aligning Multimodal AI with Human Preference: From Language Understanding to Visual Generation

⏰ Time: 9:30-10:30, Beijing Time

📆 Date: 16 April, 2026 (Thursday)

📍 Online Zoom link:

https://hkust-gz-edu-cn.zoom.us/j/94595919503?pwd=tRTIRt2xNithvVwsa5OiyOu0Bli9q4.1

Zoom ID: 945 9591 9503

Passcode: ait

研讨会内容

How do we build AI systems that perceive the world across modalities, generate content that reflects human intent, and operate reliably at the scale of hundreds of millions of users? This talk presents a research program addressing this challenge through three interconnected pillars: cross-modal representation learning, preference-aligned generative models, and personalized creator-centric AI. I will begin by discussing how unified language model pre-training and multi-model neural architectures laid the foundation for bridging understanding and generation, with systems deployed in Microsoft Bing serving 70M+ users. I will then present our most recent contribution, Diffusion-LPO (ICLR 2026), which extends Direct Preference Optimization from pairwise to listwise rankings via the Plackett-Luce model, consistently outperforming pairwise baselines across text-to-image generation, image editing, and personalized alignment. Finally, I will outline a research plan spanning fine-grained video-language alignment, multi-objective preference optimization, efficient diffusion inference, and diversity-aware generation for ecosystem health. Throughout, I will discuss the failure modes encountered when deploying these models at internet scale - distribution shift, metric mismatch, and alignment brittleness - and how real-world deployment constraints motivate fundamental research questions.

分享者简介

Dr. Yu WANG

Principal Researcher at TikTok

Yu Wang is a Principal Researcher at TikTok, leading research on multimodal large models for content ecology and generative AI. He received his Ph.D. in Computer Science from Yale University and his B.Eng. from the National University of Singapore. Previously, he was a Senior Researcher at Microsoft Research and a Research Scientist at Samsung Research America. His research spans multimodal large language models, preference-aligned generation, and personalized content intelligence, with systems deployed at production scale in Microsoft Bing (70M+ users) and TikTok (hundreds of millions of users). He has published 35+ papers (26 as first author) at venues including ICLR, NeurIPS, ICML, AAAI, and ACL, with 4,220+ Google Scholar citations, H-index 17, and 4 U.S. granted patents. He is a recipient of the 2024 NSFC Excellent Young Scientists Fund (Overseas).

扫描加关注

获取更多AI学域消息

SCAN & FOLLOW US!