👋 About Me

I am a master’s student at the School of Artificial Intelligence and Data Science, University of Science and Technology of China, expected to graduate in 2026. I received my bachelor’s degree from the School of the Gifted Young, USTC. My research interests include multi-task learning, representation learning, multimodal generation and understanding, and Agentic RL.

I previously interned at Beijing Academy of Artificial Intelligence (BAAI), ByteDance, and Xiaohongshu, working on multimodal generation and understanding, representation foundation models, generative search and recommendation, and AI search. My projects combine academic innovation with engineering impact.

If you are interested in my research, internship experience, or projects, or if you would like to discuss potential collaboration, feel free to contact me:

  • Phone: +86 17362950656
  • Email: yanruiran@mail.ustc.edu.cn
  • GitHub: RuiranYan
  • Google Scholar: Ruiran Yan

🎓 Education

  • M.Eng., School of Artificial Intelligence and Data Science, University of Science and Technology of China (ongoing)
  • B.S., School of the Gifted Young, University of Science and Technology of China

🚀 Selected Projects & Publications

  • OmniGen / OmniGen2 - General Image Generation and Understanding Models
    • As one of the core contributors to the OmniGen series, I worked on building unified image generation and understanding models that support text-to-image generation, image editing, and various visual tasks, while enabling cross-task knowledge transfer through a unified framework. I built a subject-driven multimodal data workflow from scratch (GroundingDINO + SAM / SAM2 + Outpaint), constructed large-scale subject-driven datasets X2I / X2I2, and used them to train OmniGen V1 / V2, significantly improving subject consistency and generation naturalness.
    • Links: OmniGen Paper, OmniGen GitHub, OmniGen2 Paper, OmniGen2 GitHub
    • Tags: Multimodal, Image Generation, BAAI
  • Agent-R1 - Reinforcement Learning Framework for Agent Systems
    • As one of the core contributors to the open-source Agent-R1 project, I helped design an end-to-end reinforcement learning training framework for agentic systems. The framework supports complex scenarios such as multi-turn conversations and multi-tool use, and adapts and refines mainstream algorithms such as PPO / GRPO to advance practical reinforcement learning for large models in real-world tasks.
    • Links: Paper, GitHub
    • Tags: Agent, RL, Open Source
  • O1Embedder - Let Retrieval Models Think Before Retrieval
    • Traditional representation models often struggle to leverage reasoning-oriented LLMs on complex multi-step reasoning tasks. O1Embedder converts the reasoning process of large models into supervised signals by constructing long chain-of-thought data, and unifies reasoning and representation learning through multi-task learning, enabling retrieval models to achieve stronger understanding and generalization in complex reasoning and zero-shot scenarios.
    • Links: Paper, GitHub
    • Tags: Representation Learning, Reasoning

💼 Engineering & Internship Experience

  • [2026.03 ~ 2026.05] - Xiaohongshu · Internationalization - AI Search / Agentic Search

  • [2025.05 ~ 2026.03] - ByteDance · Douyin Search - Personalized Pre-trained Models / Generative Search and Recommendation / LLM4Rec

  • [2024.07 ~ 2025.05] - Beijing Academy of Artificial Intelligence · Research Intern - Representation Models / Multimodal Generation and Understanding

  • [2022.10 ~ 2023.07] - OPPO · App Store - Multi-task Recommendation

🏆 Competitions & Awards

  • Meta KDD Cup 2024 · CRAG: Comprehensive RAG Benchmark - Global Top 4 (3,000+ participants) / Special Award for Complex Questions ($500 prize).