top of page


TurningPoint AI
AIGC Research Group

About us

We are a compact and hardcore research team focused on harnessing the power of Multimodal Reasoning. Current focus includes:
  1. Multimodal
  2. O1-style reasoning
  3. Text-to-image generation

PROJECT HIGHLIGHT

VisualThinker-R1-Zero

Deepseek R1 has demonstrated how Reinforcement Learning (RL) with well-designed rule-based rewards can enable a large language model to build unique reasoning capabilities autonomously. Since then, many researchers have attempted to extend this success to multimodal reasoning. However, recent efforts primarily struggle to reproduce the increasing response length and thinking pattern exhibited by DeepSeek R1.

VisionThinker-R1-Zero is a replication of DeepSeek-R1-Zero training on small multimodal models. We are the first to successfully observe the emergent “aha moment” and increased response length on multimodal tasks. Through applying GRPO on an unaligned 2B base model, we can observe the model develops self-verification autonomously and exhibits an emergent ability to "take another look" at the image and correct its mistakes.

MOSSBench

More Projects

Join Us

At TurningPoint AI, there are no ranks or limits - only opportunities. We prioritize transparency, fairness, and integrity. Active contribution and team spirit are fundamental to our team culture. For research assistants and collaboration opportunities:

© 2024 by Turning Point.AI

  • X
  • GitHub
bottom of page