About me

Ciao! I am a final-year PhD student in Deep Learning and Computer Vision at the University of Trento, under the joint supervision of Prof. Elisa RICCI and Prof. Zhun ZHONG.

Recently, I completed my research visit at the University of California, Los Angeles (UCLA), where I worked on automatic urban simulation scene creation from city-tour videos under the guidance of Prof. Bolei ZHOU.

From June 2023 to December 2024, I also spent 18 wonderful months as a Visiting Researcher at NAVER LABS Europe exploring open-vocabulary object detection supervised by Gabriela CSURKA, Riccardo VOLPI, and Tyler L. HAYES, in the team led by Diane Larlus.

Before starting my PhD, I earned two Master’s degrees—Summa Cum Laude—from KTH Royal Institute of Technology (Sweden) in Intelligent Autonomous Systems and from the University of Trento (Italy) in Mechatronics Engineering. I also spent three years as an Innovation Engineer at SIEMENS Smart Infrastructure Division, designing IoT-based automation solutions.


🚨On the Job Market:🚨 I’ll be graduating in Spring 2026 — if you’re working on the future of practical open-world machines and multimodal foundation models, let’s chat!


The Research I Like & Do: Training open-world machines to see, understand, and reason about our chaotic visual, semantic, and physical world — Going Better and Wilder!

  • Knowledge Discovery

    Uncovering structured and interpretable semantic and geometric knowledge from real-world visual data.

  • Open-vocabulary Recognition

    Empowering users to freely define their vocabulary to localize, segment, and recognize objects of interest within image streams in a zero-shot workflow

  • Vision and Language

    Leveraging language as a medium to enhance vision tasks, delivering improved interpretability and interactivity in more real-world scenarios.

  • Urban Embodied AI Simulation

    Parsing our chaotic 3D semantic world from city-tour videos and turning it into high-fidelity urban simulation worlds for embodied AI learning.

Things I'd like to share

  • [10/2025]: After nine wonderful and inspiring months at UCLA, our new work UrbanVerse — scaling urban simulation scenes for embodied AI — is out! Check out the project here!
  • [01/2025]: I'm excited to share that I joined the Zhou Lab at UCLA as a visiting researcher, where I'll have the privilege of being advised by Prof. Bolei ZHOU.
  • [10/2025]: Honored to be selected as Outstanding Reviewer for CVPR 2025!
  • [07/2024]: I successfully got $5,000 funding (in credits) from OpenAI to support my research!
  • [05/2024]: One paper on incremental novel class discovery with large scale pre-trained models is accepted as an Oral paper at ICPR 2024.
  • [05/2024]: Filed my first US Patent: "A Method for Using Semantic Hierarchy Trees to Increase the Robustness of Open-vocabulary Object Detection Models"!
  • [02/2024]: One paper on open-vocabulary object detection with semantic hierarchy (work done with NAVER LABS Europe) is accepted as a Highlight paper, 2.8% acceptance rate at CVPR 2024! Thanks to the team!
  • [01/2024]: One paper on discovering fine-grained semantic concepts with LLMs is accepted to ICLR 2024! See you in Vienna this May!

Selected Publications

UrbanVerse: Scaling Urban Simulation by Watching City-Tour Videos

Mingxuan Liu*, Honglin He*, Elisa Ricci, Wayne Wu, Bolei Zhou

Technical report, arXiv:2510.15018, 2025

Organizing Unstructured Image Collections using Natural Language

Mingxuan Liu, Zhun Zhong, Jun Li, Gianni Franchi, Subhankar Roy, Elisa Ricci

Technical report, arXiv:2410.05217, 2025

SHiNe Teaser

SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection

Mingxuan Liu, Tyler L. Hayes, Gabriela Csurka, Elisa Ricci, Riccardo Volpi

The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR, Highlight paper, 2.8% acceptance rate), 2024

FineR Teaser

Democratizing Fine-grained Visual Recognition with Large Language Models

Mingxuan Liu, Subhankar Roy, Wenjing Li, Zhun Zhong, Nicu Sebe, Elisa Ricci

International Conference on Learning Representations (ICLR), 2024

MSc-iNCD Teaser

Large-scale Pre-trained Models are Surprisingly Strong in Incremental Novel Class Discovery

Mingxuan Liu, Subhankar Roy, Zhun Zhong, Nicu Sebe, Elisa Ricci

International Conference on Pattern Recognition (ICPR, Oral paper), 2024

Class-iNCD Teaser

Class-incremental Novel Class Discovery

Subhankar Roy*, Mingxuan Liu*, Zhun Zhong, Nicu Sebe, Elisa Ricci

* denotes co-first authorship

European Conference on Computer Vision (ECCV), 2022

RWG Teaser

Siemens RWG Control Platform Advanced Course and Practice

Jiaxin Han, Huixia Zhao, Kaixuan Zhang, Mingxuan Liu (Deputy Editor-in-Chief & Author)

China Electric Power Press (CEPP), 2022

ISBN: 9787519859947

Tutorial book about IoT-based building automation control platform, including Programmable Logic Controller (PLC), Internet-of-Things (IoT), and cloud-based Software-as-a-Service (SaaS). Work carried out at SIEMENS.

Desigo CC Teaser

Siemens Designo CC Building Management System Software and Practice

Huixia Zhao, Jiaxin Han, Kaixuan Zhang, Chao Wang, Lin Feng, Jianqiao Feng, Jian Li, Mingxuan Liu (Author)

China Electric Power Press (CEPP), 2022

ISBN: 9787519853341

Tutorial book about building automation management software (Siemens Desigo CC). Work carried out at SIEMENS.

The V-SLAM Hurdler: A Faster V-SLAM System using Online Semantic Dynamic-and-Hardness-aware Approximation

Mingxuan Liu

Digitala Vetenskapliga Arkivet (DiVA, Master Thesis), 2022

Work done at Ericsson Lund, Sweden.

Project

ORB-SLAM3 Deployment on Underwater Autonomous Vehicle (UAV) SAM in Simulator and Real-world

Mingxuan Liu

Work carried at KTH Royal Institute of Technology supervised by Prof. John Folkesson, 2021

Hot Steel Plate Tracking and Rotation Angle Detection

Mingxuan Liu

Work carried at Technical University of Munich summer school for SMS Group, 2020

Mini Cheetah Robotic Leg Design, and Kinematic and Dynamic Simulation

Mingxuan Liu

Work carried at University of Trento supervised by Prof. Francesco Biral, 2020

MSc-iNCD Teaser

Virtual RGB-D Camera Unity Implementation for Point Cloud Generation

Mingxuan Liu,

Work carried at University of Trento, 2020

Community Service

  • ICLR: Reviewer'2025, 2026
  • NeurIPS: Reviewer'2024
  • ICML: Reviewer'2025
  • CVPR: Reviewer'2024, 2025
  • ICCV: Reviewer'2025
  • ECCV: Reviewer'2024
  • IJCV: Reviewer'2024

Resume

Education

  1. University of Trento

    Nov. 2022 — Present Trento, Trentino-Alto Adige, Italy

    PhD student in Deep Learning and Computer Vision

    Advisor: Elisa RICCI and Zhun ZHONG

  2. University of California, Los Angeles

    Jan. 2025 — Oct. 2025 Los Angeles, California, USA

    Visiting Researcher on automatic urban simulation scene creation from city-tour videos

    Advisor: Bolei ZHOU

  3. KTH Royal Institute of Technology

    Aug. 2021 — Jul. 2022 Stockholm, Sweden

    Master's degree in Intelligent Autonomous Systems

    Grade: A

  4. University of Trento

    Sep. 2020 — Aug. 2021 Trento, Trentino-Alto Adige, Italy

    Master's degree in Mechatronics Engineering

    Grade: 110L/110, Summa Cum Laude

Working Experience

  1. NAVER LABS Europe

    Jun. 2023 — Dec. 2024 Grenoble, Auvergne-Rhône-Alpes, France

    Position: Visiting Researcher at Visual Representation Learning Team

    Advisor: Gabriela CSURKA, Riccardo VOLPI, and Tyler L. HAYES

    Responsibility:

    (1) Improving open-vocabulary and vocabulary-free object detection on handling novel classes;

    (2) Benchmarking Layout2Image diffusion models.

  2. Ericsson

    Jan. 2022 — Jun. 2022 Lund, Sweden

    Position: Master Thesis Intern at the Department of Device Software Research

    Topic: The V-SLAM Hurdler: A Faster V-SLAM System using Online Semantic Dynamic-and-Hardness-aware Approximation

    (1) Approximate spatial computing for Simultaneous Localization and Mapping (SLAM) algorithm on AR/MR devices

    (2) Investigate quantization methods for CNN-based Object Detection (YOLOv4) and Instance Segmentation algorithms (Mask R-CNN)

    (3) Investigate the cloud-device cooperation mechanism of distributed Semantic SLAM based on the approximate computing techniques

  3. Siemens Co., Ltd.

    Jul. 2017 – Jul. 2020 Beijing, China

    Position: Innovation Engineer at Department of Innovation, Smart Infrastructure Division

    Responsibility:

    (1) Conducted innovative application research, market analysis and competitive product analysis in building automation industry with Internet-of-Things (IoT) technology

    (2) Designed software and hardware solution for IoT-based application and demonstrated the solution in a zero to one product definition and development fashion

    (3) Defined features and functions for the innovative products based on the new use cases

    (4) Cooperated with the internal R&D and production departments, third-party partners (OEM manufacturers, value-added partners, system integrators) to develop and deploy innovative products

    (5) Managed and deployed pilot projects of the innovative IoT-based products

Patent

  1. A Method for Using Semantic Hierarchy Trees to Increase the Robustness of OvOD Models

    Mar. 2024 US Patent App. Status: Filed; under processing

    Mingxuan Liu, Tyler L. Hayes, Gabriela Csurka, Elisa Ricci, Riccardo Volpi

Publication

  1. UrbanVerse: Scaling Urban Simulation by Watching City-Tour Videos

    Technical report, arXiv:2510.15018, 2025.

    Mingxuan Liu, Honglin He, Elisa Ricci, Wayne Wu, Bolei Zhou

  2. Knowledge to Sight: Reasoning over Visual Attributes via Knowledge Decomposition for Abnormality Grounding

    WACV, Early acceptance paper, Top 6.4%, 2026.

    Jun Li, Che Liu, Wenjia Bai, Mingxuan Liu, Rossella Arcucci, Cosmin I. Bercea, Julia A. Schnabel, Elisa Ricci

  3. Superpowering Open-Vocabulary Object Detectors for X-ray Vision using Large Language Models

    ICCV, 2025.

    Pablo Garcia-Fernandez, Lorenzo Vaquero, Mingxuan Liu, Feng Xue, Daniel Cores, Nicu Sebe, Manuel Mucientes, Elisa Ricci

  4. Organizing Unstructured Image Collections using Natural Language

    Technical report, arXiv:2410.05217, 2025.

    Mingxuan Liu, Zhun Zhong, Jun Li, Gianni Franchi, Subhankar Roy, Elisa Ricci

  5. Test-time Vocabulary Adaptation for Language-driven Object Detection

    ICIP, 2025.

    Mingxuan Liu, Tyler L. Hayes, Massimiliano Mancini, Gabriela Csurka, Elisa Ricci, Riccardo Volpi

  6. SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection

    CVPR, Highlight paper, 2.8% acceptance rate, 2024.

    Mingxuan Liu, Tyler L. Hayes, Gabriela Csurka, Elisa Ricci, Riccardo Volpi

  7. Democratizing Fine-grained Visual Recognition with Large Language Models

    ICLR, 2024.

    Mingxuan Liu, Subhankar Roy, Wenjing Li, Zhun Zhong, Nicu Sebe, Elisa Ricci

  8. Large-scale Pre-trained Models are Surprisingly Strong in Incremental Novel Class Discovery

    ICPR, Oral paper, 2024.

    Mingxuan Liu, Subhankar Roy, Zhun Zhong, Nicu Sebe, Elisa Ricci

  9. Class-incremental Novel Class Discovery

    ECCV, 2022.

    Subhankar Roy*, Mingxuan Liu*, Zhun Zhong, Nicu Sebe, Elisa Ricci

    * denotes co-first authorship

  10. The V-SLAM Hurdler: A Faster V-SLAM System using Online Semantic Dynamic-and-Hardness-aware Approximation

    DiVA, Master Thesis, 2022.

    Mingxuan Liu

  11. Siemens RWG Control Platform Advanced Course and Practice

    China Electric Power Press (CEPP), 2022. ISBN: 9787519859947

    Jiaxin Han, Huixia Zhao, Kaixuan Zhang, Mingxuan Liu (Deputy Editor-in-Chief & Author)

  12. Siemens Designo CC Building Management System Software and Practice

    China Electric Power Press (CEPP), 2022. ISBN: 9787519853341

    Huixia Zhao, Jiaxin Han, Kaixuan Zhang, Chao Wang, Lin Feng, Jianqiao Feng, Jian Li, Mingxuan Liu (Author)

Grant

  1. OpenAI Researcher Access Program

    Feb. 2024 Awarded $5,000 API credits

    Role: Principal Investigator

  2. Italian SuperComputing Resource Allocation – ISCRA

    Jan. 2022 Awarded $30,000 (8,000 Nvidia DGX GPU Hours). Code: HP10C58YK9

    Role: Principal Investigator

Technical Skill

  • Computer Vision
    90%
  • Deep Learning
    80%
  • Robotics
    70%
  • Natural Language Processing
    50%
  • Simulation (Kinematic, Dynamic, Sensor, Mobile Agents)
    60%
  • Automatic Control
    65%

Language Skill

  • Beijing Dialect
    99%
  • Chinese
    90%
  • English
    80%
  • Italian
    5%

Hobby

  • Bodybuilding
    80%
  • Reading
    85%
  • Playing Piano
    40%
  • Hiking
    90%
  • Cooking
    95%

Publication

Project

ORB-SLAM3 Deployment on Underwater Autonomous Vehicle (UAV) SAM in Simulator and Real-world

Mingxuan Liu

Work carried at KTH Royal Institute of Technology supervised by Prof. John Folkesson, 2021

Hot Steel Plate Tracking and Rotation Angle Detection

Mingxuan Liu

Work carried at Technical University of Munich summer school for SMS Group, 2020

Mini Cheetah Robotic Leg Design, and Kinematic and Dynamic Simulation

Mingxuan Liu

Work carried at University of Trento supervised by Prof. Francesco Biral, 2020

MSc-iNCD Teaser

Virtual RGB-D Camera Unity Implementation for Point Cloud Generation

Mingxuan Liu,

Work carried at University of Trento, 2020

Fun Cards

  • Daniel lewis

    Where am I from?

    I grew up in Beijing. In fact, I have lived in Beijing for 26 years. Funny thing, I've never been to the Great Wall : P

  • Daniel lewis

    Why Trento?

    Come for the mountain, stay for the people.

  • Jessica miller

    What are my sports?

    Bodybuilding and basketball! I love body building and the feeling of precise control over my muscles. Bodybuilding is one of the few things that you can get what you work for via scientific discipline. I used to be a semi-professional body builder back to my bachelor time. Actually, I earned most of my snacking (aka dining-out and clubbing) money by being an online coaching and selling fitness plan during my bachelor.

  • Henry william

    What championships have I won?

    When I was in elementary school, I won the first place of Beijing Fangshan District Toy 4WD Competition. Thereby, I represented Fangshan District in the national Toy 4WD Competition.

  • Henry william

    What is my favorite cartoon?

    Crayon Shin-chan, period.

  • Henry william

    Who is my favorite rapper?

    2Pac is my spirit, Eminem is my life.

Journey Cards

Contact

Contact Form