About me
Ciao! I am a final-year PhD student in Deep Learning and Computer Vision at the University of Trento, under the joint supervision of Prof. Elisa RICCI and Prof. Zhun ZHONG.
Recently, I completed my research visit at the University of California, Los Angeles (UCLA), where I worked on automatic urban simulation scene creation from city-tour videos under the guidance of Prof. Bolei ZHOU.
From June 2023 to December 2024, I also spent 18 wonderful months as a Visiting Researcher at NAVER LABS Europe exploring open-vocabulary object detection supervised by Gabriela CSURKA, Riccardo VOLPI, and Tyler L. HAYES, in the team led by Diane Larlus.
Before starting my PhD, I earned two Master’s degrees—Summa Cum Laude—from KTH Royal Institute of Technology (Sweden) in Intelligent Autonomous Systems and from the University of Trento (Italy) in Mechatronics Engineering. I also spent three years as an Innovation Engineer at SIEMENS Smart Infrastructure Division, designing IoT-based automation solutions.
🚨On the Job Market:🚨 I’ll be graduating in Spring 2026 — if you’re working on the future of practical open-world machines and multimodal foundation models, let’s chat!
The Research I Like & Do: Training open-world machines to see, understand, and reason about our chaotic visual, semantic, and physical world — Going Better and Wilder!
-
Knowledge Discovery
Uncovering structured and interpretable semantic and geometric knowledge from real-world visual data.
-
Open-vocabulary Recognition
Empowering users to freely define their vocabulary to localize, segment, and recognize objects of interest within image streams in a zero-shot workflow
-
Vision and Language
Leveraging language as a medium to enhance vision tasks, delivering improved interpretability and interactivity in more real-world scenarios.
-
Urban Embodied AI Simulation
Parsing our chaotic 3D semantic world from city-tour videos and turning it into high-fidelity urban simulation worlds for embodied AI learning.
Things I'd like to share
Selected Publications
UrbanVerse: Scaling Urban Simulation by Watching City-Tour Videos
Technical report, arXiv:2510.15018, 2025
Organizing Unstructured Image Collections using Natural Language
Technical report, arXiv:2410.05217, 2025
SHiNe: Semantic Hierarchy Nexus for Open-vocabulary Object Detection
The IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR, Highlight paper, 2.8% acceptance rate), 2024
Democratizing Fine-grained Visual Recognition with Large Language Models
International Conference on Learning Representations (ICLR), 2024
Large-scale Pre-trained Models are Surprisingly Strong in Incremental Novel Class Discovery
International Conference on Pattern Recognition (ICPR, Oral paper), 2024
Class-incremental Novel Class Discovery
European Conference on Computer Vision (ECCV), 2022
Siemens RWG Control Platform Advanced Course and Practice
China Electric Power Press (CEPP), 2022
ISBN: 9787519859947
Tutorial book about IoT-based building automation control platform, including Programmable Logic Controller (PLC), Internet-of-Things (IoT), and cloud-based Software-as-a-Service (SaaS). Work carried out at SIEMENS.
Siemens Designo CC Building Management System Software and Practice
China Electric Power Press (CEPP), 2022
ISBN: 9787519853341
Tutorial book about building automation management software (Siemens Desigo CC). Work carried out at SIEMENS.
The V-SLAM Hurdler: A Faster V-SLAM System using Online Semantic Dynamic-and-Hardness-aware Approximation
Digitala Vetenskapliga Arkivet (DiVA, Master Thesis), 2022
Work done at Ericsson Lund, Sweden.
Project
ORB-SLAM3 Deployment on Underwater Autonomous Vehicle (UAV) SAM in Simulator and Real-world
Work carried at KTH Royal Institute of Technology supervised by Prof. John Folkesson, 2021
Hot Steel Plate Tracking and Rotation Angle Detection
Work carried at Technical University of Munich summer school for SMS Group, 2020
Mini Cheetah Robotic Leg Design, and Kinematic and Dynamic Simulation
Work carried at University of Trento supervised by Prof. Francesco Biral, 2020
Virtual RGB-D Camera Unity Implementation for Point Cloud Generation
Work carried at University of Trento, 2020
Community Service
- ICLR: Reviewer'2025, 2026
- NeurIPS: Reviewer'2024
- ICML: Reviewer'2025
- CVPR: Reviewer'2024, 2025
- ICCV: Reviewer'2025
- ECCV: Reviewer'2024
- IJCV: Reviewer'2024