UAV Claw

Xiangyu Wang, Donglin Yang, Wenhao Zheng, Zimu Tang, Zhibo Zhang, Yantong Zhong, Xiangyi Zheng, Qinan Liao, Si Liu

An embodied aerial agent for real-world autonomy.

UAV Claw is a project that brings language-driven agents from the screen into the physical world through an autonomous aerial platform. Built on the integration of OpenClaw for task reasoning, an MCP-based tool orchestration layer, and a vision-language-action control loop, UAV Claw enables a drone to understand natural-language instructions, perceive its environment, and execute multi-step tasks in open 3D space.

The system is designed to move beyond manual piloting and fixed waypoint scripts. Given an open-ended command, UAV Claw can decompose the goal into actionable steps and coordinate perception, localization, flight control, and camera operation within a unified framework. This makes it suitable for real-world scenarios such as inspection, search, reconnaissance, and agile flight, highlighting a practical path toward embodied aerial intelligence.

面向真实世界自主任务的空中具身智能体。

UAV Claw 致力于将原本停留在数字空间中的语言智能体,扩展到真实物理世界中的自主空中平台。系统融合了 OpenClaw 的任务推理能力、基于 MCP 的工具编排框架,以及视觉-语言-动作闭环控制能力,使无人机能够理解 自然语言指令、感知周围环境,并在开放三维空间中自主完成多步骤任务。

相比依赖人工遥控或预设航点脚本的传统方式,UAV Claw 可以根据开放式指令自动拆解任务目标,并在统一框架下 协调感知、定位、飞控与相机操作。这使其能够应用于巡检、搜索、侦察和敏捷飞行等真实场景,展示出空中具身 智能从概念走向落地系统的实际路径。

Project demonstration