Vision-Guided Robotic Arm with Natural Language Control
Overview
The system lets you control a robotic arm using plain English. For example, type "pick up the blue cube," and the arm finds it, reaches for it, and picks it up. It can also place objects down when prompted.
How It Works
It starts with the Orbbec Gemini 2 RGB-D camera, which captures a color image and depth map of the scene. That data gets processed into a 3D point cloud, where objects on the table are detected and assigned object IDs.
When you send a command, GPT-5.2 first figures out whether you want to pick something up, place something back, or if it needs clarification. If you're placing something back, it reuses the position and grip width from the last successful pick to put the object back where it came from.
If you're picking something up, an image capture from the camera and a scene summary gets sent to GPT-5.2 along with your command. Instead of hardcoding logic like "blue cube = obj_2," the model figures out which object you're referring to and returns a structured response with the target ID and grasp mode.
The GPT model just picks a target object, it doesn't actually drive servos directly.
After the target is locked in, the system uses its 3D position to plan a grasp approach and solve the arm's inverse kinematics, basically figuring out what angles the joints need to be at to actually reach the object, before sending those joint angle commands to the Arduino, which then generates the PWM signals that move the servos.
The Arm
The arm has 4 degrees of freedom (base, shoulder, elbow, and wrist), driven by four joint servos connected to a PCA9685 PWM driver. The gripper runs on a fifth smart servo connected to a Serial Bus Servo Driver Board. An Arduino Mega handles the low-level servo control, receiving motion commands from my Mac.
Media
*Note: Sorry about the background buzzing noise in the videos, it’s the sound of the joint servos trying to stay in position.*
The gripper design is a modified version of the gripper used in this SO-ARM100 design. I just had to make a couple minor design changes and modify the sizing before 3D printing.
I used servo mounts and U-brackets to connect the joint servos and build out the arm. The arm’s base is just a whiteboard I drilled into and screwed the base servo on with some bolts.