The world of robotics has a dizzying number of subjects; it’s quite overwhelming at first glance to figure out which topics someone “really needs to get” and which topics require a more cursory understanding. Accordingly, this will be the first in a number of posts (“number” being linearly proportional to my motivation) that I will be doing on some of the more fundamental topics within the realm of robotics. We’ll begin our travels with “coordinate frames.” What are coordinate frames you ask…well slow down there fella…let’s first take a step back to figure out the motivations for wanting to ask such a question to begin with.

A “robot” is typically defined (more or less) as an autonomous system which interacts with its environment. Interaction may include actual manipulation of the robot’s environment; this requires some sort of manipulator. In (Siciliano, 2009), a manipulator is described as “a kinematic chain of rigid bodies connected by means of revolute or prismatic joints. One end of the chain is constrained to a base, while an end-effector is mounted to the other end.” This academic explanation can be more easily understood by looking at an AX-12 Smart Robot Arm: the aluminum links are the “rigid bodies,” the AX-12 servos are the “prismatic joints,” and the gripper is the “end effector.” So what’s this have to do with coordinate frames? Well, a challenge in having a robot manipulate its environment is being able to determine and describe the position and orientation (together describing the pose) of the end effector in relation to what needs to be manipulated (and yes, this is very challenging). More specifically, one needs to describe both the pose of the end effector and target object in relation to a reference frame.

When considering an object’s pose within a reference frame, one first needs to know what a reference frame is to begin with. In short, a reference frame is “how the world is oriented”; i.e., which way’s North, South, up, down, etc. To describe the reference frame, and – more importantly – to enable one to provide bearing for where an object is within that frame, the frame is described with three axis: x, y, and (you guessed it) z. Without being oriented, if y points to the right and z points up, which way does x point? To determine this, use a handy trick (no mean for the pun) known as the “right-handed rule” (sorry south paws). To demonstrate, hold out your hand in front of your face like you were about to karate chop a board with your thumb sticking towards your face. If you point your index finger towards y, curl your other fingers towards z, then your thumb will point towards the positive direction of x.

The origin of the frame, the [0, 0, 0] value of the x, y, and z axis is located on an arbitrary, but known, point within the environment or on an object. There can be a frame, and different origin accordingly, for each reference perspective for a given context; each would be known as the coordinate frame of the given context.

For example, suppose one is developing a robot to pick up toys and put them into a toy bin (can you tell I have kids?). In this case, there would likely be three coordinate frames of interest. The first would be a reference frame which would allow one to describe the pose of the toy and the manipulator in relation to that reference frame. For instance, the reference frame could have its origin in the corner of the room and be tied to the orientation of the room itself; a “reference frame” is simply a coordinate frame which does not change pose as other objects move through it. A second coordinate frame would be the coordinate frame from the perspective of the end effector. By applying a separate coordinate frame to the end effector, it’s now tractable to determine not only where the end effector is found in relation to the reference frame, but also how the end effector is oriented in relation to the reference frame and how the pose needs to be modified to reach another pose (with fun stuff techniques like matrix transformations). As the end effector would be moved, its coordinate frame would move with it, figuratively fixed to a point on the end effector. Finally, a third coordinate frame would be that of the toy being picked up; this frame, in relation to the reference frame, would facilitate determining how the end effector’s pose needs to change to be in proper alignment for picking up the toy, taking into account the toy’s pose as well (read, lots more matrix transformations).

When applying a coordinate frame to an object or environment, two decisions must be made:

  • What fixed point on the object or environment should the coordinate frame be applied to? If using RobotIQ’s way cool, underactuated Adaptive Gripper, the coordinate frame might have its origin at the base of the “solo” finger. If talking about an airport, the coordinate frame (in this case the “reference frame”) might have its origin at the base of the control tower.
  • How will the coordinate frame be oriented with respect to the environment or object it is applied to; i.e., which way should x, y, and z point out of the origin? By convention, z would point “up” out of the origin. With the airport example, z would point towards the sky through the control tower. For movable objects, it’s not so straight forward to pick which way is “up.” So when applying a coordinate frame to a movable object (e.g., end effector), what’s most import is to pick a point and an orientation of the frame, with respect to the object its applied to, and keep that pose relationship between the object and the frame fixed as the object changes pose. For example, if a coordinate frame were applied to a mobile robot base, and the robot turns 90 degrees clockwise, then its coordinate frame would turn 90 degrees clockwise with it while the reference frame would remain static.

There is certainly a ton more to coordinate frames, manipulating them, comparing them to each other, and transforming them, than the light introduction provided here, but this should at least assist in removing a deer in headlights look if you’re unfamiliar with this term and someone brings it up in conversation during a cocktail party…which always happens.

Billy McCafferty


Siciliano, B. 2009. Robotics: Modelling, Planning and Control.