Skip to content

Computer Vision and ROS

Basheer Subei edited this page Apr 20, 2015 · 12 revisions

ROS/Images/OpenCV

Useful links

  • image_pipeline is the stack for dealing with ROS Image messages. The packages below are part of image_pipeline:

    1. image_view
      • Package used to view ROS Image messages.
    2. image_proc
      • Package used to process images (often takes input from camera driver or rosbag) and undistorts it. See stereo_image_proc for the stereo equivalent.
    3. camera_calibration
      • Package used to calibrate the camera (to get intrinsic parameters).
      • After calibrating the camera, the camera node (i.e. usb_cam) will take care of publishing the calibration information for that camera on the /camera_info topic along with the raw image topic.
    • Intrinsic camera parameters are those that are independent of the camera position or orientation. They are essentially focal length and the distortion coefficients.
    • Extrinsic camera parameters are those that describe the "coordinate system transformations from 3D world coordinates to 3D camera coordinates." In other words, how to turn the pixels in an image to points in 3D space.
  • image_transport is the hidden layer underneath nodes that use ROS messages. Essentially, it takes care of publishing all the extra "topics" for the different compression formats (image_transport doesn't use ROS topics, but something very similar and transparent for the user).

    • ROS Image messages can be in RAW format (not encoded or compressed), or they can be compressed (in JPEG format mostly), in which case they would be CompressedImage ROS message types. In order to take care of all this nonsense compression magic, image_transport is used as a hidden layer underneath nodes that use ROS Image messages (such as image_view or image_proc).
  • Notes on Images in ROS

  • Besides the image_resizer we have, you can also compress/decompress images in rosbags by playing them and recording them at the same time using the different transport (compressed or not). Also, check out this image_compressor node (this one uses PNG compression).

Going from Image coordinates (pixels) to World coordinates (x,y,z relative to camera)

There's two approaches I can think of:

  1. This answer here. The main idea is that given an intrinsic camera matrix (obtained from calibration), we can multiply the inverse intrinsic matrix by the pixel coordinates to get a normalized x,y,z coordinate (think of this as the 3d Ray coming out from the center of the camera), which we will intersect with the ground plane (fixed known height). We can then solve for actual x and y coordinates.

  2. Using a library (image_geometry) function called projectPixelTo3dRay(uv), which will give us that 3D ray. Then we intersect it with our ground plane.

Cameras and Lenses

General Overview

ignore this section for now

First Approach

  • Stereo camera driver node publishes raw image topics.
  • stereo_image_proc subscribes to these images and publishes a pointcloud of everything.
  • Some node listens to pointcloud and separates ground points and above-ground points.
  • The ground points are thrown into line_detection, which has to figure out which pixels these points came from (reconstruct an image of the ground), and then find the lines in that ground image. Once the lines are found in the image, we pick the points that correspond to these line pixels and publish those points into the costmap
  • The above-ground points are thrown into costmap as barrels and obstacles. diagram of first method

Second Approach

  • Stereo camera driver node publishes raw image topics.
  • line_detection subscribes to raw image and publishes image with only lines. stereo_image_proc1 then subscribes to the line images and publishes line pointcloud to costmap (line layer).
  • in parallel, stereo_image_proc2 also subscribes to raw images and publishes pointcloud of everything. A ground_chopper node subscribes to this pointcloud and chops off ground (publishes above ground pointcloud to costmap obstacle layer) diagram of second method
Clone this wiki locally