single-view ambiguity

  • Given a camera and an image, we only know the ray corresponding to each pixel.
    • We don’t know how far away the object the ray was reflected from
    • We don’t have enough constraints to solve for X (depth)

Stereo: given 2 calibrated cameras in different views and correspondences, can solve for X

closers things move more than farther things

depth estimation

  • pictorial cues like familiar objects
    • shading
    • perspective effects
  • everything is easier in projective space
    • the camera maps 3D world to 2D image - coordinate transformation