20110220 computer vision_eruhimov_lecture01

Computer vision for roboticsVictor EruhimovCTO, itseezhttp://www.itseez.com

Why do we need computer vision?Smart video surveillanceBiometricsAutomatic Driver Assistance SystemsMachine vision (Visual inspection)Image retrieval (e.g. Google Goggles)Movie productionRobotics

Vision is hard! Even for humans…

AgendaCamera modelStereo visionStereo vision on GPUObject detection methodsSliding windowLocal descriptors ApplicationsTextured object detectionOutlet detectionVisual odometry

Perspective-n-Points problemP4PRANSAC (RANdomSAmple Consensus)

Stereo: epipolar geometryFundamental matrix constraint

Stereo RectificationAlgorithm steps are shown at right:Goal:Each row of the image contains the same world points“Epipolar constraint”Result: Epipolar alignment of features:12All: Gary Bradski and Adrian Kaehler: Learning OpenCV

Stereo correspondenceBlock matchingDynamic programmingInter-scanline dependenciesSegmentationBelief propagation

Stereo correspondence block matchingFor each block in left image:Search for the corresponding block in the right image such that SSD or SAD between pixel intensities is minimum

Pre- and post processingLow texture filteringSSD/SAD minimum ambiguity removalUsing gradients instead of intensitiesSpeckle filtering

Parallel implementation of block matchingThe outer cycle iterates through disparity valuesWe compute SSD and compare it with the current minimum for each pixel in a tileDifferent tiles reuse the results of each other17

Optimization conceptsNot using texture – saving registers1 thread per 8 pixels processing – using cacheReducing the amount of arithmetic operationsNon-parallelizable functions (speckle filtering) are done on CPU19

Performance summaryCPU (i5 750 2.66GHz), GPU (Fermi card 448 cores)Block matching on CPU+2xGPU is 10 times faster than CPU implementation with SSE optimization, enabling real-time processing of HD images!

Full-HD stereo in realtimehttp://www.youtube.com/watch?v=ThE7sRAtaWU

Applications of stereo visionMachine visionAutomatic Driver AssistanceMovie productionRoboticsObject recognitionVisual odometry / SLAM

Cascade classifierimagefacefacefaceStage 1Stage 2Stage 3Not faceNot faceNot faceReal-time in year 2000!

Object detection with local descriptorsDetect keypointsCalculate local descriptors for each pointMatch descriptors for different imagesValidate matches with a geometry model

SIFT descriptorDavid Lowe, 2004

SURF descriptor4x4 square regions inside a square window 20*s4 values per square region

More descriptorsOne way descriptorC-descriptor, FERNS, BRIEFHoGDaisy

Ways to improve matchingIncrease the inliers to outliers ratioDistance thresholdDistance ratio threshold (second to first NN distance)Backward-forward matchingWindowed matchingIncrease the amount of inliersOne to many matching

Random Sample ConsensusDo n iterations until #inliers > inlierThresholdDraw k matches randomlyFind the transformationCalculate inliers countRemember the best solutionThe number of iterations required ~

Scaling upFLANN (Fast Library for Approximate Nearest Neighbors)In OpenCV thanks to Marius MujaBag of WordsIn OpenCV thanks to Ken ChatfieldVocabulary treesIs going to be in OpenCV thanks to Patrick Mihelich

ProjectsTextured object detectionPR2 robot automatic pluginVisual odometry / SLAM

Object detection exampleIryna Gordon and David G. Lowe, "What and where: 3D object recognition with accurate pose," in Toward Category-Level Object Recognition, eds. J. Ponce, M. Hebert, C. Schmid, and A. Zisserman, (Springer-Verlag, 2006), pp. 67-82. Manuel Martinez Torres, Alvaro ColletRomea, and Siddhartha Srinivasa, MOPED: A Scalable and Low Latency Object Recognition and Pose Estimation System, Proceedings of ICRA 2010, May, 2010.

Keypoint detectionWe are looking for small dark regionsThis operation takes only ~10ms on 640x480 imageThe rest of the algorithm works only with keypoint regionsItseez Ltd. http://itseez.com

Classification with one way descriptorIntroduced by Hinterstoisser et al (Technical U of Munich, EcolePolytechnique) at CVPR 2009A test patch is compared to samples of affine-transformed training patches with Euclidean distanceThe closest patch together with a pose guess are reconstructedItseez Ltd. http://itseez.com

Keypoint classification examplesOne way descriptor does the most of the outlet detection job for us. Few holes are misclassifiedGround holePower holeNon-hole keypoint from outlet imageBackground keypointItseez Ltd. http://itseez.com

Object detectionObject pose is reconstructed by geometry validation (using geomertic hashing)Itseez Ltd. http://itseez.com

Outlet detection: challenging casesShadows

Partial occlusionsItseez Ltd. http://itseez.com

PR2 plugin (outlet and plug detection)http://www.youtube.com/watch?v=GWcepdggXsU

20110220 computer vision_eruhimov_lecture01

More Related Content

What's hot (20)

Similar to 20110220 computer vision_eruhimov_lecture01 (20)

More from Computer Science Club (20)

20110220 computer vision_eruhimov_lecture01