Tuesday, 3 July 2012

Important techniques in Vision for Robotics

Some of the key computer vision skills required for Humanoid Robotic Vision from my point of understanding are:

1.) Hough transforms to detect line/planes/contours etc.

2.) Perspective vision (Especially if we use two cameras for (eye-like) vision, and for depth mapping).

3.) Block matching for correcting the left and right side vision problems.(Jut close one eye and you'll find the image significantly different from when its two eyes, to simulate this kind of correction for a computer is a pretty difficult task and is ongoing current research efforts.)

4.) Calculating odometry information by calculating the kinematic transforms for each step the robot takes. This is necessary for keeping balance without a special sensor for balance.

5.) Calculating the floor planes from the visual information using standard 3-D reconstruction techniques and/or depth maps.

6.) Creating an occupancy grid once we have made the floor plane and clustered it into object blocked clusters and free clusters. 

7.) Making an efficient algorithm to plan paths based on the Occupancy Grid. (Maybe A* or Dijkstra's or something)

Monday, 2 July 2012

No Kinect for External Environments: Why?

Here's why Skeletal Tracking Algorithms used by Kinect does work so well in the external environments:

Kinect sensor isn't suited for outside/external environments. It is based off skeletal tracking. Now here's how skeletal tracking works: 

For the depth image we:
1) Thresholding on the depth image to extract the foreground from the image.
2) Noise is removed using morphological operations of erosion and dilation.
3) Further small blob removal is done to get a clean foregound segmented image.
4) This helps us focus on only the subject in the image and calculate the Extended Distance Transform.

For the RGB image we:
1) Face and upper body detection.
2) Skin Segmentation.
3) Arm Fitting

For the face and upper body detection, we use hear features in the viola-jones face detection algorithm. The hear features used are edge detectors, line features and centre surround features.

Problems in the external environment are:
1) Skeletal tracking doesn't work for non-humans.
2) Gestures/Human/environment movements cannot be tracked as easily using skeletal tracking when the perspective view is different or rather say sideways etc. A problem of the extended distance transform used.
3) Its far more difficult to morphologically clean up an image in the wide variety of external environments we can think of.

Ref: http://home.iitk.ac.in/~akar/cs397/Skeletal%20Tracking%20Using%20Microsoft%20Kinect.pdf

Saturday, 2 June 2012

Histogram Equalization and Entropy

Some interesting insights towards the Histogram Equalization Question in Image Processing Exam:


The question was: Does histogram equalization lead to an increase or decrease in entropy of the image?

1) If first order entropy(acc toThiruvikraman Kandhadai sir) or zero order entropy (acc to wikipedia) is taken into consideration, two cases arise:
1)a) If you consider round off errors in the practical case(the most relevant case out of the three) then the entropy of the image decreases after Histogram equalization.
1)b) If you do not consider round off errors, the entropy of the image remains constant before and after Histogram Equalization.

2) If you consider higher orders of entropy in addition to zero/first order entropy then the entropy of an image increases after histogram equalization.

Why does this happen?

Case b) When we plot a histogram of pixel values, we get columns having number of pixels in image with that particular gray-scale value. Let us call these columns bins. Consider an 8 bit coded image = 256 levels of gray.(not the amount of memory in image) Suppose we have N number of partially filled bins, then we have 256-N number of empty bins. Since histogram equalization is a one-to-one mapping, the number of partially filled bins is not allowed to change and so it remains at N. Now the formula for first/zero order entropy is

H = cumsum( P(aj)log(P(aj)))

P(aj) for that particular gray value does change, but the same P(aj) value is taken up by another gray value. Since its a one to one mapping, and we obviously cannot split the partially filled bins the probability list of grayscale values remains the same.

Since P(aj) effectively does not change then Entropy does not change.

b) If you consider round off errors then there might be a case where the round off error causes the histogram equalization function to change from one to one mapping. One of the higher probability bins might lose a pixel to have another lower probability partially filled bin to take up that lost pixel. In this case the probability decreases effectively because of the log(P(aj)) term which penalizes higher probability terms more than lower probability term.

3) Higher order entropy terms will cause the overall Entropy term to increase. However, you have to consider inter-pixel redundancy and this makes calculations a bit more complicated.

After some clarifications with Sir, I've found that he's going to give marks according to the assumptions taken and relevant explanation.

Have fun!!

Saturday, 26 May 2012

Image Segmentation Basics

Image Segmentation Basics:

There are two types of Image Segmentation:
a) Discontinuity Based Approach
b) Similarity Based approach

In the discontinuity based approach we try to segment the image with respect to isolated points, lines and edges. Edge detection algorithms such as sobel, prewitt , canny, lapalcian, LoG operators all fall under this category. For line detection hough transforms are used. For isolated points there are a variety of problem specific methods used.

In the similarity based approach there are several possible methods that can be used all of which give differing results.
1) Thresholding methods
2) Region Growing methods
3) Region Splitting and Merging methods

Thresholding is simply classifying a similar property(say color frequency or color intensity) and segmenting the images into parts or segements based on this property.(Degrees of freedom = 1 since you can only depend on 1 property for similarity check)

Region Growing methods involve taking seed points and then looking at adjacent pixels then classifying those pixels to be part of the segment or not by comparing some property. K-NN is an example of a region growing algorithm.(Degrees of freedom = 2 with the second degree dependent on the space orientation of the image.)

Region Splitting and Merging methods involve a tree based approach to the segmentation problem. First divide an image into k regions and then divide each of the k regions into l regions. Then finally merge regions with similar properties. (Degrees of freedom = 3 because you can split for the first time based on property a, second time based on property b and then merge according to property c)