Local Minimum Problem In Motion Estimation


Motion Estimation: In Video Codecs, ME is defined as searching the best motion vector (MV), which is the displacement of the coordinate of the best similar block within a given search range in previously-encoded video frames for a block in the current video frame.

Matching Error: Is the sum of absolute differences (SAD) between the reference and the candidate block.

The search range for H.264 video codec for High Definition is resolution very high and is -128 to +127 resulting huge computations. So, Motion Estimation consumes 70% of the cycles of the total video encoder. Much research work was done in coming up with fast motion estimation and many papers were published. I am going to put a post on my study of those papers soon. Before that I wanted to highlight the problem called local minimum problem that arises with the common assumption made by the authors of the papers to reduce the number of search points within the given range.

Assumption [1]: The matching error surface inside the search window is unimodal so that the matching error decreases monotonically as the searched point moves closer to the global optimum.

This means if we compute the matching error at all the points in the search range and plot the error, the error surface should have single minimum (which gives lowest error- the optimum/best match) at one point and the error should reduce monotonically when we move close to this optimum point.

But if we look at the error surface for integer pixel motion estimation (IME) given in the figure-a [1], the surface not has single minimum point. There multiple local minimum points within the search range. So there is large chance that if the full search was not conducted, the search will be trapped into one of these local minimum points which may not be the optimum minimum point. As the search range increases (for HD), the probability of this trapping into wrong local minimum point increases. But for the fractional pixel motion estimation (1/8-pixel case was shown in figure), the assumption hold good.

[1] Fast Integer Pel and Fractional Pel Motion Estimation for JVT: JVT-F017

Comments