1Edges and Thresholding

1.1 Please state for each of the statements below whether they are true(+) or false(-).

Symmetric filters are associative even when correlation instead of convolution is used.
Any edge detection filter needs to have zero-crossings as a requirement, i.e. positive as well as negative values in the filter mask.
A normalized histogram is a probability mass function.
To increase the quality of Otsu's thresholding algorithm blockprocessing can be used. In the limit (blocks of size 1x1) this is equal to a locally adaptive thresholding with a larger mask.

2Features

2.1 Given the two image patches A and B shown below.
Please state for each of the statements below whether they are true(+) or false(-).

The correlation between the two patches is 27.
Patch B could be directly used as a 3x3 box filter.
As the normalized cross-correlation between two image patches is the angle between the vectorized versions of these patches the normalized cross-correlation cannot be smaller than 0.
If the zero-mean normalized cross-correlation between two patches is -1, then one patch is an exact copy of the other patch up to a shift and scaling factor.

2.2 Please state for each of the statements below whether they are true(+) or false(-).

To get a good estimate of the dominant gradient direction of a feature one simply computes the gradient directions around the feature location and takes the average of those.
A feature detector detects features based on a feature descriptor.
Let m1 be the nearest neighbor feature (i.e. best match) for a SIFT feature in an image I. The second nearest neighbor feature minimizes the distance in pixel coordinates from m1 in image I compared to all other found features in I.
If it is more important to not miss any feature correspondences than having fewer but correct feature matches you should prefer an algorithm with a high precision instead of a high recall.

3Filtering

3.1 Please state for each of the statements below whether they are true(+) or false(-).

If we only increase the image resolution in our camera the relative noise per pixel increases.
The phase in the Fourier transform of an image represents the shift in the intensity direction.
The pixel index in the discrete Fourier transform describes the number of oscillations of the represented trigonometric function in x- and y-direction.
Any discrete linear filter is separable.

4ImageAcquisition

4.1 Please state for each of the statements below whether they are true(+) or false(-).

The highest recordable frequency for an image depends only on the image resolution. The higher the resolution the more details are captured.
You record the same scene twice, once with an exposure time of 1/5th of a second (image A) and once with a 1/10th of a second (image B). As image intensity scales linearly with exposure time the mean value of image A will always be twice as high as of image B.
The signal-to-noise-ratio (amount of noise in comparison to the image intensity) increases with increased exposure time.
If you encode a RGB HDR image as an 8-bit per channel RGBA image from which you can recover the HDR values as (R*A, G*A, B*A), this gives you the same precision as an 16-bit per channel RGB image as one can span the full range from 0 (choosing A=0) to 216-1 (choosing R,G,B=255 and A=255).

5Machine Learning

5.1 Please state for each of the statements below whether they are true(+) or false(-).

Given are two networks A and B with the same number of input and output nodes and one hidden layer. The hidden layer of A is a convolution layer with a 3x3 filter, the hidden layer of B is a fully connected layer. Network A has 9 times more connections between the nodes than network B because of the 3x3 filter.
A neural net can get stuck in a local minimum during training.
If you could prove that your net found a global optimum during training. Does that mean that your net found the best possible solution for the problem you are trying to solve?
A rectifier (ReLu or Rectified Linear Unit) maps the value -5 to 5.

6Optical Flow

6.1 Please state for each of the statements below whether they are true(+) or false(-).

Assume two cameras C1 and C2 took images of the same scene but different positions at distinct time steps t1,1 and t1,2 for camera C1 and t2,1 and t2,2 for camera C2, with t1,1 < t1,2 and t2,1 < t1,2 < t2,2. Assuming the camera parameters and the apparent motion between the two images for each camera are given, it is then possible to reconstruct the 3D position of each point in the scene visible in both cameras and all images.
If the flow field F between two images I1 and I2 is optimized according to the brightness constancy assumption I1(x,y) = I2(x+u, y+v) with (x,y) being the pixel position and (u,v) the flow field value at pixel (x,y), then it is not possible from this information alone to know if the flow field encodes a forward flow from I1 to I2 or a backward flow from image I2 to I1.
Given two successive images I1 and I2 in a video. Forward warping computes the flow field from image I1 to I2 and backward warping from I2 to I1.
Given points p1=(1,2,3) and p2=(4,2,6). Let p1 be the position of the linear interpolation at \alpha = 0 and p2 at \alpha = 1. The interpolated position at \alpha=1/3 is (2,2,4). (\alpha stands for the greek letter alpha)

7Parametric Interpolation

7.1 Given two line segments L1 and L2 in R2. L1 is defined by the two points l11 = (1,1) and l12 = (2,2). L2 is defined by the two points l21 = (1,1) and l22 = (4,3). L1 and L2 are related by a similarity transform M=T*S, where T is a translation with the translation components tx and ty in x and y direction respectively and S a scaling with the scaling components sx and sy respectively.