Description
Scale Invariant Feature Detection and Image Filtering
Computer Vision
Announced: 111/03/04 (Fri.)
Outline
Part 1: Scale Invariant Feature Detection
• Implement Difference of Gaussian
Part 2: Image Filtering
• Implement bilateral filter
• Advanced color-to-gray conversion
Part 1:
Difference of Gaussian
Gaussian Blur
Difference of Gaussian Filter
Gaussian Pyramid
Find local extremum
Implementation
In DoG.py
• You should do gaussian blur with corresponding sigma value. In the second octave, you should down sample the forth blurred image (fifth image) in the first octave as the base image.
• You should subtract the second image (less blurred one) to the first image (more blurred one) to get DoG
• Threshold the pixel value and find the local extremum
Assignment Description
• part1/eval.py
• DO NOT Modify!
• part1/main.py
• Read image, execute DoG, visualize results for report, … etc.
• part1/DoG.py
• Follow the instructions and implement Difference of Gaussian.
• The output format should be np.array with its shape (x, 2)
Assignment Description
• Recommended steps
• Implement Difference of Gaussian in DoG.py
• Use eval.py to evaluate your DoG.py
• By
• Your Result needs to match Ground truth
• Finish remaining code in main.py if needed
Supplementary:
Advanced Color-to-Gray Conversion
Color Conversion
• RGB2YUV
• Read https://en.wikipedia.org/wiki/YUV for more details
• Many vision systems only take the Y channel (luminance) as input to reduce computations
RGB to Gray
Problems
What happened?
• Dimensionality reduction
• Another view:
• The conversion is actually a plane equation! All colors on the same plane are converted to the same grayscale value.
Finding a better conversion
• The general form of linear conversion:
• Let’s consider the quantized weight space
• For example:
• Given a color image, a set of weight combination corresponds to a grayscale image candidate.
• We are going to identify which candidate is better!
Yibing Song, Linchao Bao, Xiaobin Xu, and Qingxiong Yang. Decolorization: is rgb2gray() out?. InSIGGRAPH Asia2013 Technical
Briefs
• Joint bilateral filter (JBF) as the similarity measurement
• Joint bilateral filter (JBF) as the similarity measurement
• Find local minimum
• The actual weight space looks like this:
Part 2: Image Filtering
Bilateral Filter
• Given input image 𝐼 and guidance 𝑇, the bilateral filter is written as:
′ = σ𝑞∈Ω𝑝 𝐺𝑠 𝑝, 𝑞 ⋅ 𝐺𝑟 𝑇𝑝, 𝑇𝑞 ⋅ 𝐼𝑞
𝐼𝑝
σ𝑞∈Ω𝑝 𝐺𝑠 𝑝, 𝑞 ⋅ 𝐺𝑟 𝑇𝑝, 𝑇𝑞
• 𝐼𝑝: Intensity of pixel 𝑝 of original image 𝐼
• 𝐼𝑝′: Intensity of pixel 𝑝 of filtered image 𝐼′
• 𝑇𝑝: Intensity of pixel 𝑝 of guidance image 𝑇
• Ω𝑝: Window centered in pixel 𝑝
• 𝐺𝑠: Spatial kernel
• 𝐺𝑟: Range kernel
Bilateral Filter
• For the spatial kernel 𝐺𝑠:
• For the range kernel 𝐺𝑟:
• If 𝑇 is a single-channel image:
• If 𝑇 is a color image:
• Pixel values should be normalized to [0, 1] (divided by 255) to construct range kernel.
• part2/main.py
• Read image, execute joint bilateral filter, read setting file, select the best grayscale conversion… etc.
• part2/JBF.py
• Implement joint bilateral filter
• part2/eval.py (DO NOT Modify!)
• Evaluate the correctness of the output of joint bilateral filter
• TAs will run this file to score upload code.
• When testing your code, we will assign different arguments, like 𝜎𝑠 and 𝜎𝑟, and corresponding ground truth file.
• part2/testdata/
• One example image with bf and jbf ground truth
• Two images with respective setting files
• Setting file gives 𝜎𝑠, 𝜎𝑟 and five kinds of gray conversion
• You need to use those five and also original cv2 gray conversions (six in total) as guidance to run joint bilateral filter and compute the perceptual similarity.
• Refer p24 and p25 for details (we use L1-norm as our cost function).
• Note: need to cast the image into np.int32 to avoid overflow for subtraction.
• Recommended steps
• Implement joint bilateral filter in JBF.py
• Use eval.py to evaluate your JBF.py
• By
• The error of bilateral and joint bilateral filter should be both 0
• Finish remaining code in main.py if needed
• Improve the inference speed of joint bilateral filter
• About the speed test of JBF…
• For fair comparison, you CAN ONLY use basic functions (e.g. cannot use cv2.filter2D, cv2.GaussianBlur) in JBF.py
• Cython, multi-thread and GPU acceleration is forbidden.
• Intel Core i7-6800K CPU + 128 GB RAM ⇒ ~1.28 sec
• Some useful tips
• Build look-up-table for both spatial and range gaussian kernels
• Reduce the usage of for-loop to enhance parallel processing
• We only use one for-loop (in range(1, window_size**2)) in entire bilateral filtering
Package
• Python 3.6+
• Python standard library
• Numpy 1.21.1
• Opencv-python 4.5.1
• https://docs.python.org/3.7/library/
Submission
• Directory architecture:
+ R07654321/
– DoG.py
– JBF.py
– report.pdf
• Put all above files in a directory (named StudentID) and compress the directory into zip file (named StudentID.zip)
• e.g. After TAs run “unzip R07654321.zip”, it should create one directory named “R07654321”
• Submit to NTU COOL
• Late policy:
http://media.ee.ntu.edu.tw/courses/cv/22S/hw/delay_policy.pdf
• Do NOT copy homeworks (code and report) from others
Report
• Your student ID, name
• Part1: Difference of Gaussian
• Plot 8 DoG images descripted in page.6 with threshold 5 (4%)
• Use three thresholds (2, 5, 7) on 2.png and plot, then describe the difference (5%)
• Part2: Joint bilateral filter
• For 1.png and 2.png:
• Report the cost for each filtered image (by using 6 grayscale images as guidance) (1%+1%)
• Show original RGB image / two filtered RGB images and two grayscale images with highest and lowest cost (five images in total for each input image) (2%+2%)
• Describe the difference between those two grayscale images. (5%+5%)
• Describe how you speed up the implementation of bilateral filter.
(5%)
Grading (Total 15%)
• Part 1 Code: 30%
• 0%, others
• Part 2 Code: 30%
• 30%, runs within 5 mins and no error (both bf and jbf error = 0)
• 0%, others
• Report : 30%
• Part 2 Inference time: 10%
• 10%, Top ~ 20%
• 6%, 20% ~ 50%
• 3%, 50% ~ 80%
• 0%, 80% ~
• Kai-Siang Yang (楊凱翔)
E-mail: siangyang@media.ee.ntu.edu.tw
Location: 博理 421
• Chih-Ting Liu (劉致廷)
E-mail: jackieliu@media.ee.ntu.edu.tw
Location: 博理 421




Reviews
There are no reviews yet.