DLCV

Description

5/5 – (1 vote)

Problem 1:
Semantic Segmentation
Outline
• Homework Introduction
• Homework Policy
• Others
Outline
• Homework Introduction
• Homework Policy
• Others
Homework 2 Problem 1
• In this problem, you will need to implement two semantic segmentation models and answer some questions in the report. oA baseline model
§ Implement a specific model to perform semantic segmentation.
§ The performance of the baseline model should pass the simple baseline. oAn improved model
§ Base on the baseline model, you can design you own model or implement some existing models (FCN, SegNEt, Unet …).
§ The performance of the improved model must be better than the baseline model on the validation set.
Semantic Segmentation
• Semantic segmentation aims at classifying each pixel in an image to a pre-defined class.

Input: RGB Image Output: Segmentation
(Grayscale image: each pixel value is the class this pixel belong to.)
[ref1] Image from Pascal VOC dataset. http://host.robots.ox.ac.uk/pascal/VOC/
Evaluation
• Evaluation metric: mean Intersection over Union (mIoU) score oFor each class, IoU is defined as:
True positive/ (True positive + False Positive + False Negative) omIoU is calculated by averaging the IoU over all classes

Dataset
• Class (9 classes in total):
oClass 0: background oClass 1: person oClass 2: aeroplane oClass 3: bus oClass 4: tv/monitor oClass 5: horse oClass 6: dog oClass 7: cat oClass 8: car
• Data Split:
oTraining set: 5460 image-segmentation pairs oValidation: 500 image-segmentation pairs oTest: 500 (Only TAs have the test data)
• Image size oInput RGB image: 352x448x3 oGround truth segmentation: 352×448
• Note: You should not use the validation set to train your model. Anyone using validation set to train their model will get zero points in this homework.

Download dataset: https://drive. google. com/file/d/1wFDqKvaV7kmdlJMuhM2VOq4kQzVV5AIy/view? usp = sharing
Baseline Model – Overview
• Use resnet18 as backbone (or called feature extractor) and use some transpose convolutional layers and convolution layers to build a model.
• This model must pass the simple baseline.
per-pixel argmax
Nx512 Nx256 Nx128 Nx64 Nx32 Nx16 N
x352x448 x11x14 x22x28 x44x56 x88x112 x176x224 x352x448 x352x448 x352x448
feature maps with shape = convolution layer
(batch_size, #channel, height, width)(kernel_size=1, stride=1, padding=0, bias=True)
transpose convolution layer
(kernel_size=4, stride=2, padding=1, bias=False)ReLU
Baseline Model
Input size Output size Layer type Layer parameters
kernel size stride padding bias
(Nx3x352x448) (Nx512x11x14) resnet18 – – – –
(Nx512x11x14) (Nx256x22x28) Transpose conv 4 2 1 False
(Nx256x22x28) (Nx256x22x28) ReLU
(Nx256x22x28) (Nx128x44x56) Transpose conv 4 2 1 False
(Nx128x44x56) (Nx128x44x56) ReLU
(Nx128x44x56) (Nx64x88x112) Transpose conv 4 2 1 False
(Nx64x88x112) (Nx64x88x112) ReLU
(Nx64x88x114) (Nx32x176x224) Transpose conv 4 2 1 False
(Nx32x176x224) (Nx32x176x224) ReLU
(Nx32x176x228) (Nx16x352x448) Transpose conv 4 2 1 False
(Nx16x352x448) (Nx16x352x448) ReLU
(Nx16x352x448) (Nx9x352x448) conv 1 1 0 True
Baseline Model –Resnet18
• You do not have to implement resnet18 by yourself. Pytorch has implemented it for you.
https://pytorch.org/docs/stable/torchvision/models.html
• You can also load the Imagenet pre-trained weight and bias.
Report
1. Baseline model
1. Describe how you pre-process the data. (5%) (Any data augmentation technique used? Do you normalize the data?)
2. Show the following two figures:
1. Training loss versus number of training iterations (Y coordinate: training loss. X coordinate: number of iterations.) (5%)
2. IoU score on validation set versus number of training iterations (Y coordinate: IoU score on validation set. X coordinate: number of epochs.) (5%)
3. Visualize at least one semantic segmentation result for each class. (5%)
4. Report mIoU score and per-class IoU score of the baseline model. Which class has the highest IoU score? Which class has the lowest IoU score? Please also hypothesize the reason why. (10%)
2. Improved model (If the mIoU of your improved model is worse than that of the baseline model, you will get at most 20 points in this part.)
1. Draw the model architecture of your improved model. (5%)
3. To prove that your improved model is better than the baseline one, report the mIoU score of your improved model. Please also show some semantic segmentation results of your improved model and the baseline model. (10%)
Model Performance (40%)
• On the validation set oSimple baseline (15%): 0.627 oStrong baseline (5%): 0.701
• On the test set:
oSimple baseline (15%): 0.659 oStrong baseline (5%): 0.718
• TAs will execute your code to check if you pass the baseline.
Tools
• mIoU:
oWe provide the code to calculate mIoU score. oUsage:
python3 mean_iou_evaluate.py <-g ground_truth_directory> <-p prediction_directory>
• Visualization:
oWe provide the code to draw semantic segmentation map on RGB image. oUsage
python3 viz_mask.py < –img_path path_to_the_rgb_image> < –seg_path path_to _the_segmentation>
Outline
• Homework Introduction
• Homework Policy
• Other
• Late policy : Up to 3 free late days in a semester. If you are running out of late day points, late hand-ins will incur a 30% penalty per day.
• Don’t use additional data in this homewrok. (Using Imagenet pretrained model is allowed.)
Submission
• DLCV2019FALL/hw2 on your GitHub repository should include the following files :
ohw2_YourStudentID.pdf ohw2.sh (for baseline model) ohw2_best.sh (for improved model) oyour python files (e.g., Training code & Testing code) oyour model files (can be loaded by your python file)
• Don’t upload your dataset.
• If any of the file format is wrong, you will get zero point.
Submission cont’d
• TAs will execute hw2.sh and hw2_best.sh to reproduce the mIoU score in your report on the validation set.
Trained Model
• If your model is larger than GitHub’s maximum capacity (100MB), you can upload your model to another cloud service (e.g., Dropbox). However, you script file should be able to download the model automatically. • https://drive.google.com/file/d/1XOz69Mgxo67IZNQWnRSjT2eZAZtpUAgZ/view
• Do not delete your trained model before the TAs disclose your homework score and before you make sure that your score is correct.
• Use the wget command in your script to download you model files. Do not use the curl command.
• Note that you should NOT hard code any path in your file or script except for the path of your trained model.
Bash Script
oCUDA_VISIBLE_DEVICES=#GPU bash hw2.sh $1 $2 oCUDA_VISIBLE_DEVICES=#GPU bash hw2_best.sh $1 $2
• $1: testing images directory (images are named ’xxxx.png’)
• $2: output images directory (You must not created this directory in your code.)
• If the input RGB image is xxxx.png, your output semantic segmentation map should be xxxx.png.
• You should save the predicted semantic segmentation maps to the output directory ($2).
Bash Script cont’d
• Your testing code have to be finished in 10 mins.
• You must not use commands such as rm, sudo, CUDA_VISIBLE_DEVICES, cp, mv, mkdir, cd, pip or other commands to change the Linux environment.
• In your submitted script, please use the command python3 to execute your testing python files. oFor example: python3 test.py < — img_dir $1> < — save_dir $2>
• We will execute you code on Linux system, so try to make sure you code can be executed on Linux system before submitting your homework.
Packages
• python3.6
• pytorch==0.4.1
• scipy==1.2.0
• tensorboardX==1.8
• torchvision==0.2.1
• Other python3.6 standard library
• For more details, please refer to the requirements.txt in your homework package.
Packages cont’d
• If you use matplotlib in your code, please add matplotlib.use(“Agg”) in you code or we will not be able to execute your code.

• Do not use imshow() or show() in your code or your code will crash.
• Use os.path.join to deal with path as often as possible.
Penalty
• If we can not reproduce your mIoU score on the validation set, you will get 0 points in model performance (40%) and you will receive a 30% penalty in your report score.
• If we can not execute your code, we will give you a chance to make minor modifications to your code. After you modify your code, oIf we can execute your code and reproduce your results on the validation set,
you will still receive a 30% penalty in your homework score. oIf we can run your code but cannot reproduce your mIoU score on the validation set, you will get 0 points in model performance (40%) and you will receive a 30% penalty in your report score. oIf we still cannot execute your code, you will get 0 in this problem.
Outline
• Homework Introduction
• Semantic Segmentation
• Dataset
• Evaluation
• Grading
• Report (60%)
• Model performance(40%)
• Tools • Homework Policy
• Submission
• Trained Model
• Packages
• Tools
• Penalty
• Other
Reminder
• Please follow the rules.
• TAs will NOT debug for you, including addressing coding, environmental, library dependency problems.
• TAs do NOT answer questions not related to the course.
How to find help
• Google !
• TAs Emailbox: ntudlcvta2019@gmail.com

Reviews

There are no reviews yet.

Be the first to review “DLCV”

Description

Reviews

Related products

CV – Homework4: License Plate Localization Solved

CV – CS6550 Computer Vision Homework #3 Solved

CV – CS6550 Computer Vision Homework #2 Solved

DLCV – Homework #3 Solved