Description
Instructions: In this project, you are given a dataset collected by an actual IoT system (see description below) and asked to use the dataset to build a forecasting model. You have to answer a set of questions, as well as propose your own interesting questions. The project is worth 40 marks.
1. Form teams of 4 students each and register your team in LumiNUS under Class Groups.
2. Do the following for each question. Use a Jupyter notebook (ipynb file) to do the analysis and answer all the parts of the question. Use both code cells (for code) and markdown cells (for comments). Submit (i) PDF file/Print preview of your Jupyter notebook, (ii) the Jupyter notebook (ipynb file), and (iii) any additional data you used. Do not include the original data I gave you. Zip all files into one zip file, name it apprpriately, and upload it to the correct LumiNUS folder.
3. Complete all of Question 1. Please name your file GroupName Q1.zip and upload to
LumiNUS. (10 marks)
4. Complete all of Question 2. Please name your file GroupName Q2.zip and upload to
LumiNUS. (10 marks)
5. Complete Question 3. For Question 3, please also include a detailed description of your proposed work. Be sure to justify all assumptions, give the necessary details, and make a persuasive argument for your proposal. Please name your file GroupName Q3.zip and upload to LumiNUS. (10 marks)
6. Your presentation will be in the form of a video. The video should be 8-10 minutes long and focus on Question 3. Be sure to tell an interesting story and make a convincing argument. You can present results from Q1 and Q2 if they support what you want to do in Q3. It would be good if all members could play a role in the video. (10 marks)
Data Description: The data is available in the project directory in LumiNUS Files. In
<Timestamp (localtime)> <MeterID (dataid)> <meter reading (meter_value)>
Questions:
1. Exploring the Data (10 marks)
EE4211 Data Science for IoT, Project Description Page 2 of 2
1.2 Generate hourly readings from the raw data. Select one month from the 6-month study interval and plot the hourly readings (time-series) for that month. Hint: You will have to decide what to do if there are no readings for a certain hour.
1.3 Intuitively, we expect that gas consumption from different homes to be correlated. For example, many homes would experience higher consumption levels in the evening when meals are cooked. For each home, find the top five homes with which it shows the highest correlation.
2. Forecasting (10 marks)
2.2 Build a linear regression model to forecast the hourly readings in the future (next hour). Generate two plots: (i) Time series plot of the actual and predicted hourly meter readings and (ii) Scatter plot of actual vs predicted meter readings (along with the line showing how good the fit is).
2.3 Do the same as Question 2.2 above but use support vector regression (SVR).
3. Student Proposal (10 marks)
3.1 At this point, you understand the data quite well. Propose and carry out additional analysis using the dataset given. Please be sure to justify why this additional analysis is useful and interesting.
Additional Information about Data Collection:
1. Gas flow meters have a sensor that is used to measure the volume of gas that passes though a pipe. Different meters use different sensors (e.g. ultrasonic sensors, synthetic diaphragm with rotating valve etc.). The meters check on the sensors periodically to get a reading of the current consumption value. This is what is meant in the sentence above: ”The gas meters measure the cumulative gas consumption at a frequency of 15 seconds.”




Reviews
There are no reviews yet.