Description
Bayesian Statistics and Data Analysis Mini-Project Instructions Contents
1 Project Instructions 2
1.1 Suggested Reading/Video material 2
1.2 Project Group and Expected Workload 2
1.3 Data Sets and Methods Recommendations 2
1.4 Project Proposal 3
1.5 Project Report 3
1.6 Project Presentations 5
1.7 Project Grading 6
1 Project Instructions
The last two weeks will focus on a course project where 2-3 students choose data and will do a Bayesian analysis of a real-world dataset.
Requirements for the projects are:
Your project should be a Bayesian Data Analysis using Stan (you can use brms or rstanarm if you like. Although, this is easier and will be considered when grading, i.e. it will be harder to get a VG in the mini-project.
Real data should be used (see below for details).
At least two di erent models should be estimated and compared.
Tip! Write the report so you can add your nal report as an example of work you have done to future potential employers.
For PhD students: You can choose to make a small project related to your research interest instead. Although, it should still be a 4 page paper output.
1.1 Suggested Reading/Video material
The project will be a small practical exercise in Bayesian data analysis. To get some inspiration, see the Stan YouTube channel [here] or the Stan User Guide [here]
1.2 Project Group and Expected Workload
The project is expected to take 40h per student in the group. Hence a 3 group project should be the equivalent of a 120h project.
1.3 Data Sets and Methods Recommendations
We recommend that you nd a dataset you are interested in using yourself, ideally in a eld you nd interesting. Feel free to discuss potential projects with the teacher.
If you have a hard time nding a dataset to use, there are a lot of available datasets (and problems) at:
The UCI Machine Learning repository: [here]
The machine learning competition site Kaggle: [here]
Hint! Use a data set with a natural grouping (e.g. patients in hospitals, car emissions by di erent brands, data with a time dimension etc) since that will help you to use ideas of hiearchical modeling from later in the course.
The following data sets should not be used in the project:
Titanic (R data set) mtcars (R data set)
1.4 Project Proposal
Students need to turn in a one-page project proposal and data description by and get approval for the proposed project. The Project proposal should follow the ICML paper format that can be found [here]. The ICML format is also available in overleaf here: [here]. We recommend using Overleaf for writing the proposal.
The project proposal should include all the group members names!
In the Project proposal part 1, the following parts should be included. See Project report below on the details on each section.
1. Title
2. Abstract (1 paragraph)
3. Introduction (roughly 0.5 page)
4. Data (roughly 0.5 page)
5. Models (roughly 0.25 page) Note! For proposal pt 1, you only need the most simple model to model the actual data.
For Project Proposal part 2, you need to also add:
1. The rst model in Stan code in the appendix.
2. The rst model estimated and added to the Results section of the project proposal.
1.5 Project Report
The Project outcome is a report in the ICML paper format that can be found [here]. The ICML format is also available in overleaf here: [here]. We recommend using Overleaf for writing the report.
The paper should consist of between three and a half (3.5) and four (4) pages, excluding references and eventual appendices. Write the report as you would do in a real situation, i.e. do not refer to the paper as a “mini-project” or similar.
The project report should include all the group members names!
The paper should include the following parts/sections:
1. Title
The title should describe the problem and be like a real article title, i.e. don’t write “Mini-project:” or similar in the title.
2. Abstract (1 paragraph)
3. Introduction (roughly 0.5 page) Note! The introduction should be as an introduction to an article, i.e. describe the problem and include at least one reference.
Description of the problem.
4. Data (roughly 0.5 page)
Description of the data, e.g. the number of observations, the number of groups (if you intend to use a hierarchical model), what is the dependent variable etc.
5. Models and methods (roughly 0.5 page)
Description of the models Note! Stan code for all models should be included as an Appendix.
Description of how the models were compared (LOO/WAIC)
6. Results (roughly 1.5-2 pages)
Results of the di erent models.
Estimated parameters of interest together with their posterior uncertainty.
Which model does seem to work the best, and why?
7. Conclusions (roughly 0.5-1 pages)
Conclusions from the results.
Discussion of problems and potential improvements and other models
8. Acknowledgements (optional)
If you are willing to let me use your project report as an example in the next course, please state that in the Acknowledgements. Just add the following sentence: “We hereby grant our consent for the utilization of this project report as a reference material within the context of future editions of the course.”
Other people you might want to thank.
9. Appendix
Stan code for all models used in the report should be included in the appendix Other results, Figures etc that you think might be of interest to some readers.
Additional requirements for the report
1. All Figures using color should have a color-blind friendly color palette. See here and here.
2. The nal report should look like a research paper/article, i.e. try to avoid bullet list and get a good ow in the text. Also do not refer to the report as a mini-project. See it as a real report (or short paper).
3. All model included in the report should have the Stan code included in the appendix.
4. You should use correct reference systems. A tip is to use citet, citep, and bibtex. This will also simplify your future thesis work.
5. You should include some (or all) estimated parameters together with their uncertainty and Rˆ and interpret them in the report. You are free to include them in the appendix.
Additional hints for the report
1. Before you turn in the project, do a language check with a tool such as Grammarly. A project will poor english (errors that would have been spotted with a tool such as Grammarly) will a ect your grade downwards.
1.6 Project Presentations
Presentation details:
Each project needs to be presented in addition to submitting the mini-project report
The presentation should be high level, but su ciently detailed information should be readily available to facilitate answering questions from the audience
For 1-2 person groups, the presentation should be 10 minutes
For three-person groups, the presentation should be 15 minutes
Afterwards, questions will be asked rst by other students and then by attending teachers.
Speci c presentation recommendations:
The rst slide needs to include the project title and names of the group members.
The chosen methods(s) should be explained and justi ed (you are not holding this presentation for a hypothetical customer who doesn’t care about the details of your methods).
Big enough font size for text and gure labels should be used to make it easy for the audience to read slides.
A good rule of thumb is to expect one slide to take 2 minutes to present.
The last/ nal slide needs to include your conclusion and names of the group members.
Missing the project presentation
To be able to get VG on the project the presentation needs to be turned in the day before the presentation at 23.59 the latest.
1.7 Project Grading
Below are the criteria used when grading the mini-projects. Some general comments on grading are:
1. The more students the higher the quality expected of the project, i.e. a better report is expected from a three-student report than a two-student report.
To pass the report (G), the following criterias should be ful lled:
1. The report should be turned in and follow the general outline of Section 1.5.
2. The report should follow the additional requirements in Section 1.5.
3. show basic knowledge and understanding of the core concepts of the course by using concepts correctly
4. show an understanding in when certain methods should be used or not, and how
5. use at least two (2) di erent models and compare them in a correct way
6. state what has been done in the report with clarity, good english and rigour so it is easy for a reader to understand and follow the paper.
7. correctly use references in the report following the guideline of the template in Section 1.5
To pass the mini-project with distinction (VG), the following criterias also apply in addition to the criteria for passing the report above:
1. show deep knowledge knowledge and understanding of the core concepts and how to adapt them in a good way to a new situation
2. connect the analysis in the report with other areas in statistics or previous courses taken in the masters program, i.e. not just repeat what has been done in previous labs.
3. use models that has not been part of the BSDA course




Reviews
There are no reviews yet.