Final Project

Due date: Friday December 6th, 2024, by 11:59pm Pacific Time.

Learning Objective: To apply the skills learned in PM 566 by analyzing and interpreting a dataset of your choice.

Narrative: Through this project you will launch a portfolio of data science projects that will become seminal for your job hunt.

Using the dataset from your midterm, make sure you have formulated a clear and concise question to answer. You will apply the skills learned throughout the semester to answer this question.

Deliverables

  • Written report (single spaced, 6-8 pages where 3-4 pages are tables and figures)
  • Website (summarize your analysis, include interactive figures, provide a link to download the full written report)

Written report

The actual analysis should be included in the PDF report. For the analysis, we will expect to see a cleaner report than for the midterm, with more sophisticated analysis. Regardless of how well you did on the midterm, we want to see improvements, either by fixing your mistakes or adding more data or questions to your analysis.

The PDF report can make references to interactive visualizations included in the website, but otherwise all figures and tables should be included in the PDF.

The report should have the following sections:

  • Introduction: provide background on your dataset and formulated question)
  • Methods: include how and where the data were acquired, how you cleaned and wrangled the data, what tools you used for data exploration
  • Results: provide final, publication-ready tables and figures from your analysis, make separate subsections as needed
  • Conclusion and Summary: a brief recap where you describe your findings.

In your report, please do not include any code (so make sure echo = FALSE), unformatted output, or partial datasets (e.g. output from head(), str(), etc.)

Website Checklist

See the checklist below for additional details about the website requirements:

  1. Create a website (HTML document and all the required files, including figures). It should feature:

    1. A brief description of the project,

    2. Interactive visualizations, also with a description so that people can understand what they are looking at, and

    3. A link to the PDF version of the actual report (i.e. a link to “Download the report.”).

    4. Your home page should feature no more than five interactive tables and figures. If you want to include more to showcase your skills, we request you do so by adding extra pages to the website. We will only evaluate the homepage of the website.

  2. Upload everything, source code, website files, and PDF report, to the GitHub repository.

  3. Make sure that the website, which is to be hosted in GitHub pages, actually works, i.e., figures and interactive visualizations are properly rendered when visiting the website (Hint: double check the URL! Make sure you aren’t looking at your local version).

  4. Have a data folder with either the dataset or instructions about how to acquire it. If necessary, you should provide the instructions for acquiring the data in a README file within that folder.