References

R

  • R. A platform for statistical computing.
  • RStudio. An IDE for R. The most straightforward way to get into using R and Quarto.
  • R Graphics Cookbook. Complete guide to plotting data with ggplot.
  • R Style Guide. Write readable code.
  • RStudio Cheatsheets Other quick guides, including information about using RStudio’s IDE and some of the main tools in R.

Quarto

  • Quarto An integrated, open-source publishing system. Generalizes and expands upon a lot of the functionality of RMarkdown.
  • Quarto Guide Comprehensive guide to creating a wide range of documents and presentations with Quarto.

Git / GitHub

  • Git. Version control system. Installs with Apple’s Developer Tools, or get the latest version via Homebrew.
  • GitHub. Host public Git repositories for free. Pay to host private ones. Also a source for publicly available code (e.g. R packages and utilities) written by other people.
  • GitHub Docs Tutorials for performing various actions using git and GitHub, from beginner to advanced.

Markdown / R Markdown

Data Science

Tools

  • Apple’s Developer Tools Unix toolchain. Install directly with xcode-select --install, or just try to use e.g. git from the terminal and have OS X prompt you to install the tools.
  • Homebrew package manager. A convenient way to install several of the tools here, including Emacs and Pandoc.
  • R. A platform for statistical computing.
  • Python and SciPy. Python is a general-purpose programming language increasingly used in data manipulation and analysis.
  • RStudio. An IDE for R. The most straightforward way to get into using R and RMarkdown.
  • TeX and LaTeX. A typesetting and document preparation system. You can write files in .tex format directly, but it is more useful to just have it available in the background for other tools to use. The MacTeX Distribution is the one to install for macOS.
  • Pandoc. Converts plain-text documents to and from a wide variety of formats. Can be installed with Homebrew. Be sure to also install pandoc-citeproc for processing citations and bibliographies, and pandoc-crossref for producing cross-references and labels.
  • Git. Version control system. Installs with Apple’s Developer Tools, or get the latest version via Homebrew.
  • GitHub. Host public Git repositories for free. Pay to host private ones. Also a source for publicly available code (e.g. R packages and utilities) written by other people.
  • GNU Make. You tell make what the steps are to create the pieces of a document or program. As you edit and change the various pieces, it automatically figures out which pieces need to be updated and recompiled, and issues the commands to do that. See Karl Broman’s Minimal Make for a short introduction. Make will be installed automatically with Apple’s developer tools.
  • lintr and flycheck. Tools that nudge you to write neater code.
  • Zotero. A citation manager that incorporates PDF storage, annotation, and other features. Zotero is free to use and can export to BibTeX/BibLaTeX files.

Data

Many of these websites offer publicly available datasets that can be used for research or class projects.

Health and Biological data

Government data

Other data

Social Networks