Scraping, APIs, and Regular Expressions

Required reading

  • Chapter 17, Regular Expressions, from “R Programming for Data Science” ( link)

  • The vignette from the R package rvest: “Harvesting the web with rvest” ( link)

  • This post “HTTP: The Protocol Every Web Developer Must Know - Part 1”, excluding the section “Tools to View HTTP Traffic” link.

Optional reading

  • A Full course on APIs by Zapier ( link)
  • R for Data Science chapter on Strings (includes a discussion of the stringr package, which we will be using) ( link)