# Assignments

• Project

## Final Project

In lieu of a written final exam, you and your classmates will construct an annotated list of R functions and everyone will individually submit a comparative discussion of two simulations.

• Homework

## Homework 5 (extra credit)

For this extra credit homework assignment, you will be guided through the process of building a regression model that predicts the market value of condominiums in New York City using a dataset published by the New York City Department of Finance.

• Homework

## Homework 4

For this homework assignment, you will use statistical inference to answer a question about the National Survey of Family Growth, Cycle 6 dataset published by the National Center for Health Statistics.

• Reading

## Reading 15

R for Data Science

Read the following:

No questions need to be submitted for this reading assignment.

• Reading

## Reading 14

Introductory Statistics with Randomization and Simulation

Read the following:

• From chapter 5: from the beginning through to the end of section 5.1.4, section 5.4.1

R for Data Science

Read the following:

No questions need to be submitted for this reading assignment.

• Homework

## Homework 3

For this homework assignment, you will use the SelectorGadget Chrome extension and the rvest package to scrape data from the official Mason Patriots sports website.

• Reading

## Reading 13

Nature News Feature article

Read the following article about p values:

Reading discussion

Instead of posting a question as we’ve done for the other readings, please respond to the following prompts:

1. Had you ever heard of this situation concerning p-values before this class?

• If this is the first time you’ve heard this, did you find this surprising, and does it affect how you feel about science? Explain.

• If you have heard about this situation before, did the article change your perspective in any way? Explain.

2. Based on the article, what practical things can we do to make sure our claims are accurate and transparent? Mention any quantities that we should compute and what kinds of details we should try to include in our RMarkdown notebooks.

A full response for this reading consists of a minimum of two paragraphs, one for the first prompt and one for the second prompt. Each paragraph must have a minimum of three full sentences, and the content must be substantive.

Discussion hashtag
#reading13

• Project

## Midterm Project

For the midterm, you will conduct an exploratory data analysis of the U.S. Department of Education’s College Scorecard dataset in teams.

• Reading

## Reading 12

Introductory Statistics with Randomization and Simulation

Read the following:

• From chapter 2: section 2.4 through to the end of section 2.5

• From chapter 4: section 4.5 (skip 4.5.3)

Reading discussion

Discussion hashtag
#reading12

Remember to post your question about it to the su18-a01-readings channel in Slack by the due date.

• Reading

## Reading 11

Introductory Statistics with Randomization and Simulation

Read the following:

• From chapter 2: from the beginning through to the end of section 2.3

Reading discussion

Discussion hashtag
#reading11

Remember to post your question about it to the su18-a01-readings channel in Slack by the due date.

• Reading

## Reading 10

Tutorials

Read the following tutorials on the rvest package and SelectorGadget Chrome extension.

Beginner’s Guide on Web Scraping in R (using rvest) with hands-on example

SelectorGadget
Vignette

Reading discussion

Discussion hashtag
#reading10

Remember to post your question about it to the su18-a01-readings channel in Slack by the due date.

• Homework

## Homework 2

For your second homework assignment, you will explore a dataset about the passengers on the Titanic, the British passenger liner that crashed into an iceberg during its maiden voyage and sank early in the morning on April 16, 1912.

• Reading

## Reading 9

R for Data Science

Read the following:

Reading discussion

Discussion hashtag
#reading9

Remember to post your question about it to the su18-a01-readings channel in Slack by the due date.

• Reading

## Reading 8

R for Data Science

Read the following:

• From chapter 12: from the beginning through to the end of section 12.5

Reading discussion

Discussion hashtag
#reading8

Remember to post your question about it to the su18-a01-readings channel in Slack by the due date.

• Reading

## Reading 7

Read the following writeups on the probability mass function and cumulative distribution function:

Reading discussion

Discussion hashtag
#reading7

Remember to post your question about it to the su18-a01-readings channel in Slack by the due date.

• Homework

## Homework 1

Your first major assignment is a set of exercises based around a single dataset called rail_trail, which will provide you with practice in creating visualizations using R and ggplot2.

• Reading

## Reading 6

R for Data Science

Read the following:

Reading discussion

Discussion hashtag
#reading6

Remember to post your question about it to the su18-a01-readings channel in Slack by the due date.

• Reading

## Reading 5

R for Data Science

Read the following:

Reading discussion

Discussion hashtag
#reading5

Remember to post your question about it to the su18-a01-readings channel in Slack by the due date.

• Mini-Assignment

## Visualization mini-assignment

Mini-assignment to practice using RStudio to run code blocks in RMarkdown files and to create visualizations using ggplot2.

• Reading

## Reading 4

R for Data Science

Read the following:

Reading discussion

Discussion hashtag
#reading4

Remember to post your question about it to the su18-a01-readings channel in Slack by the due date.

• Mini-Assignment

## Can Twitter predict election results?

A mini-assignment about a data science study that used Twitter data to predict election outcomes.

• Reading

## Reading 3

R for Data Science

Read the following:

Reading discussion

Discussion hashtag
#reading3

Remember to post your question about it to the su18-a01-readings channel in Slack by the due date.

• Reading

## Reading 2

Introductory Statistics with Randomization and Simulation

Read the following:

• All of Chapter 1, except skip sections 1.3.5, all of 1.4, all of 1.5, skip 1.6.8.

Reading discussion

Discussion hashtag
#reading2

Remember to post your question about it to the su18-a01-readings channel in Slack by the due date.

• Reading

## Reading 1

R for Data Science

Read the following:

Reading discussion

Discussion hashtag
#reading1

Remember to post your question about it to the su18-a01-readings channel in Slack by the due date.