In lieu of a written final exam, you and your classmates will construct an annotated list of R functions and everyone will individually submit a comparative discussion of two simulations.
For this extra credit homework assignment, you will be guided through the process of building a regression model that predicts the market value of condominiums in New York City using a dataset published by the New York City Department of Finance.
For this homework assignment, you will use statistical inference to answer a question about the National Survey of Family Growth, Cycle 6 dataset published by the National Center for Health Statistics.
Read the following:
No questions need to be submitted for this reading assignment.
Introductory Statistics with Randomization and Simulation
Read the following:
Read the following:
All of chapter 22 (short)
From chapter 23: section 23.1 through to the end of section 23.3
No questions need to be submitted for this reading assignment.
For this homework assignment, you will use the SelectorGadget Chrome extension and the rvest package to scrape data from the official Mason Patriots sports website.
Nature News Feature article
Read the following article about p values:
Reading discussion
Instead of posting a question as we’ve done for the other readings, please respond to the following prompts:
Had you ever heard of this situation concerning p-values before this class?
If this is the first time you’ve heard this, did you find this surprising, and does it affect how you feel about science? Explain.
If you have heard about this situation before, did the article change your perspective in any way? Explain.
Based on the article, what practical things can we do to make sure our claims are accurate and transparent? Mention any quantities that we should compute and what kinds of details we should try to include in our RMarkdown notebooks.
A full response for this reading consists of a minimum of two paragraphs, one for the first prompt and one for the second prompt. Each paragraph must have a minimum of three full sentences, and the content must be substantive.
Discussion hashtag
#reading13
For the midterm, you will conduct an exploratory data analysis of the U.S. Department of Education’s
Introductory Statistics with Randomization and Simulation
Read the following:
From chapter 2: section 2.4 through to the end of section 2.5
From chapter 4: section 4.5 (skip 4.5.3)
Reading discussion
Discussion hashtag
#reading12
Remember to post your question about it to the su18-a01-readings
channel in Slack by the due date.
Introductory Statistics with Randomization and Simulation
Read the following:
Reading discussion
Discussion hashtag
#reading11
Remember to post your question about it to the su18-a01-readings
channel in Slack by the due date.
Tutorials
Read the following tutorials on the rvest
package and SelectorGadget
Chrome extension.
Beginner’s Guide on Web Scraping in R (using rvest) with hands-on example
SelectorGadget
Vignette
Reading discussion
Discussion hashtag
#reading10
Remember to post your question about it to the su18-a01-readings
channel in Slack by the due date.
For your second homework assignment, you will explore a dataset about the passengers on the Titanic, the British passenger liner that crashed into an iceberg during its maiden voyage and sank early in the morning on April 16, 1912.
Read the following:
Reading discussion
Discussion hashtag
#reading8
Remember to post your question about it to the su18-a01-readings
channel in Slack by the due date.
Read the following writeups on the probability mass function and cumulative distribution function:
Reading discussion
Discussion hashtag
#reading7
Remember to post your question about it to the su18-a01-readings
channel in Slack by the due date.
Your first major assignment is a set of exercises based around a single dataset called rail_trail, which will provide you with practice in creating visualizations using R and ggplot2
.
Read the following:
Reading discussion
Discussion hashtag
#reading6
Remember to post your question about it to the su18-a01-readings
channel in Slack by the due date.
Read the following:
Reading discussion
Discussion hashtag
#reading5
Remember to post your question about it to the su18-a01-readings
channel in Slack by the due date.
Mini-assignment to practice using RStudio to run code blocks in RMarkdown files and to create visualizations using ggplot2
.
Read the following:
Reading discussion
Discussion hashtag
#reading4
Remember to post your question about it to the su18-a01-readings
channel in Slack by the due date.
A mini-assignment about a data science study that used Twitter data to predict election outcomes.
Read the following:
Reading discussion
Discussion hashtag
#reading3
Remember to post your question about it to the su18-a01-readings
channel in Slack by the due date.
Introductory Statistics with Randomization and Simulation
Read the following:
Reading discussion
Discussion hashtag
#reading2
Remember to post your question about it to the su18-a01-readings
channel in Slack by the due date.
Read the following:
Reading discussion
Discussion hashtag
#reading1
Remember to post your question about it to the su18-a01-readings
channel in Slack by the due date.