Objectives
The primary objective of this assignment is to give you more practice with:
filter
, arrange
, select
, summarize
, mutate
, group_by
)You should also:
slice()
distinct()
facet_wrap()
and facet_grid()
This assignment is due Thursday, September 30th at noon. Please turn your .html AND .Rmd files into Canvas. Your .Rmd file should knit without an error before turning in the assignment.
This assignment concerns a dataset from an experiment that tested whether 2-4 year-old children could learn new words from exclusion (Lewis, Cristiano, Lake, Kwan & Frank, 2020).
There were two conditions. In the critical condition, children saw two objects. One of the objects was an object that the child knew the label for (e.g., a ball) and the other object was an object that the child did not know the label for (e.g., tongs). The experimenter then asked the child to point to the novel object by saying, e.g., “Can you find the tongs?”. If the child assumes that each object only has one name, they should assume that this new label refers to the tongs, and not the ball. This phenomenon is called “Mutual Exclusivity” in the literature (Markman & Wachtel, 1988), because children are thought to assume that a new label is mutually exclusive with an old one. Let’s call this condition the “Novel-Familiar” condition, or NF.
In the control condition, children again saw two objects. This time both of the objects were objects that the child knew a label for (e.g., a ball and a cup). The experimenter then asked the child to point to one of the objects by saying, e.g., “Can you find the ball?”. Let’s call this condition the “Familiar-Familiar” condition, or FF.
Each child completed 7 trials: 4 in the NF condition and 3 in the FF condition. On each trial we recorded which object was the correct choice, and whether or not the child pointed to the correct object. We also measured two variables for each child: The age of the child and their performance on an separate vocabulary test.
Each variable in the dataset is described below:
Here is the path to a lightly cleaned version of the dataset:
<- "https://raw.githubusercontent.com/mllewis/cumulative-science/master/static/data/tidy_me_data.csv" DATA_PATH
me_data
. Use the glimpse()
function to determine: sub_id
, and target_object
.slice()
to print rows 1 and 3 from me_data
. arrange
and slice()
to print 7 rows of the first trial (where trial_num is 1).group_by
to answer this question. count
to answer this question.subject_means
.subject_means
data frame to calculate the mean proportion correct by condition. Plot the result as a bar plot. Include the following things:ylim
).geom_hline()
; geom_hline takes one parameter, yintercept).Which condition are children better at?
facet_wrap()
. You’ll need to create a new data frame like subject_means_with_years
but one that also includes the variable target_object
. Call the new data frame subject_means_with_years_obj.
me_data
, make a new variable called scaled_vocabulary_score
that ranges from 0 to 1, rather than 0 to 100.me_data
to plot the distribution of children’s scaled_vocabulary_score
. To do this, you’ll need a data frame with only one row per child. Use geom_histogram()
.
geom_bar
, geom_violin
, geom_boxplot
, geom_histogram
) For inspiration, check out the R ggplot gallery.