05:00
The RSA
tibbles
factors
R has a special data class, called factor
, to deal with categorical data. Factors:
levels
(values) of a categorical variable, such as days of the week or responses to a question in a survey⏰ 5 mins
05:00
central_heating
variable, rename “no”, “yes”, and “unknown” to “No”, “Yes” and “Unknown” respectively.select
or filter
and create new columns with mutate
.%>%
.summarise
, group_by
, and count
to split a data frame into groups of observations, apply summary statistics for each group, and then combine the results.⏰ 5 mins
05:00
Using pipes, subset census_data
to include responses from participants based in London and retain only the columns household_size
, dwelling_type
, and cars
Note that if you select
before you filter
, your code won’t run. That’s because you’re not retaining the variable that you use in your filtering. When piping, order matters!
⏰ 10 mins
10:00
dwelling_type
?group_by()
and summarise()
to find the median, min, and max number of bedrooms for each dwelling_type
. Also add the number of observations (hint: see ?n
)select
or filter
and create new columns with mutate
.%>%
.summarise
, group_by
, and count
to split a data frame into groups of observations, apply summary statistics for each group, and then combine the results..csv
and .tsv
file.