05:00
The RSA
tibblesfactorsR has a special data class, called factor, to deal with categorical data. Factors:
levels (values) of a categorical variable, such as days of the week or responses to a question in a survey⏰ 5 mins
05:00
central_heating variable, rename “no”, “yes”, and “unknown” to “No”, “Yes” and “Unknown” respectively.select or filter and create new columns with mutate.%>%.summarise, group_by, and count to split a data frame into groups of observations, apply summary statistics for each group, and then combine the results.⏰ 5 mins
05:00
Using pipes, subset census_data to include responses from participants based in London and retain only the columns household_size, dwelling_type, and cars
Note that if you select before you filter, your code won’t run. That’s because you’re not retaining the variable that you use in your filtering. When piping, order matters!
⏰ 10 mins
10:00
dwelling_type?group_by() and summarise() to find the median, min, and max number of bedrooms for each dwelling_type. Also add the number of observations (hint: see ?n)select or filter and create new columns with mutate.%>%.summarise, group_by, and count to split a data frame into groups of observations, apply summary statistics for each group, and then combine the results..csv and .tsv file.