COURSE OUTLINE

The Barcelona Summer School of Demography (BSSD), based at the Centre for Demographic Studies (CED), Universitat Autònoma de Barcelona, offers a six-week course in R. The course is divided into six modules - one per week - covering three major strengths of R: statistical and demographic analysis, data visualization, and spatial analysis. Each module consists of 20 hours of teaching, combining theoretical lectures and practical exercises.
Participants are welcome to apply for the entire course or any of the individual modules. However, three itineraries are suggested: (1) The full course, Modules 1 to 6; (2) Statistics and demography, Modules 1 to 3; and (3) Data visualization and spatial analysis, Modules 1B and 4 to 6.
Modules 1 and 1B offer an introduction to R for which no previous knowledge is required. Module 1B is meant for participants with no previous knowledge of R preferring to join the course at week 3. For the other modules, basic knowledge in R is required. Module 2 focuses on basic statistical analysis. Module 3 shows how to implement common demographic methods in R. Module 4 provides a comprehensive view on data visualization using base R. Module 5 introduces the ‘tidyverse’ approach in R programming, including the ‘ggplot2’ package. Module 6 is devoted to spatial analysis and web-based mapping. For detailed contents on each module, please visit Schedule and Organization.
Participation will be limited to 15 students per module. Participants will be selected on a competitive basis based on motivation and research interests. Priority will be given to early-career researchers (Master and PhD students), but applicants from more advanced stages are also welcome. Participants are expected to bring and use their own laptops with R and RStudio installed as well as to pay their own transportation and living costs while staying in Barcelona. Lectures will be taught in English. Deadline for application: 25 April 2018. Applicants will be informed about the results of selection process by mid April 2018.
For further information, please contact bssd@ced.uab.es.
SCHEDULE AND ORGANIZATION
The BSSD will be held at the Center for Demographics Studies (CED), located on the Campus of the Autonomous University of Barcelona, Bellaterra, Spain. Lectures will be taught from 10 a.m. to 2 p.m. (theoretical lectures, combined with practical exercises).
MODULE 1/1B Introduction to R (June 18-22 / July 2-6)
Instructors: Francisco Villavicencio / Tim Riffe
Session 1 (Monday)
1) Introduction to R and RStudio
2) Using the editor: main characteristics of RStudio, packages
3) Data handling: import/export data to/from R
4) Basic operations: assigning
5) Using functions
Session 2 (Tuesday)
1) Common data types
2) Data structures overview
3) Vectors and matrices
4) Data frames
5) Reshaping, sorting and grouping
Session 3 (Wednesday)
1) Descriptive statistics in R
2) Contingency tables
3) Introduction to R plotting
Session 4 (Thursday)
1) Conditional execution: the ‘if’ command
2) Introduction to for-loops
3) Writing your own functions in R
Session 5 (Friday)
1) The apply() family functions
2) Using loops and custom functions in base plotting
3) Saving plots
4) Review of module
MODULE 2 Basic Statistics in R (June 25-29)
Instructor: Francisco Villavicencio
Session 1 (Monday)
1) Review of descriptive statistics
2) The normal distribution and QQ-plots
3) The t-distribution
4) Other distributions
Session 2 (Tuesday)
1) Linear models
2) Least square estimation
3) Residuals
4) Standard errors and confidence intervals
5) Diagnostic plots
Session 3 (Wednesday)
1) Hypothesis testing
2) The t-test and p-values
3) Comparison of groups: analysis of variance (ANOVA)
4) The F-test
Session 4 (Thursday)
1) Analysis of count data: the chi-square test
2) The Poisson distribution
3) The binomial distribution
4) Logistic regression
Session 5 (Friday)
1) Maximum likelihood estimation
2) Manual optimization
3) Non-linear regression
4) Review of the module
MODULE 3 Demography with R (July 2-6)
Instructor: Marie-Pier Bergeron-Boucher
Session 1 (Monday)
1) Basic demographic measures
2) The Lexis diagram
3) Rates, probabilities and proportions
Session 2 (Tuesday)
1) Life expectancy
2) Life table calculations
3) Building a life table in R
4) The Human Mortality Database (HMD)
Session 3 (Wednesday)
1) Standardization of demographic measures
2) Rate decomposition (Kitagawa method)
3) Life expectancy decomposition (Arriaga method)
Session 4 (Thursday)
1) Review of matrix algebra
2) Matrix population models
3) The Leslie matrix
Session 5 (Friday)
1) Population forecast principles
2) The Lee-Carter model
3) Review of the module
MODULE 4 Data visualization with base R (July 9-13)
Instructor: Tim Riffe
Session 1 (Monday)
1) Review of intro
2) Base plotting approach
3) Base plot types
4) Plot device control
5) Theory I | visual vocabulary
Session 2 (Tuesday)
1) Color specification
2) Color palettes
3) Figure layering & composition
4) Theory II | design
Session 3 (Wednesday)
1) R figures in documents & presentations
2) Panel graphics
3) Text and symbols in base plots
4) Theory III | visualization in social sciences
Session 4 (Thursday)
1) Making custom functions for plot elements
2) Geometric transformations
3) Coordinate spaces
4) Post processing figures in Inkscape
5) Participant choice topic
Session 5 (Friday)
1) Animation
2) Review of module
3) Participant project presentations
MODULE 5 The `tidyverse` approach to R (July 16-20)
Instructor: Jonas Schöley
Session 1 (Monday): Introduction to the tidyverse
1) R programming paradigms
2) The tidy approach to data analysis
3) The `tidyverse`
4) The tidy workflow
5) Getting started with `ggplot2` and `dplyr` and `rmarkdown`
6) Basic exploratory data analysis with `dplyr` and `ggplot2`
Session 2 (Tuesday): Data wrangling
1) Tidy data
2) Data pipelines
3) Long versus wide format data
4) Data reshaping
5) Data tidying
6) Making sense of messy data (`tidyr`, `dplyr`, `ggplot2`)
Session 3 (Wednesday): Tidy iteration
1) The split-apply-combine paradigm
2) Transforming/summarising data group by group
3) Fitting and summarising many models
4) Visualizing data group by group
Session 4 (Thursday): Data Visualization
1) Visualization as a design process
2) Marks, channels and perception
3) Best practices of data viz
4) `ggplot2`: working with color
5) `ggplot2`: making your plot ready for publication
Session 5 (Friday): The grand finale
MODULE 6 Spatial analysis (July 23-27)
Instructor: Juan Galeano
Session 1 (Monday)
1) Basic data manipulation using dplyr
2) %>% the pipe function
3) Group your data and summarise
4) Tidy your data
5) Plot your data: ggplot2
Session 2 (Tuesday)
1) Read shapefiles into R
2) General manipulation of spatial objects.
3) Univariate Class Intervals
4) Color palettes.
5) Thematic maps (I).
Session 3 (Wednesday)
1) Conversion between projection systems.
2) The ggmap package.
3) Thematic maps (II).
Session 4 (Thursday)
1) Spatial Statistics
2) Measures of spatial segregation and population diversity: The OasisR package.
3) Neighborhood Matrix.
4) Spatial autocorrelation: Global and Local Indicators of Spatial Autocorrelation (LISA).
Session 5 (Friday)
1) Plot Raster Data.
2) Web-mapping: Leaflet and ggiraph.
3) Review of module.