R for Data Analysis: A Short Tutorial

class: top, left, inverse, title-slide

.title[
# R for Data Analysis: A Short Tutorial
]
.subtitle[
## Session 1: Introduction to R
]
.author[
### Dimiter Toshkov
]
.institute[
### Institute of Public Administration, Leiden University
]
.date[
### last updated: 2025-03-31
]

---

.remark-slide-content {
    font-size: 28px;
    padding: 1em 1em 1em 1em;
}

.remark-slide-content > h1 {
  font-size: 32px;
  margin-top: -85px;
}

</style>

# Welcome! First things first: introductions!

---
class: inverse, top

background-image: url("data:image/png;base64,#figs/mip_env_country.png")
background-size: contain

# Here is a picture that I like (done in R)

---
class: inverse, top

background-image: url("data:image/png;base64,#http://dimiter.eu/thumb/DSC_3484.jpg")
background-size: contain

# Not done in R (yes, there's [more](http://dimiter.eu/Photos.html)  where that came from <svg viewBox="0 0 496 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M248 8C111 8 0 119 0 256s111 248 248 248 248-111 248-248S385 8 248 8zm0 448c-110.3 0-200-89.7-200-200S137.7 56 248 56s200 89.7 200 200-89.7 200-200 200zm117.8-146.4c-10.2-8.5-25.3-7.1-33.8 3.1-20.8 25-51.5 39.4-84 39.4s-63.2-14.3-84-39.4c-8.5-10.2-23.7-11.5-33.8-3.1-10.2 8.5-11.5 23.6-3.1 33.8 30 36 74.1 56.6 120.9 56.6s90.9-20.6 120.9-56.6c8.5-10.2 7.1-25.3-3.1-33.8zM168 240c17.7 0 32-14.3 32-32s-14.3-32-32-32-32 14.3-32 32 14.3 32 32 32zm160-60c-25.7 0-55.9 16.9-59.9 42.1-1.7 11.2 11.5 18.2 19.8 10.8l9.5-8.5c14.8-13.2 46.2-13.2 61 0l9.5 8.5c8.5 7.4 21.6.3 19.8-10.8-3.8-25.2-34-42.1-59.7-42.1z"></path></svg>)

---
class: left

# What is R?

--
## R is the best thing since sliced bread,

--
class: left

## only much better, because...

--
.left[unlike bread, it combines *functional*  with *object-oriented* programming,]

--
.left[and it is not sliced, but modular, so that it can be easily extended with new packages.]

---
# Downsides of R

It is called R.

It comes with no customer support.

It regularly leads people into arguments about how good it is.

Error messages are rarely informative.

It is not made for data entry (though if you insist, look [here](https://www.sebastianvanbaalen.se/uploads/tutorial-data-collection-app)).

---
# What can you do with R? (1)

## Fun things, such as:
- Run (m)any statistical models, including Bayseian models (e.g. with [STAN](https://mc-stan.org/users/interfaces/rstan))
- Do automated text analysis (e.g. with [quanteda](https://quanteda.io/)) and machine learning (e.g. with [tensorflow](https://cran.r-project.org/web/packages/tensorflow/index.html))
- Interact with Large Language Models (LLMs) (e.g. with [chattr](https://mlverse.github.io/chattr/) and [tidychatmodels](https://albert-rapp.de/posts/20_tidychatmodels/20_tidychatmodels))

---
# What can you do with R? (2)

## Access data: 
- From any type of file (Stata, SPSS, Excel, text, etc.)
- Directly from the web via APIs (e.g. [World Bank](https://cran.r-project.org/web/packages/wbstats/vignettes/wbstats.html))
- Scrape complex internet sites and databases (e.g. [EUR-Lex](https://eur-lex.europa.eu/))

---
# What can you do with R? (3)

## Do other important things, such as:
- Well-formatted conference programs from excel sheets <svg viewBox="0 0 496 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M248 8C111 8 0 119 0 256s111 248 248 248 248-111 248-248S385 8 248 8zm0 448c-110.3 0-200-89.7-200-200S137.7 56 248 56s200 89.7 200 200-89.7 200-200 200zm0-176c-35.3 0-64 28.7-64 64s28.7 64 64 64 64-28.7 64-64-28.7-64-64-64zm-48-72c0-17.7-14.3-32-32-32s-32 14.3-32 32 14.3 32 32 32 32-14.3 32-32zm128-32c-17.7 0-32 14.3-32 32s14.3 32 32 32 32-14.3 32-32-14.3-32-32-32z"></path></svg>
- Presentations like this one (with `RMarkdown` and [xaringan](https://cran.r-project.org/web/packages/xaringan/index.html)) <svg viewBox="0 0 640 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M278.9 511.5l-61-17.7c-6.4-1.8-10-8.5-8.2-14.9L346.2 8.7c1.8-6.4 8.5-10 14.9-8.2l61 17.7c6.4 1.8 10 8.5 8.2 14.9L293.8 503.3c-1.9 6.4-8.5 10.1-14.9 8.2zm-114-112.2l43.5-46.4c4.6-4.9 4.3-12.7-.8-17.2L117 256l90.6-79.7c5.1-4.5 5.5-12.3.8-17.2l-43.5-46.4c-4.5-4.8-12.1-5.1-17-.5L3.8 247.2c-5.1 4.7-5.1 12.8 0 17.5l144.1 135.1c4.9 4.6 12.5 4.4 17-.5zm327.2.6l144.1-135.1c5.1-4.7 5.1-12.8 0-17.5L492.1 112.1c-4.8-4.5-12.4-4.3-17 .5L431.6 159c-4.6 4.9-4.3 12.7.8 17.2L523 256l-90.6 79.7c-5.1 4.5-5.5 12.3-.8 17.2l43.5 46.4c4.5 4.9 12.1 5.1 17 .6z"></path></svg>

- Practice open reproducible science <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <g groupmode="layer" id="layer6" label="icon">    <path id="path90" style="fill-opacity:1;fill-rule:nonzero;stroke:none;stroke-width:0.0352777" d="M 255.99349,8.0446377 C 223.80996,6.8710755 198.88846,29.106812 171.90108,43.078975 135.19601,64.83668 97.398702,84.909644 61.424592,107.79531 c -23.529448,19.12486 -34.340913,49.89944 -31.869027,79.61823 0.136467,52.69231 -0.273926,105.39637 0.206898,158.08127 3.046554,31.90013 25.166618,58.97257 53.630746,72.31691 45.693451,26.22554 91.119371,52.93874 136.981251,78.85735 29.15402,13.29373 63.66608,7.7032 89.45702,-10.28262 45.55933,-26.46403 91.40188,-52.46098 136.78286,-79.21917 26.10444,-18.58601 38.49713,-51.27731 35.81722,-82.60095 -0.13644,-52.69193 0.27402,-105.39561 -0.20674,-158.08014 C 479.17554,134.58854 457.05921,107.51038 428.59249,94.170602 382.89744,67.950466 337.46312,41.255709 291.60404,15.334368 280.50454,10.177688 268.1747,8.0789301 255.99349,8.0446377 Z M 227.29819,140.73479 c 19.13338,0 38.26677,0 57.40015,0 0,76.83994 0,153.67988 0,230.51981 -19.13338,0 -38.26677,0 -57.40015,0 0,-76.83993 0,-153.67987 0,-230.51981 z m 72.23096,50.17436 c 19.12396,0 38.2479,0 57.37188,0 0,60.11515 0,120.2303 0,180.34545 -19.12398,0 -38.24792,0 -57.37188,0 0,-60.11515 0,-120.2303 0,-180.34545 z m -144.45251,94.07924 c 19.13025,0 38.26049,0 57.39074,0 0,28.7554 0,57.51081 0,86.26621 -19.13025,0 -38.26049,0 -57.39074,0 0,-28.7554 0,-57.51081 0,-86.26621 z"></path>  </g></svg> <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <g groupmode="layer" id="layer6" label="icon">    <path id="path86" style="fill-opacity:1;fill-rule:nonzero;stroke:none;stroke-width:0.0352777" d="m 255.99051,7.9999992 c -14.45957,0 -28.91558,3.1667158 -39.88958,9.5105468 L 69.397238,102.20596 C 47.455739,114.88072 29.50769,145.96034 29.50769,171.29036 v 169.41925 c 0,25.33653 17.948049,56.42568 39.889548,69.0939 l 146.703692,84.71437 c 10.974,6.32444 25.4365,9.48212 39.88958,9.48212 14.46596,0 28.92849,-3.15768 39.88953,-9.48212 l 146.72261,-84.71437 c 21.9352,-12.66822 39.88966,-43.75737 39.88966,-69.0939 V 171.29036 c 0,-25.33002 -17.95446,-56.40964 -39.88966,-69.0844 L 295.88952,17.510546 c -10.974,-6.343831 -25.43305,-9.5105468 -39.89901,-9.5105468 z m -58.05822,145.7374608 58.06766,22.86711 58.09612,-22.86711 100.07901,40.79892 -58.06755,22.85762 V 317.49208 L 255.99995,358.27204 155.90196,317.49208 V 217.394 L 97.815363,194.53638 Z M 155.90196,217.394 255.99995,258.16451 356.10753,217.394 255.99995,176.60457 Z"></path>  </g></svg> <svg viewBox="0 0 320 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <g groupmode="layer" id="layer6" label="icon">    <path style="stroke-width:0.07717" d="m 234.55878,151.33351 c -0.0147,0 -0.0257,0.002 -0.0414,0.002 L 224.16976,96.812862 c 15.38213,-7.702378 25.99025,-23.574443 25.99025,-41.918293 0,-25.857997 -21.03653,-46.8945659 -46.89284,-46.8945659 -25.85803,0 -46.89911,21.0365689 -46.89911,46.8968339 0,25.858034 21.04053,46.896833 46.89911,46.896833 0.64873,0 1.28333,-0.0722 1.9281,-0.0996 l 9.97184,52.5411 c -27.25948,8.32453 -47.14885,33.69486 -47.14885,63.6431 0,22.04302 10.81663,41.55831 27.37532,53.67526 l -21.77855,36.87756 c -14.95433,-8.29345 -32.13858,-13.03243 -50.4163,-13.03243 -57.513449,0 -104.302352,46.78948 -104.302352,104.30066 0,57.51118 46.78947,104.30067 104.302352,104.30067 57.51286,0 104.30234,-46.78949 104.30234,-104.30067 0,-32.21545 -14.68591,-61.05947 -37.71052,-80.20745 l 22.92967,-38.82661 c 6.853,2.39098 14.18125,3.75738 21.84068,3.75738 36.69162,0 66.54272,-29.85103 66.54272,-66.54265 0,-36.69561 -29.85278,-66.54664 -66.5444,-66.54664 z m -26.61074,248.3654 c 0,46.72732 -38.01848,84.74581 -84.74975,84.74581 -46.731246,0 -84.749743,-38.01849 -84.749743,-84.74581 0,-46.72957 38.018497,-84.7475 84.749743,-84.7475 46.73127,0 84.74975,38.0185 84.74975,84.7475 z M 203.26732,85.497762 c -16.8746,0 -30.6032,-13.726911 -30.6032,-30.600925 0,-16.874011 13.7286,-30.600923 30.6032,-30.600923 16.87455,0 30.59923,13.726912 30.59923,30.600923 0,16.874014 -13.72695,30.600925 -30.59923,30.600925 z m 31.29146,179.370208 c -25.90998,0 -46.9895,-21.0801 -46.9895,-46.98954 0,-25.91171 21.08007,-46.98951 46.9895,-46.98951 25.90944,0 46.98955,21.08007 46.98955,46.98951 0.003,25.91002 -21.08011,46.98954 -46.98955,46.98954 z" id="path2"></path>  </g></svg> with [`Quarto`](https://quarto.org/).

---
# Art with R, by Katharina Brunner

.center[![description of the image](data:image/png;base64,#figs/generativeart.png)]

---
# How can you learn to use R?

--
## 1. Get a good foundation <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M464 32H48C21.49 32 0 53.49 0 80v352c0 26.51 21.49 48 48 48h416c26.51 0 48-21.49 48-48V80c0-26.51-21.49-48-48-48zM224 416H64V160h160v256zm224 0H288V160h160v256z"></path></svg>

--
## 2. Learn by doing
2.1 with lots of support form the R community on blogs and StackOverflow <svg viewBox="0 0 384 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <g label="icon" id="layer6" groupmode="layer">    <path id="path4" d="M 252.08252,64.000002 224.59395,84.395607 326.58056,221.8558 354.06915,201.4602 Z m -63.40386,54.985808 -21.72932,26.15488 131.69229,109.52992 21.72931,-26.16352 z m -50.55161,70.9472 -14.18598,31.4811 155.197,72.27227 14.18598,-31.0394 z m -29.26399,77.59851 -7.093,33.69819 167.60759,35.0319 7.09299,-33.70684 z M 29.930845,310.98148 V 448 H 338.10776 V 310.98148 H 303.96787 V 413.86011 H 64.079385 V 310.98148 Z m 68.28841,34.14854 V 379.2699 H 269.37765 v -34.13988 z" style="stroke:none;stroke-width:1"></path>  </g></svg>

2.2 adapting other people's code <svg viewBox="0 0 496 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"></path></svg>

2.3 and asking LLMs for help <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M144 208c-17.7 0-32 14.3-32 32s14.3 32 32 32 32-14.3 32-32-14.3-32-32-32zm112 0c-17.7 0-32 14.3-32 32s14.3 32 32 32 32-14.3 32-32-14.3-32-32-32zm112 0c-17.7 0-32 14.3-32 32s14.3 32 32 32 32-14.3 32-32-14.3-32-32-32zM256 32C114.6 32 0 125.1 0 240c0 47.6 19.9 91.2 52.9 126.3C38 405.7 7 439.1 6.5 439.5c-6.6 7-8.4 17.2-4.6 26S14.4 480 24 480c61.5 0 110-25.7 139.1-46.3C192 442.8 223.2 448 256 448c141.4 0 256-93.1 256-208S397.4 32 256 32zm0 368c-26.7 0-53.1-4.1-78.4-12.1l-22.7-7.2-19.5 13.8c-14.3 10.1-33.9 21.4-57.5 29 7.3-12.1 14.4-25.7 19.9-40.2l10.6-28.1-20.6-21.8C69.7 314.1 48 282.2 48 240c0-88.2 93.3-160 208-160s208 71.8 208 160-93.3 160-208 160z"></path></svg>

--
## 3. You can also get good old-fashioned books <svg viewBox="0 0 448 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <g label="icon" id="layer6" groupmode="layer">    <path style="stroke-width:1" d="m 341.26842,32 c 0,15.556013 0,28.718795 0,43.277631 14.55884,-6.381952 27.32275,-13.76109 42.54638,-19.01291 0,17.018549 0,33.039921 0,49.526639 14.49235,-1.99436 27.58866,-4.9859 41.61568,-5.58421 0,98.78735 0,196.9764 0,295.69728 C 358.08755,424.02493 290.94406,452.01246 223.80057,480 156.65707,452.01246 89.84597,424.02493 22.56952,395.90443 c 0,-98.45496 0,-196.71049 0,-295.963191 12.96335,1.728441 25.52783,3.390401 39.42187,5.251811 0,-16.420241 0,-32.175692 0,-48.928329 15.42306,5.25182 28.31993,12.498002 43.14469,19.544744 0,-14.42588 0,-27.921057 0,-43.809465 48.46296,35.964976 90.14511,74.5891 118.19913,126.97432 C 251.18978,106.5891 293.07137,68.164415 341.26842,32 Z M 117.23519,55.732896 c -1.06366,91.740604 -2.06084,181.486864 -3.05802,270.435374 40.55201,32.30865 76.98234,66.87758 104.43804,110.55409 0,-2.79211 0,-5.58422 0,-8.4428 0,-78.44487 -0.13295,-156.88974 0.19944,-235.33461 0.0665,-10.76956 -2.32676,-20.40897 -7.11322,-29.78247 C 189.76346,120.01781 155.52692,87.642676 117.23519,55.732896 Z M 34.53568,388.05994 c 57.83647,24.19825 115.2076,48.19707 174.5066,72.99362 -8.90814,-12.03264 -17.61685,-20.94078 -27.18979,-29.05119 -33.63822,-28.51936 -72.32883,-48.3965 -112.88085,-64.81674 -3.85575,-1.59549 -11.23489,-4.45407 -11.23489,-4.45407 1.2631,-80.37275 2.52619,-163.87001 3.78928,-245.04051 l -26.99035,-4.25464 c 0,91.40822 0,182.88293 0,274.62353 z M 74.09051,74.280455 C 72.5615,168.14839 71.09897,261.28505 69.63643,354.55468 114.3766,372.50393 156.32467,394.4419 193.6857,424.55675 168.88915,387.52812 136.18162,358.5434 102.34396,331.81896 103.20818,250.18311 104.07241,169.74387 104.93663,89.437599 94.76538,83.853384 85.25893,78.668046 74.09051,74.280455 Z M 227.85575,436.72236 c 26.32558,-44.80665 64.88322,-78.91022 104.57102,-110.62057 -0.99719,-89.8792 -1.99436,-179.09363 -2.99154,-270.634806 -9.97181,8.642226 -18.61404,15.755451 -26.72444,23.334019 -27.58866,25.594297 -52.98353,53.050007 -69.27082,87.618937 -3.05802,6.44844 -5.78365,12.89687 -5.78365,20.47544 z m 113.34621,-18.48108 c 23.99881,-10.03829 47.99763,-20.07657 72.32883,-30.31429 0,-91.60765 0,-182.88292 0,-274.82297 -9.77238,1.66197 -19.14587,3.19098 -29.0512,4.85294 1.2631,81.43642 2.45972,164.93367 3.72281,245.23995 l -9.30702,3.25745 c -17.61686,6.84732 -34.76837,14.62532 -51.38805,23.59994 -31.31147,16.81912 -61.02744,35.69907 -84.29499,63.22126 l -5.51774,7.778 z m -91.34175,10.17124 c 0.73126,-0.73126 1.26309,-1.19661 1.72844,-1.59549 0.86423,-0.73127 1.72845,-1.46253 2.5262,-2.26028 36.36386,-30.5802 77.97951,-52.25226 122.18786,-70.00207 -1.39605,-93.40258 -2.79211,-186.53925 -4.25463,-280.739576 -10.90252,5.384776 -21.14023,9.439972 -30.3143,15.821934 0.86423,80.838102 1.79494,161.343812 2.65916,241.849522 -11.36786,9.57294 -22.8022,18.68052 -33.63823,28.4529 -21.33966,19.21234 -41.81511,39.35539 -57.83647,63.48716 z" id="path2"></path>  </g></svg>

---
# Recommended books

<img src="https://d33wubrfki0l68.cloudfront.net/b88ef926a004b0fce72b2526b0b5c4413666a4cb/24a30/cover.png" width="60"> [R for Data Science](https://amzn.to/38NYJUr),  ([link to a free version](https://r4ds.had.co.nz/))

<img src="https://images.tandf.co.uk/common/jackets/amazon/978081538/9780815384571.jpg" width="60"> [Advanced R](https://amzn.to/2RDFBCM),  ([link to a free version](http://adv-r.had.co.nz/))

<img src="https://images-na.ssl-images-amazon.com/images/I/41No%2BZCNEZL._SX379_BO1,204,203,200_.jpg" width="60"> [R Cookbook](https://amzn.to/2RE5hPT), ([link to a free version](http://www.cookbook-r.com/))

<img src="https://m.media-amazon.com/images/I/71iHxb-EMeL._SL1500_.jpg" width="60"> [The R Book](https://amzn.to/2GxmDrh), ([link to a free version](https://www.cs.upc.edu/~robert/teaching/estadistica/TheRBook.pdf))

---
# Additional resources
[An introduction to R by the R Core Team (pdf)](https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf)

[R Studio Primers](https://rstudio.cloud/learn/primers)

[A very short intro to some of R's programming features](https://www.johndcook.com/blog/r_language_for_programmers/)

[Learn R interactively in R, with Swirl](https://swirlstats.com/)

[A free introductory online course at DataCamp](https://www.datacamp.com/courses/free-introduction-to-r)

[An intermediate course at DataCamp](https://www.datacamp.com/courses/intermediate-r)

---
# Even more resources
[Companion to the book 'Data analysis for social science'](https://ellaudet.github.io/dss_instructor_resources/)

[A short introductory tutorial by Chris Hanretty](https://chrishanretty.co.uk/conveRt/#1)

[Another great introduction for political scientists](https://m-freitag.github.io/intro-r-polsci/)

[YaRrr! The Pirate’s Guide to R](https://bookdown.org/ndphillips/YaRrr/)

---
# What can you expect from this tutorial?

--
<svg viewBox="0 0 352 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M176 80c-52.94 0-96 43.06-96 96 0 8.84 7.16 16 16 16s16-7.16 16-16c0-35.3 28.72-64 64-64 8.84 0 16-7.16 16-16s-7.16-16-16-16zM96.06 459.17c0 3.15.93 6.22 2.68 8.84l24.51 36.84c2.97 4.46 7.97 7.14 13.32 7.14h78.85c5.36 0 10.36-2.68 13.32-7.14l24.51-36.84c1.74-2.62 2.67-5.7 2.68-8.84l.05-43.18H96.02l.04 43.18zM176 0C73.72 0 0 82.97 0 176c0 44.37 16.45 84.85 43.56 115.78 16.64 18.99 42.74 58.8 52.42 92.16v.06h48v-.12c-.01-4.77-.72-9.51-2.15-14.07-5.59-17.81-22.82-64.77-62.17-109.67-20.54-23.43-31.52-53.15-31.61-84.14-.2-73.64 59.67-128 127.95-128 70.58 0 128 57.42 128 128 0 30.97-11.24 60.85-31.65 84.14-39.11 44.61-56.42 91.47-62.1 109.46a47.507 47.507 0 0 0-2.22 14.3v.1h48v-.05c9.68-33.37 35.78-73.18 52.42-92.16C335.55 260.85 352 220.37 352 176 352 78.8 273.2 0 176 0z"></path></svg> Get started and get inspired.

--
<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M464 32H48C21.49 32 0 53.49 0 80v352c0 26.51 21.49 48 48 48h416c26.51 0 48-21.49 48-48V80c0-26.51-21.49-48-48-48zM224 416H64V160h160v256zm224 0H288V160h160v256z"></path></svg> Get a good foundation, hopefully.

--
<svg viewBox="0 0 640 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M208 352c-2.39 0-4.78.35-7.06 1.09C187.98 357.3 174.35 360 160 360c-14.35 0-27.98-2.7-40.95-6.91-2.28-.74-4.66-1.09-7.05-1.09C49.94 352-.33 402.48 0 464.62.14 490.88 21.73 512 48 512h224c26.27 0 47.86-21.12 48-47.38.33-62.14-49.94-112.62-112-112.62zm-48-32c53.02 0 96-42.98 96-96s-42.98-96-96-96-96 42.98-96 96 42.98 96 96 96zM592 0H208c-26.47 0-48 22.25-48 49.59V96c23.42 0 45.1 6.78 64 17.8V64h352v288h-64v-64H384v64h-76.24c19.1 16.69 33.12 38.73 39.69 64H592c26.47 0 48-22.25 48-49.59V49.59C640 22.25 618.47 0 592 0z"></path></svg> Learn enough so you can continue learning on your own.

---
# Organization of the meetings

.pull-left[
## Session 1: Introduction to R
- Workflow
- Fundamentals, objects and functions
- Conditional evaluation and loops

## Session 2: Data wrangling
- Importing data
- Restructuring datasets
- Recoding variables
- Merging and exporting data
]

.pull-right[
## Session 3: Data analysis
- Data summary and simple linear models
- Generalized linear models (logistic regression)
- Multilevel models
- Generating tables

## Session 4: Data visualization
- with `plot`
- with `ggplot2`
]

---
# By the end of the tutorial...

## You should not think about working with any other software for your data work<sup>1</sup>.

.footnote[
  [1] Unless you have to work with the uninitiated.
    ]    
  
---
# Let's get started (1)

--
<svg viewBox="0 0 640 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M257.981 272.971L63.638 467.314c-9.373 9.373-24.569 9.373-33.941 0L7.029 444.647c-9.357-9.357-9.375-24.522-.04-33.901L161.011 256 6.99 101.255c-9.335-9.379-9.317-24.544.04-33.901l22.667-22.667c9.373-9.373 24.569-9.373 33.941 0L257.981 239.03c9.373 9.372 9.373 24.568 0 33.941zM640 456v-32c0-13.255-10.745-24-24-24H312c-13.255 0-24 10.745-24 24v32c0 13.255 10.745 24 24 24h304c13.255 0 24-10.745 24-24z"></path></svg> We can work with R directly (from the console/terminal), but it would be nice if we could save our work somehow.

--
<svg viewBox="0 0 448 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M433.941 65.941l-51.882-51.882A48 48 0 0 0 348.118 0H176c-26.51 0-48 21.49-48 48v48H48c-26.51 0-48 21.49-48 48v320c0 26.51 21.49 48 48 48h224c26.51 0 48-21.49 48-48v-48h80c26.51 0 48-21.49 48-48V99.882a48 48 0 0 0-14.059-33.941zM266 464H54a6 6 0 0 1-6-6V150a6 6 0 0 1 6-6h74v224c0 26.51 21.49 48 48 48h96v42a6 6 0 0 1-6 6zm128-96H182a6 6 0 0 1-6-6V54a6 6 0 0 1 6-6h106v88c0 13.255 10.745 24 24 24h88v202a6 6 0 0 1-6 6zm6-256h-64V48h9.632c1.591 0 3.117.632 4.243 1.757l48.368 48.368a6 6 0 0 1 1.757 4.243V112z"></path></svg> We can use any text editor and *copy and paste* the code, but this gets boring pretty quickly.

---
# Let's get started (2)

You can customize the appearance of code in *RStudio*.

---
# General grammatical features of R as a language
<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M173.898 439.404l-166.4-166.4c-9.997-9.997-9.997-26.206 0-36.204l36.203-36.204c9.997-9.998 26.207-9.998 36.204 0L192 312.69 432.095 72.596c9.997-9.997 26.207-9.997 36.204 0l36.203 36.204c9.997 9.997 9.997 26.206 0 36.204l-294.4 294.401c-9.998 9.997-26.207 9.997-36.204-.001z"></path></svg> Functions have arguments in parentheses, separated by commas (unlike in Excel).
<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M173.898 439.404l-166.4-166.4c-9.997-9.997-9.997-26.206 0-36.204l36.203-36.204c9.997-9.998 26.207-9.998 36.204 0L192 312.69 432.095 72.596c9.997-9.997 26.207-9.997 36.204 0l36.203 36.204c9.997 9.997 9.997 26.206 0 36.204l-294.4 294.401c-9.998 9.997-26.207 9.997-36.204-.001z"></path></svg> Capitalization of object and function names matters.
<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M173.898 439.404l-166.4-166.4c-9.997-9.997-9.997-26.206 0-36.204l36.203-36.204c9.997-9.998 26.207-9.998 36.204 0L192 312.69 432.095 72.596c9.997-9.997 26.207-9.997 36.204 0l36.203 36.204c9.997 9.997 9.997 26.206 0 36.204l-294.4 294.401c-9.998 9.997-26.207 9.997-36.204-.001z"></path></svg> Intervals and indents in your code do not matter (unlike in Python). But they matter inside object and function names.
<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M173.898 439.404l-166.4-166.4c-9.997-9.997-9.997-26.206 0-36.204l36.203-36.204c9.997-9.998 26.207-9.998 36.204 0L192 312.69 432.095 72.596c9.997-9.997 26.207-9.997 36.204 0l36.203 36.204c9.997 9.997 9.997 26.206 0 36.204l-294.4 294.401c-9.998 9.997-26.207 9.997-36.204-.001z"></path></svg> You can use single or double quotation marks, also nested within each other. But be careful to close like with like.

---
# Files and projects (1)

--
<svg viewBox="0 0 384 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M369.9 97.9L286 14C277 5 264.8-.1 252.1-.1H48C21.5 0 0 21.5 0 48v416c0 26.5 21.5 48 48 48h288c26.5 0 48-21.5 48-48V131.9c0-12.7-5.1-25-14.1-34zM332.1 128H256V51.9l76.1 76.1zM48 464V48h160v104c0 13.3 10.7 24 24 24h104v288H48z"></path></svg> We can start a file, write code, execute the code, and - when we are happy - we can save the file (with an `.r` extension, but it remains a text file that can be edited in any text editor).

This might be all we need for very small, simple and individual projects. (Btw, where is our work?)

``` r
### Where are we?
getwd()  # oh, here
setwd('C:/my projects/here') # better here
```

---
# Files and projects (2)
<svg viewBox="0 0 640 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M128 352H32c-17.67 0-32 14.33-32 32v96c0 17.67 14.33 32 32 32h96c17.67 0 32-14.33 32-32v-96c0-17.67-14.33-32-32-32zm-24-80h192v48h48v-48h192v48h48v-57.59c0-21.17-17.23-38.41-38.41-38.41H344v-64h40c17.67 0 32-14.33 32-32V32c0-17.67-14.33-32-32-32H256c-17.67 0-32 14.33-32 32v96c0 17.67 14.33 32 32 32h40v64H94.41C73.23 224 56 241.23 56 262.41V320h48v-48zm264 80h-96c-17.67 0-32 14.33-32 32v96c0 17.67 14.33 32 32 32h96c17.67 0 32-14.33 32-32v-96c0-17.67-14.33-32-32-32zm240 0h-96c-17.67 0-32 14.33-32 32v96c0 17.67 14.33 32 32 32h96c17.67 0 32-14.33 32-32v-96c0-17.67-14.33-32-32-32z"></path></svg> For more complex projects, you would want to start a *Project*. A project sets up the *working environment* and organizes things in a nice way. Within the project, you can (should) create separate folders for your code, input data, output data, plots, model results, and tables. The code itself can (should) be separated in smaller files (e.g. one for the libraries and functions that you use, one for data import and manipulation, one for statistical analyses, etc.). To start or continue working on a project, you click on the relevant `.Rproj` file, which loads the working environment.

---
# Files and projects (3)
<svg viewBox="0 0 384 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M384 144c0-44.2-35.8-80-80-80s-80 35.8-80 80c0 36.4 24.3 67.1 57.5 76.8-.6 16.1-4.2 28.5-11 36.9-15.4 19.2-49.3 22.4-85.2 25.7-28.2 2.6-57.4 5.4-81.3 16.9v-144c32.5-10.2 56-40.5 56-76.3 0-44.2-35.8-80-80-80S0 35.8 0 80c0 35.8 23.5 66.1 56 76.3v199.3C23.5 365.9 0 396.2 0 432c0 44.2 35.8 80 80 80s80-35.8 80-80c0-34-21.2-63.1-51.2-74.6 3.1-5.2 7.8-9.8 14.9-13.4 16.2-8.2 40.4-10.4 66.1-12.8 42.2-3.9 90-8.4 118.2-43.4 14-17.4 21.1-39.8 21.6-67.9 31.6-10.8 54.4-40.7 54.4-75.9zM80 64c8.8 0 16 7.2 16 16s-7.2 16-16 16-16-7.2-16-16 7.2-16 16-16zm0 384c-8.8 0-16-7.2-16-16s7.2-16 16-16 16 7.2 16 16-7.2 16-16 16zm224-320c8.8 0 16 7.2 16 16s-7.2 16-16 16-16-7.2-16-16 7.2-16 16-16z"></path></svg> Use relative, not absolute paths in your scripts to make collaborative work easier.

``` r
### There are good and bad paths
'./data/nl/zh/dh/students.csv' # this is a good path
'C:/data/nl/zh/dh/students.csv' # this is a bad path
```

--
Some additional advice on setting up projects is available [here](https://martinctc.github.io/blog/rstudio-projects-and-working-directories-a-beginner's-guide/).
We will say more about workflow (with `GitHub` and `RMarkdown`) later in this tutorial.

---
# Some good practices

--
<svg viewBox="0 0 576 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M402.3 344.9l32-32c5-5 13.7-1.5 13.7 5.7V464c0 26.5-21.5 48-48 48H48c-26.5 0-48-21.5-48-48V112c0-26.5 21.5-48 48-48h273.5c7.1 0 10.7 8.6 5.7 13.7l-32 32c-1.5 1.5-3.5 2.3-5.7 2.3H48v352h352V350.5c0-2.1.8-4.1 2.3-5.6zm156.6-201.8L296.3 405.7l-90.4 10c-26.2 2.9-48.5-19.2-45.6-45.6l10-90.4L432.9 17.1c22.9-22.9 59.9-22.9 82.7 0l43.2 43.2c22.9 22.9 22.9 60 .1 82.8zM460.1 174L402 115.9 216.2 301.8l-7.3 65.3 65.3-7.3L460.1 174zm64.8-79.7l-43.2-43.2c-4.1-4.1-10.8-4.1-14.8 0L436 82l58.1 58.1 30.9-30.9c4-4.2 4-10.8-.1-14.9z"></path></svg> Take the time to annotate your code (using `#` to start a segment of a line that is not executed as code).

Think about the names of files and variables that you use. Have a system and be consistent. You can use `.`, or `_`, or capital letters, but stick to one.

``` r
### How (not) to name your variables
data.nl.denhaag.bezuidenhout # this is fine
data_nl_denhaag_bezuidenhout # this is also fine
DataNlDenhaagBezuidenhout # this is not so fine
data.Nl_DenHaag_.bezuidenhout # this is definitely not fine
```

---
# Some more good practices

--
Think about about how you name your scripts and other file names as well.

---
# Modularity

R works with packages.

The default installation comes with basic functionality.

For everything else, you install a package.

There are multiple packages that can achieve the same task.

There is a special universe of packages called [`tidyverse`](https://www.tidyverse.org/), developed by Hadley Wickham and company, which creates a convenient way to load, wrangle data, analyze and visualize data. We will use these a lot.

---
# Working with packages (1)

Working with packages is easy:
- First, you have to install, from a **CRAN** repository, or from zip files, or via `devtools`. You can install with a command or from the **R Studio** menu. You install a package once on a computer (you might need to update every now and then).

- Once the package is installed, you will want to load it with the `library()` function to use its functions. You have to load the package every session (if you need it, of course).

- You can also directly specify functions from packages for use, e.g. `dplyr::recode()`. This is necessary because different functions in different packages can have the same name. This leads to confusion, both for `R` and for us.

---
# Working with packages (2)

``` r
### How to install and load a package
install.packages('dplyr')
library (dplyr)
```

--
<svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M496 448H16c-8.84 0-16 7.16-16 16v32c0 8.84 7.16 16 16 16h480c8.84 0 16-7.16 16-16v-32c0-8.84-7.16-16-16-16zm-304-64l-64-32 64-32 32-64 32 64 64 32-64 32-16 32h208l-86.41-201.63a63.955 63.955 0 0 1-1.89-45.45L416 0 228.42 107.19a127.989 127.989 0 0 0-53.46 59.15L64 416h144l-16-32zm64-224l16-32 16 32 32 16-32 16-16 32-16-32-32-16 32-16z"></path></svg> **Protip:** If you work with people who would not know how to install a package but would want to run your code, you can start your code with a function that will install and load packages automatically (see [here](https://stackoverflow.com/questions/4090169/elegant-way-to-check-for-missing-packages-and-install-them))

---
# Assignment operators (1)

Perhaps the most fundamental operation in R is to assign a value to a named object: `object_name <- value`. Be careful, R is sensitive; *case sensitive*, that is.

You can be old school<sup>1</sup> and assign values to names with `<-`. Or you can just use `=`. And if you are that cool, you can also use `->`.

.footnote[
  [1] "There is a general preference among the R community for using `<-` for assignment (other than in function signatures) for compatibility with (very) old versions of S-Plus."]
  
---
# Assignment operators (2)

``` r
### There are different ways to assign
best.month <- 'August'
best.date = 18
1978 -> best.year

best.date
## [1] 18
best.month
## [1] "August"
best.year
## [1] 1978
```

---
# Assignment operators (3)

There are some subtle differences among the different assignment operators; if you are interested, read [here](https://stackoverflow.com/questions/1741820/what-are-the-differences-between-and-assignment-operators-in-r).

You also have the assignment operator `<<-`. This is most useful *'in conjunction with closures to maintain state'*. Exactly. If you want to know more, read [here](https://stackoverflow.com/questions/2628621/how-do-you-use-scoping-assignment-in-r).

---
# Vectors (1)
Vectors are one-dimensional collections of objects.

``` r
### How to make and check a vector
v1 <- seq (1, 50, by=5)
v1
##  [1]  1  6 11 16 21 26 31 36 41 46
v2 <- c('R', 'pie', 5, NA)
v2
## [1] "R"   "pie" "5"   NA
is.vector(v1)
## [1] TRUE
is.vector (v2)
## [1] TRUE
is.vector(c(is.vector(v1), is.vector(v2)))
## [1] TRUE
```

---
# Vectors (2)
There are several different types of vectors: *logical*, *character*, *numeric* (which can be *double* or *integer*), *complex* and *raw*. *Factors* and *dates* are augmented vectors that have a special attribute, their *'class'*.

``` r
### What vectors?
typeof(v1)
## [1] "double"
typeof(v2)
## [1] "character"
typeof(is.vector(c(is.vector(v1), is.vector(v2))))
## [1] "logical"
typeof(c("1", "2", "4"))
## [1] "character"
```

---
# More on vectors

``` r
0.3/3 == 0.1 # floating point bizzaro
## [1] FALSE
round(0.3/3,1) == 0.1
## [1] TRUE
unique(c(.3, .4 - .1, .5 - .2, .6 - .3, .7 - .4))
## [1] 0.3 0.3 0.3
```

Integers have one special value: `NA`, while doubles have four: `NA`, `NaN`, `Inf` and `-Inf`.

``` r
c(-1, 0, 1) / 0     
## [1] -Inf  NaN  Inf
```

---
# Coercion (1)
Use can coerce one type of vector to another. But be gentle and beware the consequences.

``` r
v1 <- c(1,2,4)
typeof(v1)
## [1] "double"
f1 <- as.factor(v1)
f1
## [1] 1 2 4
## Levels: 1 2 4
n1 <- as.numeric(f1)
is.numeric(n1)
## [1] TRUE
n1 # OMG!!!
## [1] 1 2 3
n2 <- as.numeric(as.character(f1))
n2 # that's better!
## [1] 1 2 4
```

---
# Coercion (2)
Coercion happens without your help (and perhaps realization) as well, every time you mix vector elements of different types together. The most complex type prevails.

``` r
v1 <- seq(1:999)
is.numeric(v1)
## [1] TRUE
length(v1) # vectors have length
## [1] 999
v2 <- c(v1, '1000')
is.numeric(v2)
## [1] FALSE
typeof(v2) # it only takes one
## [1] "character"
```

---
# Character vectors
Character vectors are the most complex type of atomic vector, because each element of a character vector is a string, and a string can contain an arbitrary amount of data.

Working with strings and character vectors is very common in data analysis. There are a couple of very useful operations with strings that we should learn right away:

``` r
v.char <- c("alpha", "beta", "gama")
substr(v.char, 1, 2) # get the first two letters of every element 
## [1] "al" "be" "ga"
nchar(v.char) # count the number of characters in each string
## [1] 5 4 4
toupper(v.char) # capitalize each element
## [1] "ALPHA" "BETA"  "GAMA"
```
The package [`stringr`](https://stringr.tidyverse.org/) has handy functions for more advanced operations.
  
---
# Lists 
Lists, also called recursive vectors, can contain all kinds of things, including other lists.

`y <- list("a", 1L, 1.5, TRUE)`

Data frame are lists of a special class:

`typeof(data.frame(NA))`

`class(data.frame(NA))`

---
# Navigating our objects
There are different ways in which we can navigate to and access elements of our objects.

--
We can do that by position or name:

``` r
x <- rnorm(100 ,0, 1) # let's generate some randomness 
y <- rnorm(100 ,0, 1) # let's generate more randomness 
m <- cbind(x,y) # let's bind randomness together in a ....
class(m)
## [1] "matrix" "array"
dim(m)
## [1] 100   2
length(x)
## [1] 100
df<-data.frame(m)
```

---
# Navigation examples

``` r
x[1]
## [1] 1.114262
m[1,1]
##        x 
## 1.114262
m[1:5, -2]
## [1]  1.1142618 -1.3311627  0.3256704 -0.3040167 -1.7666893
df[seq(1, 100, 10), "y"]
##  [1]  0.92370516 -0.05881295 -0.20430290 -0.73554277 -0.69460705  0.38450683
##  [7]  0.37186241  0.31955549  0.34642647  0.46685266
```

--
Navigating lists is more complicated. We have to use `x[[ n ]]` to get the n-th element of list `x`. That element itself could be anything (e.g. a data frame).

---
# Some basic functions for summarizing data
First steps are easy

``` r
mean(x)
## [1] 0.05794814
sd(y)
## [1] 1.00284
quantile(m)
##          0%         25%         50%         75%        100% 
## -3.68630633 -0.71101087 -0.02760762  0.57849092  2.58010017
range(df)
## [1] -3.686306  2.580100
summary(df)
##        x                  y            
##  Min.   :-2.01945   Min.   :-3.686306  
##  1st Qu.:-0.75151   1st Qu.:-0.672038  
##  Median :-0.07559   Median : 0.005507  
##  Mean   : 0.05795   Mean   :-0.079793  
##  3rd Qu.: 0.76700   3rd Qu.: 0.449224  
##  Max.   : 2.58010   Max.   : 2.385434
```

---
# But it can get more tricky. Note that we can use the dollar sign `$` to access columns (variables) of a data frame.

``` r
df <- rbind(df, c(NA,NA)) # bind rows together
tail(df)
##               x            y
## 96   2.40618546  0.594138446
## 97   0.06715917  0.276341899
## 98  -0.39137443  1.235099081
## 99  -0.34462524  0.425646241
## 100  0.56221336 -0.003453708
## 101          NA           NA
sum(df$x) # oops
## [1] NA
sum(df$x, na.rm=TRUE) # ok, R is very careful with missing data 
## [1] 5.794814
```

---
# And more tricky

``` r
sum(df)
## [1] NA
df$z <- rowSums(df) 
tail(df, 2)
##             x            y         z
## 100 0.5622134 -0.003453708 0.5587597
## 101        NA           NA        NA
df$z <- rowSums(df, na.rm=TRUE)
tail(df, 2)
##             x            y        z
## 100 0.5622134 -0.003453708 1.117519
## 101        NA           NA 0.000000
```

---
# LOOPS (1)
Loops are a fundamental programming technique, in which we iterate over a predefined sequence and apply a function to each element.

``` r
for (i in 1:5){
   print(round(df[i,], 2))
}
##      x    y    z
## 1 1.11 0.92 4.08
##       x     y     z
## 2 -1.33 -2.71 -8.08
##      x   y    z
## 3 0.33 0.5 1.65
##      x    y    z
## 4 -0.3 0.37 0.14
##       x     y     z
## 5 -1.77 -1.51 -6.55
```
Most of R functions are vectorized, which means that we do not have to loop over the elements of a vector to apply the function to each element separately. Yet, in some cases loops can be handy.

---
# LOOPS (2)

We can also create new objects in loops:

``` r
for (i in 1:dim(df)[1]){
    df$our.sum[i] <- sum(df[i,1:2], na.rm=TRUE)
}
df[c(1,100:101),]
```

```
##             x            y        z   our.sum
## 1   1.1142618  0.923705162 4.075934 2.0379670
## 100 0.5622134 -0.003453708 1.117519 0.5587597
## 101        NA           NA 0.000000 0.0000000
```

You can read more about loops [here](https://www.datacamp.com/community/tutorials/tutorial-on-loops-in-r).

---
# Comparisons (evaluation)
Sooner or later, we all become judgmental:

These are the main evaluation functions: `>`, `>=`, `<`, `<=`, `!=` (not equal), and `==` (equal).

With logical operators, we can mix thing up a bit: `&` is “and”, `|` is “or”, and `!` is “not”. 
   
Be careful with missing values:  almost any operation involving an unknown value will also be unknown.

---
# Comparisons (2)

We can check for missing data: `is.na(x)` or even better `which(is.na(df$x))`.

``` r
which(df$x > 1)
##  [1]  1  6  8 11 13 15 28 30 31 37 39 42 47 48 69 72 75 76 92 96
w1 <- which(df$x > 1)
length(w1)
## [1] 20
w2 <- which(df$y>1)
length(w2)
## [1] 11
```

Check whether the last row of `df` has elements greater than 1.

---
# Conditionals (1)
Conditional evaluation is another fundamental programming technique.

``` r
  if (this) {
     # do that
   } else if (that) {
     # do something else
   } else {
     # 
   }
```

---
# Conditionals (2)
For very short evaluations we can also use the `ifelse` one-liner: `ifelse(evaluate, do.this.if.true, do.this.if.false)`. These simple statements can be nested, but it is better to use the extensive form shown above.

``` r
for (i in 1:length(df$x)){
   if (is.na(df$x[i]) == FALSE & is.na(df$y[i]) == FALSE) {
      df$out.sum2[i] <- sum(df[i,1:2])
   } else {
       df$out.sum2[i] <- NA
   }
}
```

---
# Functions
Objects are staff with names and values. Functions do things to objecs.

In R you can easily write your own functions. Just give them a name and tell them what to do

``` r
sum.na <- function (x) {sum(x, na.rm=T)} # sum that avoids NAs 
sum.na(c(3,5,NA))
## [1] 8

sum.allna <- function (x) {if (all(is.na(x))) NA 
  else sum(x, na.rm=T)} # sum that avoids NAs but returns NA if all NAs
sum.allna(c(NA,NA))
## [1] NA
```

You can read more about functions [here](https://www.datacamp.com/community/tutorials/functions-in-r-a-tutorial).

---
# Strings and factors

## Strings
You can create strings with either single quotes or double quotes.
Multiple strings are often stored in a character vector, which you can create with `c()`.

## Factors
In R, factors are used to work with categorical variables: variables that have a fixed and known set of possible values. They are also useful when you want to display character vectors in a non-alphabetical order. You can read more about factors [here](https://www.datacamp.com/tutorial/factors-in-r).

If you ever need to access the set of valid factor levels directly, you can do so with `levels()`. You can also re-asign the levels of a factor with `levels()`.
   
---
# When things don't woRk as expected <svg viewBox="0 0 496 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg">  <path d="M248 8C111 8 0 119 0 256s111 248 248 248 248-111 248-248S385 8 248 8zm144 386.4V280c0-13.2-10.8-24-24-24s-24 10.8-24 24v151.4C315.5 447 282.8 456 248 456s-67.5-9-96-24.6V280c0-13.2-10.8-24-24-24s-24 10.8-24 24v114.4c-34.6-36-56-84.7-56-138.4 0-110.3 89.7-200 200-200s200 89.7 200 200c0 53.7-21.4 102.5-56 138.4zM205.8 234.5c4.4-2.4 6.9-7.4 6.1-12.4-4-25.2-34.2-42.1-59.8-42.1s-55.9 16.9-59.8 42.1c-.8 5 1.7 10 6.1 12.4 4.4 2.4 9.9 1.8 13.7-1.6l9.5-8.5c14.8-13.2 46.2-13.2 61 0l9.5 8.5c2.5 2.3 7.9 4.8 13.7 1.6zM344 180c-25.7 0-55.9 16.9-59.8 42.1-.8 5 1.7 10 6.1 12.4 4.5 2.4 9.9 1.8 13.7-1.6l9.5-8.5c14.8-13.2 46.2-13.2 61 0l9.5 8.5c2.5 2.2 8 4.7 13.7 1.6 4.4-2.4 6.9-7.4 6.1-12.4-3.9-25.2-34.1-42.1-59.8-42.1zm-96 92c-30.9 0-56 28.7-56 64s25.1 64 56 64 56-28.7 56-64-25.1-64-56-64z"></path></svg>

---
# Some solution strategies

---
# How to get in touch?