class: top, left, inverse, title-slide .title[ # R for Data Analysis: A Short Tutorial ] .subtitle[ ## Session 1: Introduction to R ] .author[ ### Dimiter Toshkov ] .institute[ ### Institute of Public Administration, Leiden University ] .date[ ### last updated: 2025-03-31 ] --- <style type="text/css"> .title-slide { background-image: url(https://cran.r-project.org/Rlogo.svg); background-position: 50% 0%; ## just start changing this background-size: 150px; background-color: #fff; padding-left: 100px; /* delete this for 4:3 aspect ratio */ } .remark-slide-content { font-size: 28px; padding: 1em 1em 1em 1em; } .remark-slide-content > h1 { font-size: 32px; margin-top: -85px; } </style> # Welcome! First things first: introductions! -- <svg viewBox="0 0 640 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M434.7 64h-85.9c-8 0-15.7 3-21.6 8.4l-98.3 90c-.1.1-.2.3-.3.4-16.6 15.6-16.3 40.5-2.1 56 12.7 13.9 39.4 17.6 56.1 2.7.1-.1.3-.1.4-.2l79.9-73.2c6.5-5.9 16.7-5.5 22.6 1 6 6.5 5.5 16.6-1 22.6l-26.1 23.9L504 313.8c2.9 2.4 5.5 5 7.9 7.7V128l-54.6-54.6c-5.9-6-14.1-9.4-22.6-9.4zM544 128.2v223.9c0 17.7 14.3 32 32 32h64V128.2h-96zm48 223.9c-8.8 0-16-7.2-16-16s7.2-16 16-16 16 7.2 16 16-7.2 16-16 16zM0 384h64c17.7 0 32-14.3 32-32V128.2H0V384zm48-63.9c8.8 0 16 7.2 16 16s-7.2 16-16 16-16-7.2-16-16c0-8.9 7.2-16 16-16zm435.9 18.6L334.6 217.5l-30 27.5c-29.7 27.1-75.2 24.5-101.7-4.4-26.9-29.4-24.8-74.9 4.4-101.7L289.1 64h-83.8c-8.5 0-16.6 3.4-22.6 9.4L128 128v223.9h18.3l90.5 81.9c27.4 22.3 67.7 18.1 90-9.3l.2-.2 17.9 15.5c15.9 13 39.4 10.5 52.3-5.4l31.4-38.6 5.4 4.4c13.7 11.1 33.9 9.1 45-4.7l9.5-11.7c11.2-13.8 9.1-33.9-4.6-45.1z"></path></svg> My name is written *Димитър* and is pronounced *[diˈ mitər]*. -- <svg viewBox="0 0 640 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M255.03 261.65c6.25 6.25 16.38 6.25 22.63 0l11.31-11.31c6.25-6.25 6.25-16.38 0-22.63L253.25 192l35.71-35.72c6.25-6.25 6.25-16.38 0-22.63l-11.31-11.31c-6.25-6.25-16.38-6.25-22.63 0l-58.34 58.34c-6.25 6.25-6.25 16.38 0 22.63l58.35 58.34zm96.01-11.3l11.31 11.31c6.25 6.25 16.38 6.25 22.63 0l58.34-58.34c6.25-6.25 6.25-16.38 0-22.63l-58.34-58.34c-6.25-6.25-16.38-6.25-22.63 0l-11.31 11.31c-6.25 6.25-6.25 16.38 0 22.63L386.75 192l-35.71 35.72c-6.25 6.25-6.25 16.38 0 22.63zM624 416H381.54c-.74 19.81-14.71 32-32.74 32H288c-18.69 0-33.02-17.47-32.77-32H16c-8.8 0-16 7.2-16 16v16c0 35.2 28.8 64 64 64h512c35.2 0 64-28.8 64-64v-16c0-8.8-7.2-16-16-16zM576 48c0-26.4-21.6-48-48-48H112C85.6 0 64 21.6 64 48v336h512V48zm-64 272H128V64h384v256z"></path></svg> I am *not* a programmer. -- <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M332.8 320h38.4c6.4 0 12.8-6.4 12.8-12.8V172.8c0-6.4-6.4-12.8-12.8-12.8h-38.4c-6.4 0-12.8 6.4-12.8 12.8v134.4c0 6.4 6.4 12.8 12.8 12.8zm96 0h38.4c6.4 0 12.8-6.4 12.8-12.8V76.8c0-6.4-6.4-12.8-12.8-12.8h-38.4c-6.4 0-12.8 6.4-12.8 12.8v230.4c0 6.4 6.4 12.8 12.8 12.8zm-288 0h38.4c6.4 0 12.8-6.4 12.8-12.8v-70.4c0-6.4-6.4-12.8-12.8-12.8h-38.4c-6.4 0-12.8 6.4-12.8 12.8v70.4c0 6.4 6.4 12.8 12.8 12.8zm96 0h38.4c6.4 0 12.8-6.4 12.8-12.8V108.8c0-6.4-6.4-12.8-12.8-12.8h-38.4c-6.4 0-12.8 6.4-12.8 12.8v198.4c0 6.4 6.4 12.8 12.8 12.8zM496 384H64V80c0-8.84-7.16-16-16-16H16C7.16 64 0 71.16 0 80v336c0 17.67 14.33 32 32 32h464c8.84 0 16-7.16 16-16v-32c0-8.84-7.16-16-16-16z"></path></svg> I have done a fair bit of data analysis, visualizations and some programming. -- <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M464 448H48c-26.51 0-48-21.49-48-48V112c0-26.51 21.49-48 48-48h416c26.51 0 48 21.49 48 48v288c0 26.51-21.49 48-48 48zM112 120c-30.928 0-56 25.072-56 56s25.072 56 56 56 56-25.072 56-56-25.072-56-56-56zM64 384h384V272l-87.515-87.515c-4.686-4.686-12.284-4.686-16.971 0L208 320l-55.515-55.515c-4.686-4.686-12.284-4.686-16.971 0L64 336v48z"></path></svg> I like pictures. --- class: inverse, top background-image: url("data:image/png;base64,#figs/mip_env_country.png") background-size: contain # Here is a picture that I like (done in R) --- class: inverse, top background-image: url("data:image/png;base64,#http://dimiter.eu/thumb/DSC_3484.jpg") background-size: contain # Not done in R (yes, there's [more](http://dimiter.eu/Photos.html) where that came from <svg viewBox="0 0 496 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M248 8C111 8 0 119 0 256s111 248 248 248 248-111 248-248S385 8 248 8zm0 448c-110.3 0-200-89.7-200-200S137.7 56 248 56s200 89.7 200 200-89.7 200-200 200zm117.8-146.4c-10.2-8.5-25.3-7.1-33.8 3.1-20.8 25-51.5 39.4-84 39.4s-63.2-14.3-84-39.4c-8.5-10.2-23.7-11.5-33.8-3.1-10.2 8.5-11.5 23.6-3.1 33.8 30 36 74.1 56.6 120.9 56.6s90.9-20.6 120.9-56.6c8.5-10.2 7.1-25.3-3.1-33.8zM168 240c17.7 0 32-14.3 32-32s-14.3-32-32-32-32 14.3-32 32 14.3 32 32 32zm160-60c-25.7 0-55.9 16.9-59.9 42.1-1.7 11.2 11.5 18.2 19.8 10.8l9.5-8.5c14.8-13.2 46.2-13.2 61 0l9.5 8.5c8.5 7.4 21.6.3 19.8-10.8-3.8-25.2-34-42.1-59.7-42.1z"></path></svg>) --- class: left # What is R? -- ## R is the best thing since sliced bread, -- <img src="https://cran.r-project.org/Rlogo.svg" width="300"> <img src="https://www.abelandcole.co.uk/media/3029_15898_z.jpg" width="300"> -- class: left ## only much better, because... -- .left[unlike bread, it combines *functional* with *object-oriented* programming,] -- .left[and it is not sliced, but modular, so that it can be easily extended with new packages.] --- # Downsides of R -- It is called R. -- It comes with no customer support. -- It regularly leads people into arguments about how good it is. -- Error messages are rarely informative. -- It is not made for data entry (though if you insist, look [here](https://www.sebastianvanbaalen.se/uploads/tutorial-data-collection-app)). --- # What can you do with R? (1) ## Fun things, such as: - Run (m)any statistical models, including Bayseian models (e.g. with [STAN](https://mc-stan.org/users/interfaces/rstan)) - Do automated text analysis (e.g. with [quanteda](https://quanteda.io/)) and machine learning (e.g. with [tensorflow](https://cran.r-project.org/web/packages/tensorflow/index.html)) - Interact with Large Language Models (LLMs) (e.g. with [chattr](https://mlverse.github.io/chattr/) and [tidychatmodels](https://albert-rapp.de/posts/20_tidychatmodels/20_tidychatmodels)) --- # What can you do with R? (2) ## Access data: - From any type of file (Stata, SPSS, Excel, text, etc.) - Directly from the web via APIs (e.g. [World Bank](https://cran.r-project.org/web/packages/wbstats/vignettes/wbstats.html)) - Scrape complex internet sites and databases (e.g. [EUR-Lex](https://eur-lex.europa.eu/)) --- # What can you do with R? (3) ## Do other important things, such as: - Well-formatted conference programs from excel sheets <svg viewBox="0 0 496 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M248 8C111 8 0 119 0 256s111 248 248 248 248-111 248-248S385 8 248 8zm0 448c-110.3 0-200-89.7-200-200S137.7 56 248 56s200 89.7 200 200-89.7 200-200 200zm0-176c-35.3 0-64 28.7-64 64s28.7 64 64 64 64-28.7 64-64-28.7-64-64-64zm-48-72c0-17.7-14.3-32-32-32s-32 14.3-32 32 14.3 32 32 32 32-14.3 32-32zm128-32c-17.7 0-32 14.3-32 32s14.3 32 32 32 32-14.3 32-32-14.3-32-32-32z"></path></svg> - Presentations like this one (with `RMarkdown` and [xaringan](https://cran.r-project.org/web/packages/xaringan/index.html)) <svg viewBox="0 0 640 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M278.9 511.5l-61-17.7c-6.4-1.8-10-8.5-8.2-14.9L346.2 8.7c1.8-6.4 8.5-10 14.9-8.2l61 17.7c6.4 1.8 10 8.5 8.2 14.9L293.8 503.3c-1.9 6.4-8.5 10.1-14.9 8.2zm-114-112.2l43.5-46.4c4.6-4.9 4.3-12.7-.8-17.2L117 256l90.6-79.7c5.1-4.5 5.5-12.3.8-17.2l-43.5-46.4c-4.5-4.8-12.1-5.1-17-.5L3.8 247.2c-5.1 4.7-5.1 12.8 0 17.5l144.1 135.1c4.9 4.6 12.5 4.4 17-.5zm327.2.6l144.1-135.1c5.1-4.7 5.1-12.8 0-17.5L492.1 112.1c-4.8-4.5-12.4-4.3-17 .5L431.6 159c-4.6 4.9-4.3 12.7.8 17.2L523 256l-90.6 79.7c-5.1 4.5-5.5 12.3-.8 17.2l43.5 46.4c4.5 4.9 12.1 5.1 17 .6z"></path></svg> - Practice open reproducible science <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <g groupmode="layer" id="layer6" label="icon"> <path id="path90" style="fill-opacity:1;fill-rule:nonzero;stroke:none;stroke-width:0.0352777" d="M 255.99349,8.0446377 C 223.80996,6.8710755 198.88846,29.106812 171.90108,43.078975 135.19601,64.83668 97.398702,84.909644 61.424592,107.79531 c -23.529448,19.12486 -34.340913,49.89944 -31.869027,79.61823 0.136467,52.69231 -0.273926,105.39637 0.206898,158.08127 3.046554,31.90013 25.166618,58.97257 53.630746,72.31691 45.693451,26.22554 91.119371,52.93874 136.981251,78.85735 29.15402,13.29373 63.66608,7.7032 89.45702,-10.28262 45.55933,-26.46403 91.40188,-52.46098 136.78286,-79.21917 26.10444,-18.58601 38.49713,-51.27731 35.81722,-82.60095 -0.13644,-52.69193 0.27402,-105.39561 -0.20674,-158.08014 C 479.17554,134.58854 457.05921,107.51038 428.59249,94.170602 382.89744,67.950466 337.46312,41.255709 291.60404,15.334368 280.50454,10.177688 268.1747,8.0789301 255.99349,8.0446377 Z M 227.29819,140.73479 c 19.13338,0 38.26677,0 57.40015,0 0,76.83994 0,153.67988 0,230.51981 -19.13338,0 -38.26677,0 -57.40015,0 0,-76.83993 0,-153.67987 0,-230.51981 z m 72.23096,50.17436 c 19.12396,0 38.2479,0 57.37188,0 0,60.11515 0,120.2303 0,180.34545 -19.12398,0 -38.24792,0 -57.37188,0 0,-60.11515 0,-120.2303 0,-180.34545 z m -144.45251,94.07924 c 19.13025,0 38.26049,0 57.39074,0 0,28.7554 0,57.51081 0,86.26621 -19.13025,0 -38.26049,0 -57.39074,0 0,-28.7554 0,-57.51081 0,-86.26621 z"></path> </g></svg> <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <g groupmode="layer" id="layer6" label="icon"> <path id="path86" style="fill-opacity:1;fill-rule:nonzero;stroke:none;stroke-width:0.0352777" d="m 255.99051,7.9999992 c -14.45957,0 -28.91558,3.1667158 -39.88958,9.5105468 L 69.397238,102.20596 C 47.455739,114.88072 29.50769,145.96034 29.50769,171.29036 v 169.41925 c 0,25.33653 17.948049,56.42568 39.889548,69.0939 l 146.703692,84.71437 c 10.974,6.32444 25.4365,9.48212 39.88958,9.48212 14.46596,0 28.92849,-3.15768 39.88953,-9.48212 l 146.72261,-84.71437 c 21.9352,-12.66822 39.88966,-43.75737 39.88966,-69.0939 V 171.29036 c 0,-25.33002 -17.95446,-56.40964 -39.88966,-69.0844 L 295.88952,17.510546 c -10.974,-6.343831 -25.43305,-9.5105468 -39.89901,-9.5105468 z m -58.05822,145.7374608 58.06766,22.86711 58.09612,-22.86711 100.07901,40.79892 -58.06755,22.85762 V 317.49208 L 255.99995,358.27204 155.90196,317.49208 V 217.394 L 97.815363,194.53638 Z M 155.90196,217.394 255.99995,258.16451 356.10753,217.394 255.99995,176.60457 Z"></path> </g></svg> <svg viewBox="0 0 320 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <g groupmode="layer" id="layer6" label="icon"> <path style="stroke-width:0.07717" d="m 234.55878,151.33351 c -0.0147,0 -0.0257,0.002 -0.0414,0.002 L 224.16976,96.812862 c 15.38213,-7.702378 25.99025,-23.574443 25.99025,-41.918293 0,-25.857997 -21.03653,-46.8945659 -46.89284,-46.8945659 -25.85803,0 -46.89911,21.0365689 -46.89911,46.8968339 0,25.858034 21.04053,46.896833 46.89911,46.896833 0.64873,0 1.28333,-0.0722 1.9281,-0.0996 l 9.97184,52.5411 c -27.25948,8.32453 -47.14885,33.69486 -47.14885,63.6431 0,22.04302 10.81663,41.55831 27.37532,53.67526 l -21.77855,36.87756 c -14.95433,-8.29345 -32.13858,-13.03243 -50.4163,-13.03243 -57.513449,0 -104.302352,46.78948 -104.302352,104.30066 0,57.51118 46.78947,104.30067 104.302352,104.30067 57.51286,0 104.30234,-46.78949 104.30234,-104.30067 0,-32.21545 -14.68591,-61.05947 -37.71052,-80.20745 l 22.92967,-38.82661 c 6.853,2.39098 14.18125,3.75738 21.84068,3.75738 36.69162,0 66.54272,-29.85103 66.54272,-66.54265 0,-36.69561 -29.85278,-66.54664 -66.5444,-66.54664 z m -26.61074,248.3654 c 0,46.72732 -38.01848,84.74581 -84.74975,84.74581 -46.731246,0 -84.749743,-38.01849 -84.749743,-84.74581 0,-46.72957 38.018497,-84.7475 84.749743,-84.7475 46.73127,0 84.74975,38.0185 84.74975,84.7475 z M 203.26732,85.497762 c -16.8746,0 -30.6032,-13.726911 -30.6032,-30.600925 0,-16.874011 13.7286,-30.600923 30.6032,-30.600923 16.87455,0 30.59923,13.726912 30.59923,30.600923 0,16.874014 -13.72695,30.600925 -30.59923,30.600925 z m 31.29146,179.370208 c -25.90998,0 -46.9895,-21.0801 -46.9895,-46.98954 0,-25.91171 21.08007,-46.98951 46.9895,-46.98951 25.90944,0 46.98955,21.08007 46.98955,46.98951 0.003,25.91002 -21.08011,46.98954 -46.98955,46.98954 z" id="path2"></path> </g></svg> with [`Quarto`](https://quarto.org/). --- # Art with R, by Katharina Brunner .center[] --- # How can you learn to use R? -- ## 1. Get a good foundation <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M464 32H48C21.49 32 0 53.49 0 80v352c0 26.51 21.49 48 48 48h416c26.51 0 48-21.49 48-48V80c0-26.51-21.49-48-48-48zM224 416H64V160h160v256zm224 0H288V160h160v256z"></path></svg> -- ## 2. Learn by doing 2.1 with lots of support form the R community on blogs and StackOverflow <svg viewBox="0 0 384 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <g label="icon" id="layer6" groupmode="layer"> <path id="path4" d="M 252.08252,64.000002 224.59395,84.395607 326.58056,221.8558 354.06915,201.4602 Z m -63.40386,54.985808 -21.72932,26.15488 131.69229,109.52992 21.72931,-26.16352 z m -50.55161,70.9472 -14.18598,31.4811 155.197,72.27227 14.18598,-31.0394 z m -29.26399,77.59851 -7.093,33.69819 167.60759,35.0319 7.09299,-33.70684 z M 29.930845,310.98148 V 448 H 338.10776 V 310.98148 H 303.96787 V 413.86011 H 64.079385 V 310.98148 Z m 68.28841,34.14854 V 379.2699 H 269.37765 v -34.13988 z" style="stroke:none;stroke-width:1"></path> </g></svg> 2.2 adapting other people's code <svg viewBox="0 0 496 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"></path></svg> 2.3 and asking LLMs for help <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M144 208c-17.7 0-32 14.3-32 32s14.3 32 32 32 32-14.3 32-32-14.3-32-32-32zm112 0c-17.7 0-32 14.3-32 32s14.3 32 32 32 32-14.3 32-32-14.3-32-32-32zm112 0c-17.7 0-32 14.3-32 32s14.3 32 32 32 32-14.3 32-32-14.3-32-32-32zM256 32C114.6 32 0 125.1 0 240c0 47.6 19.9 91.2 52.9 126.3C38 405.7 7 439.1 6.5 439.5c-6.6 7-8.4 17.2-4.6 26S14.4 480 24 480c61.5 0 110-25.7 139.1-46.3C192 442.8 223.2 448 256 448c141.4 0 256-93.1 256-208S397.4 32 256 32zm0 368c-26.7 0-53.1-4.1-78.4-12.1l-22.7-7.2-19.5 13.8c-14.3 10.1-33.9 21.4-57.5 29 7.3-12.1 14.4-25.7 19.9-40.2l10.6-28.1-20.6-21.8C69.7 314.1 48 282.2 48 240c0-88.2 93.3-160 208-160s208 71.8 208 160-93.3 160-208 160z"></path></svg> -- ## 3. You can also get good old-fashioned books <svg viewBox="0 0 448 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <g label="icon" id="layer6" groupmode="layer"> <path style="stroke-width:1" d="m 341.26842,32 c 0,15.556013 0,28.718795 0,43.277631 14.55884,-6.381952 27.32275,-13.76109 42.54638,-19.01291 0,17.018549 0,33.039921 0,49.526639 14.49235,-1.99436 27.58866,-4.9859 41.61568,-5.58421 0,98.78735 0,196.9764 0,295.69728 C 358.08755,424.02493 290.94406,452.01246 223.80057,480 156.65707,452.01246 89.84597,424.02493 22.56952,395.90443 c 0,-98.45496 0,-196.71049 0,-295.963191 12.96335,1.728441 25.52783,3.390401 39.42187,5.251811 0,-16.420241 0,-32.175692 0,-48.928329 15.42306,5.25182 28.31993,12.498002 43.14469,19.544744 0,-14.42588 0,-27.921057 0,-43.809465 48.46296,35.964976 90.14511,74.5891 118.19913,126.97432 C 251.18978,106.5891 293.07137,68.164415 341.26842,32 Z M 117.23519,55.732896 c -1.06366,91.740604 -2.06084,181.486864 -3.05802,270.435374 40.55201,32.30865 76.98234,66.87758 104.43804,110.55409 0,-2.79211 0,-5.58422 0,-8.4428 0,-78.44487 -0.13295,-156.88974 0.19944,-235.33461 0.0665,-10.76956 -2.32676,-20.40897 -7.11322,-29.78247 C 189.76346,120.01781 155.52692,87.642676 117.23519,55.732896 Z M 34.53568,388.05994 c 57.83647,24.19825 115.2076,48.19707 174.5066,72.99362 -8.90814,-12.03264 -17.61685,-20.94078 -27.18979,-29.05119 -33.63822,-28.51936 -72.32883,-48.3965 -112.88085,-64.81674 -3.85575,-1.59549 -11.23489,-4.45407 -11.23489,-4.45407 1.2631,-80.37275 2.52619,-163.87001 3.78928,-245.04051 l -26.99035,-4.25464 c 0,91.40822 0,182.88293 0,274.62353 z M 74.09051,74.280455 C 72.5615,168.14839 71.09897,261.28505 69.63643,354.55468 114.3766,372.50393 156.32467,394.4419 193.6857,424.55675 168.88915,387.52812 136.18162,358.5434 102.34396,331.81896 103.20818,250.18311 104.07241,169.74387 104.93663,89.437599 94.76538,83.853384 85.25893,78.668046 74.09051,74.280455 Z M 227.85575,436.72236 c 26.32558,-44.80665 64.88322,-78.91022 104.57102,-110.62057 -0.99719,-89.8792 -1.99436,-179.09363 -2.99154,-270.634806 -9.97181,8.642226 -18.61404,15.755451 -26.72444,23.334019 -27.58866,25.594297 -52.98353,53.050007 -69.27082,87.618937 -3.05802,6.44844 -5.78365,12.89687 -5.78365,20.47544 z m 113.34621,-18.48108 c 23.99881,-10.03829 47.99763,-20.07657 72.32883,-30.31429 0,-91.60765 0,-182.88292 0,-274.82297 -9.77238,1.66197 -19.14587,3.19098 -29.0512,4.85294 1.2631,81.43642 2.45972,164.93367 3.72281,245.23995 l -9.30702,3.25745 c -17.61686,6.84732 -34.76837,14.62532 -51.38805,23.59994 -31.31147,16.81912 -61.02744,35.69907 -84.29499,63.22126 l -5.51774,7.778 z m -91.34175,10.17124 c 0.73126,-0.73126 1.26309,-1.19661 1.72844,-1.59549 0.86423,-0.73127 1.72845,-1.46253 2.5262,-2.26028 36.36386,-30.5802 77.97951,-52.25226 122.18786,-70.00207 -1.39605,-93.40258 -2.79211,-186.53925 -4.25463,-280.739576 -10.90252,5.384776 -21.14023,9.439972 -30.3143,15.821934 0.86423,80.838102 1.79494,161.343812 2.65916,241.849522 -11.36786,9.57294 -22.8022,18.68052 -33.63823,28.4529 -21.33966,19.21234 -41.81511,39.35539 -57.83647,63.48716 z" id="path2"></path> </g></svg> --- # Recommended books <img src="https://d33wubrfki0l68.cloudfront.net/b88ef926a004b0fce72b2526b0b5c4413666a4cb/24a30/cover.png" width="60"> [R for Data Science](https://amzn.to/38NYJUr), ([link to a free version](https://r4ds.had.co.nz/)) <img src="https://images.tandf.co.uk/common/jackets/amazon/978081538/9780815384571.jpg" width="60"> [Advanced R](https://amzn.to/2RDFBCM), ([link to a free version](http://adv-r.had.co.nz/)) <img src="https://images-na.ssl-images-amazon.com/images/I/41No%2BZCNEZL._SX379_BO1,204,203,200_.jpg" width="60"> [R Cookbook](https://amzn.to/2RE5hPT), ([link to a free version](http://www.cookbook-r.com/)) <img src="https://m.media-amazon.com/images/I/71iHxb-EMeL._SL1500_.jpg" width="60"> [The R Book](https://amzn.to/2GxmDrh), ([link to a free version](https://www.cs.upc.edu/~robert/teaching/estadistica/TheRBook.pdf)) --- # Additional resources [An introduction to R by the R Core Team (pdf)](https://cran.r-project.org/doc/manuals/r-release/R-intro.pdf) [R Studio Primers](https://rstudio.cloud/learn/primers) [A very short intro to some of R's programming features](https://www.johndcook.com/blog/r_language_for_programmers/) [Learn R interactively in R, with Swirl](https://swirlstats.com/) [A free introductory online course at DataCamp](https://www.datacamp.com/courses/free-introduction-to-r) [An intermediate course at DataCamp](https://www.datacamp.com/courses/intermediate-r) --- # Even more resources [Companion to the book 'Data analysis for social science'](https://ellaudet.github.io/dss_instructor_resources/) [A short introductory tutorial by Chris Hanretty](https://chrishanretty.co.uk/conveRt/#1) [Another great introduction for political scientists](https://m-freitag.github.io/intro-r-polsci/) [YaRrr! The Pirate’s Guide to R](https://bookdown.org/ndphillips/YaRrr/) --- # What can you expect from this tutorial? -- <svg viewBox="0 0 352 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M176 80c-52.94 0-96 43.06-96 96 0 8.84 7.16 16 16 16s16-7.16 16-16c0-35.3 28.72-64 64-64 8.84 0 16-7.16 16-16s-7.16-16-16-16zM96.06 459.17c0 3.15.93 6.22 2.68 8.84l24.51 36.84c2.97 4.46 7.97 7.14 13.32 7.14h78.85c5.36 0 10.36-2.68 13.32-7.14l24.51-36.84c1.74-2.62 2.67-5.7 2.68-8.84l.05-43.18H96.02l.04 43.18zM176 0C73.72 0 0 82.97 0 176c0 44.37 16.45 84.85 43.56 115.78 16.64 18.99 42.74 58.8 52.42 92.16v.06h48v-.12c-.01-4.77-.72-9.51-2.15-14.07-5.59-17.81-22.82-64.77-62.17-109.67-20.54-23.43-31.52-53.15-31.61-84.14-.2-73.64 59.67-128 127.95-128 70.58 0 128 57.42 128 128 0 30.97-11.24 60.85-31.65 84.14-39.11 44.61-56.42 91.47-62.1 109.46a47.507 47.507 0 0 0-2.22 14.3v.1h48v-.05c9.68-33.37 35.78-73.18 52.42-92.16C335.55 260.85 352 220.37 352 176 352 78.8 273.2 0 176 0z"></path></svg> Get started and get inspired. -- <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M464 32H48C21.49 32 0 53.49 0 80v352c0 26.51 21.49 48 48 48h416c26.51 0 48-21.49 48-48V80c0-26.51-21.49-48-48-48zM224 416H64V160h160v256zm224 0H288V160h160v256z"></path></svg> Get a good foundation, hopefully. -- <svg viewBox="0 0 640 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M208 352c-2.39 0-4.78.35-7.06 1.09C187.98 357.3 174.35 360 160 360c-14.35 0-27.98-2.7-40.95-6.91-2.28-.74-4.66-1.09-7.05-1.09C49.94 352-.33 402.48 0 464.62.14 490.88 21.73 512 48 512h224c26.27 0 47.86-21.12 48-47.38.33-62.14-49.94-112.62-112-112.62zm-48-32c53.02 0 96-42.98 96-96s-42.98-96-96-96-96 42.98-96 96 42.98 96 96 96zM592 0H208c-26.47 0-48 22.25-48 49.59V96c23.42 0 45.1 6.78 64 17.8V64h352v288h-64v-64H384v64h-76.24c19.1 16.69 33.12 38.73 39.69 64H592c26.47 0 48-22.25 48-49.59V49.59C640 22.25 618.47 0 592 0z"></path></svg> Learn enough so you can continue learning on your own. --- # Organization of the meetings .pull-left[ ## Session 1: Introduction to R - Workflow - Fundamentals, objects and functions - Conditional evaluation and loops ## Session 2: Data wrangling - Importing data - Restructuring datasets - Recoding variables - Merging and exporting data ] .pull-right[ ## Session 3: Data analysis - Data summary and simple linear models - Generalized linear models (logistic regression) - Multilevel models - Generating tables ## Session 4: Data visualization - with `plot` - with `ggplot2` ] --- # By the end of the tutorial... ## You should not think about working with any other software for your data work<sup>1</sup>. .footnote[ [1] Unless you have to work with the uninitiated. ] --- # Let's get started (1) -- <svg viewBox="0 0 640 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M257.981 272.971L63.638 467.314c-9.373 9.373-24.569 9.373-33.941 0L7.029 444.647c-9.357-9.357-9.375-24.522-.04-33.901L161.011 256 6.99 101.255c-9.335-9.379-9.317-24.544.04-33.901l22.667-22.667c9.373-9.373 24.569-9.373 33.941 0L257.981 239.03c9.373 9.372 9.373 24.568 0 33.941zM640 456v-32c0-13.255-10.745-24-24-24H312c-13.255 0-24 10.745-24 24v32c0 13.255 10.745 24 24 24h304c13.255 0 24-10.745 24-24z"></path></svg> We can work with R directly (from the console/terminal), but it would be nice if we could save our work somehow. -- <svg viewBox="0 0 448 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M433.941 65.941l-51.882-51.882A48 48 0 0 0 348.118 0H176c-26.51 0-48 21.49-48 48v48H48c-26.51 0-48 21.49-48 48v320c0 26.51 21.49 48 48 48h224c26.51 0 48-21.49 48-48v-48h80c26.51 0 48-21.49 48-48V99.882a48 48 0 0 0-14.059-33.941zM266 464H54a6 6 0 0 1-6-6V150a6 6 0 0 1 6-6h74v224c0 26.51 21.49 48 48 48h96v42a6 6 0 0 1-6 6zm128-96H182a6 6 0 0 1-6-6V54a6 6 0 0 1 6-6h106v88c0 13.255 10.745 24 24 24h88v202a6 6 0 0 1-6 6zm6-256h-64V48h9.632c1.591 0 3.117.632 4.243 1.757l48.368 48.368a6 6 0 0 1 1.757 4.243V112z"></path></svg> We can use any text editor and *copy and paste* the code, but this gets boring pretty quickly. -- <svg viewBox="0 0 640 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M255.03 261.65c6.25 6.25 16.38 6.25 22.63 0l11.31-11.31c6.25-6.25 6.25-16.38 0-22.63L253.25 192l35.71-35.72c6.25-6.25 6.25-16.38 0-22.63l-11.31-11.31c-6.25-6.25-16.38-6.25-22.63 0l-58.34 58.34c-6.25 6.25-6.25 16.38 0 22.63l58.35 58.34zm96.01-11.3l11.31 11.31c6.25 6.25 16.38 6.25 22.63 0l58.34-58.34c6.25-6.25 6.25-16.38 0-22.63l-58.34-58.34c-6.25-6.25-16.38-6.25-22.63 0l-11.31 11.31c-6.25 6.25-6.25 16.38 0 22.63L386.75 192l-35.71 35.72c-6.25 6.25-6.25 16.38 0 22.63zM624 416H381.54c-.74 19.81-14.71 32-32.74 32H288c-18.69 0-33.02-17.47-32.77-32H16c-8.8 0-16 7.2-16 16v16c0 35.2 28.8 64 64 64h512c35.2 0 64-28.8 64-64v-16c0-8.8-7.2-16-16-16zM576 48c0-26.4-21.6-48-48-48H112C85.6 0 64 21.6 64 48v336h512V48zm-64 272H128V64h384v256z"></path></svg> So we use programs such as [**R Studio**](https://posit.co/download/rstudio-desktop/) that integrate a text editor linked to **R** and some other nice features. -- <svg viewBox="0 0 640 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M278.9 511.5l-61-17.7c-6.4-1.8-10-8.5-8.2-14.9L346.2 8.7c1.8-6.4 8.5-10 14.9-8.2l61 17.7c6.4 1.8 10 8.5 8.2 14.9L293.8 503.3c-1.9 6.4-8.5 10.1-14.9 8.2zm-114-112.2l43.5-46.4c4.6-4.9 4.3-12.7-.8-17.2L117 256l90.6-79.7c5.1-4.5 5.5-12.3.8-17.2l-43.5-46.4c-4.5-4.8-12.1-5.1-17-.5L3.8 247.2c-5.1 4.7-5.1 12.8 0 17.5l144.1 135.1c4.9 4.6 12.5 4.4 17-.5zm327.2.6l144.1-135.1c5.1-4.7 5.1-12.8 0-17.5L492.1 112.1c-4.8-4.5-12.4-4.3-17 .5L431.6 159c-4.6 4.9-4.3 12.7.8 17.2L523 256l-90.6 79.7c-5.1 4.5-5.5 12.3-.8 17.2l43.5 46.4c4.5 4.9 12.1 5.1 17 .6z"></path></svg> We write in a text file (script) and send commands using `Cntr+ENTER` to be executed in the console. --- # Let's get started (2) You can customize the appearance of code in *RStudio*. <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M496 448H16c-8.84 0-16 7.16-16 16v32c0 8.84 7.16 16 16 16h480c8.84 0 16-7.16 16-16v-32c0-8.84-7.16-16-16-16zm-304-64l-64-32 64-32 32-64 32 64 64 32-64 32-16 32h208l-86.41-201.63a63.955 63.955 0 0 1-1.89-45.45L416 0 228.42 107.19a127.989 127.989 0 0 0-53.46 59.15L64 416h144l-16-32zm64-224l16-32 16 32 32 16-32 16-16 32-16-32-32-16 32-16z"></path></svg> **Protip:** Use an *R Studio* theme that highlights code (I use *Pastel on Dark*). <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M496 448H16c-8.84 0-16 7.16-16 16v32c0 8.84 7.16 16 16 16h480c8.84 0 16-7.16 16-16v-32c0-8.84-7.16-16-16-16zm-304-64l-64-32 64-32 32-64 32 64 64 32-64 32-16 32h208l-86.41-201.63a63.955 63.955 0 0 1-1.89-45.45L416 0 228.42 107.19a127.989 127.989 0 0 0-53.46 59.15L64 416h144l-16-32zm64-224l16-32 16 32 32 16-32 16-16 32-16-32-32-16 32-16z"></path></svg> **Protip:** Turn on *rainbow brackets*. <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M496 448H16c-8.84 0-16 7.16-16 16v32c0 8.84 7.16 16 16 16h480c8.84 0 16-7.16 16-16v-32c0-8.84-7.16-16-16-16zm-304-64l-64-32 64-32 32-64 32 64 64 32-64 32-16 32h208l-86.41-201.63a63.955 63.955 0 0 1-1.89-45.45L416 0 228.42 107.19a127.989 127.989 0 0 0-53.46 59.15L64 416h144l-16-32zm64-224l16-32 16 32 32 16-32 16-16 32-16-32-32-16 32-16z"></path></svg> **Protip:** *R Studio* has useful shortcuts (see them all with `Alt+Shift+K`). Learn and use some of them (e.g. `Cntr+1`/`Cntr+2`). --- # General grammatical features of R as a language <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M173.898 439.404l-166.4-166.4c-9.997-9.997-9.997-26.206 0-36.204l36.203-36.204c9.997-9.998 26.207-9.998 36.204 0L192 312.69 432.095 72.596c9.997-9.997 26.207-9.997 36.204 0l36.203 36.204c9.997 9.997 9.997 26.206 0 36.204l-294.4 294.401c-9.998 9.997-26.207 9.997-36.204-.001z"></path></svg> Functions have arguments in parentheses, separated by commas (unlike in Excel). <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M173.898 439.404l-166.4-166.4c-9.997-9.997-9.997-26.206 0-36.204l36.203-36.204c9.997-9.998 26.207-9.998 36.204 0L192 312.69 432.095 72.596c9.997-9.997 26.207-9.997 36.204 0l36.203 36.204c9.997 9.997 9.997 26.206 0 36.204l-294.4 294.401c-9.998 9.997-26.207 9.997-36.204-.001z"></path></svg> Capitalization of object and function names matters. <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M173.898 439.404l-166.4-166.4c-9.997-9.997-9.997-26.206 0-36.204l36.203-36.204c9.997-9.998 26.207-9.998 36.204 0L192 312.69 432.095 72.596c9.997-9.997 26.207-9.997 36.204 0l36.203 36.204c9.997 9.997 9.997 26.206 0 36.204l-294.4 294.401c-9.998 9.997-26.207 9.997-36.204-.001z"></path></svg> Intervals and indents in your code do not matter (unlike in Python). But they matter inside object and function names. <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M173.898 439.404l-166.4-166.4c-9.997-9.997-9.997-26.206 0-36.204l36.203-36.204c9.997-9.998 26.207-9.998 36.204 0L192 312.69 432.095 72.596c9.997-9.997 26.207-9.997 36.204 0l36.203 36.204c9.997 9.997 9.997 26.206 0 36.204l-294.4 294.401c-9.998 9.997-26.207 9.997-36.204-.001z"></path></svg> You can use single or double quotation marks, also nested within each other. But be careful to close like with like. --- # Files and projects (1) -- <svg viewBox="0 0 384 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M369.9 97.9L286 14C277 5 264.8-.1 252.1-.1H48C21.5 0 0 21.5 0 48v416c0 26.5 21.5 48 48 48h288c26.5 0 48-21.5 48-48V131.9c0-12.7-5.1-25-14.1-34zM332.1 128H256V51.9l76.1 76.1zM48 464V48h160v104c0 13.3 10.7 24 24 24h104v288H48z"></path></svg> We can start a file, write code, execute the code, and - when we are happy - we can save the file (with an `.r` extension, but it remains a text file that can be edited in any text editor). -- This might be all we need for very small, simple and individual projects. (Btw, where is our work?) ``` r ### Where are we? getwd() # oh, here setwd('C:/my projects/here') # better here ``` --- # Files and projects (2) <svg viewBox="0 0 640 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M128 352H32c-17.67 0-32 14.33-32 32v96c0 17.67 14.33 32 32 32h96c17.67 0 32-14.33 32-32v-96c0-17.67-14.33-32-32-32zm-24-80h192v48h48v-48h192v48h48v-57.59c0-21.17-17.23-38.41-38.41-38.41H344v-64h40c17.67 0 32-14.33 32-32V32c0-17.67-14.33-32-32-32H256c-17.67 0-32 14.33-32 32v96c0 17.67 14.33 32 32 32h40v64H94.41C73.23 224 56 241.23 56 262.41V320h48v-48zm264 80h-96c-17.67 0-32 14.33-32 32v96c0 17.67 14.33 32 32 32h96c17.67 0 32-14.33 32-32v-96c0-17.67-14.33-32-32-32zm240 0h-96c-17.67 0-32 14.33-32 32v96c0 17.67 14.33 32 32 32h96c17.67 0 32-14.33 32-32v-96c0-17.67-14.33-32-32-32z"></path></svg> For more complex projects, you would want to start a *Project*. A project sets up the *working environment* and organizes things in a nice way. Within the project, you can (should) create separate folders for your code, input data, output data, plots, model results, and tables. The code itself can (should) be separated in smaller files (e.g. one for the libraries and functions that you use, one for data import and manipulation, one for statistical analyses, etc.). To start or continue working on a project, you click on the relevant `.Rproj` file, which loads the working environment. --- # Files and projects (3) <svg viewBox="0 0 384 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M384 144c0-44.2-35.8-80-80-80s-80 35.8-80 80c0 36.4 24.3 67.1 57.5 76.8-.6 16.1-4.2 28.5-11 36.9-15.4 19.2-49.3 22.4-85.2 25.7-28.2 2.6-57.4 5.4-81.3 16.9v-144c32.5-10.2 56-40.5 56-76.3 0-44.2-35.8-80-80-80S0 35.8 0 80c0 35.8 23.5 66.1 56 76.3v199.3C23.5 365.9 0 396.2 0 432c0 44.2 35.8 80 80 80s80-35.8 80-80c0-34-21.2-63.1-51.2-74.6 3.1-5.2 7.8-9.8 14.9-13.4 16.2-8.2 40.4-10.4 66.1-12.8 42.2-3.9 90-8.4 118.2-43.4 14-17.4 21.1-39.8 21.6-67.9 31.6-10.8 54.4-40.7 54.4-75.9zM80 64c8.8 0 16 7.2 16 16s-7.2 16-16 16-16-7.2-16-16 7.2-16 16-16zm0 384c-8.8 0-16-7.2-16-16s7.2-16 16-16 16 7.2 16 16-7.2 16-16 16zm224-320c8.8 0 16 7.2 16 16s-7.2 16-16 16-16-7.2-16-16 7.2-16 16-16z"></path></svg> Use relative, not absolute paths in your scripts to make collaborative work easier. ``` r ### There are good and bad paths './data/nl/zh/dh/students.csv' # this is a good path 'C:/data/nl/zh/dh/students.csv' # this is a bad path ``` -- Some additional advice on setting up projects is available [here](https://martinctc.github.io/blog/rstudio-projects-and-working-directories-a-beginner's-guide/). We will say more about workflow (with `GitHub` and `RMarkdown`) later in this tutorial. --- # Some good practices -- <svg viewBox="0 0 576 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M402.3 344.9l32-32c5-5 13.7-1.5 13.7 5.7V464c0 26.5-21.5 48-48 48H48c-26.5 0-48-21.5-48-48V112c0-26.5 21.5-48 48-48h273.5c7.1 0 10.7 8.6 5.7 13.7l-32 32c-1.5 1.5-3.5 2.3-5.7 2.3H48v352h352V350.5c0-2.1.8-4.1 2.3-5.6zm156.6-201.8L296.3 405.7l-90.4 10c-26.2 2.9-48.5-19.2-45.6-45.6l10-90.4L432.9 17.1c22.9-22.9 59.9-22.9 82.7 0l43.2 43.2c22.9 22.9 22.9 60 .1 82.8zM460.1 174L402 115.9 216.2 301.8l-7.3 65.3 65.3-7.3L460.1 174zm64.8-79.7l-43.2-43.2c-4.1-4.1-10.8-4.1-14.8 0L436 82l58.1 58.1 30.9-30.9c4-4.2 4-10.8-.1-14.9z"></path></svg> Take the time to annotate your code (using `#` to start a segment of a line that is not executed as code). -- Think about the names of files and variables that you use. Have a system and be consistent. You can use `.`, or `_`, or capital letters, but stick to one. ``` r ### How (not) to name your variables data.nl.denhaag.bezuidenhout # this is fine data_nl_denhaag_bezuidenhout # this is also fine DataNlDenhaagBezuidenhout # this is not so fine data.Nl_DenHaag_.bezuidenhout # this is definitely not fine ``` --- # Some more good practices -- Think about about how you name your scripts and other file names as well. <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M496 448H16c-8.84 0-16 7.16-16 16v32c0 8.84 7.16 16 16 16h480c8.84 0 16-7.16 16-16v-32c0-8.84-7.16-16-16-16zm-304-64l-64-32 64-32 32-64 32 64 64 32-64 32-16 32h208l-86.41-201.63a63.955 63.955 0 0 1-1.89-45.45L416 0 228.42 107.19a127.989 127.989 0 0 0-53.46 59.15L64 416h144l-16-32zm64-224l16-32 16 32 32 16-32 16-16 32-16-32-32-16 32-16z"></path></svg> **Protip:** Use `00_libraries.R`, `01_firstanalysis.R`, etc. to name your scripts in the order that they should be executed, so you can quickly sort them within the folder alphabetically. --- # Modularity R works with packages. The default installation comes with basic functionality. For everything else, you install a package. There are multiple packages that can achieve the same task. There is a special universe of packages called [`tidyverse`](https://www.tidyverse.org/), developed by Hadley Wickham and company, which creates a convenient way to load, wrangle data, analyze and visualize data. We will use these a lot. --- # Working with packages (1) Working with packages is easy: - First, you have to install, from a **CRAN** repository, or from zip files, or via `devtools`. You can install with a command or from the **R Studio** menu. You install a package once on a computer (you might need to update every now and then). - Once the package is installed, you will want to load it with the `library()` function to use its functions. You have to load the package every session (if you need it, of course). - You can also directly specify functions from packages for use, e.g. `dplyr::recode()`. This is necessary because different functions in different packages can have the same name. This leads to confusion, both for `R` and for us. --- # Working with packages (2) ``` r ### How to install and load a package install.packages('dplyr') library (dplyr) ``` -- <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M496 448H16c-8.84 0-16 7.16-16 16v32c0 8.84 7.16 16 16 16h480c8.84 0 16-7.16 16-16v-32c0-8.84-7.16-16-16-16zm-304-64l-64-32 64-32 32-64 32 64 64 32-64 32-16 32h208l-86.41-201.63a63.955 63.955 0 0 1-1.89-45.45L416 0 228.42 107.19a127.989 127.989 0 0 0-53.46 59.15L64 416h144l-16-32zm64-224l16-32 16 32 32 16-32 16-16 32-16-32-32-16 32-16z"></path></svg> **Protip:** If you work with people who would not know how to install a package but would want to run your code, you can start your code with a function that will install and load packages automatically (see [here](https://stackoverflow.com/questions/4090169/elegant-way-to-check-for-missing-packages-and-install-them)) <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M496 448H16c-8.84 0-16 7.16-16 16v32c0 8.84 7.16 16 16 16h480c8.84 0 16-7.16 16-16v-32c0-8.84-7.16-16-16-16zm-304-64l-64-32 64-32 32-64 32 64 64 32-64 32-16 32h208l-86.41-201.63a63.955 63.955 0 0 1-1.89-45.45L416 0 228.42 107.19a127.989 127.989 0 0 0-53.46 59.15L64 416h144l-16-32zm64-224l16-32 16 32 32 16-32 16-16 32-16-32-32-16 32-16z"></path></svg> **Protip:** Don't do that with people who know their way around R. They don't like your script installing things without their authorization. --- # Assignment operators (1) Perhaps the most fundamental operation in R is to assign a value to a named object: `object_name <- value`. Be careful, R is sensitive; *case sensitive*, that is. You can be old school<sup>1</sup> and assign values to names with `<-`. Or you can just use `=`. And if you are that cool, you can also use `->`. .footnote[ [1] "There is a general preference among the R community for using `<-` for assignment (other than in function signatures) for compatibility with (very) old versions of S-Plus."] --- # Assignment operators (2) ``` r ### There are different ways to assign best.month <- 'August' best.date = 18 1978 -> best.year best.date ## [1] 18 best.month ## [1] "August" best.year ## [1] 1978 ``` --- # Assignment operators (3) There are some subtle differences among the different assignment operators; if you are interested, read [here](https://stackoverflow.com/questions/1741820/what-are-the-differences-between-and-assignment-operators-in-r). You also have the assignment operator `<<-`. This is most useful *'in conjunction with closures to maintain state'*. Exactly. If you want to know more, read [here](https://stackoverflow.com/questions/2628621/how-do-you-use-scoping-assignment-in-r). --- # Vectors (1) Vectors are one-dimensional collections of objects. ``` r ### How to make and check a vector v1 <- seq (1, 50, by=5) v1 ## [1] 1 6 11 16 21 26 31 36 41 46 v2 <- c('R', 'pie', 5, NA) v2 ## [1] "R" "pie" "5" NA is.vector(v1) ## [1] TRUE is.vector (v2) ## [1] TRUE is.vector(c(is.vector(v1), is.vector(v2))) ## [1] TRUE ``` --- # Vectors (2) There are several different types of vectors: *logical*, *character*, *numeric* (which can be *double* or *integer*), *complex* and *raw*. *Factors* and *dates* are augmented vectors that have a special attribute, their *'class'*. ``` r ### What vectors? typeof(v1) ## [1] "double" typeof(v2) ## [1] "character" typeof(is.vector(c(is.vector(v1), is.vector(v2)))) ## [1] "logical" typeof(c("1", "2", "4")) ## [1] "character" ``` --- # More on vectors <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M496 448H16c-8.84 0-16 7.16-16 16v32c0 8.84 7.16 16 16 16h480c8.84 0 16-7.16 16-16v-32c0-8.84-7.16-16-16-16zm-304-64l-64-32 64-32 32-64 32 64 64 32-64 32-16 32h208l-86.41-201.63a63.955 63.955 0 0 1-1.89-45.45L416 0 228.42 107.19a127.989 127.989 0 0 0-53.46 59.15L64 416h144l-16-32zm64-224l16-32 16 32 32 16-32 16-16 32-16-32-32-16 32-16z"></path></svg> **Protip:** In R, numbers are 'doubles' by default. To make an 'integer', place an `L` after the number (e.g.`2L`). This can save some trouble down the road. Alternatively, use `round()` when evaluating. ``` r 0.3/3 == 0.1 # floating point bizzaro ## [1] FALSE round(0.3/3,1) == 0.1 ## [1] TRUE unique(c(.3, .4 - .1, .5 - .2, .6 - .3, .7 - .4)) ## [1] 0.3 0.3 0.3 ``` Integers have one special value: `NA`, while doubles have four: `NA`, `NaN`, `Inf` and `-Inf`. ``` r c(-1, 0, 1) / 0 ## [1] -Inf NaN Inf ``` --- # Coercion (1) Use can coerce one type of vector to another. But be gentle and beware the consequences. ``` r v1 <- c(1,2,4) typeof(v1) ## [1] "double" f1 <- as.factor(v1) f1 ## [1] 1 2 4 ## Levels: 1 2 4 n1 <- as.numeric(f1) is.numeric(n1) ## [1] TRUE n1 # OMG!!! ## [1] 1 2 3 n2 <- as.numeric(as.character(f1)) n2 # that's better! ## [1] 1 2 4 ``` --- # Coercion (2) Coercion happens without your help (and perhaps realization) as well, every time you mix vector elements of different types together. The most complex type prevails. ``` r v1 <- seq(1:999) is.numeric(v1) ## [1] TRUE length(v1) # vectors have length ## [1] 999 v2 <- c(v1, '1000') is.numeric(v2) ## [1] FALSE typeof(v2) # it only takes one ## [1] "character" ``` --- # Character vectors Character vectors are the most complex type of atomic vector, because each element of a character vector is a string, and a string can contain an arbitrary amount of data. Working with strings and character vectors is very common in data analysis. There are a couple of very useful operations with strings that we should learn right away: ``` r v.char <- c("alpha", "beta", "gama") substr(v.char, 1, 2) # get the first two letters of every element ## [1] "al" "be" "ga" nchar(v.char) # count the number of characters in each string ## [1] 5 4 4 toupper(v.char) # capitalize each element ## [1] "ALPHA" "BETA" "GAMA" ``` The package [`stringr`](https://stringr.tidyverse.org/) has handy functions for more advanced operations. --- # Lists Lists, also called recursive vectors, can contain all kinds of things, including other lists. `y <- list("a", 1L, 1.5, TRUE)` Data frame are lists of a special class: `typeof(data.frame(NA))` `class(data.frame(NA))` --- # Navigating our objects There are different ways in which we can navigate to and access elements of our objects. -- We can do that by position or name: ``` r x <- rnorm(100 ,0, 1) # let's generate some randomness y <- rnorm(100 ,0, 1) # let's generate more randomness m <- cbind(x,y) # let's bind randomness together in a .... class(m) ## [1] "matrix" "array" dim(m) ## [1] 100 2 length(x) ## [1] 100 df<-data.frame(m) ``` --- # Navigation examples ``` r x[1] ## [1] 1.114262 m[1,1] ## x ## 1.114262 m[1:5, -2] ## [1] 1.1142618 -1.3311627 0.3256704 -0.3040167 -1.7666893 df[seq(1, 100, 10), "y"] ## [1] 0.92370516 -0.05881295 -0.20430290 -0.73554277 -0.69460705 0.38450683 ## [7] 0.37186241 0.31955549 0.34642647 0.46685266 ``` -- Navigating lists is more complicated. We have to use `x[[ n ]]` to get the n-th element of list `x`. That element itself could be anything (e.g. a data frame). --- # Some basic functions for summarizing data First steps are easy ``` r mean(x) ## [1] 0.05794814 sd(y) ## [1] 1.00284 quantile(m) ## 0% 25% 50% 75% 100% ## -3.68630633 -0.71101087 -0.02760762 0.57849092 2.58010017 range(df) ## [1] -3.686306 2.580100 summary(df) ## x y ## Min. :-2.01945 Min. :-3.686306 ## 1st Qu.:-0.75151 1st Qu.:-0.672038 ## Median :-0.07559 Median : 0.005507 ## Mean : 0.05795 Mean :-0.079793 ## 3rd Qu.: 0.76700 3rd Qu.: 0.449224 ## Max. : 2.58010 Max. : 2.385434 ``` --- # But it can get more tricky. Note that we can use the dollar sign `$` to access columns (variables) of a data frame. ``` r df <- rbind(df, c(NA,NA)) # bind rows together tail(df) ## x y ## 96 2.40618546 0.594138446 ## 97 0.06715917 0.276341899 ## 98 -0.39137443 1.235099081 ## 99 -0.34462524 0.425646241 ## 100 0.56221336 -0.003453708 ## 101 NA NA sum(df$x) # oops ## [1] NA sum(df$x, na.rm=TRUE) # ok, R is very careful with missing data ## [1] 5.794814 ``` --- # And more tricky ``` r sum(df) ## [1] NA df$z <- rowSums(df) tail(df, 2) ## x y z ## 100 0.5622134 -0.003453708 0.5587597 ## 101 NA NA NA df$z <- rowSums(df, na.rm=TRUE) tail(df, 2) ## x y z ## 100 0.5622134 -0.003453708 1.117519 ## 101 NA NA 0.000000 ``` --- # LOOPS (1) Loops are a fundamental programming technique, in which we iterate over a predefined sequence and apply a function to each element. ``` r for (i in 1:5){ print(round(df[i,], 2)) } ## x y z ## 1 1.11 0.92 4.08 ## x y z ## 2 -1.33 -2.71 -8.08 ## x y z ## 3 0.33 0.5 1.65 ## x y z ## 4 -0.3 0.37 0.14 ## x y z ## 5 -1.77 -1.51 -6.55 ``` Most of R functions are vectorized, which means that we do not have to loop over the elements of a vector to apply the function to each element separately. Yet, in some cases loops can be handy. --- # LOOPS (2) We can also create new objects in loops: ``` r for (i in 1:dim(df)[1]){ df$our.sum[i] <- sum(df[i,1:2], na.rm=TRUE) } df[c(1,100:101),] ``` ``` ## x y z our.sum ## 1 1.1142618 0.923705162 4.075934 2.0379670 ## 100 0.5622134 -0.003453708 1.117519 0.5587597 ## 101 NA NA 0.000000 0.0000000 ``` You can read more about loops [here](https://www.datacamp.com/community/tutorials/tutorial-on-loops-in-r). --- # Comparisons (evaluation) Sooner or later, we all become judgmental: These are the main evaluation functions: `>`, `>=`, `<`, `<=`, `!=` (not equal), and `==` (equal). With logical operators, we can mix thing up a bit: `&` is “and”, `|` is “or”, and `!` is “not”. Be careful with missing values: almost any operation involving an unknown value will also be unknown. --- # Comparisons (2) We can check for missing data: `is.na(x)` or even better `which(is.na(df$x))`. ``` r which(df$x > 1) ## [1] 1 6 8 11 13 15 28 30 31 37 39 42 47 48 69 72 75 76 92 96 w1 <- which(df$x > 1) length(w1) ## [1] 20 w2 <- which(df$y>1) length(w2) ## [1] 11 ``` Check whether the last row of `df` has elements greater than 1. --- # Conditionals (1) Conditional evaluation is another fundamental programming technique. ``` r if (this) { # do that } else if (that) { # do something else } else { # } ``` --- # Conditionals (2) For very short evaluations we can also use the `ifelse` one-liner: `ifelse(evaluate, do.this.if.true, do.this.if.false)`. These simple statements can be nested, but it is better to use the extensive form shown above. ``` r for (i in 1:length(df$x)){ if (is.na(df$x[i]) == FALSE & is.na(df$y[i]) == FALSE) { df$out.sum2[i] <- sum(df[i,1:2]) } else { df$out.sum2[i] <- NA } } ``` --- # Functions Objects are staff with names and values. Functions do things to objecs. In R you can easily write your own functions. Just give them a name and tell them what to do ``` r sum.na <- function (x) {sum(x, na.rm=T)} # sum that avoids NAs sum.na(c(3,5,NA)) ## [1] 8 sum.allna <- function (x) {if (all(is.na(x))) NA else sum(x, na.rm=T)} # sum that avoids NAs but returns NA if all NAs sum.allna(c(NA,NA)) ## [1] NA ``` You can read more about functions [here](https://www.datacamp.com/community/tutorials/functions-in-r-a-tutorial). --- # Strings and factors ## Strings You can create strings with either single quotes or double quotes. Multiple strings are often stored in a character vector, which you can create with `c()`. ## Factors In R, factors are used to work with categorical variables: variables that have a fixed and known set of possible values. They are also useful when you want to display character vectors in a non-alphabetical order. You can read more about factors [here](https://www.datacamp.com/tutorial/factors-in-r). If you ever need to access the set of valid factor levels directly, you can do so with `levels()`. You can also re-asign the levels of a factor with `levels()`. --- # When things don't woRk as expected <svg viewBox="0 0 496 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M248 8C111 8 0 119 0 256s111 248 248 248 248-111 248-248S385 8 248 8zm144 386.4V280c0-13.2-10.8-24-24-24s-24 10.8-24 24v151.4C315.5 447 282.8 456 248 456s-67.5-9-96-24.6V280c0-13.2-10.8-24-24-24s-24 10.8-24 24v114.4c-34.6-36-56-84.7-56-138.4 0-110.3 89.7-200 200-200s200 89.7 200 200c0 53.7-21.4 102.5-56 138.4zM205.8 234.5c4.4-2.4 6.9-7.4 6.1-12.4-4-25.2-34.2-42.1-59.8-42.1s-55.9 16.9-59.8 42.1c-.8 5 1.7 10 6.1 12.4 4.4 2.4 9.9 1.8 13.7-1.6l9.5-8.5c14.8-13.2 46.2-13.2 61 0l9.5 8.5c2.5 2.3 7.9 4.8 13.7 1.6zM344 180c-25.7 0-55.9 16.9-59.8 42.1-.8 5 1.7 10 6.1 12.4 4.5 2.4 9.9 1.8 13.7-1.6l9.5-8.5c14.8-13.2 46.2-13.2 61 0l9.5 8.5c2.5 2.2 8 4.7 13.7 1.6 4.4-2.4 6.9-7.4 6.1-12.4-3.9-25.2-34.1-42.1-59.8-42.1zm-96 92c-30.9 0-56 28.7-56 64s25.1 64 56 64 56-28.7 56-64-25.1-64-56-64z"></path></svg> <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M511.988 288.9c-.478 17.43-15.217 31.1-32.653 31.1H424v16c0 21.864-4.882 42.584-13.6 61.145l60.228 60.228c12.496 12.497 12.496 32.758 0 45.255-12.498 12.497-32.759 12.496-45.256 0l-54.736-54.736C345.886 467.965 314.351 480 280 480V236c0-6.627-5.373-12-12-12h-24c-6.627 0-12 5.373-12 12v244c-34.351 0-65.886-12.035-90.636-32.108l-54.736 54.736c-12.498 12.497-32.759 12.496-45.256 0-12.496-12.497-12.496-32.758 0-45.255l60.228-60.228C92.882 378.584 88 357.864 88 336v-16H32.666C15.23 320 .491 306.33.013 288.9-.484 270.816 14.028 256 32 256h56v-58.745l-46.628-46.628c-12.496-12.497-12.496-32.758 0-45.255 12.498-12.497 32.758-12.497 45.256 0L141.255 160h229.489l54.627-54.627c12.498-12.497 32.758-12.497 45.256 0 12.496 12.497 12.496 32.758 0 45.255L424 197.255V256h56c17.972 0 32.484 14.816 31.988 32.9zM257 0c-61.856 0-112 50.144-112 112h224C369 50.144 318.856 0 257 0z"></path></svg> Most often, code breaks because of punctuation errors (misspelled verbs and object names; parentheses and quotation marks that are not closed or are closed at the wrong place; capitalization errors; intervals in function and object calls, etc.) <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M511.988 288.9c-.478 17.43-15.217 31.1-32.653 31.1H424v16c0 21.864-4.882 42.584-13.6 61.145l60.228 60.228c12.496 12.497 12.496 32.758 0 45.255-12.498 12.497-32.759 12.496-45.256 0l-54.736-54.736C345.886 467.965 314.351 480 280 480V236c0-6.627-5.373-12-12-12h-24c-6.627 0-12 5.373-12 12v244c-34.351 0-65.886-12.035-90.636-32.108l-54.736 54.736c-12.498 12.497-32.759 12.496-45.256 0-12.496-12.497-12.496-32.758 0-45.255l60.228-60.228C92.882 378.584 88 357.864 88 336v-16H32.666C15.23 320 .491 306.33.013 288.9-.484 270.816 14.028 256 32 256h56v-58.745l-46.628-46.628c-12.496-12.497-12.496-32.758 0-45.255 12.498-12.497 32.758-12.497 45.256 0L141.255 160h229.489l54.627-54.627c12.498-12.497 32.758-12.497 45.256 0 12.496 12.497 12.496 32.758 0 45.255L424 197.255V256h56c17.972 0 32.484 14.816 31.988 32.9zM257 0c-61.856 0-112 50.144-112 112h224C369 50.144 318.856 0 257 0z"></path></svg> Trying to apply a function to an object of the wrong type is a major source of errors. <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M511.988 288.9c-.478 17.43-15.217 31.1-32.653 31.1H424v16c0 21.864-4.882 42.584-13.6 61.145l60.228 60.228c12.496 12.497 12.496 32.758 0 45.255-12.498 12.497-32.759 12.496-45.256 0l-54.736-54.736C345.886 467.965 314.351 480 280 480V236c0-6.627-5.373-12-12-12h-24c-6.627 0-12 5.373-12 12v244c-34.351 0-65.886-12.035-90.636-32.108l-54.736 54.736c-12.498 12.497-32.759 12.496-45.256 0-12.496-12.497-12.496-32.758 0-45.255l60.228-60.228C92.882 378.584 88 357.864 88 336v-16H32.666C15.23 320 .491 306.33.013 288.9-.484 270.816 14.028 256 32 256h56v-58.745l-46.628-46.628c-12.496-12.497-12.496-32.758 0-45.255 12.498-12.497 32.758-12.497 45.256 0L141.255 160h229.489l54.627-54.627c12.498-12.497 32.758-12.497 45.256 0 12.496 12.497 12.496 32.758 0 45.255L424 197.255V256h56c17.972 0 32.484 14.816 31.988 32.9zM257 0c-61.856 0-112 50.144-112 112h224C369 50.144 318.856 0 257 0z"></path></svg> Functions with the same name residing in different packages can cause confusion (e.g. `recode()` in `car` and `recode()` in `dplyr`). <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M511.988 288.9c-.478 17.43-15.217 31.1-32.653 31.1H424v16c0 21.864-4.882 42.584-13.6 61.145l60.228 60.228c12.496 12.497 12.496 32.758 0 45.255-12.498 12.497-32.759 12.496-45.256 0l-54.736-54.736C345.886 467.965 314.351 480 280 480V236c0-6.627-5.373-12-12-12h-24c-6.627 0-12 5.373-12 12v244c-34.351 0-65.886-12.035-90.636-32.108l-54.736 54.736c-12.498 12.497-32.759 12.496-45.256 0-12.496-12.497-12.496-32.758 0-45.255l60.228-60.228C92.882 378.584 88 357.864 88 336v-16H32.666C15.23 320 .491 306.33.013 288.9-.484 270.816 14.028 256 32 256h56v-58.745l-46.628-46.628c-12.496-12.497-12.496-32.758 0-45.255 12.498-12.497 32.758-12.497 45.256 0L141.255 160h229.489l54.627-54.627c12.498-12.497 32.758-12.497 45.256 0 12.496 12.497 12.496 32.758 0 45.255L424 197.255V256h56c17.972 0 32.484 14.816 31.988 32.9zM257 0c-61.856 0-112 50.144-112 112h224C369 50.144 318.856 0 257 0z"></path></svg> Having the correct arguments, but in the wrong place in function calls. --- # Some solution strategies <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M466.27 286.69C475.04 271.84 480 256 480 236.85c0-44.015-37.218-85.58-85.82-85.58H357.7c4.92-12.81 8.85-28.13 8.85-46.54C366.55 31.936 328.86 0 271.28 0c-61.607 0-58.093 94.933-71.76 108.6-22.747 22.747-49.615 66.447-68.76 83.4H32c-17.673 0-32 14.327-32 32v240c0 17.673 14.327 32 32 32h64c14.893 0 27.408-10.174 30.978-23.95 44.509 1.001 75.06 39.94 177.802 39.94 7.22 0 15.22.01 22.22.01 77.117 0 111.986-39.423 112.94-95.33 13.319-18.425 20.299-43.122 17.34-66.99 9.854-18.452 13.664-40.343 8.99-62.99zm-61.75 53.83c12.56 21.13 1.26 49.41-13.94 57.57 7.7 48.78-17.608 65.9-53.12 65.9h-37.82c-71.639 0-118.029-37.82-171.64-37.82V240h10.92c28.36 0 67.98-70.89 94.54-97.46 28.36-28.36 18.91-75.63 37.82-94.54 47.27 0 47.27 32.98 47.27 56.73 0 39.17-28.36 56.72-28.36 94.54h103.99c21.11 0 37.73 18.91 37.82 37.82.09 18.9-12.82 37.81-22.27 37.81 13.489 14.555 16.371 45.236-5.21 65.62zM88 432c0 13.255-10.745 24-24 24s-24-10.745-24-24 10.745-24 24-24 24 10.745 24 24z"></path></svg> Inspect your code for grammatical errors. <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M466.27 286.69C475.04 271.84 480 256 480 236.85c0-44.015-37.218-85.58-85.82-85.58H357.7c4.92-12.81 8.85-28.13 8.85-46.54C366.55 31.936 328.86 0 271.28 0c-61.607 0-58.093 94.933-71.76 108.6-22.747 22.747-49.615 66.447-68.76 83.4H32c-17.673 0-32 14.327-32 32v240c0 17.673 14.327 32 32 32h64c14.893 0 27.408-10.174 30.978-23.95 44.509 1.001 75.06 39.94 177.802 39.94 7.22 0 15.22.01 22.22.01 77.117 0 111.986-39.423 112.94-95.33 13.319-18.425 20.299-43.122 17.34-66.99 9.854-18.452 13.664-40.343 8.99-62.99zm-61.75 53.83c12.56 21.13 1.26 49.41-13.94 57.57 7.7 48.78-17.608 65.9-53.12 65.9h-37.82c-71.639 0-118.029-37.82-171.64-37.82V240h10.92c28.36 0 67.98-70.89 94.54-97.46 28.36-28.36 18.91-75.63 37.82-94.54 47.27 0 47.27 32.98 47.27 56.73 0 39.17-28.36 56.72-28.36 94.54h103.99c21.11 0 37.73 18.91 37.82 37.82.09 18.9-12.82 37.81-22.27 37.81 13.489 14.555 16.371 45.236-5.21 65.62zM88 432c0 13.255-10.745 24-24 24s-24-10.745-24-24 10.745-24 24-24 24 10.745 24 24z"></path></svg> Read the documentation of the function that breaks the code. <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M466.27 286.69C475.04 271.84 480 256 480 236.85c0-44.015-37.218-85.58-85.82-85.58H357.7c4.92-12.81 8.85-28.13 8.85-46.54C366.55 31.936 328.86 0 271.28 0c-61.607 0-58.093 94.933-71.76 108.6-22.747 22.747-49.615 66.447-68.76 83.4H32c-17.673 0-32 14.327-32 32v240c0 17.673 14.327 32 32 32h64c14.893 0 27.408-10.174 30.978-23.95 44.509 1.001 75.06 39.94 177.802 39.94 7.22 0 15.22.01 22.22.01 77.117 0 111.986-39.423 112.94-95.33 13.319-18.425 20.299-43.122 17.34-66.99 9.854-18.452 13.664-40.343 8.99-62.99zm-61.75 53.83c12.56 21.13 1.26 49.41-13.94 57.57 7.7 48.78-17.608 65.9-53.12 65.9h-37.82c-71.639 0-118.029-37.82-171.64-37.82V240h10.92c28.36 0 67.98-70.89 94.54-97.46 28.36-28.36 18.91-75.63 37.82-94.54 47.27 0 47.27 32.98 47.27 56.73 0 39.17-28.36 56.72-28.36 94.54h103.99c21.11 0 37.73 18.91 37.82 37.82.09 18.9-12.82 37.81-22.27 37.81 13.489 14.555 16.371 45.236-5.21 65.62zM88 432c0 13.255-10.745 24-24 24s-24-10.745-24-24 10.745-24 24-24 24 10.745 24 24z"></path></svg> Check that objects exist and have the expected type. <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M466.27 286.69C475.04 271.84 480 256 480 236.85c0-44.015-37.218-85.58-85.82-85.58H357.7c4.92-12.81 8.85-28.13 8.85-46.54C366.55 31.936 328.86 0 271.28 0c-61.607 0-58.093 94.933-71.76 108.6-22.747 22.747-49.615 66.447-68.76 83.4H32c-17.673 0-32 14.327-32 32v240c0 17.673 14.327 32 32 32h64c14.893 0 27.408-10.174 30.978-23.95 44.509 1.001 75.06 39.94 177.802 39.94 7.22 0 15.22.01 22.22.01 77.117 0 111.986-39.423 112.94-95.33 13.319-18.425 20.299-43.122 17.34-66.99 9.854-18.452 13.664-40.343 8.99-62.99zm-61.75 53.83c12.56 21.13 1.26 49.41-13.94 57.57 7.7 48.78-17.608 65.9-53.12 65.9h-37.82c-71.639 0-118.029-37.82-171.64-37.82V240h10.92c28.36 0 67.98-70.89 94.54-97.46 28.36-28.36 18.91-75.63 37.82-94.54 47.27 0 47.27 32.98 47.27 56.73 0 39.17-28.36 56.72-28.36 94.54h103.99c21.11 0 37.73 18.91 37.82 37.82.09 18.9-12.82 37.81-22.27 37.81 13.489 14.555 16.371 45.236-5.21 65.62zM88 432c0 13.255-10.745 24-24 24s-24-10.745-24-24 10.745-24 24-24 24 10.745 24 24z"></path></svg> Isolate the problem by working step-by-step. Replicate the problem on a small subset of your data. <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M466.27 286.69C475.04 271.84 480 256 480 236.85c0-44.015-37.218-85.58-85.82-85.58H357.7c4.92-12.81 8.85-28.13 8.85-46.54C366.55 31.936 328.86 0 271.28 0c-61.607 0-58.093 94.933-71.76 108.6-22.747 22.747-49.615 66.447-68.76 83.4H32c-17.673 0-32 14.327-32 32v240c0 17.673 14.327 32 32 32h64c14.893 0 27.408-10.174 30.978-23.95 44.509 1.001 75.06 39.94 177.802 39.94 7.22 0 15.22.01 22.22.01 77.117 0 111.986-39.423 112.94-95.33 13.319-18.425 20.299-43.122 17.34-66.99 9.854-18.452 13.664-40.343 8.99-62.99zm-61.75 53.83c12.56 21.13 1.26 49.41-13.94 57.57 7.7 48.78-17.608 65.9-53.12 65.9h-37.82c-71.639 0-118.029-37.82-171.64-37.82V240h10.92c28.36 0 67.98-70.89 94.54-97.46 28.36-28.36 18.91-75.63 37.82-94.54 47.27 0 47.27 32.98 47.27 56.73 0 39.17-28.36 56.72-28.36 94.54h103.99c21.11 0 37.73 18.91 37.82 37.82.09 18.9-12.82 37.81-22.27 37.81 13.489 14.555 16.371 45.236-5.21 65.62zM88 432c0 13.255-10.745 24-24 24s-24-10.745-24-24 10.745-24 24-24 24 10.745 24 24z"></path></svg> Google the text of the error message. Ask LLMs for help. --- # How to get in touch? <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M464 64H48C21.49 64 0 85.49 0 112v288c0 26.51 21.49 48 48 48h416c26.51 0 48-21.49 48-48V112c0-26.51-21.49-48-48-48zm0 48v40.805c-22.422 18.259-58.168 46.651-134.587 106.49-16.841 13.247-50.201 45.072-73.413 44.701-23.208.375-56.579-31.459-73.413-44.701C106.18 199.465 70.425 171.067 48 152.805V112h416zM48 400V214.398c22.914 18.251 55.409 43.862 104.938 82.646 21.857 17.205 60.134 55.186 103.062 54.955 42.717.231 80.509-37.199 103.053-54.947 49.528-38.783 82.032-64.401 104.947-82.653V400H48z"></path></svg> demetriodor@gmail.com <svg viewBox="0 0 496 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M131.5 217.5L55.1 100.1c47.6-59.2 119-91.8 192-92.1 42.3-.3 85.5 10.5 124.8 33.2 43.4 25.2 76.4 61.4 97.4 103L264 133.4c-58.1-3.4-113.4 29.3-132.5 84.1zm32.9 38.5c0 46.2 37.4 83.6 83.6 83.6s83.6-37.4 83.6-83.6-37.4-83.6-83.6-83.6-83.6 37.3-83.6 83.6zm314.9-89.2L339.6 174c37.9 44.3 38.5 108.2 6.6 157.2L234.1 503.6c46.5 2.5 94.4-7.7 137.8-32.9 107.4-62 150.9-192 107.4-303.9zM133.7 303.6L40.4 120.1C14.9 159.1 0 205.9 0 256c0 124 90.8 226.7 209.5 244.9l63.7-124.8c-57.6 10.8-113.2-20.8-139.5-72.5z"></path></svg> [http://dimiter.eu](http://dimiter.eu) <svg viewBox="0 0 484 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <g groupmode="layer" id="layer6" label="icon"> <path id="Shape_1_" class="st1" d="M 324.19873,96 H 215.27696 C 130.71036,96 61.027479,165.68288 61.027479,250.24948 v 6.08879 c 0,30.44398 9.47146,58.18182 25.70825,83.21353 L 5.5518005,416 123.94503,378.7907 c 25.70825,19.61945 58.18182,30.44397 92.685,30.44397 h 107.5687 c 85.91965,0 154.24947,-69.68287 154.24947,-152.8964 v -6.08879 C 478.4482,165.68288 408.76534,96 324.19873,96 Z M 406,276 c 0,46.68076 -35.23395,75.66979 -81.23818,75.66979 H 213.13392 C 166.45316,351.66979 132,322.68076 132,276 v -40 c 0,-46.68077 34.45321,-81.20125 81.13397,-81.20125 h 111.6279 C 371.44264,154.79875 406,189.31924 406,236 Z" style="stroke-width:1" nodetypes="sssscccssssscsssssscc"></path> </g></svg> @dtoshkov.bsky.social <svg viewBox="0 0 512 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M459.37 151.716c.325 4.548.325 9.097.325 13.645 0 138.72-105.583 298.558-298.558 298.558-59.452 0-114.68-17.219-161.137-47.106 8.447.974 16.568 1.299 25.34 1.299 49.055 0 94.213-16.568 130.274-44.832-46.132-.975-84.792-31.188-98.112-72.772 6.498.974 12.995 1.624 19.818 1.624 9.421 0 18.843-1.3 27.614-3.573-48.081-9.747-84.143-51.98-84.143-102.985v-1.299c13.969 7.797 30.214 12.67 47.431 13.319-28.264-18.843-46.781-51.005-46.781-87.391 0-19.492 5.197-37.36 14.294-52.954 51.655 63.675 129.3 105.258 216.365 109.807-1.624-7.797-2.599-15.918-2.599-24.04 0-57.828 46.782-104.934 104.934-104.934 30.213 0 57.502 12.67 76.67 33.137 23.715-4.548 46.456-13.32 66.599-25.34-7.798 24.366-24.366 44.833-46.132 57.827 21.117-2.273 41.584-8.122 60.426-16.243-14.292 20.791-32.161 39.308-52.628 54.253z"></path></svg> @DToshkov <svg viewBox="0 0 496 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M165.9 397.4c0 2-2.3 3.6-5.2 3.6-3.3.3-5.6-1.3-5.6-3.6 0-2 2.3-3.6 5.2-3.6 3-.3 5.6 1.3 5.6 3.6zm-31.1-4.5c-.7 2 1.3 4.3 4.3 4.9 2.6 1 5.6 0 6.2-2s-1.3-4.3-4.3-5.2c-2.6-.7-5.5.3-6.2 2.3zm44.2-1.7c-2.9.7-4.9 2.6-4.6 4.9.3 2 2.9 3.3 5.9 2.6 2.9-.7 4.9-2.6 4.6-4.6-.3-1.9-3-3.2-5.9-2.9zM244.8 8C106.1 8 0 113.3 0 252c0 110.9 69.8 205.8 169.5 239.2 12.8 2.3 17.3-5.6 17.3-12.1 0-6.2-.3-40.4-.3-61.4 0 0-70 15-84.7-29.8 0 0-11.4-29.1-27.8-36.6 0 0-22.9-15.7 1.6-15.4 0 0 24.9 2 38.6 25.8 21.9 38.6 58.6 27.5 72.9 20.9 2.3-16 8.8-27.1 16-33.7-55.9-6.2-112.3-14.3-112.3-110.5 0-27.5 7.6-41.3 23.6-58.9-2.6-6.5-11.1-33.3 2.6-67.9 20.9-6.5 69 27 69 27 20-5.6 41.5-8.5 62.8-8.5s42.8 2.9 62.8 8.5c0 0 48.1-33.6 69-27 13.7 34.7 5.2 61.4 2.6 67.9 16 17.7 25.8 31.5 25.8 58.9 0 96.5-58.9 104.2-114.8 110.5 9.2 7.9 17 22.9 17 46.4 0 33.7-.3 75.4-.3 83.6 0 6.5 4.6 14.4 17.3 12.1C428.2 457.8 496 362.9 496 252 496 113.3 383.5 8 244.8 8zM97.2 352.9c-1.3 1-1 3.3.7 5.2 1.6 1.6 3.9 2.3 5.2 1 1.3-1 1-3.3-.7-5.2-1.6-1.6-3.9-2.3-5.2-1zm-10.8-8.1c-.7 1.3.3 2.9 2.3 3.9 1.6 1 3.6.7 4.3-.7.7-1.3-.3-2.9-2.3-3.9-2-.6-3.6-.3-4.3.7zm32.4 35.6c-1.6 1.3-1 4.3 1.3 6.2 2.3 2.3 5.2 2.6 6.5 1 1.3-1.3.7-4.3-1.3-6.2-2.2-2.3-5.2-2.6-6.5-1zm-11.4-14.7c-1.6 1-1.6 3.6 0 5.9 1.6 2.3 4.3 3.3 5.6 2.3 1.6-1.3 1.6-3.9 0-6.2-1.4-2.3-4-3.3-5.6-2z"></path></svg> [github.com/demetriodor](https://github.com/demetriodor/) <svg viewBox="0 0 448 512" style="height:1em;position:relative;display:inline-block;top:.1em;" xmlns="http://www.w3.org/2000/svg"> <path d="M416 32H31.9C14.3 32 0 46.5 0 64.3v383.4C0 465.5 14.3 480 31.9 480H416c17.6 0 32-14.5 32-32.3V64.3c0-17.8-14.4-32.3-32-32.3zM135.4 416H69V202.2h66.5V416zm-33.2-243c-21.3 0-38.5-17.3-38.5-38.5S80.9 96 102.2 96c21.2 0 38.5 17.3 38.5 38.5 0 21.3-17.2 38.5-38.5 38.5zm282.1 243h-66.4V312c0-24.8-.5-56.7-34.5-56.7-34.6 0-39.9 27-39.9 54.9V416h-66.4V202.2h63.7v29.2h.9c8.9-16.8 30.6-34.5 62.9-34.5 67.2 0 79.7 44.3 79.7 101.9V416z"></path></svg> Dimiter Toshkov