-
Notifications
You must be signed in to change notification settings - Fork 1
Expand file tree
/
Copy pathdescriptive-statistics-and-visualisations.html
More file actions
781 lines (739 loc) · 82.9 KB
/
Copy pathdescriptive-statistics-and-visualisations.html
File metadata and controls
781 lines (739 loc) · 82.9 KB
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
<!DOCTYPE html>
<html lang="" xml:lang="">
<head>
<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge" />
<title>5 Descriptive Statistics and Visualisations | Introduction to R for Crime Analysts</title>
<meta name="description" content="This course is designed to help you transition from SPSS to R, demonstrating that all of the functionality you’re accustomed to in SPSS can be replicated—and often enhanced—in R. By the end of this course, you will be equipped with the knowledge and skills to conduct your analyses in R, whether you’re dealing with basic descriptive statistics, survey data, or more advanced statistical models." />
<meta name="generator" content="bookdown 0.40 and GitBook 2.6.7" />
<meta property="og:title" content="5 Descriptive Statistics and Visualisations | Introduction to R for Crime Analysts" />
<meta property="og:type" content="book" />
<meta property="og:description" content="This course is designed to help you transition from SPSS to R, demonstrating that all of the functionality you’re accustomed to in SPSS can be replicated—and often enhanced—in R. By the end of this course, you will be equipped with the knowledge and skills to conduct your analyses in R, whether you’re dealing with basic descriptive statistics, survey data, or more advanced statistical models." />
<meta name="twitter:card" content="summary" />
<meta name="twitter:title" content="5 Descriptive Statistics and Visualisations | Introduction to R for Crime Analysts" />
<meta name="twitter:description" content="This course is designed to help you transition from SPSS to R, demonstrating that all of the functionality you’re accustomed to in SPSS can be replicated—and often enhanced—in R. By the end of this course, you will be equipped with the knowledge and skills to conduct your analyses in R, whether you’re dealing with basic descriptive statistics, survey data, or more advanced statistical models." />
<meta name="author" content="Daniel Hammocks, Senior Data Scientist at Mayor’s Office for Policing and Crime" />
<meta name="date" content="2024-08-28" />
<meta name="viewport" content="width=device-width, initial-scale=1" />
<meta name="apple-mobile-web-app-capable" content="yes" />
<meta name="apple-mobile-web-app-status-bar-style" content="black" />
<link rel="prev" href="connecting-to-and-accessing-a-postgresql-database.html"/>
<link rel="next" href="survey-analysis-in-r.html"/>
<script src="libs/jquery-3.6.0/jquery-3.6.0.min.js"></script>
<script src="https://cdn.jsdelivr.net/npm/fuse.js@6.4.6/dist/fuse.min.js"></script>
<link href="libs/gitbook-2.6.7/css/style.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-table.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-bookdown.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-highlight.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-search.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-fontsettings.css" rel="stylesheet" />
<link href="libs/gitbook-2.6.7/css/plugin-clipboard.css" rel="stylesheet" />
<link href="libs/anchor-sections-1.1.0/anchor-sections.css" rel="stylesheet" />
<link href="libs/anchor-sections-1.1.0/anchor-sections-hash.css" rel="stylesheet" />
<script src="libs/anchor-sections-1.1.0/anchor-sections.js"></script>
<style type="text/css">
pre > code.sourceCode { white-space: pre; position: relative; }
pre > code.sourceCode > span { line-height: 1.25; }
pre > code.sourceCode > span:empty { height: 1.2em; }
.sourceCode { overflow: visible; }
code.sourceCode > span { color: inherit; text-decoration: inherit; }
pre.sourceCode { margin: 0; }
@media screen {
div.sourceCode { overflow: auto; }
}
@media print {
pre > code.sourceCode { white-space: pre-wrap; }
pre > code.sourceCode > span { text-indent: -5em; padding-left: 5em; }
}
pre.numberSource code
{ counter-reset: source-line 0; }
pre.numberSource code > span
{ position: relative; left: -4em; counter-increment: source-line; }
pre.numberSource code > span > a:first-child::before
{ content: counter(source-line);
position: relative; left: -1em; text-align: right; vertical-align: baseline;
border: none; display: inline-block;
-webkit-touch-callout: none; -webkit-user-select: none;
-khtml-user-select: none; -moz-user-select: none;
-ms-user-select: none; user-select: none;
padding: 0 4px; width: 4em;
color: #aaaaaa;
}
pre.numberSource { margin-left: 3em; border-left: 1px solid #aaaaaa; padding-left: 4px; }
div.sourceCode
{ }
@media screen {
pre > code.sourceCode > span > a:first-child::before { text-decoration: underline; }
}
code span.al { color: #ff0000; font-weight: bold; } /* Alert */
code span.an { color: #60a0b0; font-weight: bold; font-style: italic; } /* Annotation */
code span.at { color: #7d9029; } /* Attribute */
code span.bn { color: #40a070; } /* BaseN */
code span.bu { color: #008000; } /* BuiltIn */
code span.cf { color: #007020; font-weight: bold; } /* ControlFlow */
code span.ch { color: #4070a0; } /* Char */
code span.cn { color: #880000; } /* Constant */
code span.co { color: #60a0b0; font-style: italic; } /* Comment */
code span.cv { color: #60a0b0; font-weight: bold; font-style: italic; } /* CommentVar */
code span.do { color: #ba2121; font-style: italic; } /* Documentation */
code span.dt { color: #902000; } /* DataType */
code span.dv { color: #40a070; } /* DecVal */
code span.er { color: #ff0000; font-weight: bold; } /* Error */
code span.ex { } /* Extension */
code span.fl { color: #40a070; } /* Float */
code span.fu { color: #06287e; } /* Function */
code span.im { color: #008000; font-weight: bold; } /* Import */
code span.in { color: #60a0b0; font-weight: bold; font-style: italic; } /* Information */
code span.kw { color: #007020; font-weight: bold; } /* Keyword */
code span.op { color: #666666; } /* Operator */
code span.ot { color: #007020; } /* Other */
code span.pp { color: #bc7a00; } /* Preprocessor */
code span.sc { color: #4070a0; } /* SpecialChar */
code span.ss { color: #bb6688; } /* SpecialString */
code span.st { color: #4070a0; } /* String */
code span.va { color: #19177c; } /* Variable */
code span.vs { color: #4070a0; } /* VerbatimString */
code span.wa { color: #60a0b0; font-weight: bold; font-style: italic; } /* Warning */
</style>
<style type="text/css">
div.hanging-indent{margin-left: 1.5em; text-indent: -1.5em;}
</style>
<link rel="stylesheet" href="style.css" type="text/css" />
</head>
<body>
<div class="book without-animation with-summary font-size-2 font-family-1" data-basepath=".">
<div class="book-summary">
<nav role="navigation">
<ul class="summary">
<li class="toc-logo"><a href="./"><img src="images/MOPAC-DS-Square.png"></a></li>
<li class="divider"></li>
<li class="chapter" data-level="" data-path="index.html"><a href="index.html"><i class="fa fa-check"></i>Preface</a>
<ul>
<li class="chapter" data-level="" data-path="index.html"><a href="index.html#purpose-of-this-book"><i class="fa fa-check"></i>Purpose of this Book</a></li>
<li class="chapter" data-level="" data-path="index.html"><a href="index.html#about-the-author"><i class="fa fa-check"></i>About the Author</a></li>
</ul></li>
<li class="chapter" data-level="1" data-path="introduction.html"><a href="introduction.html"><i class="fa fa-check"></i><b>1</b> Introduction</a>
<ul>
<li class="chapter" data-level="1.1" data-path="introduction.html"><a href="introduction.html#overview"><i class="fa fa-check"></i><b>1.1</b> Overview</a></li>
<li class="chapter" data-level="1.2" data-path="introduction.html"><a href="introduction.html#why-learn-r"><i class="fa fa-check"></i><b>1.2</b> Why Learn R?</a>
<ul>
<li class="chapter" data-level="1.2.1" data-path="introduction.html"><a href="introduction.html#flexibility-and-power"><i class="fa fa-check"></i><b>1.2.1</b> Flexibility and Power</a></li>
<li class="chapter" data-level="1.2.2" data-path="introduction.html"><a href="introduction.html#reproducibility"><i class="fa fa-check"></i><b>1.2.2</b> Reproducibility</a></li>
<li class="chapter" data-level="1.2.3" data-path="introduction.html"><a href="introduction.html#extensive-community-and-package-ecosystem"><i class="fa fa-check"></i><b>1.2.3</b> Extensive Community and Package Ecosystem</a></li>
<li class="chapter" data-level="1.2.4" data-path="introduction.html"><a href="introduction.html#cost"><i class="fa fa-check"></i><b>1.2.4</b> Cost</a></li>
</ul></li>
<li class="chapter" data-level="1.3" data-path="introduction.html"><a href="introduction.html#replicating-spss-functionality-in-r"><i class="fa fa-check"></i><b>1.3</b> Replicating SPSS Functionality in R</a>
<ul>
<li class="chapter" data-level="1.3.1" data-path="introduction.html"><a href="introduction.html#data-management"><i class="fa fa-check"></i><b>1.3.1</b> Data Management</a></li>
<li class="chapter" data-level="1.3.2" data-path="introduction.html"><a href="introduction.html#descriptive-statistics"><i class="fa fa-check"></i><b>1.3.2</b> Descriptive Statistics</a></li>
<li class="chapter" data-level="1.3.3" data-path="introduction.html"><a href="introduction.html#statistical-tests"><i class="fa fa-check"></i><b>1.3.3</b> Statistical Tests</a></li>
<li class="chapter" data-level="1.3.4" data-path="introduction.html"><a href="introduction.html#regression-analysis"><i class="fa fa-check"></i><b>1.3.4</b> Regression Analysis</a></li>
<li class="chapter" data-level="1.3.5" data-path="introduction.html"><a href="introduction.html#data-visualisation"><i class="fa fa-check"></i><b>1.3.5</b> Data Visualisation</a></li>
</ul></li>
<li class="chapter" data-level="1.4" data-path="introduction.html"><a href="introduction.html#transitioning-from-spss-to-r"><i class="fa fa-check"></i><b>1.4</b> Transitioning from SPSS to R</a>
<ul>
<li class="chapter" data-level="1.4.1" data-path="introduction.html"><a href="introduction.html#building-confidence-in-r"><i class="fa fa-check"></i><b>1.4.1</b> Building Confidence in R</a></li>
<li class="chapter" data-level="1.4.2" data-path="introduction.html"><a href="introduction.html#leveraging-rs-ecosystem"><i class="fa fa-check"></i><b>1.4.2</b> Leveraging R’s Ecosystem</a></li>
</ul></li>
<li class="chapter" data-level="1.5" data-path="introduction.html"><a href="introduction.html#conclusion"><i class="fa fa-check"></i><b>1.5</b> Conclusion</a></li>
</ul></li>
<li class="chapter" data-level="2" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html"><i class="fa fa-check"></i><b>2</b> Getting Started with R</a>
<ul>
<li class="chapter" data-level="2.1" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#the-r-environment"><i class="fa fa-check"></i><b>2.1</b> The R Environment</a>
<ul>
<li class="chapter" data-level="2.1.1" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#overview-of-the-rstudio-interface"><i class="fa fa-check"></i><b>2.1.1</b> Overview of the RStudio Interface</a></li>
<li class="chapter" data-level="2.1.2" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#console-vs.-scripts-vs.-notebooks"><i class="fa fa-check"></i><b>2.1.2</b> Console vs. Scripts vs. Notebooks</a></li>
</ul></li>
<li class="chapter" data-level="2.2" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#introduction-to-r-packages-and-installing-key-packages"><i class="fa fa-check"></i><b>2.2</b> Introduction to R Packages and Installing Key Packages</a>
<ul>
<li class="chapter" data-level="2.2.1" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#introduction-to-r-packages"><i class="fa fa-check"></i><b>2.2.1</b> Introduction to R Packages</a></li>
<li class="chapter" data-level="2.2.2" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#installing-and-loading-packages"><i class="fa fa-check"></i><b>2.2.2</b> Installing and Loading Packages</a></li>
<li class="chapter" data-level="2.2.3" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#key-packages-for-data-analysis"><i class="fa fa-check"></i><b>2.2.3</b> Key Packages for Data Analysis</a></li>
<li class="chapter" data-level="2.2.4" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#managing-package-dependencies"><i class="fa fa-check"></i><b>2.2.4</b> Managing Package Dependencies</a></li>
<li class="chapter" data-level="2.2.5" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#summary"><i class="fa fa-check"></i><b>2.2.5</b> Summary</a></li>
</ul></li>
<li class="chapter" data-level="2.3" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#coding-conventions-and-best-practices"><i class="fa fa-check"></i><b>2.3</b> Coding Conventions and Best Practices</a>
<ul>
<li class="chapter" data-level="2.3.1" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#writing-clean-and-readable-code"><i class="fa fa-check"></i><b>2.3.1</b> Writing Clean and Readable Code</a></li>
<li class="chapter" data-level="2.3.2" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#commenting-and-structuring-scripts"><i class="fa fa-check"></i><b>2.3.2</b> Commenting and Structuring Scripts</a></li>
</ul></li>
<li class="chapter" data-level="2.4" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#data-types-and-structures"><i class="fa fa-check"></i><b>2.4</b> Data Types and Structures</a>
<ul>
<li class="chapter" data-level="2.4.1" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#introduction-to-vectors-data-frames-lists-and-factors"><i class="fa fa-check"></i><b>2.4.1</b> Introduction to Vectors, Data Frames, Lists, and Factors</a></li>
<li class="chapter" data-level="2.4.2" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#comparing-r-data-types-to-spss-data-types"><i class="fa fa-check"></i><b>2.4.2</b> Comparing R Data Types to SPSS Data Types</a></li>
</ul></li>
<li class="chapter" data-level="2.5" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#basic-operations-and-functions-in-r"><i class="fa fa-check"></i><b>2.5</b> Basic Operations and Functions in R</a>
<ul>
<li class="chapter" data-level="2.5.1" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#arithmetic-operations"><i class="fa fa-check"></i><b>2.5.1</b> Arithmetic Operations</a></li>
<li class="chapter" data-level="2.5.2" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#logical-operations"><i class="fa fa-check"></i><b>2.5.2</b> Logical Operations</a></li>
<li class="chapter" data-level="2.5.3" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#basic-functions"><i class="fa fa-check"></i><b>2.5.3</b> Basic Functions</a></li>
</ul></li>
<li class="chapter" data-level="2.6" data-path="getting-started-with-r.html"><a href="getting-started-with-r.html#conclusion-1"><i class="fa fa-check"></i><b>2.6</b> Conclusion</a></li>
</ul></li>
<li class="chapter" data-level="3" data-path="data-management-in-r.html"><a href="data-management-in-r.html"><i class="fa fa-check"></i><b>3</b> Data Management in R</a>
<ul>
<li class="chapter" data-level="3.1" data-path="data-management-in-r.html"><a href="data-management-in-r.html#data-import-and-export"><i class="fa fa-check"></i><b>3.1</b> Data Import and Export</a>
<ul>
<li class="chapter" data-level="3.1.1" data-path="data-management-in-r.html"><a href="data-management-in-r.html#importing-data"><i class="fa fa-check"></i><b>3.1.1</b> Importing Data</a></li>
<li class="chapter" data-level="3.1.2" data-path="data-management-in-r.html"><a href="data-management-in-r.html#exporting-data"><i class="fa fa-check"></i><b>3.1.2</b> Exporting Data</a></li>
</ul></li>
<li class="chapter" data-level="3.2" data-path="data-management-in-r.html"><a href="data-management-in-r.html#data-cleaning-and-preparation"><i class="fa fa-check"></i><b>3.2</b> Data Cleaning and Preparation</a>
<ul>
<li class="chapter" data-level="3.2.1" data-path="data-management-in-r.html"><a href="data-management-in-r.html#handling-missing-data"><i class="fa fa-check"></i><b>3.2.1</b> Handling Missing Data</a></li>
<li class="chapter" data-level="3.2.2" data-path="data-management-in-r.html"><a href="data-management-in-r.html#filtering-and-subsetting-data"><i class="fa fa-check"></i><b>3.2.2</b> Filtering and Subsetting Data</a></li>
<li class="chapter" data-level="3.2.3" data-path="data-management-in-r.html"><a href="data-management-in-r.html#data-transformations"><i class="fa fa-check"></i><b>3.2.3</b> Data Transformations</a></li>
<li class="chapter" data-level="3.2.4" data-path="data-management-in-r.html"><a href="data-management-in-r.html#the-dplyr-pipeline"><i class="fa fa-check"></i><b>3.2.4</b> The dplyr Pipeline</a></li>
</ul></li>
<li class="chapter" data-level="3.3" data-path="data-management-in-r.html"><a href="data-management-in-r.html#working-with-categorical-data"><i class="fa fa-check"></i><b>3.3</b> Working with Categorical Data</a>
<ul>
<li class="chapter" data-level="3.3.1" data-path="data-management-in-r.html"><a href="data-management-in-r.html#creating-and-manipulating-factors"><i class="fa fa-check"></i><b>3.3.1</b> Creating and Manipulating Factors</a></li>
<li class="chapter" data-level="3.3.2" data-path="data-management-in-r.html"><a href="data-management-in-r.html#recoding-variables"><i class="fa fa-check"></i><b>3.3.2</b> Recoding Variables</a></li>
<li class="chapter" data-level="3.3.3" data-path="data-management-in-r.html"><a href="data-management-in-r.html#frequency-tables-and-cross-tabulations"><i class="fa fa-check"></i><b>3.3.3</b> Frequency Tables and Cross-Tabulations</a></li>
</ul></li>
<li class="chapter" data-level="3.4" data-path="data-management-in-r.html"><a href="data-management-in-r.html#conclusion-2"><i class="fa fa-check"></i><b>3.4</b> Conclusion</a></li>
</ul></li>
<li class="chapter" data-level="4" data-path="connecting-to-and-accessing-a-postgresql-database.html"><a href="connecting-to-and-accessing-a-postgresql-database.html"><i class="fa fa-check"></i><b>4</b> Connecting to and Accessing a PostgreSQL Database</a>
<ul>
<li class="chapter" data-level="4.1" data-path="connecting-to-and-accessing-a-postgresql-database.html"><a href="connecting-to-and-accessing-a-postgresql-database.html#introduction-1"><i class="fa fa-check"></i><b>4.1</b> Introduction</a></li>
<li class="chapter" data-level="4.2" data-path="connecting-to-and-accessing-a-postgresql-database.html"><a href="connecting-to-and-accessing-a-postgresql-database.html#setting-up-the-environment"><i class="fa fa-check"></i><b>4.2</b> Setting Up the Environment</a>
<ul>
<li class="chapter" data-level="4.2.1" data-path="connecting-to-and-accessing-a-postgresql-database.html"><a href="connecting-to-and-accessing-a-postgresql-database.html#installing-necessary-packages"><i class="fa fa-check"></i><b>4.2.1</b> Installing Necessary Packages</a></li>
</ul></li>
<li class="chapter" data-level="4.3" data-path="connecting-to-and-accessing-a-postgresql-database.html"><a href="connecting-to-and-accessing-a-postgresql-database.html#connecting-to-postgresql-using-rpostgres"><i class="fa fa-check"></i><b>4.3</b> Connecting to PostgreSQL (Using RPostgres)</a></li>
<li class="chapter" data-level="4.4" data-path="connecting-to-and-accessing-a-postgresql-database.html"><a href="connecting-to-and-accessing-a-postgresql-database.html#querying-data-from-postgresql"><i class="fa fa-check"></i><b>4.4</b> Querying Data from PostgreSQL</a>
<ul>
<li class="chapter" data-level="4.4.1" data-path="connecting-to-and-accessing-a-postgresql-database.html"><a href="connecting-to-and-accessing-a-postgresql-database.html#executing-a-query"><i class="fa fa-check"></i><b>4.4.1</b> Executing a Query</a></li>
</ul></li>
<li class="chapter" data-level="4.5" data-path="connecting-to-and-accessing-a-postgresql-database.html"><a href="connecting-to-and-accessing-a-postgresql-database.html#handling-errors-and-troubleshooting"><i class="fa fa-check"></i><b>4.5</b> Handling Errors and Troubleshooting</a></li>
<li class="chapter" data-level="4.6" data-path="connecting-to-and-accessing-a-postgresql-database.html"><a href="connecting-to-and-accessing-a-postgresql-database.html#conclusion-3"><i class="fa fa-check"></i><b>4.6</b> Conclusion</a></li>
</ul></li>
<li class="chapter" data-level="5" data-path="descriptive-statistics-and-visualisations.html"><a href="descriptive-statistics-and-visualisations.html"><i class="fa fa-check"></i><b>5</b> Descriptive Statistics and Visualisations</a>
<ul>
<li class="chapter" data-level="5.1" data-path="descriptive-statistics-and-visualisations.html"><a href="descriptive-statistics-and-visualisations.html#introduction-to-descriptive-statistics"><i class="fa fa-check"></i><b>5.1</b> Introduction to Descriptive Statistics</a>
<ul>
<li class="chapter" data-level="5.1.1" data-path="descriptive-statistics-and-visualisations.html"><a href="descriptive-statistics-and-visualisations.html#understanding-descriptive-statistics"><i class="fa fa-check"></i><b>5.1.1</b> Understanding Descriptive Statistics</a></li>
<li class="chapter" data-level="5.1.2" data-path="descriptive-statistics-and-visualisations.html"><a href="descriptive-statistics-and-visualisations.html#basic-descriptive-statistics-in-r"><i class="fa fa-check"></i><b>5.1.2</b> Basic Descriptive Statistics in R</a></li>
</ul></li>
<li class="chapter" data-level="5.2" data-path="descriptive-statistics-and-visualisations.html"><a href="descriptive-statistics-and-visualisations.html#creating-visualisations-with-ggplot2"><i class="fa fa-check"></i><b>5.2</b> Creating Visualisations with ggplot2</a>
<ul>
<li class="chapter" data-level="5.2.1" data-path="descriptive-statistics-and-visualisations.html"><a href="descriptive-statistics-and-visualisations.html#introduction-to-ggplot2"><i class="fa fa-check"></i><b>5.2.1</b> Introduction to ggplot2</a></li>
<li class="chapter" data-level="5.2.2" data-path="descriptive-statistics-and-visualisations.html"><a href="descriptive-statistics-and-visualisations.html#creating-basic-plots"><i class="fa fa-check"></i><b>5.2.2</b> Creating Basic Plots</a></li>
<li class="chapter" data-level="5.2.3" data-path="descriptive-statistics-and-visualisations.html"><a href="descriptive-statistics-and-visualisations.html#customising-your-plots"><i class="fa fa-check"></i><b>5.2.3</b> Customising Your Plots</a></li>
</ul></li>
<li class="chapter" data-level="5.3" data-path="descriptive-statistics-and-visualisations.html"><a href="descriptive-statistics-and-visualisations.html#descriptive-statistics-with-dplyr"><i class="fa fa-check"></i><b>5.3</b> Descriptive Statistics with dplyr</a>
<ul>
<li class="chapter" data-level="5.3.1" data-path="descriptive-statistics-and-visualisations.html"><a href="descriptive-statistics-and-visualisations.html#using-dplyr-to-summarise-data"><i class="fa fa-check"></i><b>5.3.1</b> Using dplyr to Summarise Data</a></li>
<li class="chapter" data-level="5.3.2" data-path="descriptive-statistics-and-visualisations.html"><a href="descriptive-statistics-and-visualisations.html#combining-dplyr-with-ggplot2"><i class="fa fa-check"></i><b>5.3.2</b> Combining dplyr with ggplot2</a></li>
</ul></li>
<li class="chapter" data-level="5.4" data-path="descriptive-statistics-and-visualisations.html"><a href="descriptive-statistics-and-visualisations.html#advanced-visualisation-techniques"><i class="fa fa-check"></i><b>5.4</b> Advanced Visualisation Techniques</a>
<ul>
<li class="chapter" data-level="5.4.1" data-path="descriptive-statistics-and-visualisations.html"><a href="descriptive-statistics-and-visualisations.html#faceting"><i class="fa fa-check"></i><b>5.4.1</b> Faceting</a></li>
<li class="chapter" data-level="5.4.2" data-path="descriptive-statistics-and-visualisations.html"><a href="descriptive-statistics-and-visualisations.html#combining-multiple-geoms"><i class="fa fa-check"></i><b>5.4.2</b> Combining Multiple Geoms</a></li>
<li class="chapter" data-level="5.4.3" data-path="descriptive-statistics-and-visualisations.html"><a href="descriptive-statistics-and-visualisations.html#saving-your-plots"><i class="fa fa-check"></i><b>5.4.3</b> Saving Your Plots</a></li>
</ul></li>
<li class="chapter" data-level="5.5" data-path="descriptive-statistics-and-visualisations.html"><a href="descriptive-statistics-and-visualisations.html#conclusion-4"><i class="fa fa-check"></i><b>5.5</b> Conclusion</a></li>
</ul></li>
<li class="chapter" data-level="6" data-path="survey-analysis-in-r.html"><a href="survey-analysis-in-r.html"><i class="fa fa-check"></i><b>6</b> Survey Analysis in R</a>
<ul>
<li class="chapter" data-level="6.1" data-path="survey-analysis-in-r.html"><a href="survey-analysis-in-r.html#introduction-to-survey-data"><i class="fa fa-check"></i><b>6.1</b> Introduction to Survey Data</a>
<ul>
<li class="chapter" data-level="6.1.1" data-path="survey-analysis-in-r.html"><a href="survey-analysis-in-r.html#key-concepts-in-survey-analysis"><i class="fa fa-check"></i><b>6.1.1</b> Key Concepts in Survey Analysis</a></li>
<li class="chapter" data-level="6.1.2" data-path="survey-analysis-in-r.html"><a href="survey-analysis-in-r.html#understanding-survey-data-structures"><i class="fa fa-check"></i><b>6.1.2</b> Understanding Survey Data Structures</a></li>
</ul></li>
<li class="chapter" data-level="6.2" data-path="survey-analysis-in-r.html"><a href="survey-analysis-in-r.html#importing-and-preparing-survey-data"><i class="fa fa-check"></i><b>6.2</b> Importing and Preparing Survey Data</a>
<ul>
<li class="chapter" data-level="6.2.1" data-path="survey-analysis-in-r.html"><a href="survey-analysis-in-r.html#converting-data-for-survey-analysis"><i class="fa fa-check"></i><b>6.2.1</b> Converting Data for Survey Analysis</a></li>
</ul></li>
<li class="chapter" data-level="6.3" data-path="survey-analysis-in-r.html"><a href="survey-analysis-in-r.html#descriptive-analysis-of-survey-data"><i class="fa fa-check"></i><b>6.3</b> Descriptive Analysis of Survey Data</a>
<ul>
<li class="chapter" data-level="6.3.1" data-path="survey-analysis-in-r.html"><a href="survey-analysis-in-r.html#calculating-means-and-totals"><i class="fa fa-check"></i><b>6.3.1</b> Calculating Means and Totals</a></li>
<li class="chapter" data-level="6.3.2" data-path="survey-analysis-in-r.html"><a href="survey-analysis-in-r.html#frequencies-and-cross-tabulations"><i class="fa fa-check"></i><b>6.3.2</b> Frequencies and Cross-tabulations</a></li>
<li class="chapter" data-level="6.3.3" data-path="survey-analysis-in-r.html"><a href="survey-analysis-in-r.html#comparing-results-with-spss-survey-functions"><i class="fa fa-check"></i><b>6.3.3</b> Comparing Results with SPSS Survey Functions</a></li>
</ul></li>
<li class="chapter" data-level="6.4" data-path="survey-analysis-in-r.html"><a href="survey-analysis-in-r.html#weighting-survey-data"><i class="fa fa-check"></i><b>6.4</b> Weighting Survey Data</a>
<ul>
<li class="chapter" data-level="6.4.1" data-path="survey-analysis-in-r.html"><a href="survey-analysis-in-r.html#applying-weights"><i class="fa fa-check"></i><b>6.4.1</b> Applying Weights</a></li>
<li class="chapter" data-level="6.4.2" data-path="survey-analysis-in-r.html"><a href="survey-analysis-in-r.html#analysing-weighted-survey-data"><i class="fa fa-check"></i><b>6.4.2</b> Analysing Weighted Survey Data</a></li>
</ul></li>
<li class="chapter" data-level="6.5" data-path="survey-analysis-in-r.html"><a href="survey-analysis-in-r.html#conclusion-5"><i class="fa fa-check"></i><b>6.5</b> Conclusion</a></li>
</ul></li>
<li class="chapter" data-level="7" data-path="inferential-statistics.html"><a href="inferential-statistics.html"><i class="fa fa-check"></i><b>7</b> Inferential Statistics</a>
<ul>
<li class="chapter" data-level="7.1" data-path="inferential-statistics.html"><a href="inferential-statistics.html#hypothesis-testing"><i class="fa fa-check"></i><b>7.1</b> Hypothesis Testing</a>
<ul>
<li class="chapter" data-level="7.1.1" data-path="inferential-statistics.html"><a href="inferential-statistics.html#t-tests"><i class="fa fa-check"></i><b>7.1.1</b> T-tests</a></li>
<li class="chapter" data-level="7.1.2" data-path="inferential-statistics.html"><a href="inferential-statistics.html#chi-square-tests"><i class="fa fa-check"></i><b>7.1.2</b> Chi-square Tests</a></li>
<li class="chapter" data-level="7.1.3" data-path="inferential-statistics.html"><a href="inferential-statistics.html#anova-analysis-of-variance"><i class="fa fa-check"></i><b>7.1.3</b> ANOVA (Analysis of Variance)</a></li>
</ul></li>
<li class="chapter" data-level="7.2" data-path="inferential-statistics.html"><a href="inferential-statistics.html#correlation-analysis"><i class="fa fa-check"></i><b>7.2</b> Correlation Analysis</a>
<ul>
<li class="chapter" data-level="7.2.1" data-path="inferential-statistics.html"><a href="inferential-statistics.html#pearson-correlation"><i class="fa fa-check"></i><b>7.2.1</b> Pearson Correlation</a></li>
<li class="chapter" data-level="7.2.2" data-path="inferential-statistics.html"><a href="inferential-statistics.html#spearman-correlation"><i class="fa fa-check"></i><b>7.2.2</b> Spearman Correlation</a></li>
<li class="chapter" data-level="7.2.3" data-path="inferential-statistics.html"><a href="inferential-statistics.html#pearson-vs-spearman"><i class="fa fa-check"></i><b>7.2.3</b> Pearson vs Spearman?</a></li>
<li class="chapter" data-level="7.2.4" data-path="inferential-statistics.html"><a href="inferential-statistics.html#visualising-correlations"><i class="fa fa-check"></i><b>7.2.4</b> Visualising Correlations</a></li>
</ul></li>
<li class="chapter" data-level="7.3" data-path="inferential-statistics.html"><a href="inferential-statistics.html#conclusion-6"><i class="fa fa-check"></i><b>7.3</b> Conclusion</a></li>
</ul></li>
<li class="chapter" data-level="8" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html"><i class="fa fa-check"></i><b>8</b> Regression Analysis</a>
<ul>
<li class="chapter" data-level="8.1" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#introduction-to-regression-analysis"><i class="fa fa-check"></i><b>8.1</b> Introduction to Regression Analysis</a>
<ul>
<li class="chapter" data-level="8.1.1" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#what-is-linear-regression"><i class="fa fa-check"></i><b>8.1.1</b> What is Linear Regression?</a></li>
</ul></li>
<li class="chapter" data-level="8.2" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#simple-linear-regression-in-r"><i class="fa fa-check"></i><b>8.2</b> Simple Linear Regression in R</a>
<ul>
<li class="chapter" data-level="8.2.1" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#performing-simple-linear-regression"><i class="fa fa-check"></i><b>8.2.1</b> Performing Simple Linear Regression</a></li>
<li class="chapter" data-level="8.2.2" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#interpreting-the-output"><i class="fa fa-check"></i><b>8.2.2</b> Interpreting the Output</a></li>
</ul></li>
<li class="chapter" data-level="8.3" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#multiple-linear-regression"><i class="fa fa-check"></i><b>8.3</b> Multiple Linear Regression</a>
<ul>
<li class="chapter" data-level="8.3.1" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#performing-multiple-linear-regression"><i class="fa fa-check"></i><b>8.3.1</b> Performing Multiple Linear Regression</a></li>
<li class="chapter" data-level="8.3.2" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#detailed-interpretation-of-the-output"><i class="fa fa-check"></i><b>8.3.2</b> Detailed Interpretation of the Output</a></li>
</ul></li>
<li class="chapter" data-level="8.4" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#checking-model-assumptions"><i class="fa fa-check"></i><b>8.4</b> Checking Model Assumptions</a>
<ul>
<li class="chapter" data-level="8.4.1" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#assumption-1-linearity"><i class="fa fa-check"></i><b>8.4.1</b> Assumption 1: Linearity</a></li>
<li class="chapter" data-level="8.4.2" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#assumption-2-normality-of-residuals"><i class="fa fa-check"></i><b>8.4.2</b> Assumption 2: Normality of Residuals</a></li>
<li class="chapter" data-level="8.4.3" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#assumption-3-homoscedasticity"><i class="fa fa-check"></i><b>8.4.3</b> Assumption 3: Homoscedasticity</a></li>
<li class="chapter" data-level="8.4.4" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#assumption-4-independence-of-errors"><i class="fa fa-check"></i><b>8.4.4</b> Assumption 4: Independence of Errors</a></li>
<li class="chapter" data-level="8.4.5" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#assumption-5-multicollinearity"><i class="fa fa-check"></i><b>8.4.5</b> Assumption 5: Multicollinearity</a></li>
</ul></li>
<li class="chapter" data-level="8.5" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#transformations-and-interaction-terms"><i class="fa fa-check"></i><b>8.5</b> Transformations and Interaction Terms</a>
<ul>
<li class="chapter" data-level="8.5.1" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#when-and-how-to-apply-transformations"><i class="fa fa-check"></i><b>8.5.1</b> When and How to Apply Transformations</a></li>
<li class="chapter" data-level="8.5.2" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#using-interaction-terms"><i class="fa fa-check"></i><b>8.5.2</b> Using Interaction Terms</a></li>
</ul></li>
<li class="chapter" data-level="8.6" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#logistic-regression"><i class="fa fa-check"></i><b>8.6</b> Logistic Regression</a>
<ul>
<li class="chapter" data-level="8.6.1" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#performing-logistic-regression-in-r"><i class="fa fa-check"></i><b>8.6.1</b> Performing Logistic Regression in R</a></li>
<li class="chapter" data-level="8.6.2" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#interpreting-logistic-regression-output"><i class="fa fa-check"></i><b>8.6.2</b> Interpreting Logistic Regression Output</a></li>
</ul></li>
<li class="chapter" data-level="8.7" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#checking-model-assumptions-for-logistic-regression"><i class="fa fa-check"></i><b>8.7</b> Checking Model Assumptions for Logistic Regression</a>
<ul>
<li class="chapter" data-level="8.7.1" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#assumption-1-linearity-of-the-logit"><i class="fa fa-check"></i><b>8.7.1</b> Assumption 1: Linearity of the Logit</a></li>
<li class="chapter" data-level="8.7.2" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#assumption-2-independence-of-observations"><i class="fa fa-check"></i><b>8.7.2</b> Assumption 2: Independence of Observations</a></li>
<li class="chapter" data-level="8.7.3" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#assumption-3-absence-of-multicollinearity"><i class="fa fa-check"></i><b>8.7.3</b> Assumption 3: Absence of Multicollinearity</a></li>
<li class="chapter" data-level="8.7.4" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#assumption-4-sufficient-sample-size"><i class="fa fa-check"></i><b>8.7.4</b> Assumption 4: Sufficient Sample Size</a></li>
</ul></li>
<li class="chapter" data-level="8.8" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#model-validation-and-diagnostics"><i class="fa fa-check"></i><b>8.8</b> Model Validation and Diagnostics</a>
<ul>
<li class="chapter" data-level="8.8.1" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#cross-validation"><i class="fa fa-check"></i><b>8.8.1</b> Cross-Validation</a></li>
<li class="chapter" data-level="8.8.2" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#dealing-with-overfitting"><i class="fa fa-check"></i><b>8.8.2</b> Dealing with Overfitting</a></li>
</ul></li>
<li class="chapter" data-level="8.9" data-path="regression-analysis-1.html"><a href="regression-analysis-1.html#conclusion-7"><i class="fa fa-check"></i><b>8.9</b> Conclusion</a></li>
</ul></li>
<li class="chapter" data-level="9" data-path="geographic-mapping-and-spatial-analysis.html"><a href="geographic-mapping-and-spatial-analysis.html"><i class="fa fa-check"></i><b>9</b> Geographic Mapping and Spatial Analysis</a>
<ul>
<li class="chapter" data-level="9.1" data-path="geographic-mapping-and-spatial-analysis.html"><a href="geographic-mapping-and-spatial-analysis.html#introduction-to-geographic-data-in-r"><i class="fa fa-check"></i><b>9.1</b> Introduction to Geographic Data in R</a>
<ul>
<li class="chapter" data-level="9.1.1" data-path="geographic-mapping-and-spatial-analysis.html"><a href="geographic-mapping-and-spatial-analysis.html#understanding-geographic-data-formats"><i class="fa fa-check"></i><b>9.1.1</b> Understanding Geographic Data Formats</a></li>
<li class="chapter" data-level="9.1.2" data-path="geographic-mapping-and-spatial-analysis.html"><a href="geographic-mapping-and-spatial-analysis.html#importing-and-handling-spatial-data-with-the-sf-package"><i class="fa fa-check"></i><b>9.1.2</b> Importing and Handling Spatial Data with the <code>sf</code> Package</a></li>
</ul></li>
<li class="chapter" data-level="9.2" data-path="geographic-mapping-and-spatial-analysis.html"><a href="geographic-mapping-and-spatial-analysis.html#geographic-coordinate-systems-and-projections"><i class="fa fa-check"></i><b>9.2</b> Geographic Coordinate Systems and Projections</a>
<ul>
<li class="chapter" data-level="9.2.1" data-path="geographic-mapping-and-spatial-analysis.html"><a href="geographic-mapping-and-spatial-analysis.html#introduction-to-coordinate-systems"><i class="fa fa-check"></i><b>9.2.1</b> Introduction to Coordinate Systems</a></li>
<li class="chapter" data-level="9.2.2" data-path="geographic-mapping-and-spatial-analysis.html"><a href="geographic-mapping-and-spatial-analysis.html#common-issues-with-coordinate-systems-in-r"><i class="fa fa-check"></i><b>9.2.2</b> Common Issues with Coordinate Systems in R</a></li>
<li class="chapter" data-level="9.2.3" data-path="geographic-mapping-and-spatial-analysis.html"><a href="geographic-mapping-and-spatial-analysis.html#handling-coordinate-systems-in-r"><i class="fa fa-check"></i><b>9.2.3</b> Handling Coordinate Systems in R</a></li>
<li class="chapter" data-level="9.2.4" data-path="geographic-mapping-and-spatial-analysis.html"><a href="geographic-mapping-and-spatial-analysis.html#practical-considerations"><i class="fa fa-check"></i><b>9.2.4</b> Practical Considerations</a></li>
</ul></li>
<li class="chapter" data-level="9.3" data-path="geographic-mapping-and-spatial-analysis.html"><a href="geographic-mapping-and-spatial-analysis.html#creating-basic-maps"><i class="fa fa-check"></i><b>9.3</b> Creating Basic Maps</a>
<ul>
<li class="chapter" data-level="9.3.1" data-path="geographic-mapping-and-spatial-analysis.html"><a href="geographic-mapping-and-spatial-analysis.html#plotting-data-on-maps-using-ggplot2-and-sf"><i class="fa fa-check"></i><b>9.3.1</b> Plotting Data on Maps Using ggplot2 and sf</a></li>
<li class="chapter" data-level="9.3.2" data-path="geographic-mapping-and-spatial-analysis.html"><a href="geographic-mapping-and-spatial-analysis.html#customising-maps"><i class="fa fa-check"></i><b>9.3.2</b> Customising Maps</a></li>
</ul></li>
<li class="chapter" data-level="9.4" data-path="geographic-mapping-and-spatial-analysis.html"><a href="geographic-mapping-and-spatial-analysis.html#spatial-analysis"><i class="fa fa-check"></i><b>9.4</b> Spatial Analysis</a>
<ul>
<li class="chapter" data-level="9.4.1" data-path="geographic-mapping-and-spatial-analysis.html"><a href="geographic-mapping-and-spatial-analysis.html#basic-spatial-operations"><i class="fa fa-check"></i><b>9.4.1</b> Basic Spatial Operations</a></li>
<li class="chapter" data-level="9.4.2" data-path="geographic-mapping-and-spatial-analysis.html"><a href="geographic-mapping-and-spatial-analysis.html#creating-choropleth-maps"><i class="fa fa-check"></i><b>9.4.2</b> Creating Choropleth Maps</a></li>
</ul></li>
<li class="chapter" data-level="9.5" data-path="geographic-mapping-and-spatial-analysis.html"><a href="geographic-mapping-and-spatial-analysis.html#conclusion-8"><i class="fa fa-check"></i><b>9.5</b> Conclusion</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html"><i class="fa fa-check"></i>Exercise Answers</a>
<ul>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#chapter-2"><i class="fa fa-check"></i>Chapter 2</a>
<ul>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-2.1"><i class="fa fa-check"></i>Exercise 2.1</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-2.2"><i class="fa fa-check"></i>Exercise 2.2</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-2.3"><i class="fa fa-check"></i>Exercise 2.3</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#chapter-3"><i class="fa fa-check"></i>Chapter 3</a>
<ul>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-3.1"><i class="fa fa-check"></i>Exercise 3.1</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-3.2"><i class="fa fa-check"></i>Exercise 3.2</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-3.3"><i class="fa fa-check"></i>Exercise 3.3</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-3.4"><i class="fa fa-check"></i>Exercise 3.4</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-3.5"><i class="fa fa-check"></i>Exercise 3.5</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-3.6"><i class="fa fa-check"></i>Exercise 3.6</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-3.7"><i class="fa fa-check"></i>Exercise 3.7</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-3.8"><i class="fa fa-check"></i>Exercise 3.8</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#chapter-5"><i class="fa fa-check"></i>Chapter 5</a>
<ul>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-5.1"><i class="fa fa-check"></i>Exercise 5.1</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#excercise-5.2"><i class="fa fa-check"></i>Excercise 5.2</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-5.3"><i class="fa fa-check"></i>Exercise 5.3</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-5.4"><i class="fa fa-check"></i>Exercise 5.4</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-5.6"><i class="fa fa-check"></i>Exercise 5.6</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-5.7"><i class="fa fa-check"></i>Exercise 5.7</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-5.8"><i class="fa fa-check"></i>Exercise 5.8</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#excerise-5.9"><i class="fa fa-check"></i>Excerise 5.9</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-5.10"><i class="fa fa-check"></i>Exercise 5.10</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#chapter-6"><i class="fa fa-check"></i>Chapter 6</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#chapter-7"><i class="fa fa-check"></i>Chapter 7</a>
<ul>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-7.1"><i class="fa fa-check"></i>Exercise 7.1</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-7.2"><i class="fa fa-check"></i>Exercise 7.2</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-7.3"><i class="fa fa-check"></i>Exercise 7.3</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-7.4"><i class="fa fa-check"></i>Exercise 7.4</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#chapter-8"><i class="fa fa-check"></i>Chapter 8</a>
<ul>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-8.1"><i class="fa fa-check"></i>Exercise 8.1</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-8.2"><i class="fa fa-check"></i>Exercise 8.2</a></li>
</ul></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#chapter-9"><i class="fa fa-check"></i>Chapter 9</a>
<ul>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-9.1"><i class="fa fa-check"></i>Exercise 9.1</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-9.2"><i class="fa fa-check"></i>Exercise 9.2</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-9.3"><i class="fa fa-check"></i>Exercise 9.3</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-9.4"><i class="fa fa-check"></i>Exercise 9.4</a></li>
<li class="chapter" data-level="" data-path="exercise-answers.html"><a href="exercise-answers.html#exercise-9.5"><i class="fa fa-check"></i>Exercise 9.5</a></li>
</ul></li>
</ul></li>
<li class="divider"></li>
<li><a href="https://github.com/rstudio/bookdown" target="blank">Published with bookdown</a></li>
</ul>
</nav>
</div>
<div class="book-body">
<div class="body-inner">
<div class="book-header" role="navigation">
<h1>
<i class="fa fa-circle-o-notch fa-spin"></i><a href="./">Introduction to R for Crime Analysts</a>
</h1>
</div>
<div class="page-wrapper" tabindex="-1" role="main">
<div class="page-inner">
<section class="normal" id="section-">
<div id="descriptive-statistics-and-visualisations" class="section level1 hasAnchor" number="5">
<h1><span class="header-section-number">5</span> Descriptive Statistics and Visualisations<a href="descriptive-statistics-and-visualisations.html#descriptive-statistics-and-visualisations" class="anchor-section" aria-label="Anchor link to header"></a></h1>
<div id="introduction-to-descriptive-statistics" class="section level2 hasAnchor" number="5.1">
<h2><span class="header-section-number">5.1</span> Introduction to Descriptive Statistics<a href="descriptive-statistics-and-visualisations.html#introduction-to-descriptive-statistics" class="anchor-section" aria-label="Anchor link to header"></a></h2>
<p>Descriptive statistics provide a summary of the central tendency, dispersion, and shape of a dataset’s distribution. In SPSS, you may have used functions like <strong>FREQUENCIES</strong> or <strong>DESCRIPTIVES.</strong> In R, these tasks are just as straightforward, with added flexibility and power.</p>
<p>In this chapter we will utilise a fictional dataset called <a href="data/police_activity_data.csv">police_activity_data</a>. You can download a copy by clicking on the link.</p>
<div class="sourceCode" id="cb48"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb48-1"><a href="descriptive-statistics-and-visualisations.html#cb48-1" tabindex="-1"></a><span class="co">#Load the dataset from csv file</span></span>
<span id="cb48-2"><a href="descriptive-statistics-and-visualisations.html#cb48-2" tabindex="-1"></a>police_activity_data <span class="ot"><-</span> <span class="fu">read.csv</span>(<span class="st">'data/police_activity_data.csv'</span>)</span>
<span id="cb48-3"><a href="descriptive-statistics-and-visualisations.html#cb48-3" tabindex="-1"></a></span>
<span id="cb48-4"><a href="descriptive-statistics-and-visualisations.html#cb48-4" tabindex="-1"></a><span class="co">#Explore the first 5 entries</span></span>
<span id="cb48-5"><a href="descriptive-statistics-and-visualisations.html#cb48-5" tabindex="-1"></a><span class="fu">head</span>(police_activity_data, <span class="dv">5</span>)</span></code></pre></div>
<pre><code>## IncidentID Date Time IncidentType ResponseTime OfficersInvolved Outcome Borough IncidentSeverity
## 1 INC001 2024-08-31 15:32 Burglary 13 1 No Action North 1
## 2 INC002 2024-08-15 19:15 Public Disturbance 12 4 Arrest East 2
## 3 INC003 2024-08-19 07:27 Public Disturbance 7 2 Arrest West 5
## 4 INC004 2024-08-14 02:48 Traffic Stop 14 2 Warning South 2
## 5 INC005 2024-08-03 03:11 Traffic Stop 13 4 Warning East 5</code></pre>
<div id="understanding-descriptive-statistics" class="section level3 hasAnchor" number="5.1.1">
<h3><span class="header-section-number">5.1.1</span> Understanding Descriptive Statistics<a href="descriptive-statistics-and-visualisations.html#understanding-descriptive-statistics" class="anchor-section" aria-label="Anchor link to header"></a></h3>
<ul>
<li><em>Measures of Central Tendency</em>: These describe the center of the data (mean, median, mode).</li>
<li><em>Measures of Dispersion</em>: These describe the spread of the data (range, variance, standard deviation, interquartile range).</li>
<li><em>Shape of Distribution</em>: Skewness and kurtosis help describe the shape of the data distribution.</li>
</ul>
</div>
<div id="basic-descriptive-statistics-in-r" class="section level3 hasAnchor" number="5.1.2">
<h3><span class="header-section-number">5.1.2</span> Basic Descriptive Statistics in R<a href="descriptive-statistics-and-visualisations.html#basic-descriptive-statistics-in-r" class="anchor-section" aria-label="Anchor link to header"></a></h3>
<p>R provides multiple ways to compute descriptive statistics. Here are some basic functions.</p>
<p>The <code>summary()</code> function provides a summary of each variable in a dataset.</p>
<div class="sourceCode" id="cb50"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb50-1"><a href="descriptive-statistics-and-visualisations.html#cb50-1" tabindex="-1"></a><span class="co"># Basic summary of all variables in a data frame</span></span>
<span id="cb50-2"><a href="descriptive-statistics-and-visualisations.html#cb50-2" tabindex="-1"></a><span class="fu">summary</span>(police_activity_data)</span></code></pre></div>
<pre><code>## IncidentID Date Time IncidentType ResponseTime OfficersInvolved Outcome Borough IncidentSeverity
## Length:200 Length:200 Length:200 Length:200 Min. : 5.00 Min. :1.00 Length:200 Length:200 Min. :1.00
## Class :character Class :character Class :character Class :character 1st Qu.: 9.00 1st Qu.:2.00 Class :character Class :character 1st Qu.:2.00
## Mode :character Mode :character Mode :character Mode :character Median :13.00 Median :3.00 Mode :character Mode :character Median :3.00
## Mean :12.42 Mean :3.03 Mean :2.98
## 3rd Qu.:16.00 3rd Qu.:4.00 3rd Qu.:4.00
## Max. :20.00 Max. :5.00 Max. :5.00</code></pre>
<p>But you can also use specific functions such as <code>mean()</code>, <code>median()</code> and <code>sd()</code> to calculate specific statistics. Note the usage of the <code>na.rm = TRUE</code> argument which tells R to ignore <code>NA</code> (missing) values.</p>
<div class="sourceCode" id="cb52"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb52-1"><a href="descriptive-statistics-and-visualisations.html#cb52-1" tabindex="-1"></a><span class="co"># Mean of a numeric variable</span></span>
<span id="cb52-2"><a href="descriptive-statistics-and-visualisations.html#cb52-2" tabindex="-1"></a><span class="fu">mean</span>(police_activity_data<span class="sc">$</span>ResponseTime, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span></code></pre></div>
<pre><code>## [1] 12.42</code></pre>
<div class="sourceCode" id="cb54"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb54-1"><a href="descriptive-statistics-and-visualisations.html#cb54-1" tabindex="-1"></a><span class="co"># Median of a numeric variable</span></span>
<span id="cb54-2"><a href="descriptive-statistics-and-visualisations.html#cb54-2" tabindex="-1"></a><span class="fu">median</span>(police_activity_data<span class="sc">$</span>ResponseTime, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span></code></pre></div>
<pre><code>## [1] 13</code></pre>
<div class="sourceCode" id="cb56"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb56-1"><a href="descriptive-statistics-and-visualisations.html#cb56-1" tabindex="-1"></a><span class="co"># Standard deviation of a numeric variable</span></span>
<span id="cb56-2"><a href="descriptive-statistics-and-visualisations.html#cb56-2" tabindex="-1"></a><span class="fu">sd</span>(police_activity_data<span class="sc">$</span>ResponseTime, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span></code></pre></div>
<pre><code>## [1] 4.290178</code></pre>
<div class="infobox caution">
<p><strong>Exercise!</strong></p>
<p>Download the <code>police_activity_data.csv</code> file and load it into a R data frame. Produce a basic summary of the Response Time variable. What does the difference between the Mean and Median measure tell you about the skewness of the data?</p>
</div>
</div>
</div>
<div id="creating-visualisations-with-ggplot2" class="section level2 hasAnchor" number="5.2">
<h2><span class="header-section-number">5.2</span> Creating Visualisations with ggplot2<a href="descriptive-statistics-and-visualisations.html#creating-visualisations-with-ggplot2" class="anchor-section" aria-label="Anchor link to header"></a></h2>
<p>Visualisations are key to understanding and presenting data. The <code>ggplot2</code> package is a powerful tool for creating a wide variety of plots, from simple bar charts to complex multi-layered visualisations.</p>
<div id="introduction-to-ggplot2" class="section level3 hasAnchor" number="5.2.1">
<h3><span class="header-section-number">5.2.1</span> Introduction to ggplot2<a href="descriptive-statistics-and-visualisations.html#introduction-to-ggplot2" class="anchor-section" aria-label="Anchor link to header"></a></h3>
<p><code>ggplot2</code> is part of the <code>tidyverse</code> collection of packages and is based on the grammar of graphics. The basic structure of a <code>ggplot2</code> plot involves:</p>
<ul>
<li><strong>Data</strong>: The dataset being used.</li>
<li><strong>Aesthetics (aes)</strong>: Mapping variables to visual properties like x, y, color, size.</li>
<li><strong>Geometries (geom)</strong>: The type of plot (e.g., geom_bar for bar charts, geom_point for scatter plots).</li>
</ul>
<p>Install <code>ggplot2</code> if you haven’t already:</p>
<div class="sourceCode" id="cb58"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb58-1"><a href="descriptive-statistics-and-visualisations.html#cb58-1" tabindex="-1"></a><span class="fu">install.packages</span>(<span class="st">"ggplot2"</span>)</span></code></pre></div>
</div>
<div id="creating-basic-plots" class="section level3 hasAnchor" number="5.2.2">
<h3><span class="header-section-number">5.2.2</span> Creating Basic Plots<a href="descriptive-statistics-and-visualisations.html#creating-basic-plots" class="anchor-section" aria-label="Anchor link to header"></a></h3>
<p>Here’s how you can create some basic plots in R using ggplot2:</p>
<div id="bar-charts" class="section level4 hasAnchor" number="5.2.2.1">
<h4><span class="header-section-number">5.2.2.1</span> Bar Charts<a href="descriptive-statistics-and-visualisations.html#bar-charts" class="anchor-section" aria-label="Anchor link to header"></a></h4>
<p>Bar charts are used to display the frequency of categorical data.</p>
<ul>
<li><code>aes(x = categorical_variable)</code>: Maps the categorical variable to the x-axis.</li>
<li><code>geom_bar()</code>: Creates the bar chart.</li>
</ul>
<div class="sourceCode" id="cb59"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb59-1"><a href="descriptive-statistics-and-visualisations.html#cb59-1" tabindex="-1"></a><span class="co">#Load the ggplot2 library</span></span>
<span id="cb59-2"><a href="descriptive-statistics-and-visualisations.html#cb59-2" tabindex="-1"></a><span class="fu">library</span>(ggplot2)</span>
<span id="cb59-3"><a href="descriptive-statistics-and-visualisations.html#cb59-3" tabindex="-1"></a></span>
<span id="cb59-4"><a href="descriptive-statistics-and-visualisations.html#cb59-4" tabindex="-1"></a><span class="co"># Bar chart for a categorical variable</span></span>
<span id="cb59-5"><a href="descriptive-statistics-and-visualisations.html#cb59-5" tabindex="-1"></a><span class="fu">ggplot</span>(your_data, <span class="fu">aes</span>(<span class="at">x =</span> categorical_variable)) <span class="sc">+</span></span>
<span id="cb59-6"><a href="descriptive-statistics-and-visualisations.html#cb59-6" tabindex="-1"></a> <span class="fu">geom_bar</span>() <span class="sc">+</span></span>
<span id="cb59-7"><a href="descriptive-statistics-and-visualisations.html#cb59-7" tabindex="-1"></a> <span class="fu">labs</span>(<span class="at">title =</span> <span class="st">"Bar Chart of Categorical Variable"</span>, <span class="at">x =</span> <span class="st">"Category"</span>, <span class="at">y =</span> <span class="st">"Count"</span>)</span></code></pre></div>
<div class="infobox caution">
<p><strong>Exercise!</strong></p>
<p>Create a Bar Chart of the <code>Borough</code> variable in the <code>police_activity_data</code> dataset. Which borough has the greatest number of crimes?</p>
</div>
</div>
<div id="histograms" class="section level4 hasAnchor" number="5.2.2.2">
<h4><span class="header-section-number">5.2.2.2</span> Histograms<a href="descriptive-statistics-and-visualisations.html#histograms" class="anchor-section" aria-label="Anchor link to header"></a></h4>
<p>Histograms show the distribution of a continuous variable.</p>
<ul>
<li><code>geom_histogram(binwidth = 10):</code> Creates the histogram with specified bin width.</li>
<li><code>fill</code> and <code>color</code>: Customize the appearance.</li>
</ul>
<div class="sourceCode" id="cb60"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb60-1"><a href="descriptive-statistics-and-visualisations.html#cb60-1" tabindex="-1"></a><span class="fu">ggplot</span>(your_data, <span class="fu">aes</span>(<span class="at">x =</span> continuous_variable)) <span class="sc">+</span></span>
<span id="cb60-2"><a href="descriptive-statistics-and-visualisations.html#cb60-2" tabindex="-1"></a> <span class="fu">geom_histogram</span>(<span class="at">binwidth =</span> <span class="dv">10</span>, <span class="at">fill =</span> <span class="st">"blue"</span>, <span class="at">color =</span> <span class="st">"black"</span>) <span class="sc">+</span></span>
<span id="cb60-3"><a href="descriptive-statistics-and-visualisations.html#cb60-3" tabindex="-1"></a> <span class="fu">labs</span>(<span class="at">title =</span> <span class="st">"Histogram of Continuous Variable"</span>, <span class="at">x =</span> <span class="st">"Value"</span>, <span class="at">y =</span> <span class="st">"Frequency"</span>)</span></code></pre></div>
<div class="infobox caution">
<p><strong>Exercise!</strong></p>
<p>Create a Histogram of the <code>ResponseTime</code> variable in the <code>police_activity_data</code> dataset setting the bin size to 4. How is the response time distributed?</p>
</div>
</div>
<div id="boxplots" class="section level4 hasAnchor" number="5.2.2.3">
<h4><span class="header-section-number">5.2.2.3</span> Boxplots<a href="descriptive-statistics-and-visualisations.html#boxplots" class="anchor-section" aria-label="Anchor link to header"></a></h4>
<p>Boxplots display the distribution of a variable and its potential outliers.</p>
<ul>
<li><code>aes(x = factor_variable, y = continuous_variable)</code>: Maps the factor variable to the x-axis and the continuous variable to the y-axis.</li>
<li><code>geom_boxplot()</code>: Creates the boxplot.</li>
</ul>
<div class="sourceCode" id="cb61"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb61-1"><a href="descriptive-statistics-and-visualisations.html#cb61-1" tabindex="-1"></a><span class="fu">ggplot</span>(your_data, <span class="fu">aes</span>(<span class="at">x =</span> factor_variable, <span class="at">y =</span> continuous_variable)) <span class="sc">+</span></span>
<span id="cb61-2"><a href="descriptive-statistics-and-visualisations.html#cb61-2" tabindex="-1"></a> <span class="fu">geom_boxplot</span>() <span class="sc">+</span></span>
<span id="cb61-3"><a href="descriptive-statistics-and-visualisations.html#cb61-3" tabindex="-1"></a> <span class="fu">labs</span>(<span class="at">title =</span> <span class="st">"Boxplot of Continuous Variable by Factor"</span>, <span class="at">x =</span> <span class="st">"Factor"</span>, <span class="at">y =</span> <span class="st">"Value"</span>)</span></code></pre></div>
<div class="infobox caution">
<p><strong>Exercise!</strong></p>
<p>Create a Boxplot of the <code>ResponseTime</code> variable for each of the <code>Borough</code> in the <code>police_activity_data</code> dataset. Which Borough has the lowest Median response time? Which Borough has the smallest range of response times?</p>
</div>
</div>
</div>
<div id="customising-your-plots" class="section level3 hasAnchor" number="5.2.3">
<h3><span class="header-section-number">5.2.3</span> Customising Your Plots<a href="descriptive-statistics-and-visualisations.html#customising-your-plots" class="anchor-section" aria-label="Anchor link to header"></a></h3>
<p>One of the strengths of ggplot2 is its flexibility in customising plots.You can add additional commands and features using the <code>+</code> notation.</p>
<p><strong>Adding Titles, Labels, and Themes</strong></p>
<ul>
<li><code>labs()</code>: Adds titles and axis labels.</li>
<li><code>theme_minimal()</code>: Applies a clean, minimalistic theme to the plot.</li>
</ul>
<div class="sourceCode" id="cb62"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb62-1"><a href="descriptive-statistics-and-visualisations.html#cb62-1" tabindex="-1"></a><span class="fu">ggplot</span>(your_data, <span class="fu">aes</span>(<span class="at">x =</span> categorical_variable)) <span class="sc">+</span></span>
<span id="cb62-2"><a href="descriptive-statistics-and-visualisations.html#cb62-2" tabindex="-1"></a> <span class="fu">geom_bar</span>(<span class="at">fill =</span> <span class="st">"lightblue"</span>, <span class="at">color =</span> <span class="st">"black"</span>) <span class="sc">+</span></span>
<span id="cb62-3"><a href="descriptive-statistics-and-visualisations.html#cb62-3" tabindex="-1"></a> <span class="fu">labs</span>(<span class="at">title =</span> <span class="st">"Bar Chart of Categorical Variable"</span>, <span class="at">x =</span> <span class="st">"Category"</span>, <span class="at">y =</span> <span class="st">"Count"</span>) <span class="sc">+</span></span>
<span id="cb62-4"><a href="descriptive-statistics-and-visualisations.html#cb62-4" tabindex="-1"></a> <span class="fu">theme_minimal</span>()</span></code></pre></div>
<p><strong>Using Colors to Enhance Visualisations</strong></p>
<p>You can differentiate categories or highlight data points using color.</p>
<ul>
<li><code>scale_fill_brewer(palette = "Pastel1")</code>: Applies a color palette to the fill of the boxplots.</li>
</ul>
<div class="sourceCode" id="cb63"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb63-1"><a href="descriptive-statistics-and-visualisations.html#cb63-1" tabindex="-1"></a><span class="fu">ggplot</span>(your_data, <span class="fu">aes</span>(<span class="at">x =</span> factor_variable, <span class="at">y =</span> continuous_variable, <span class="at">fill =</span> factor_variable)) <span class="sc">+</span></span>
<span id="cb63-2"><a href="descriptive-statistics-and-visualisations.html#cb63-2" tabindex="-1"></a> <span class="fu">geom_boxplot</span>() <span class="sc">+</span></span>
<span id="cb63-3"><a href="descriptive-statistics-and-visualisations.html#cb63-3" tabindex="-1"></a> <span class="fu">labs</span>(<span class="at">title =</span> <span class="st">"Boxplot of Continuous Variable by Factor"</span>, <span class="at">x =</span> <span class="st">"Factor"</span>, <span class="at">y =</span> <span class="st">"Value"</span>) <span class="sc">+</span></span>
<span id="cb63-4"><a href="descriptive-statistics-and-visualisations.html#cb63-4" tabindex="-1"></a> <span class="fu">scale_fill_brewer</span>(<span class="at">palette =</span> <span class="st">"Pastel1"</span>)</span></code></pre></div>
<div class="infobox caution">
<p><strong>Exercise!</strong></p>
<p>Using your Boxplot Diagram of the <code>ResponseTime</code> variable for each of the <code>Borough</code> in the <code>police_activity_data</code> dataset. Add some colour!</p>
</div>
</div>
</div>
<div id="descriptive-statistics-with-dplyr" class="section level2 hasAnchor" number="5.3">
<h2><span class="header-section-number">5.3</span> Descriptive Statistics with dplyr<a href="descriptive-statistics-and-visualisations.html#descriptive-statistics-with-dplyr" class="anchor-section" aria-label="Anchor link to header"></a></h2>
<p><code>dplyr</code> is a powerful tool for data manipulation and is also useful for summarising data. It works well alongside <code>ggplot2</code> for data exploration and visualization.</p>
<div id="using-dplyr-to-summarise-data" class="section level3 hasAnchor" number="5.3.1">
<h3><span class="header-section-number">5.3.1</span> Using dplyr to Summarise Data<a href="descriptive-statistics-and-visualisations.html#using-dplyr-to-summarise-data" class="anchor-section" aria-label="Anchor link to header"></a></h3>
<p>You can summarise data by calculating various descriptive statistics for different groups.</p>
<ul>
<li><code>group_by(factor_variable)</code>: Groups the data by the specified factor variable.</li>
<li><code>summarize()</code>: Calculates the mean and standard deviation for each group.</li>
</ul>
<div class="sourceCode" id="cb64"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb64-1"><a href="descriptive-statistics-and-visualisations.html#cb64-1" tabindex="-1"></a><span class="co"># Load the dplyr library</span></span>
<span id="cb64-2"><a href="descriptive-statistics-and-visualisations.html#cb64-2" tabindex="-1"></a><span class="fu">library</span>(dplyr)</span>
<span id="cb64-3"><a href="descriptive-statistics-and-visualisations.html#cb64-3" tabindex="-1"></a></span>
<span id="cb64-4"><a href="descriptive-statistics-and-visualisations.html#cb64-4" tabindex="-1"></a><span class="co"># Summarise data: mean and standard deviation by group</span></span>
<span id="cb64-5"><a href="descriptive-statistics-and-visualisations.html#cb64-5" tabindex="-1"></a>summary_data <span class="ot"><-</span> your_data <span class="sc">%>%</span></span>
<span id="cb64-6"><a href="descriptive-statistics-and-visualisations.html#cb64-6" tabindex="-1"></a> <span class="fu">group_by</span>(factor_variable) <span class="sc">%>%</span></span>
<span id="cb64-7"><a href="descriptive-statistics-and-visualisations.html#cb64-7" tabindex="-1"></a> <span class="fu">summarize</span>(</span>
<span id="cb64-8"><a href="descriptive-statistics-and-visualisations.html#cb64-8" tabindex="-1"></a> <span class="at">mean_value =</span> <span class="fu">mean</span>(continuous_variable, <span class="at">na.rm =</span> <span class="cn">TRUE</span>),</span>
<span id="cb64-9"><a href="descriptive-statistics-and-visualisations.html#cb64-9" tabindex="-1"></a> <span class="at">sd_value =</span> <span class="fu">sd</span>(continuous_variable, <span class="at">na.rm =</span> <span class="cn">TRUE</span>)</span>
<span id="cb64-10"><a href="descriptive-statistics-and-visualisations.html#cb64-10" tabindex="-1"></a> )</span></code></pre></div>
<div class="infobox caution">
<p><strong>Exercise!</strong></p>
<p>Using the <code>police_activity_data</code> dataset, calculate the mean and standard deviation of the ResponseTime based on the <code>IncidentType</code>. Which Incident Type had the greatest mean response time?</p>
</div>
</div>
<div id="combining-dplyr-with-ggplot2" class="section level3 hasAnchor" number="5.3.2">
<h3><span class="header-section-number">5.3.2</span> Combining dplyr with ggplot2<a href="descriptive-statistics-and-visualisations.html#combining-dplyr-with-ggplot2" class="anchor-section" aria-label="Anchor link to header"></a></h3>
<p>You can easily combine the power of <code>dplyr</code> and <code>ggplot2</code> to create insightful visualisations.</p>
<div class="sourceCode" id="cb65"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb65-1"><a href="descriptive-statistics-and-visualisations.html#cb65-1" tabindex="-1"></a><span class="co"># Example: Create a summary and plot it</span></span>
<span id="cb65-2"><a href="descriptive-statistics-and-visualisations.html#cb65-2" tabindex="-1"></a>summary_data <span class="ot"><-</span> your_data <span class="sc">%>%</span></span>
<span id="cb65-3"><a href="descriptive-statistics-and-visualisations.html#cb65-3" tabindex="-1"></a> <span class="fu">group_by</span>(factor_variable) <span class="sc">%>%</span></span>
<span id="cb65-4"><a href="descriptive-statistics-and-visualisations.html#cb65-4" tabindex="-1"></a> <span class="fu">summarize</span>(<span class="at">sd_value =</span> <span class="fu">sd</span>(continuous_variable, <span class="at">na.rm =</span> <span class="cn">TRUE</span>))</span>
<span id="cb65-5"><a href="descriptive-statistics-and-visualisations.html#cb65-5" tabindex="-1"></a></span>
<span id="cb65-6"><a href="descriptive-statistics-and-visualisations.html#cb65-6" tabindex="-1"></a><span class="fu">ggplot</span>(summary_data, <span class="fu">aes</span>(<span class="at">x =</span> factor_variable, <span class="at">y =</span> sd_value)) <span class="sc">+</span></span>
<span id="cb65-7"><a href="descriptive-statistics-and-visualisations.html#cb65-7" tabindex="-1"></a> <span class="fu">geom_bar</span>(<span class="at">stat =</span> <span class="st">"identity"</span>) <span class="sc">+</span></span>
<span id="cb65-8"><a href="descriptive-statistics-and-visualisations.html#cb65-8" tabindex="-1"></a> <span class="fu">labs</span>(<span class="at">title =</span> <span class="st">"Standard Deviation of Continuous Variable by Factor"</span>,</span>
<span id="cb65-9"><a href="descriptive-statistics-and-visualisations.html#cb65-9" tabindex="-1"></a> <span class="at">x =</span> <span class="st">"Factor"</span>,</span>
<span id="cb65-10"><a href="descriptive-statistics-and-visualisations.html#cb65-10" tabindex="-1"></a> <span class="at">y =</span> <span class="st">"Standard Deviation"</span>)</span></code></pre></div>
<div class="infobox caution">
<p><strong>Exercise!</strong></p>
<p>Building on the previous exercise, create a plot of the mean <code>ResponseTime</code> based on the <code>IncidentType</code>. The factors along the x-axis should be sorted alphabetically. Can you try sorting these in ascending order of their mean?</p>
</div>
</div>
</div>
<div id="advanced-visualisation-techniques" class="section level2 hasAnchor" number="5.4">
<h2><span class="header-section-number">5.4</span> Advanced Visualisation Techniques<a href="descriptive-statistics-and-visualisations.html#advanced-visualisation-techniques" class="anchor-section" aria-label="Anchor link to header"></a></h2>
<p>Once you’re comfortable with basic plots, <code>ggplot2</code> offers many advanced features for more complex visualisations.</p>
<div id="faceting" class="section level3 hasAnchor" number="5.4.1">
<h3><span class="header-section-number">5.4.1</span> Faceting<a href="descriptive-statistics-and-visualisations.html#faceting" class="anchor-section" aria-label="Anchor link to header"></a></h3>
<p>Faceting allows you to create multiple plots based on the values of one or more variables.</p>
<ul>
<li><code>facet_wrap(~factor_variable)</code>: Creates separate plots for each level of factor_variable.</li>
</ul>
<div class="sourceCode" id="cb66"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb66-1"><a href="descriptive-statistics-and-visualisations.html#cb66-1" tabindex="-1"></a><span class="fu">ggplot</span>(your_data, <span class="fu">aes</span>(<span class="at">x =</span> continuous_variable)) <span class="sc">+</span></span>
<span id="cb66-2"><a href="descriptive-statistics-and-visualisations.html#cb66-2" tabindex="-1"></a> <span class="fu">geom_histogram</span>(<span class="at">binwidth =</span> <span class="dv">10</span>) <span class="sc">+</span></span>
<span id="cb66-3"><a href="descriptive-statistics-and-visualisations.html#cb66-3" tabindex="-1"></a> <span class="fu">facet_wrap</span>(<span class="sc">~</span>factor_variable) <span class="sc">+</span></span>
<span id="cb66-4"><a href="descriptive-statistics-and-visualisations.html#cb66-4" tabindex="-1"></a> <span class="fu">labs</span>(<span class="at">title =</span> <span class="st">"Histogram Faceted by Factor"</span>)</span></code></pre></div>
<div class="infobox caution">
<p><strong>Exercise!</strong></p>
<p>Use the facet wrap functionality to create a series of histograms representing the <code>IncidentSeverity</code> across the four different boroughs using a binwidth of 3. How does the response time vary across the four different boroughs?</p>
</div>
</div>
<div id="combining-multiple-geoms" class="section level3 hasAnchor" number="5.4.2">
<h3><span class="header-section-number">5.4.2</span> Combining Multiple Geoms<a href="descriptive-statistics-and-visualisations.html#combining-multiple-geoms" class="anchor-section" aria-label="Anchor link to header"></a></h3>
<p>You can layer multiple geometries to create complex plots.</p>
<ul>
<li><code>geom_point()</code>: Adds a scatter plot.</li>
<li><code>geom_smooth(method = "lm")</code>: Adds a linear regression line without the confidence interval.</li>
</ul>
<div class="sourceCode" id="cb67"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb67-1"><a href="descriptive-statistics-and-visualisations.html#cb67-1" tabindex="-1"></a><span class="fu">ggplot</span>(your_data, <span class="fu">aes</span>(<span class="at">x =</span> continuous_variable, <span class="at">y =</span> another_continuous_variable)) <span class="sc">+</span></span>
<span id="cb67-2"><a href="descriptive-statistics-and-visualisations.html#cb67-2" tabindex="-1"></a> <span class="fu">geom_point</span>() <span class="sc">+</span></span>
<span id="cb67-3"><a href="descriptive-statistics-and-visualisations.html#cb67-3" tabindex="-1"></a> <span class="fu">geom_smooth</span>(<span class="at">method =</span> <span class="st">"lm"</span>, <span class="at">se =</span> <span class="cn">FALSE</span>) <span class="sc">+</span></span>
<span id="cb67-4"><a href="descriptive-statistics-and-visualisations.html#cb67-4" tabindex="-1"></a> <span class="fu">labs</span>(<span class="at">title =</span> <span class="st">"Scatter Plot with Regression Line"</span>, <span class="at">x =</span> <span class="st">"X Variable"</span>, <span class="at">y =</span> <span class="st">"Y Variable"</span>)</span></code></pre></div>
<div class="infobox caution">
<p><strong>Exercise!</strong></p>
<p>Using <code>dplyr</code> produce a count of the number of crimes that occurred on each day. Use this information to create a scatterplot with a regression line. What do you notice about the trend?</p>
<ul>
<li><em>Hint I: Use the <code>dplyr</code> pipeline to group the data in conjunction with the <code>summarise(count = n())</code> function.</em></li>
<li><em>Hint II: If you can’t see a trendline you may need to review your <code>Date</code> variable data type.</em></li>
</ul>
</div>
</div>
<div id="saving-your-plots" class="section level3 hasAnchor" number="5.4.3">
<h3><span class="header-section-number">5.4.3</span> Saving Your Plots<a href="descriptive-statistics-and-visualisations.html#saving-your-plots" class="anchor-section" aria-label="Anchor link to header"></a></h3>
<p>Once you’ve created a plot, you might want to save it for later use.</p>
<ul>
<li><code>ggsave()</code>: Saves the last plot with specified dimensions.</li>
</ul>
<div class="sourceCode" id="cb68"><pre class="sourceCode r"><code class="sourceCode r"><span id="cb68-1"><a href="descriptive-statistics-and-visualisations.html#cb68-1" tabindex="-1"></a><span class="co"># Save the plot to a file</span></span>
<span id="cb68-2"><a href="descriptive-statistics-and-visualisations.html#cb68-2" tabindex="-1"></a><span class="fu">ggsave</span>(<span class="st">"my_plot.png"</span>, <span class="at">width =</span> <span class="dv">8</span>, <span class="at">height =</span> <span class="dv">6</span>)</span></code></pre></div>
</div>
</div>
<div id="conclusion-4" class="section level2 hasAnchor" number="5.5">
<h2><span class="header-section-number">5.5</span> Conclusion<a href="descriptive-statistics-and-visualisations.html#conclusion-4" class="anchor-section" aria-label="Anchor link to header"></a></h2>
<p>In this chapter, we explored the fundamentals of descriptive statistics and data visualisation in R, tools essential for any data analysis, including crime analysis. Starting with basic summary statistics, such as measures of central tendency and dispersion, we demonstrated how to gain quick insights into your data. We then explored the power of visualisations, learning how to create bar plots, histograms, boxplots, and scatterplots using the <code>ggplot2</code> package, one of R’s most versatile and widely-used visualization libraries.</p>
<p>These techniques allow you to uncover patterns, trends, and potential outliers in your data, transforming raw numbers into visual stories that can be more easily interpreted and communicated. As you continue to work with crime data or any other datasets, these descriptive statistics and visualisation skills will form the foundation for more advanced analyses. Whether summarising crime rates across boroughs or visualising the distribution of incident response times, these tools will help you to turn data into actionable insights.</p>
<p>In the next chapters, we will build upon these concepts, diving into inferential statistics and regression analysis, where you’ll learn to make predictions and draw conclusions beyond mere descriptions. The knowledge you’ve gained here will be crucial as we move forward, so make sure to revisit these techniques regularly as you become more familiar with R.</p>
</div>
</div>
</section>
</div>
</div>
</div>
<a href="connecting-to-and-accessing-a-postgresql-database.html" class="navigation navigation-prev " aria-label="Previous page"><i class="fa fa-angle-left"></i></a>
<a href="survey-analysis-in-r.html" class="navigation navigation-next " aria-label="Next page"><i class="fa fa-angle-right"></i></a>
</div>
</div>
<script src="libs/gitbook-2.6.7/js/app.min.js"></script>
<script src="libs/gitbook-2.6.7/js/clipboard.min.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-search.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-sharing.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-fontsettings.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-bookdown.js"></script>
<script src="libs/gitbook-2.6.7/js/jquery.highlight.js"></script>
<script src="libs/gitbook-2.6.7/js/plugin-clipboard.js"></script>
<script>
gitbook.require(["gitbook"], function(gitbook) {
gitbook.start({
"sharing": {
"github": false,
"facebook": true,
"twitter": true,
"linkedin": false,
"weibo": false,
"instapaper": false,
"vk": false,
"whatsapp": false,
"all": ["facebook", "twitter", "linkedin", "weibo", "instapaper"]
},
"fontsettings": {
"theme": "white",
"family": "sans",
"size": 2
},
"edit": {
"link": "https://github.com/USERNAME/REPO/edit/BRANCH/05-descriptive.Rmd",
"text": "Edit"
},
"history": {
"link": null,
"text": null
},
"view": {
"link": null,
"text": null
},
"download": ["_main.pdf", "_main.epub"],
"search": {
"engine": "fuse",
"options": null
},
"toc": {
"collapse": "subsection"
}
});
});
</script>
<!-- dynamically load mathjax for compatibility with self-contained -->
<script>
(function () {
var script = document.createElement("script");
script.type = "text/javascript";
var src = "true";
if (src === "" || src === "true") src = "https://cdnjs.cloudflare.com/ajax/libs/mathjax/2.7.9/latest.js?config=TeX-MML-AM_CHTML";
if (location.protocol !== "file:")
if (/^https?:/.test(src))
src = src.replace(/^https?:/, '');
script.src = src;
document.getElementsByTagName("head")[0].appendChild(script);
})();
</script>
</body>
</html>