-
Notifications
You must be signed in to change notification settings - Fork 1
/
readme.Rmd
88 lines (66 loc) · 2.28 KB
/
readme.Rmd
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
---
output:
md_document:
variant: markdown_github
---
# Gesis
[![CRAN_Status_Badge](http://www.r-pkg.org/badges/version/gesis)](http://cran.r-project.org/package=gesis)
[![](http://cranlogs.r-pkg.org/badges/grand-total/gesis)](http://cran.r-project.org/web/packages/gesis)
## Introduction
The [GESIS Data Catalogue](https://dbk.gesis.org/) offers a repository of
approximately 5,000 datasets.
To install the package from github:
```{r, eval=FALSE}
# install.packages("devtools")
devtools::install_github("expersso/gesis")
```
```{r load}
library(gesis)
```
## A simple example
We start by listing all available groups of studies:
```{r groups}
groups <- get_study_groups()
head(groups, 10)
```
We see that the Eurobarometer has study group number 0008 Let's looks at all
available Eurobarometer waves:
```{r eurobar_waves}
eurobars <- get_datasets("0008")
head(eurobars)
```
We would now like to download the first three studies. We first need to log in
to the Gesis website and then pass the DOIs (unique data set identifiers) to
`download_dataset`:
```{r login}
# username and password stored as environment
# variables "GESIS_USER" and "GESIS_PASS"
gesis_session <- login()
```
```{r}
if(!dir.exists("downloads")) dir.create("downloads")
download_dataset(s = gesis_session, doi = eurobars$doi[1:3],
path = "downloads", filetype = ".dta")
(files <- list.files("downloads", full.names = TRUE))
```
We can also download the codebooks for the same studies:
```{r codebooks, eval=FALSE}
download_codebook(eurobars$doi[1:3], path = "downloads")
```
Using the `haven` package we can now read the data sets:
```{r read_data, warning=FALSE}
library(haven)
df <- read_dta(files[1])
dim(df)
```
```{r remove_downloads, echo=FALSE}
unlink("downloads", recursive = TRUE)
```
Disclaimer: the `gesis` package is neither affiliated with, nor endorsed by, the
Leibniz Institute for the Social Sciences. I have been unable to find any
indication that programmatic access to the website is disallowed under its terms
of use (indeed, its
[guidelines](https://dbk.gesis.org/dbksearch/guidelines.asp) appear to
encourage it). That said, I would discourage users from using the `gesis`
package to put undue pressure on their servers by initiating unnecessary (or
unnecessarily large) batch downloads.