From Raw to Civilized Data

First we find the data from EGM website.We download the data change the name to egm_example_data.xlsx. We will make a reproducible example of data analysis from the raw data located somewhere to the final analysis.

Download Row Data

We created a new file as name “tmp”, then download the file from repository to the “tmp”.We read that excel file by using readxl package’s read_excel function and removed to temporary file that we created.

tmp=tempfile(fileext=".xlsx")

download.file("https://github.com/pjournal/mef03g-polatalemd-r/blob/master/egm_example_data.xlsx?raw=true",destfile=tmp,mode='wb')
raw_data=readxl::read_excel(tmp)
file.remove(tmp)

You can view head and tail of the dataset in below tables, respectively.

head(raw_data)
## # A tibble: 6 x 15
##   date  pension_fund_co~ n_of_participan~ fund_size_parti~ gov_contribution
##   <chr> <chr>                       <dbl>            <dbl>            <dbl>
## 1 7.31~ Aegon Emeklilik~            37671       132128516.        15559754.
## 2 7.31~ Allianz Hayat v~            94630      2801049763.       385315268.
## 3 7.31~ Allianz Yasam v~           728934     12158903088.      1435419831.
## 4 7.31~ Anadolu Hayat E~          1091010     16312824287.      2768140863.
## 5 7.31~ Avivasa Emeklil~           787046     16801925214.      2467365796.
## 6 7.31~ Axa Hayat ve Em~            33794       426214299.        86715369.
## # ... with 10 more variables: contribution <dbl>, n_of_pensioners <dbl>,
## #   n_of_ind_contracts <dbl>, n_of_group_ind_contracts <dbl>,
## #   n_of_employer_group_certificates <dbl>, n_total <dbl>,
## #   size_of_ind_contracts <dbl>, size_of_group_ind_contracts <dbl>,
## #   size_of_employer_group_certificates <dbl>, size_total <dbl>
tail(raw_data)
## # A tibble: 6 x 15
##   date  pension_fund_co~ n_of_participan~ fund_size_parti~ gov_contribution
##   <chr> <chr>                       <dbl>            <dbl>            <dbl>
## 1 1.3.~ Groupama Emekli~            77094       688162318.        20423655.
## 2 1.3.~ Halk Hayat ve E~           134928       307863682.        30903648.
## 3 1.3.~ ING Emeklilik              238312      1335272560.        50634563.
## 4 1.3.~ Metlife Emeklil~           121918       342887913.        26232485.
## 5 1.3.~ Vakif Emeklilik            291440      1630111936.        77968556.
## 6 1.3.~ Ziraat Hayat ve~           118882       366475156.        33148249.
## # ... with 10 more variables: contribution <dbl>, n_of_pensioners <dbl>,
## #   n_of_ind_contracts <dbl>, n_of_group_ind_contracts <dbl>,
## #   n_of_employer_group_certificates <dbl>, n_total <dbl>,
## #   size_of_ind_contracts <dbl>, size_of_group_ind_contracts <dbl>,
## #   size_of_employer_group_certificates <dbl>, size_total <dbl>