I am Talha. I am senior student at Bogazici University and planning to graduate January 2021.I am currently working as a sales analyst in Nike. Manipulating data and extracting meaningful information is very effective and important in the business world. I want to develop my skills in this subject and create a career parallel to this.
Benedict Witzenberger explains how they do journalism using r programming. First of all, a very interesting topic and a creative application. Although the organization is made up of people of social science origin, there are few digital people. It is not surprising that an organization that is a journalist with a different perspective does this work with people from different disciplines. Witzenberger talks about dividing the projects into two. Short term and long term. He then explained his workflows and mentioned four main parts.
They mentioned the three libraries they specifically used when accessing the appropriate data, such as {readr}, {rvest} and {jsonlite}. He then mentioned that the library they use the most in data wrangling operations is {tidyverse}. Finally, he said that “ggplot2” is used very much in data visulization.
After watching the video above, I was interested in using R programming in journalism and I continued to research, and this article appeared. The BBC has a team called visual and data team and they use R a lot in their work. Especially they use the ggplot2 package. Because they talk about being able to produce complex and reproducible contents with the ggplot2 package. They especially point out that being able to produce reproducible content gives them an exponential gains.
Even in 2018, the data team concentrated on the ggplot2 package and produced new packages. These packages are produced by taking reference from ggplot2 and since they are open source, anyone can use them. The names of the packages they produce are bbplot and R cookbook.
I learned from this information that users can also produce packages in R and share it with everyone.
How the BBC Visual and Data Journalism team works with graphics in R
In this article, Kan Nishida highlights how effective the dplyr package is in analyzing data and manipulating data. In the first paragraph of the article, with a short code, it shows a simple use of the dplyr package and talks about what the pipe ‘%>%’ means. Then he talks about the data and how complex we want to find, thanks to the dplyr package, how we can manipulate the data without complicating the code. By giving examples from past Oracle experiences, it gives information that this work is much more challenging with SQL codes.
In this article, Michael Grogan explained how to do basic operations such as data cleansing and data merging. Here, in his explanation, he referred to how these functions correspond to excel. The first function, merge(), mentioned that this function merges like the vlookup function in Excel. Then, he shows the number to a date with the as.Date() function, and that he shows a number of calculations and difference on the day. Later in the article, he manipulated his own data using the grapl() function. and there are simply codes for using the grapl() function. It shows that the aggregate() function can be used like the sumif function in excel, and examples are found in the paragraph.
To summarize, it is possible to clean and manipulate our data with codes that we are familiar with from the simple excel. Data Cleaning, Merging, and Wrangling in R