Group Members

  • Barış SİVAS
  • Neriman GÜRSOY
  • Ozan Barış BAYKAN
  • Sena KALEMCİ
  • Tuğba ÜNAL

Overview of Dataset

The dataset covers up all the recorded service information of x company’s dealers in Turkey. The data is belong the dates between 2019 and 2020. This data is obtained from x company where one of our team member is working for.In the dataset we have 587798 rows and 17 columns which gives different kind of parameters. According to the parameters a reader can get information about details of service processes such as model number, km, warranty start & end date, prices, vehicle id etc.

You can find variables as below:

  • Malzeme: Material id
  • Şasi No: Vehicle id
  • Müşteri No: Dealer id
  • İş emri no: Job order number
  • Malzeme türü: Material type that used in service process
  • İşlem tipi: Process type
  • Araç giriş tarihi: Date of beginning the service process
  • Araç çıkış tarihi: Date of ending the service process
  • Miktar: Quantity of part or material used
  • Net fiyat: Price
  • Model: Model of vehicle
  • Model: Production year of model
  • İş emri kapanış tarihi: Job order ending date
  • Araç km: Kilometer of vehicle
  • Firma Şehir: City of the dealer
  • Garanti Başlangıç tarihi: Date of beginning the warranty
  • Garanti Bitiş tarihi: Date of ending the warranty

You can find the glimpse of the data below.

library(readxl)
library(tidyverse)

raw_df <- read_xlsx("C:/Users/bsivas/Documents/Barış Innova Şahsi/DataScienceM/MEF-BDA/BDA503- Data Analytics/grup project/x_vehicle_company_service_dataset.xlsx")

glimpse(raw_df)
## Rows: 587,797
## Columns: 17
## $ Malzeme                    <chr> "malzeme_22459", "malzeme_21191", "malze...
## $ `Şasi No`                  <chr> "şasi_no_981", "şasi_no_981", "şasi_no_9...
## $ `Müşteri No`               <chr> "müşteri_30", "müşteri_30", "müşteri_30"...
## $ `İş Emri No`               <chr> "000000000028839", "000000000028839", "0...
## $ `Malzeme tipi`             <chr> "Diğer", "Yağ", "Diğer", "Yağ", "Diğer",...
## $ `İşlem tipi`               <chr> "P.Bakım", "P.Bakım", "P.Bakım", "P.Bakı...
## $ `Araç giriş tarihi`        <dttm> 2020-01-13, 2020-01-13, 2020-01-13, 202...
## $ `Araç çıkış tarihi`        <chr> "00.00.0000", "00.00.0000", "00.00.0000"...
## $ Miktar                     <dbl> 1, 10, 1, 1, 2, 1, 1, 1, 1, 10, 1, 1, 2,...
## $ `Net fiyat`                <dbl> NA, 13.66200, NA, 12.81555, NA, NA, NA, ...
## $ Model                      <chr> "model_4", "model_4", "model_4", "model_...
## $ `Yıl-Model`                <dbl> 2008, 2008, 2008, 2008, 2008, 2008, 2008...
## $ `İş emri kapanış tarihi`   <chr> "00.00.0000", "00.00.0000", "00.00.0000"...
## $ `Araç KM`                  <dbl> 213076, 213076, 213076, 213076, 213076, ...
## $ `Firma Şehir`              <chr> "Antalya", "Antalya", "Antalya", "Antaly...
## $ `Garanti Başlangıç tarihi` <dttm> 2007-12-31, 2007-12-31, 2007-12-31, 200...
## $ `Garanti Bitiş tarihi`     <chr> "00.00.0000", "00.00.0000", "00.00.0000"...

You can find the summary of the data below.

summary(raw_df)
##    Malzeme            Şasi No           Müşteri No         İş Emri No       
##  Length:587797      Length:587797      Length:587797      Length:587797     
##  Class :character   Class :character   Class :character   Class :character  
##  Mode  :character   Mode  :character   Mode  :character   Mode  :character  
##                                                                             
##                                                                             
##                                                                             
##                                                                             
##  Malzeme tipi        İşlem tipi        Araç giriş tarihi            
##  Length:587797      Length:587797      Min.   :2019-01-01 00:00:00  
##  Class :character   Class :character   1st Qu.:2019-06-29 00:00:00  
##  Mode  :character   Mode  :character   Median :2020-01-06 00:00:00  
##                                        Mean   :2019-12-21 13:09:52  
##                                        3rd Qu.:2020-06-25 00:00:00  
##                                        Max.   :2020-11-21 00:00:00  
##                                                                     
##  Araç çıkış tarihi      Miktar          Net fiyat           Model          
##  Length:587797      Min.   :  0.010   Min.   :     0.0   Length:587797     
##  Class :character   1st Qu.:  1.000   1st Qu.:    35.6   Class :character  
##  Mode  :character   Median :  1.000   Median :    83.1   Mode  :character  
##                     Mean   :  3.111   Mean   :   516.5                     
##                     3rd Qu.:  2.000   3rd Qu.:   310.3                     
##                     Max.   :500.000   Max.   :384403.1                     
##                     NA's   :381       NA's   :4545                         
##    Yıl-Model    İş emri kapanış tarihi    Araç KM         Firma Şehir       
##  Min.   :1983   Length:587797          Min.   :       0   Length:587797     
##  1st Qu.:2014   Class :character       1st Qu.:  167742   Class :character  
##  Median :2016   Mode  :character       Median :  325554   Mode  :character  
##  Mean   :2015                          Mean   :  456189                     
##  3rd Qu.:2017                          3rd Qu.:  641200                     
##  Max.   :2020                          Max.   :24315000                     
##  NA's   :50                            NA's   :33                           
##  Garanti Başlangıç tarihi      Garanti Bitiş tarihi
##  Min.   :1983-01-01 00:00:00   Length:587797       
##  1st Qu.:2014-01-28 00:00:00   Class :character    
##  Median :2016-04-21 00:00:00   Mode  :character    
##  Mean   :2015-02-09 05:22:03                       
##  3rd Qu.:2017-05-05 00:00:00                       
##  Max.   :2020-11-05 00:00:00                       
##  NA's   :973

Plan

Our tentative plan is as below:

  • Cleaning, tidying and manipulating the data
  • Using data visualizations to understand the data better
  • Analyzing the relationship between variables
  • To come up with a conclusion