2  inclass1

Published

October 19, 2022

2.0.1 Let’s look at the description of the dataset.

This package contains information about all flights that departed from NYC (e.g. EWR, JFK and LGA) to destinations in the United States, Puerto Rico,and the American Virgin Islands) in 2013: 336,776 flights in total.

2.0.2 AIRBUS and AIRBUS INDUSTRIE

There are AIRBUS and AIRBUS INDUSTRIE in the data, with quick google search there is no difference between them. it was AIRBUS INDUSTRIE until 2001, since then it’s just AIRBUS. That’s why let’s make them same.

planes %>%
  group_by(manufacturer) %>%
  summarise(plane_count=n()) %>%
  arrange(desc(plane_count)) %>%
  print(n = Inf)
# A tibble: 35 × 2
   manufacturer                  plane_count
   <chr>                               <int>
 1 BOEING                               1630
 2 AIRBUS INDUSTRIE                      400
 3 BOMBARDIER INC                        368
 4 AIRBUS                                336
 5 EMBRAER                               299
 6 MCDONNELL DOUGLAS                     120
 7 MCDONNELL DOUGLAS AIRCRAFT CO         103
 8 MCDONNELL DOUGLAS CORPORATION          14
 9 CANADAIR                                9
10 CESSNA                                  9
11 PIPER                                   5
12 AMERICAN AIRCRAFT INC                   2
13 BEECH                                   2
14 BELL                                    2
15 GULFSTREAM AEROSPACE                    2
16 STEWART MACO                            2
17 AGUSTA SPA                              1
18 AVIAT AIRCRAFT INC                      1
19 AVIONS MARCEL DASSAULT                  1
20 BARKER JACK L                           1
21 CANADAIR LTD                            1
22 CIRRUS DESIGN CORP                      1
23 DEHAVILLAND                             1
24 DOUGLAS                                 1
25 FRIEDEMANN JON                          1
26 HURLEY JAMES LARRY                      1
27 JOHN G HESS                             1
28 KILDALL GARY                            1
29 LAMBERT RICHARD                         1
30 LEARJET INC                             1
31 LEBLANC GLENN T                         1
32 MARZ BARRY                              1
33 PAIR MIKE E                             1
34 ROBINSON HELICOPTER CO                  1
35 SIKORSKY                                1
planes_new <- planes %>% mutate(manufacturer = replace(manufacturer, manufacturer == "AIRBUS INDUSTRIE", "AIRBUS"))

2.0.3 80/20 Rules

There are 34 different manufacturers, first 4 manufacturer dominated the market and 91.4% of planes belongs to them. Not exactly but 80/20 rules somehow works here 11% of the manufacturer dominates the 91.4% of the market.

planes_new %>%
  group_by(manufacturer) %>%
  summarise(avg_engine=mean(engines),median_engine=median(engines),plane_count=n()) %>%
  arrange(desc(plane_count)) %>%
  mutate(frequency = round(plane_count/sum(plane_count),3), cumsum = cumsum(frequency)) %>%
  print(n = Inf)
# A tibble: 34 × 6
   manufacturer                  avg_engine median_engine plane…¹ frequ…² cumsum
   <chr>                              <dbl>         <dbl>   <int>   <dbl>  <dbl>
 1 BOEING                              2.00           2      1630   0.491  0.491
 2 AIRBUS                              2.01           2       736   0.222  0.713
 3 BOMBARDIER INC                      2              2       368   0.111  0.824
 4 EMBRAER                             2              2       299   0.09   0.914
 5 MCDONNELL DOUGLAS                   2              2       120   0.036  0.95 
 6 MCDONNELL DOUGLAS AIRCRAFT CO       2              2       103   0.031  0.981
 7 MCDONNELL DOUGLAS CORPORATION       2              2        14   0.004  0.985
 8 CANADAIR                            2              2         9   0.003  0.988
 9 CESSNA                              1.33           1         9   0.003  0.991
10 PIPER                               1.4            1         5   0.002  0.993
11 AMERICAN AIRCRAFT INC               1              1         2   0.001  0.994
12 BEECH                               2              2         2   0.001  0.995
13 BELL                                1.5            1.5       2   0.001  0.996
14 GULFSTREAM AEROSPACE                2              2         2   0.001  0.997
15 STEWART MACO                        1              1         2   0.001  0.998
16 AGUSTA SPA                          2              2         1   0      0.998
17 AVIAT AIRCRAFT INC                  1              1         1   0      0.998
18 AVIONS MARCEL DASSAULT              3              3         1   0      0.998
19 BARKER JACK L                       1              1         1   0      0.998
20 CANADAIR LTD                        4              4         1   0      0.998
21 CIRRUS DESIGN CORP                  1              1         1   0      0.998
22 DEHAVILLAND                         1              1         1   0      0.998
23 DOUGLAS                             4              4         1   0      0.998
24 FRIEDEMANN JON                      1              1         1   0      0.998
25 HURLEY JAMES LARRY                  1              1         1   0      0.998
26 JOHN G HESS                         1              1         1   0      0.998
27 KILDALL GARY                        1              1         1   0      0.998
28 LAMBERT RICHARD                     1              1         1   0      0.998
29 LEARJET INC                         2              2         1   0      0.998
30 LEBLANC GLENN T                     1              1         1   0      0.998
31 MARZ BARRY                          1              1         1   0      0.998
32 PAIR MIKE E                         1              1         1   0      0.998
33 ROBINSON HELICOPTER CO              1              1         1   0      0.998
34 SIKORSKY                            2              2         1   0      0.998
# … with abbreviated variable names ¹​plane_count, ²​frequency

2.0.4 Manufacturer by engine counts

Most of planes have 2 engines and rest of them have 1,3,4 engines. 2 engines have different avg seat, for example BOEING has 187 seat, BOMBARDIER INC has 74 in average. The reason could be pricing and luxury. While BOMBARDIER offering more private experience, BOEIGN offering more seats. 3 and 4 engines might be for long flight. They also differs in terms of available seats. 4 engines CANADAIR LTD has only 2 seats, but BOEING have 450 seats.

planes_new %>%
  group_by(engines,manufacturer) %>%
  summarise(plane_count=n(),avg_seats=mean(seats),,median_seats=median(seats)) %>%
  arrange(engines,desc(plane_count)) %>%
  print(n = Inf)
`summarise()` has grouped output by 'engines'. You can override using the
`.groups` argument.
# A tibble: 40 × 5
# Groups:   engines [4]
   engines manufacturer                  plane_count avg_seats median_seats
     <int> <chr>                               <int>     <dbl>        <dbl>
 1       1 CESSNA                                  6      4.33          4  
 2       1 PIPER                                   3      6             7  
 3       1 AMERICAN AIRCRAFT INC                   2      2             2  
 4       1 STEWART MACO                            2      2             2  
 5       1 AVIAT AIRCRAFT INC                      1      2             2  
 6       1 BARKER JACK L                           1      2             2  
 7       1 BELL                                    1      5             5  
 8       1 CIRRUS DESIGN CORP                      1      4             4  
 9       1 DEHAVILLAND                             1     16            16  
10       1 FRIEDEMANN JON                          1      2             2  
11       1 HURLEY JAMES LARRY                      1      2             2  
12       1 JOHN G HESS                             1      2             2  
13       1 KILDALL GARY                            1      2             2  
14       1 LAMBERT RICHARD                         1      2             2  
15       1 LEBLANC GLENN T                         1      2             2  
16       1 MARZ BARRY                              1      2             2  
17       1 PAIR MIKE E                             1      2             2  
18       1 ROBINSON HELICOPTER CO                  1      5             5  
19       2 BOEING                               1629    175.          149  
20       2 AIRBUS                                733    202.          182  
21       2 BOMBARDIER INC                        368     74.0          80  
22       2 EMBRAER                               299     45.6          55  
23       2 MCDONNELL DOUGLAS                     120    162.          172  
24       2 MCDONNELL DOUGLAS AIRCRAFT CO         103    142           142  
25       2 MCDONNELL DOUGLAS CORPORATION          14    142           142  
26       2 CANADAIR                                9     55            55  
27       2 CESSNA                                  3      7.33          8  
28       2 BEECH                                   2      9.5           9.5
29       2 GULFSTREAM AEROSPACE                    2     22            22  
30       2 PIPER                                   2      8             8  
31       2 AGUSTA SPA                              1      8             8  
32       2 BELL                                    1     11            11  
33       2 LEARJET INC                             1     11            11  
34       2 SIKORSKY                                1     14            14  
35       3 AIRBUS                                  2    379           379  
36       3 AVIONS MARCEL DASSAULT                  1     12            12  
37       4 AIRBUS                                  1    375           375  
38       4 BOEING                                  1    450           450  
39       4 CANADAIR LTD                            1      2             2  
40       4 DOUGLAS                                 1    102           102  

2.0.5 Leaders of Market

We know that most of plane have 2 engines and they are belongs to 4 manufacturer. Let’s investigate metrics. of only 2 engines comes from 4 manufacturer.

manufacturers_names <- planes_new %>%
  group_by(manufacturer) %>%
  summarise(avg_engine=mean(engines),median_engine=median(engines),plane_count=n()) %>%
  arrange(desc(plane_count)) %>%
  mutate(frequency = round(plane_count/sum(plane_count),3), cumsum = cumsum(frequency)) %>%
  select(manufacturer) %>%
  slice_head(n = 4)

AIRBUS and BOEING look like two compaines have same strategy which is more seats. BOEING have more different planes than AIRBUS. BOMBARDIER and EMBRAER also have same strategy which is less seats.

planes_new %>%
  filter(manufacturer %in% manufacturers_names$manufacturer,engines == 2) %>%
  group_by(manufacturer) %>%
  summarise(mean=mean(seats),std_dev=sd(seats),count=n()) %>%
  print(n = Inf)
# A tibble: 4 × 4
  manufacturer    mean std_dev count
  <chr>          <dbl>   <dbl> <int>
1 AIRBUS         202.     59.2   733
2 BOEING         175.     59.1  1629
3 BOMBARDIER INC  74.0    17.8   368
4 EMBRAER         45.6    15.5   299