1 How to Use This Document

This project was produced as part of the University of Central Florida Master of Economics Econometrics Spring 2020 practicum. This document includes a series of appendices that discuss data sources, data wrangling, and data visualization. Navigate through the document by using the panel at the left.

2 Introduction:

Consumer spending in the sports-fishing tackle market grew 12% in 2018, nearing $6 billion. The growth was primarily led by an increase in spending in rods, reels, and combos that posted significant gains in both retain dollars and units sold. Fishing reel is one of the top five fishing tackle equipment. It is a cylindrical device attached to a fishing rod used in winding and stowing line. A fly reel is the simplest of all fishing reels which is normally operated by stripping line off the reel with one hand, while casting the rod with the other hand. The central of this project’s analysis is to find the country-of-manufacture effect, especially the American-made fly fishing reels in USA fishing equipment market. Typically, American-made products typically sell at a premium relative to products made in other countries. In the case of fly-fishing, many reels are manufactured in Asia because labor costs are lower than in the United States.This project aim to answer how much more, in proportionate terms (markup), consumers are willing to pay for an American-made fly-fishing reel, having certain features, than a foreign one. To investigate the question, hedonic price theory has been applied.

3 Data

Data frame has 248 observation which has been collected manually from the various webpages. The data can be downloaded from this github repository. Each fly-fishing reel in the data set is a row, while the columns correspond to the variables whose names and definitions are the following:

Description of the variable in the data frame
Variables Description
Name Name of the fly reels
Brand Brand name
Weight Weight of the fly reels
Width Width of the fly reels
Price Price in $ amount
Sealed Whether fly reels is sealed or not
Country Country of manufacture
Machined Whether machined or not

4 Exploratory Data Analysis

Let’s look at the price distribution in the data frame.

The histogram above showed that we are dealing with right-skewed data. Also the following empirical cumulative distribution plot shows that 80% of the fly reels price is below $600.

The above scatter plot shows that these data would not be very well described by a straight line. This means that we need to apply a logarithmic transformation which will linearize the data and we will be able to fit a linear model.

We can see that the data have been constrained to a much narrower range (y-axis) and we could argue that a linear line would fit the best for this scatter plot. Let’s have a look at how the data distribution changed by looking at a histogram of the log transformed data.

After the log transformation of the price, we notice in the former plot that the distribution is not perfectly normal, but it looks closer to the normal distribution than the previous histogram.

Now let’s try to use Box-Cox transformation on our data. We will start with building a simple linear model looking at how the price changes with weight of fly reels.

The above plot shows the optimal value of the lambda parameter and for our data it is somewhere around 0.2.

The preceding plot shows that the distribution is very similar to the one we got using the log transformation. The lambda value we used was approximately 0.2 and lambda = 0 would result in log transformation. Now, let’s compare the log-transformation and Box-cox transformation with the original data.

Now, let’s compare all the plots.

The above figure shows that the log-transformation performs better among all the distribution.

The former plot shows the distributions, barplots, and boxplots for log_price, weight, machined and sealed feature for countries. In the USA, all fly reels are machined and it is clear from the exploratory analysis that machined flying reels are highly priced.

## NULL

From the pair panel scatterplot, we notice that weight and diameter are highly correlated with each other. Since weight is significant conceptually in a sense that heavy weighted fly reels might be suitable for catching large fish and hence require more expensive stabilized manufacturing.

4.1 Checking Assumptions of Linear Regression:

  • Checking the linearity assumption

The assumption of linearity and homoscedasticity can be checked by residuals vs fitted plot. If the model does not meet the linear model assumption, the residuals would take on a defined shape or a distinctive pattern. R automatically flagged 3 data points that have large residuals (observations 127, 128, and 129). Besides that, the residuals in our model appear to be linear, as is evidenced by the straight red line plotted through our residuals. Also, our data are homoscedastic, given that they appear evenly spread around the y=0 line.

  • Checking the Normality assumption

The residuals are used to evaluate the normality assumption, which can be done with a QQ-plot by comparing the residuals to “ideal” normal observations along the 45-degree line. In our analysis, R automatically flagged the same three data points with high residuals (observations 127, 128, and 129). However, aside from those three data points, observations in the QQ-plot lie well along the 45-degree line. Therefore, we can assume that normality holds.

Also, the histogram plot of residuals and the Shapiro-Wilk normality test confirms the normality assumptions.

  • Checking Homoscedasticity assumption

The scale-location plot is useful for testing the homoscedasticity assumption. This plot checks to see if there is a pattern in the residuals. Apart from the R flagged observations 127, 128, and 129, we can see a horizontal line with randomly scattered data points around it, suggesting that the homoscedasticity assumption is satisfied here.

  • Checking for Outliers

The above graph does not show any data points that cross Cook’s distance line, which confirms no influential data points in the data set.

5 Economic and Statistical Modeling

The hedonic price theory estimates the relationship between the price of an asset (the dependent variable) and all of its various characteristics (the independent variable). It breaks down the item under investigation (dependent variable) into its constituent features and estimates the contributory value for each (independent variable). According to the theory, a product’s price reflects the total value of its individual attributes, such as size, location, quality, and so on. Hedonic models are most commonly estimated using regression analysis, which has been used in this analysis.

5.1 Feature Selection

Before using the future selection method, we analyzed the variables conceptually to understand the significance of including in the model.

  • Name: Name of the fly-fishing reels belong to specific brands, which means it is dependent on the brand name. Therefore it has been excluded from the model.

  • Brand: There are total 18 different brands of fly-fishing reel and only 3 of them are produced in both US and non-Us locations. These brands have significantly higher price for the US produced products than the non_US one. Therefore, brand has been discarded as a feature since it dependent on the country feature.

  • Country: Since the aim of the analysis is to observe the consumers’ willingness to pay for the USA-made fly reels, a new variable has been added, namely USA_made, with US and non-US as options. The new feature indicates whether a particular reel is made in US on outside of USA. Therefore, “Country” variables would be excluded from the model.

From backward, forward, and stepwise model selection methods, we observed that model with weight, width, sealed, and machined variables generate lowest AIC. Therefore, we will consider these as variables as characteristics for the modeling.

5.1.1 Empirical Model

The objective of the empirical investigation of the FlyReels data is to provide an empirical estimate of the hedonic price function \(p=p(z)\). To do this, it is necessary to define a regression equation that, in its general form, can be written,

\[ g(p_i) = h(z_i) + \epsilon_i \] where \(p_i\) is the price of observation \(i\), \(z_i\) is a row vector of associated product’s characteristics and \(\epsilon_i\) is an observation specific error term.

Here, we define \(g(\cdot)\) as the log transformation, that is, our empirical model regresses the natural logarithm of fly reel’s price on some function of the explanatory variables. Using the log of the fly reels’s price has at least two advantages:

First the distribution of the fly reel’s price in a market trends to show right skew (we observed that in the histogram. Such data distribution are often associated with heteroskedasticity and/or non-normality of errors, both of which complicate estimation.

Second, using a log transformation allows for readily interpretable coefficient estimates. Fir example, the coefficient on a regressor entered in simple linear form indicates the constant percentage response in fly reel’s price to a unit increase in the regressor. The regression model can be re-written as,

\[ ln(p_i) = h(z_i) + \epsilon_i \]

The empirical specification that will allow us to investigate the hedonic price theory:

\[ ln(Price) = beta_0 + beta_1 \times Weight + \beta_2 \times Sealed + \beta_3 \times Machined + \beta_4 \times USA\_ made + \epsilon \]

Here, the parameters \(\beta_1\) to \(\beta_4\) are elasticities, these parameters measure the proportional change in prices caused by proportional changes in characteristics. Therefore, the hedonic price of a particular characteristic is the slope of with respect to that particular characteristic. Here, \(\beta_4\) represents the country-of-manufacture effect.

Since we have assumed the \(\log\) of Price as our dependent variable, the added price for a certain fly fishing reel is actually the exponent of the multiplication of the coefficient and corresponding characteristic, which demonstrates the proportionate markup price for each characteristic. Therefore,

\[ Price = \exp^{(\beta_0)} \cdot~ \exp^{(\beta_1 \times weight)} \cdot~ \exp^{(\beta_2 \times sealed)} \cdot~ \exp^{(\beta_3 \times machined)} \cdot~ \exp^{(\beta_4 \times USA\_made)} \cdot~ \exp^{(\epsilon)} \]

5.2 Result Analysis

After running the regression according to the model specified before, the following results are found for the coefficients.

Observations 248
Dependent variable log(Price)
Type Linear regression
χ²(4) 74.87
Pseudo-R² (Cragg-Uhler) 0.87
Pseudo-R² (McFadden) 0.70
AIC 157.29
BIC 178.37
Est. S.E. t val. p
(Intercept) 3.94 0.08 48.82 0.00
Weight 0.10 0.01 10.75 0.00
SealedYes 0.42 0.05 8.70 0.00
MachinedYes 0.76 0.07 10.36 0.00
USA_madeyes 0.52 0.05 10.55 0.00
Standard errors: MLE
  • The intercept gives the estimated value of fly reel’s price which is produced outside of USA and without any features included in the model. We would estimate the value as exp(3.94) = 51.4186013

  • The weight has a small p value and statistically significant features for a fly reel. The model predicts that for an additional unit increment of weight is associated with exp(0.10) = 0.1051709 or 10% increase in price of fly reel, on average. People might prefer a heavy fly reel because of the easy handling of the fishing rod or fishing a large fish.

  • The sealed and machined coefficients also plays significant contribution in the pricing. The model predicts that the price of sealed fly reels goes up by exp(0.42) = 0.5219616 or 52% percent on average than the non-sealed fly reels. Also, The predicted machined fly reels price is exp(0.76) = 1.1382762 or 114.6% higher than the non-machined fly reels on average.

  • The US-made fly reels has exp(0.52) = 0.6820276 or 68.5% higher price in the market than the non-USA-made fly reels on average. The country-of-manufacture effect is proportional markup and the coefficient can be considered as the higher premium consumers are willing to pay for the USA-made fly reels.

5.3 Country of Manufacture Effect:

5.3.1 Likelihood Ratio Test:

To test the country of manufacture effect, likelihood ratio test has been applied. Likelihood ratio test is a statistical test used to compare the fit of two nested models, one of which is a restricted version of the other. It is based on the likelihood function, which is a measure of how well a statistical model fits the observed data. The likelihood ratio test compares the likelihood of the data under the full model (unrestricted) to the likelihood of the data under the restricted model, and determines whether the difference in the likelihoods is statistically significant.

The following result of likelihood ratio test shows low p-value, indicating evidence against the reduced model and in favor of the full model, which means country-of-manufacture matters.

Likelihood Ratio Test Results
term X.Df LogLik df statistic p.value
log(Price) ~ Weight + Sealed + Machined + USA_made 6 -72.64471 NA NA NA
log(Price) ~ Weight + Sealed + Machined 5 -119.37585 -1 93.46227 0

5.3.2 Oaxaca Decomposition

Oaxaca decomposition has been applied to estimate the importance of the country of manufacture when explaining the observed variation in the price data. The method assesses how observed differences between two groups (fly reels made in the USA and outside of the USA) affect the price of fly reels. The Oaxaca decomposition method decomposes the observed differences into the explained and unexplained components. The difference that can be partly explained by variations in the predictor variables (such as product weight, whether it is machined, and whether it is sealed) is known as the explained component; the difference that variations in the predictor variables cannot partly explain is known as the unexplained component.

One of the main benefits of using Oaxaca decomposition to assess the impact of country of manufacture on fly reel prices is its ability to account for variations in the reels’ observable characteristics. For instance, fly reels produced in one nation may differ from those produced in another in terms of weight, machining, and sealing. By accounting for these variations, we can obtain a more precise estimate of the impact of the country of manufacture on fly reel prices. Also, the analysis would enable the manufacturer to determine how much of the price difference is caused by unobservable factors like brand perception, marketing tactics, or supply chain logistics. Moreover, the analysis would provide insights into addressing those issues in a competitive market by comprehending the scope and makeup of these “unexplained” differences.

We are interested in decomposing the price gap between USA-made and non-USA-made group. The price gap could be due to group differences in the level of price determinants such as weight, and sealed. Alternatively, the gap could arise from a differential effect these determinants on USA-made and non_USA-made fly reels. Since oaxaca() formula uses proportion of two level (“yes”, “no”) of a categorical variable to compare between two groups, and all the fly reels made in the USA are machined; oaxaca() function didn’t include Machined variable into the analysis. Overall, the oaxaca() function estimate the related magnitude of Weight and Sealed characteristics’ influence:

There are 248 observations in the data set. 113 observation (Group A) in the data sets shows the fly reels’ price manufactured outside of the USA, and the other 135 fly reels’ (Group B) are USA-made. The following table shows fly reels’ average price difference between group A (non-USA-made) and Group B (USA-made).

Difference between average price of USA-made and non-USA-made fly reels
Variable Value
Avg Price of non-USA-made fly reels (Group-A) 292.4887
Avg Price of USA-made fly reels 484.9167
Difference between the mean price of Group A and B -192.4281

Now, the difference of $192.4 is explained by the Blinder-Oaxaca decomposition in the following:

##   coef(endowments)     se(endowments) coef(coefficients)   se(coefficients) 
##          44.124235          21.098471        -244.083809          16.339401 
##  coef(interaction)    se(interaction) 
##           7.531506           9.208312

The results of the threefold decomposition suggest that , of the $192.4 difference, approximately, $44 can be attributed to group differences in endowments (weight and sealing), -$244.1 to difference of coefficients, and the remaining $7.53 is accounted for by the interaction of the two.

The “endowments” term represents the difference in the average characteristics (weight and sealing) between the two groups being compared. The “coefficients” term represents the effect of each characteristic on the outcome of interest. The “interaction” term represents the difference in the effect of a characteristic between the two groups. The table also shows the standard errors for each of the terms in the regression model. The coefficient estimates give us the average effect of each term on the outcome of interest, while the standard errors give us an idea of how precisely the coefficient estimates were estimated.

  • Endowments: The endowment effect represents the proportion of the price gap between the two groups that can be explained by differences in the endowment variables (in this case, Weight and Sealed_binary). The endowments component of the decomposition (44.124235) captures the portion of the price gap that is explained by differences in the average values of the endowment variables (weight and sealed_binary) between the two groups.

  • Coefficients: The coef(coefficients) component in the Oaxaca decomposition analysis is calculated by comparing the two groups’ regression coefficients (slopes). The negative value of -244.08 for the coef(coefficients) component indicates that the effect of Weight on the price of fly reels is weaker for non-USA-made fly reels compared to USA-made fly reels. In other words, for non-USA-made fly reels, a unit increase in Weight has a smaller effect on the price of the fly reel than a unit increase in Weight for USA-made fly reels. In essence, if the non-USA-made fly reels have the same values of the independent variables as USA-made fly reels, we would expect non-USA-made fly reels to have a greater price than USA-made fly reels. Overall, other factors beyond weight and sealed_binary play a larger role in determining the price of non-USA-made fly reels.

  • Interaction terms: The coefficient of interaction is approximately $7.5. This term captures the combined effect of endowments and coefficients on the price gap between the two groups (USA-made and non-USA-made). the interaction component of the decomposition (7.531506) suggests that there may be some interaction between the endowment variables and the USA_made_binary variable, although the effect size is relatively small.

Now, let’s examine the endowments and coefficients components of the threefold decomposition variable by variable, The following plot shows the estimation results for each variable, along with error bars that indicate 95% confidence intervals.

In the endowments component, it seems that a significant portion of the price gap is driven by group differences in the proportion of fly reels with sealed characteristic. The fly reels with no sealed feature tend to price less, as can be seen from the pooled regression coefficient on Sealed_binary0 reported below. Furthermore, the value of x.mean.diff shows that a greater proportion of USA-made fly reels are not sealed. This means that on average, non-USA-made fly reels are 0.2248443 more likely to be sealed than USA-made fly reels. The value is negative, indicating that the proportion of sealed reels is lower in the USA-made group compared to the non-USA-made group.

summary(results$reg$reg.pooled.2)$coefficients["Sealed_binary0",]
##      Estimate    Std. Error       t value      Pr(>|t|) 
## -1.598419e+02  1.841780e+01 -8.678666e+00  5.781138e-16
results$x$x.mean.diff["Sealed_binary0"]
## Sealed_binary0 
##     -0.2248443

Moreover, in coefficeint component, weight variable achieves statistical significance. The results beta.diff[“Weight”] component of the Oaxaca decomposition measures the difference in the regression coefficient for the variable Weight between the two groups being compared. This coefficient represents the change in the predicted price of a fly reel for each unit increase in weight, after controlling for the other variables in the model.

 results$beta$beta.diff["Weight"]
##    Weight 
## -19.60932

As the difference in the Weight coefficients between non-USA-made and USA-made fly reels indicates that the relationship between Weight and price is weaker for non-USA-made fly reels than for USA-made fly reels. Specifically, the regression coefficient for the Weight variable is lower for non-USA-made fly reels than for USA-made fly reels, meaning that the effect of Weight on the price of fly reels is weaker for the former group. Finally, the interaction part is pretty negligible in the analysis.

6 Conclusion

Overall, the findings of Blinder-Oaxaca decomposition suggest that the country of manufacture significantly affects the price of fly reels after controlling for other variables in the model. Customers prefer heavy, machined fly reels if it manufactured in the USA. On average, USA-made fly reels are more expensive than non-USA-made fly reels. However, this price gap is only partially explained by differences in the endowment variables, and other factors (i.e., marketing effort, packaging of the product, customer services, etc.,.) not included in the model may also contribute to the price difference.

In conclusion, despite the limited scope of the analysis due to a few variables included in the model, it still offers essential insights into the variables influencing consumers’ willingness to pay more for products made in the United States. These results can be helpful for consumers who want to know what influences fly fishing reel prices and manufacturers trying to decide what features and qualities to include in their products.

7 Data Wrangling Appendix

#rm(list = ls())

# load libraries


read_library <- function(...) {
     invisible(lapply(substitute(list(...))[-1], function(x) 
                       library(deparse(x), character.only = TRUE)))
}

read_library(
        readxl,
        skimr,
        # tidyverse,   # deduplication, grouping, and slicing functions
        janitor,     # function for reviewing duplicates
        stringr,     # for string searches, can be used in "rolling-up" values
        ggplot2,
        lubridate,
        naniar,
        forcats,
        rmarkdown,
        here,
        tinytex,
        knitr,
        MASS,
        cowplot,
        gt,
        GGally,
        jtools,
        kableExtra)
library(readxl)
library(here)

fly_reels <-read_excel(here("Data","Clean_Data", "FlyReels.xlsx"))

str(fly_reels)

#rm(fly_reels)
library(dplyr)

fly_reels <- fly_reels %>%
  mutate(Density = (fly_reels $ Weight)/(3.14 * ((fly_reels $ Diameter)/2)**2 * (fly_reels $ Width)))
fly_reels <- fly_reels %>% 
             mutate(USA_made = 
                       case_when(Country =="USA" ~ "yes",
                  TRUE ~ "no")) %>% 
             glimpse()
library(dplyr)


# Model including all variables
full_variable <- subset(fly_reels, select = - c(Name, Brand, Country, Price, bc_price)) 

#rm(full_variable)


library(MASS)

#intercept-only model
fit_start <- glm(log_price ~ 1, data = full_variable)


# Fit the full model 
fit_all <- glm(log_price ~ ., data = full_variable)


# Apply Forward selection method
forward_model <- step(fit_start, direction = "forward", scope = formula(fit_all)) 
summ(forward_model)


# Apply Backward selection method
backward_model <- step(fit_all, direction = "backward")
summ(backward_model)


# Apply stepwise selection method 
stepwise <- step(fit_start, direction =  "both", scope = formula(fit_all))
summ(stepwise)
# Likelihood ratio test for testing the hypothesis that country of manufacture does not matter

library(lmtest)
library(zoo)
library(broom)


#Fit the full and reduced models
final_model <- glm(log(Price) ~ Weight + Sealed + Machined + USA_made, data= fly_reels)

model_reduced <- glm(log(Price) ~ Weight+Sealed+Machined , data= fly_reels)


# perform the likelihood ratio test
lrt_result <- lrtest(final_model,model_reduced)



# Use broom::tidy() to extract the results from the lrt_results object
tidy_results <- tidy(lrt_result)




# Print the result table using the kable function
tidy_results %>%
  kbl(caption = "Likelihood Ratio Test Results") %>%
  kable_classic(full_width = F, html_font = "Cambria")
# Create dummy vairable for machined, sealed, and USA-made variables

fly_reels$Sealed_binary <- factor(fly_reels$Sealed,levels = c('Yes', 'No'),labels = c(1, 0))

fly_reels$Machined_binary <- factor(fly_reels$Machined,levels = c('Yes', 'No'),labels = c(1, 0))

fly_reels$USA_made_binary <- factor(fly_reels$USA_made,levels = c('yes', 'no'),labels = c(1, 0))
# results$n

# check which groups belong to USA_made and which not 
#fly_reels %>% 
 #       group_by(USA_made) %>% 
  #      count() %>% 
   #     glimpse()

# Analyze the difference in average between two groups
#results$y

#results$threefold$overall

# Create a data frame to store the values
result_table <- data.frame(Variable = c(" Avg Price of non-USA-made fly reels (Group-A)", "Avg Price of USA-made fly reels", "Difference between the mean price of Group A and B"),
                           Value = c(292.4887, 484.9167, -192.4281))
#rm(result_table)

# Print the result table using the kable function
result_table %>%
  kbl(caption = "Difference between average price of USA-made and non-USA-made fly reels") %>%
  kable_classic(full_width = F, html_font = "Cambria")
#show results of threefold decomposition
results$threefold$overall

8 Data Visualization Appandix

# Define a custom plot theme

plot_theme_detail <- function(...){
  theme_bw() +
  theme(
    # adjust axes
    axis.line = element_blank(),
    axis.text = element_text(size = 14,
                             color = "black"),
    axis.text.x = element_text(margin = margin(5, b = 10)),
    axis.title = element_text(size = 14,
                              color = 'black'),
    axis.ticks = element_blank(),

    # add a subtle grid
    panel.grid.minor = element_blank(),
    panel.grid.major = element_line(color = "#dbdbd9", size = 0.2),

    # adjust background colors
    plot.background = element_rect(fill = "white",
                                   color = NA),
    panel.background = element_rect(fill = "white",
                                    color = NA),
    legend.background = element_rect(fill = NA,
                                     color = NA),
    # adjust titles
    legend.title = element_text(size = 14),
    legend.text = element_text(size = 14, hjust = 0,
                               color = "black"),
    plot.title = element_text(size = 20,
                              color = 'black',
                              margin = margin(10, 10, 10, 10),
                              hjust = 0.5),

    plot.subtitle = element_text(size = 10, hjust = 0.5,
                                 color = "black",
                                 margin = margin(0, 0, 30, 0))
    )

}


plot_theme <- function(...){
  theme_bw() +
  theme(panel.grid = element_blank(),
          panel.background = element_blank(),
          axis.text = element_text(size = 12),
          axis.ticks = element_blank(),
          axis.title = element_text(size = 12, color = "black"), 
          plot.title = element_text(size = 14, hjust = 0.5, face = "bold"))
}
library("scales")
library(ggplot2)
library(patchwork)
library(dplyr)

(price_hist <- ggplot(data = fly_reels) +
    geom_histogram(aes(x = Price),
                   alpha = 0.9,
                   fill = '#45ADA8') +
    labs(x = 'Price',
         y = 'Frequency') +
    plot_theme())
# Plot a scatter plot of the data
(price_scatter <- ggplot(data = fly_reels) +
    geom_point(aes(x = Weight, y = Price),  # change to geom_point() for scatter plot
                   alpha = 0.9,
                   color = '#547980') +
    labs(x = 'Weight',
         y = 'Price',
         title = "Scatterplot of Flyreels' weight and price") +
    plot_theme())  # apply the custom theme
summary(fly_reels $ log_price)

# Plot the histogram of log transformed data

 
(log_price_hist <- ggplot(data = fly_reels) +
    geom_histogram(aes(x = log_price),
                   alpha = 0.9,
                   fill = '#547980') +
    labs(x = 'Price',
         y = 'Frequency') +
    plot_theme())

# kernel smoothed density og log_price
(log_price_density <- ggplot(data = fly_reels) + 
  geom_density(aes(x = log_price, 
                   color = 4,
                   fill =  4,
                   alpha = 0.5)) +
  labs( x ="price",
        y ="Frequency") +
  #ggtitle ("Kernel smoothed density function of logarithm of  price") + 
  plot_theme() +
  theme(legend.position="none")
)
library(ggplot2)

# Transform the data using this lambda value
fly_reels <- fly_reels %>%
  mutate(bc_price = ((Price^lambda-1)/lambda))

# Plot a histogram of the Box-Cox transformed data
(price_hist_bc <- ggplot(data = fly_reels) +
    geom_histogram(aes(x = bc_price),
                   alpha = 0.9,
                   fill = '#594F4F') +
    labs(x = 'Price',
         y = 'Frequency') +
    plot_theme())
library(cowplot)

# Panel of histograms for different transformations
(price_dist_panel <- plot_grid(price_hist + labs(title = 'Original data'),  # original data  
                        log_price_hist + labs(title = 'Log transformation'),  # logarithmic transformation
                        price_hist_bc + labs(title = 'Box-Cox transformation'),  # Box-Cox transformation
                        nrow = 2,  # number of row in the panel
                        ncol = 2))  # number of columns in the panel
library("GGally")

# Plotting the scatterplot for density , machined, sealed variable  for countries USA, Korea, and China 
#rm(country_color)

country_color <- c(China = '#45ADA8', Korea ='#547980', USA = '#9DE0AD')
        
(variables_plot <- fly_reels %>% ggpairs (columns= c("log_price", "Weight", "Machined", "Sealed","Country"),
                      aes(color = Country),
                      upper = list(continuous= wrap('cor', size = 5)),
                      lower= list(combo = wrap("facethist",bins=30)),
                      diag= list(continuous = wrap("densityDiag", alpha = 0.5))) +
                       scale_fill_manual(values = country_color)
)
library(MASS)
library(psych)

(corr_plot <- pairs.panels(numeric_variable,
             smooth = TRUE, # if true, draws loess smooths
             scale = TRUE, # if true, scales the correlation font
             density = TRUE,
             method = "pearson",
             lm = FALSE, # if TRUE, plots linear fit rather than the LOESS(smoothed) fit
             cor = TRUE,
             hist.col = '#9DE0AD', # histogram color
             stars = TRUE, # if true, add significance
             main = "Correlation Pair Panels among Numeric Variables",
             )
)
# Create a data frame to store the values
result_table2 <- data.frame(Variable = c(" Coefficient of endowments", " Standard error of endowments", "Coefficient estimates", "Standard error of coefficients", "Coefficient of interaction terms", "Standard error of interaction"),
                           Value = c(44.124235, 21.300074, -244.083809, 16.838085,  7.531506, 9.316190))
#rm(result_table)

# Print the result table using the kable function
result_table2 %>%
  kbl(caption = "Analysis of threefold decomposition") %>%
  kable_classic(full_width = F, html_font = "Cambria")