# [Fashion] Related Articles

### 1. How to be a model agent by DAZED magazine

A founder of Premier fashion agency, Carole White, mentioned about running agency and lives of fashion models. In the article, following sentences look worthy as a reference.

“Now we’re in a social media era. The whole business has changed so much in the last five years. It’s changing how advertising is done; it’s changing how we evaluate how much a job is worth. Before it used to be how many posters and billboards are there, but that’s not the crucial element anymore. Followers have become a currency and agents around the world have been slow to click onto that.”

# Poisson Distribution and Regression (+ Zero Inflation)

## Poisson Distribution (from Wikipedia)

• “… is a discrete probability distribution that expresses the probability of a given number of events occuring in a fixed interval of time and/or space if these events occur with a known average rate and independently of the time since the last event.”
• Wow, it’s exactly fit to the dependent variable in my research:
• fixed interval of time and space
• time intervals between events have an average, but independent each other
• If a discrete random variable X has a Poisson distribution with parameter lambda > 0,
• lambda = E(X) = mean = variance

## Poisson Regression (from Wikipedia)

• Poisson regression assumes,
• the response variable Y has a Poisson distribution
• the logarithm of its expected value can be modeled by a linear combination of unknown parameters.
• Also called log-linear model
• Zero inflation?
• “Another common problem with Poisson regression is excess zeros: if there are two processes at work, one determining whether there are zero events or any events, and a Poisson process determining how many events there are, there will be more zeros than a Poisson regression would predict. An example would be the distribution of cigarettes smoked in an hour by members of a group where some individuals are non-smokers.”
• Zero-inflated model
• “… a zero-inflated model is a statistical model based on a zero-inflated probability distribution, i.e. a distribution that allows for frequent zero-valued observations.
• “The zero-inflated Poisson model concerns a random event containing excess zero-count data in unit time.” : fit to my dataset!

# Multicollinearity (mainly from Wikipedia)

## Meaning and Effect

• Multicollinearity (also collinearity) is a statistical phenomenon in which two or more predictor variables in a multiple regression model are highly correlated, meaning that one can be linearly predicted from the others with a non-trivial degree of accuracy.
• Multicollinearity does not reduce the predictive power or reliability of the model as a whole, at least within the sample data themselves; it only affects calculations regarding individual predictors.

## Detection

• Large changes in the estimated regression coefficients when a predictor variable is added or deleted.
• If one variable is shown as statistically significant in a simple linear regression model, but is not in a muliple regression model.
• If the Condition Number is above 30, the regression is said to have significant multicollinearity. The Condition Number is appeared on a regression result summary of Python StatsModels.
• Use correlation matrix. Correlation values (off-diagonal elements) of at least .4 are sometimes interpreted as indicating a multicollinearity problem.

## Remedies

• Dummy variables for every category with regression constant: Perfect multicollinearity
• Drop one of the variables. However, you lose information (because you’ve dropped a variable). Omission of a relevant variable results in biased coefficient estimates for the remaining explanatory variables that are correlated with the dropped variable.
• Standardize your independent variables. This may help reduce a false flagging of a condition index above 30. Numpy module in Python has a function for standardization of variables named “preprocessing.scale“. Of course, disadvantages of standardization should be considered either when to interpret the regression result.

# Heteroscedasticity (mainly from Wikipedia)

## Meaning and Effect

• In statistics, a collection of random variables is heteroscedastic if there are sub-populations that have different variabilities from others. That is, if the variance of dependent values in a dataset is (significantly) different depending on the related independent values, we can say that the dataset is heteroscedastic.
• Example 1: A classic example of heteroscedasticity is that of income versus expenditure on meals. As one’s income increases, the variability of food consumption will increase. A poorer person will spend a rather constant amount by always eating inexpensive food; a wealthier person may occasionally buy inexpensive food and at other times eat expensive meals. Those with higher incomes display a greater variability of food consumption. (higher income – independent variable – cause higher variance in expenditure on meals – dependent variable.)