Integrated size and price optimization - Integrated size and price optimization for a fashion r

We adopt the proceeding of the former DISPO-team and prefer a so-calledtwo-stage instead of amulti-stagemodel for integrated size and price optimization. In the first stage we decide on a supply and in the second stage according to the scenario in effect price optimization is performed.

An important aspect – also for the already presented models – is the estimation of demand. We use an empirical method which was developed by the former DISPO-team. It is topic of the next chapter.

To benefit from all current information we apply price optimization in the same way as described above after we determined the optimal supply with our integrated model.

Thus, our decision support system for integrated size and price optimization DISPO consists of two parts. One is to solve ISPO one time and supply the branches according to the solution. The other one is to perform price optimization with receding horizon POP-RH every sales period again. Therefore, we have to develop faster approaches for price optimization. In Chapter 4 we treat price optimization as it takes part in DISPO in detail and present a dynamic programming approach for fixed supply before we present the Integrated Price and Size Optimization Problem (ISPO) in Chapter6.

The remainder of the thesis is devoted to the exact and heuristic solution of ISPO and real-world experiments in terms of DISPO.

Chapter 3 Demand estimation

An important aspect of DISPO is the estimation of the dependent demand. There are some special features about the situation at our industrial partner which have to be regarded. The considered fashion products are only sold once and are never offered again. Thus, historic sales data can only be used on a higher aggregation level, e.g., the average historic demand at a price in a sales week on the commodity group level.

The number of sales also depends from the popularity of the observed products. The demand estimation method should handle this aspect.

Since the supplies per branch and size of a single product are zero, one, or two in most cases, we can expect that historic sales data will only give us very coarse information. Only sales can be observed and not the real demand. If a size is sold out in a branch we do not know if the actual demand for this size in this branch was higher than the observed sales. This kind of data is also calledright-censored. The additional sales that would have occurred if the corresponding supply had been higher in literature are known as lost sales. References can be found in Section3.1. Here we also state some estimation approaches from literature and outline the applicability on our situation. We present an empirical estimation approach for DISPO developed by the former DISPO-team in Section3.2. A parametric approach, logistic regression together with maximum likelihood estimation is outlined in Section3.3. We present a logistic regression model for demand estimation for DISPO in Section3.4and compare it to the empirical estimation method in Section3.5.

3.1 Literature review

The most common approach for estimating dependencies of a dependent variable from one or more independent variables is linear regression, see for example [Har10].

According to Breen [Bre96] there are two different procedures to handle with cen-sored data. One is to omit all data which might be cencen-sored and then to apply linear regression. The other one is to estimate the censored data. A common approach to do this for metric data is the so-calledtobit model, invented by James Tobin [Tob58], where it is assumed that the censoring-point for each observation is equal. Tobit re-gression is a special case of the so-called censored rere-gression. Here each observation might be censored at a different point. The models are commonly estimated via maxi-mum likelihood estimation.

In [VvRR12] Vulcano et al. use the so-calledexpectation-maximization method 19

(EM method)to estimate the demand for a product with right-censored sales data. The EM method in general was first proposed by Dempster et al. [DLR77] to estimate dependencies on the basis of incomplete observations. The basic idea of the EM-algorithm is the following: By starting with an initial model alternately the missing observations according to the given data and the current model parameters are esti-mated and the parameters are adapted (maximization) via maximum likelihood estima-tion. Vulcano et al. in the expectation step estimateprimary (or first choice) demand.

This is the actual demand, while the sales are only incomplete observations of primary demand. The primary demand is estimated by assuming that the arrivals of the cus-tomers are Poisson-distributed over time. If a customer wants to buy a product there are two possibilities: Either the product is available then he will buy it or it is not. If the product is not available he might buy a similar product either at the same provider at a competitor. This happens according to a so-called market share which is given as input of the algorithm.

In the maximization step the estimated primary demand is used to estimate the parameters in terms of preference values for the observed products via maximum like-lihood estimation.

A similar approach also using the EM-algorithm with Poisson-distributed arrivals is outlined in [ADG98].

Huh et al. [HLRO11] applied the so-called Kaplan-Meier Estimator, see for ex-ample [ABG10], to stochastic inventory control problems with censored demand to decide on the time items have to be reordered. The Kaplan-Meier estimator – also product limit estimator – is an approach from survival analysis. The non-parametric approach examines how long the observed individuals stay in a state.

It is often applied in medial resarch where the question is: “Does the patient survive or will he die?” The probabilityS(t)that a member has a life time exceedingt is considered. For a sample with size N the observed timesti until death are ordered increasingly. The probability of surviving ti days can be seen as the probability of surviving daytiafter livingt_i−1days multiplied by the chance of survivingt_i−1days.

Thus,S(t)is given byS(t) :=Q

ti<t ni−di

n_i wherediis the number of deaths at time iandnithe number of survivors at the end oft_i−1.

A special estimation method for binary response islogistic regression. While linear regression might lead to outcomes that can be negative or higher than one, logistic regression measures the probability for outcome one or zero. The dependent variables are transformed to the so-calledodds. The odds describes the relation of the probability that the event happens to the probability that it does not – the odds can only take values higher than zero. By taking the logarithm the odds is transformed to the so-calledlogit.

As a linear function of the independent variables with a range between−∞and+∞

it is commonly estimated bymaximum likelihood estimation.

A similar model is the so-calledprobit model, see for example [Rya08]. It distin-guishes from logistic regression by the assumption is that the error is normally dis-tributed. For the logistic model we assume a distribution according to the so-called logistic function. Both models belong to the class of the so-calledgeneralized linear models. In contrast to ordinary linear regression models generalized models do not assume that the residuals are normally distributed. Both can be extended for ordinal data. For logistic regression we will outline this later on. For further information about generalized linear models and maximum likelihood estimation see for example [Har10]

or [Rya08].

We did not find any general comparison of logistic and probit regression in litera-ture.

Applicability

To omit right-censored data in our case is not possible. Because the most sizes per branch are supplied by maximal one or two items and nearly all observations are right-censored this would lead to a tiny sample which could not serve as basis for demand estimation whatsoever. Thus, we have to deal with censored observations.

For our right-censored and ordinal data linear regression as mentioned above is not suitable. Ordinary linear regression may lead to negative values for the depen-dent variable, i.e. in our case the demand. Because the supply and consequently the censoring-point for each size and branch differs the tobit model is not an alternative, rather a general censored regression model. But also this is not conceived for integer numbers of outcomes.

The EM approach by Vulcano et al. described above in principle appears promising.

However, the approach needs a set of comparable products as input. One could consider all items of the same commodity group or sub commodity group as comparable but this is not the case. The articles differ in color, fashion, etc. and most important price. But even if our industrial partner could commit us lists of comparable products there would be another difficulty: Because there is no reordering of products and the products have different sales starts we have to regard that the list of comparable articles may change over time. Moreover, incomparability caused by mark-downs would have to be regarded.

The Kaplan-Meier approach as applied in [HLRO11] can not directly be adopted.

In our situation the items are not reordered. So far, we did not see how the method could be adapted to our situation where we also have to include the possibility of mark-downs during the selling time.

From all mentioned methods from literature the for us most promising approach is the ordinal logistic regression model. This can explicitly deal with small integer outcomes. Moreover we can include price dependencies and dependencies in terms of the popularity of the observed product. By including the current stock as indepen-dent variable we estimate sales not demand and have not to deal with right-censored observations.

Im Dokument Integrated size and price optimization for a fashion retailer (Seite 29-32)