ICON-INSTITUTE Consulting Gruppe
Von-Groote-Str. 28
50968 Köln
Germany
Supply of Statistical Services for
Methodological Support to Price Statistics
Dedicated paper/study: The use of unit values from scanner data and traditional price collection
Lead expert: Jörgen Dalén
Supported by: David Baran, Daniel Santos
Content
• Conceptual perspective
• Statistical and index theory perspective
• HICP perspective
• What is a homogeneous product?
• Country practices
• Switzerland
• The Netherlands
• Sweden
• Norway
• Preliminary conclusions and recommendations
• Note: The paper is very much “work in progress”
Statistical and index theory perspective
• CPI Manual
–
Unit values are the appropriate average prices that need to be entered into an elementary price index
–
should not be calculated for sets of heterogeneous products
–The unit value over the whole month should be used
• Scanner data allow for collecting the universe of prices, instead of just taking a sample
• Scanner data are based on actual transactions rather than prices as advertised on a shelf or similarly
• The service element needs to be kept constant also
–
Aggregation over stores in a homogeneous chain is ok! (Ivancic and Fox, 2011)
• Traditional price collection cannot always capture transaction prices
–
Does not cover whole month
–Subjective sampling
–
Difficult to harmonise
–
Less transparent (not so easy to know what is going on)
HICP perspective
• HICP regulations were developed for unifying existing practices and distinguish good from bad practices
• “prices of goods and services available for purchase”
• zero sales give zero weight – ok with scanner data
• Purchase prices to be used
• This is always so with scanner data –not always with traditional data
• Over the whole month – especially where sharp price changes occur
•
Scanner data fix this directly – not traditional data
• Problems with scanner data
•
GTIN codes not 1-1 with identical product
•
Inferior goods with same GTIN code?
•
Service element needs to be kept constant
What is a homogeneous product?
• Economic theory: All product offers within its specification must be equivalent to the consumer
• In practice: What is “sufficiently homogeneous”?
• Two aspects: Physical product and service level:
•
Unit value within outlet only• Unit value over all outlets in a geographical area
• Unit value over all outlets belonging to the same chain (Dutch method)
• Unit value over all outlets belonging to the same chain but in a limited region
• Time dimension
• a period of up to a month is reasonable for a unit value
• Target period – midweek, 3 first weeks, 3 midweeks, whole month?
• Trade-off between unit value bias and problems with missing
data?
Swiss method
• Scanner data received from two dominating retail chains
• Unit value throughout Switzerland is used (? not in paper, over both chains simultaneously?)
• First fourteen days of the month
• Else uses traditional survey and index calculation methods
•
I suppose above the level of the single product?Dutch method
• Chain-specific product grouping below COICOP 5
•
no need to classify items into categories - previously very labour intensive• Unit value over chain for product for the first 3 full weeks
• Unweighted geometric mean + cutoff of smallest items
• Monthly chaining at the lowest level
• Truncation - price changes above 4 or below 0.25 excluded
• Dumping filter. Large price decrease + Large quantity
decrease exclusion and imputation for missing price
Swedish method
• Three PPS samples of 800 2 year old GTIN codes.
• Undercoverage of 20% of new products that entered the market after the sampling frame year.
• The index is calculated in one step from the base month of December to a current month so no monthly chaining.
• Unit values are computed for each outlet
• Frequent replacements when a GTIN code is missing.
•
When missing in all or almost all outlets, implying that it is leaving the market.• This involves one day per month of manual interventions with some subjective aspects. However, only one third of lapsed products are replaced.
• Replacement is deemed important when small changes are made by the producer accompanied by a new EAN code and possibly a “hidden” price change.
Norway
• Dutch-like method for COICOP 1 incl. monthly chaining
• Swedish-like method for COICOP 0611 and 0612 (pharmaceutical products)
• But unit value over outlet not chain
• Imputations allow items to re-enter in a 14-month period.
Comparisons between methods
• Attempting to cover the full universe of prices or not
•
Swiss method does not sample the full universe whereas Swedish and Dutch methods do so.• Base sampling or monthly chaining
•
Sweden uses 2-year old product universe but NL samples the current universe through monthly chaining• Unit values over chain in NL but over store in SW and NO
•
Chain-level unit values have theoretical support if stores in chain have same service level• A number of detailed issues need further analysis
•
Need for replacements, when and how• Cut-off methods
• Truncation
• Dumping filters
• … and more
Selected conclusions and recommendations
• Collected scanner data should
•
contain turnover and quantity data at GTIN level• be collected per week
• with outlet identity (geography, at chain level
• with proper legal assurances
• If scanner data are to be used we should take full advantage of these data.
•
The Swiss method does not use scanner data to its full potential• Dutch or Swedish method?
• Dutch method to prefer for (i) using current GTIN codes, (ii) less manual
interventions needed which is an advantage both from a resource perspective and for reducing subjectivity.
• But the Swedish concern about the risk for missing hidden price changes needs to be analysed further
• Norwegian method is a hybrid between Dutch and Swedish method
• Can Dutch method always avoid downward bias from a code
leaving the sample with a low price?
Continued
• Midweek or 2-3 weeks centred around the middle of the month should be used. In the long term why not go for the whole month?
• The logical concept should override timeliness considerations!
• Unit values over whole chains with the same service level
• Large countries may also separate by region?
• Big advantages in minimising replacement and manual intervention
• Details (truncation, data cleaning, dumping filter etc.) need more analysis before final recommendations.
• A focused research program is needed! For example:
•
Replicate Australian research on chain level unit values• How should the appropriate chain level be for unit value aggregation be defined
• Monthly chaining vs replacements. How to avoid downward bias from low outgoing prices or cosmetic product changes?
•Research supporting recommendations on detailed issues such as truncation, dumping filters, data cleaning etc.