How would we calculate an elementary price index if we had all the data we could ever want?

May 15, 2019

1. Introduction

This note is concerned with the question: How would we calculate an elementary price index if we had all the data we could ever want? The question is addressed in the context of the Consumer Price Index.

Why consider a question that is so patently unrealistic? Because it helps us understand and assess the methodologies actually used in the real world where data are costly and scarce. With a good understanding of what we would do to calculate the CPI in a full information environment, we are better positioned to understand why various measurement compromises and approximations are necessary in practice. We thereby enhance our abilities to improve the existing methodologies.

The CPI aims to measure the average rate of change for a very wide range of prices of consumer goods and services. It does so by aggregating a large number of elementary price indexes, using the Lowe index number formula. The present note is not much concerned with the aggregation process for the elementary indexes, although the topic is tackled briefly toward the end of the paper. Rather the focus is primarily on the estimation of the individual elementary price indexes themselves.

There are three basic dimensions to the CPI: products, time and space. The product dimension will be discussed momentarily. The time dimension is measured in months and years – the index is a monthly time series with weights drawn from a calendar year. The space dimension consists of a selection of Canadian cities plus all the Canadian provinces. Sub-indexes, with less product detail than at the Canada level, are produced for the regions. The Canada-level indexes are calculated by weighting the regional indexes appropriately. The focus in this paper will not be on the regional aggregation process. Rather, it will be on the calculation of a full-information elementary price index for a specific product elementary class in a specific region (city or province).

From the product perspective, the elementary indexes underlying the CPI can be divided into two groups. One might be called the indexes following the “standard model”, while the second group is referred to as the “special cases”. The present note is about the first of these two groups of elementary price indexes. The second group is very heterogeneous, a lot more complex and really must be considered on a case-by-case basis. Notable product classes in this group include financial services, tuition fees and owner-occupied shelter. The “special cases” group will be touched on only briefly here.

There is actually a third group of elementary indexes that might be called “imputed cases”. These are elementary indexes for which no sample price quotes are collected. Instead, the indexes are estimated by assuming they run parallel to one or more of the other elementary indexes that are calculated from sample price quotes. There are quite a few of these imputed cases in the CPI. Perhaps the best examples are the “not elsewhere specified” (NES) elementary classes comprising various, often quite heterogeneous, residual categories. If sample price quotes were collected for these indexes, as they might be one day, most if not all of them could be estimated using the standard model. For this reason, they are considered not as “special cases”, but as potential standard model cases.

2. The standard model

By the term “standard model” we mean the following.

1. An elementary class of products is defined for which we want to calculate an elementary price index, for example “liquid and automatic dishwasher detergent”. The elementary class is heterogeneous in the sense that it contains many distinct products and brands, but it is also relatively homogeneous in the sense that all of the different products within the elementary class are similar and tend to be highly substitutable. Some elementary classes are more heterogeneous than others, having a wider variety of products and perhaps a lesser degree of substitutability among their products than some other elementary classes. Ceteris paribus, the more heterogeneous the elementary class, the more problematic is the calculation of its elementary price index.

2. A small number of product groups within the elementary class are defined as “representative products” (RPs), for example “Bottle of liquid detergent, 300 to 350 ml size”. The RPs are effectively sub-elementary classes that are defined within the overall elementary class. The set of RPs within a given product elementary class need not be exhaustive, although ideally the set would be exhaustive and the selection of RPs for sampling purposes would be probabilistic. In defining the RPs the goal is to span the full variety of products within the elementary class, with greater emphasis on products that sell higher volumes. In other words, the aim is to define them so they are “representative”.

3. A sample of outlets, or sellers, of the RPs is selected. It is preferable the selection be done probabilistically, but typically it is done judgmentally. The selection of representative outlets aims to span the full variety of sellers of products within the elementary class, with greater emphasis on sellers with high volumes. In other words, the aim is to make the selection of outlets “representative”. (The term ‘outlets’ refers not just to bricks-and-mortar stores, but also door-to-door sales, Internet sales, catalogue sales, vending machine sales, etc.)

4. Within each of the selected outlets, one or more product varieties (sometimes referred to as “target product offers” or TPOs) are chosen for each of the RPs. Thus, for example, in some specific outlet the variety “Tide, 320 ml size” might be selected for the RP “Bottle of liquid detergent, 300 to 350 ml size”.

5. Each month, a CPI interviewer visits the selected outlets, finds the chosen product varieties and records the corresponding list prices.

6. Back at headquarters, the recorded list price for each selected variety in each selected outlet is divided by the corresponding list price for the same selected variety in the same outlet the previous month. These ratios are referred to as the price relatives.

7. Within the product elementary class, there will be many observed price relatives from different outlets for different product varieties. At headquarters, all of these price relatives are treated as if they come from the same underlying population. They are combined using the geometric average or Jevons formula. The average price relative resulting from this calculation is then used to calculate the next link in a monthly chain price index, which is the desired elementary index.

This is a brief and highly simplified summary of the “standard model”. The model is followed quite closely for the majority of the elementary indexes entering the CPI calculation. Later in this note, we will consider how the model would be applied if, hypothetically, the price index statisticians had all of the data they could ever want to have. However, before doing that we will consider how, in the real world, price index statisticians cope with a number of practical problems in applying this standard model.

3. Practical problems with the standard model

3.1 Quality change

Perhaps the most important practical problem with the standard model is quality change. Suppose in month ‘t’ the product variety “Tide, 320 ml size” was replaced by its manufacturer with a different product called “New and improved Tide, 300 ml size”. When the CPI interviewer returns to the store and looks for the price of “Tide, 320 ml size” she cannot find the product. She does, however, see a similar product called “New and improved Tide, 300 ml size” so she records its price instead, marked in her submission to headquarters as a substitute.

In Ottawa, the price index statistician has essentially two options. First, she could regard “New and improved Tide” as a new product, introduce it into the sample in month ‘t’ and wait until month ‘t+1’ when two months’ worth of price data will have accumulated and price relatives can be calculated. Or second, she could attempt to estimate the amount of quality change between the old and new types of Tide, adjust the month ‘t‑1’ price of old Tide for the quality change so the adjusted price is comparable to the month ‘t’ price of new Tide, and calculate the price relative accordingly. If the first approach is used, the potential price decrease, or increase, accompanying the introduction of “New and improved Tide” will never be captured and the sample size will be temporarily reduced. If the second approach is adopted, the statistician must face the typically rather difficult question of how to make the quality adjustment. This choice – discontinuity and temporary sample reduction with no attempt at quality adjustment versus continuity accompanied by the challenge of quality adjustment – is a potential dilemma. The second option is unambiguously superior to the first if a reasonably good quality adjustment is ‘doable’. The first option, however, is much simpler operationally.

The enduring problem with the first option is that it runs and hides from what may be a key turning point in the price index. The introduction by a manufacturer of a different version of a product provides an ideal opportunity to adjust the corresponding price. To the extent the product is improved, that fact might provide justification, if any is needed, for a price hike. To the extent the product is now cheaper to manufacture, that might provide reason to reduce the price. If such occasions where there are opportunities for price change are always avoided by the price index statistician, the CPI may not reflect reality very well.

3.2 Quantity change

Another situation that often arises in practice is where a product variety does not change in quality, but does change in quantity. For example, the CPI interviewer may have been reporting the price for “Tide, 320 ml size” over a period of several months. Then one day that variety is no longer available, but there is a larger bottle of the otherwise identical product available, “Tide, 650 ml size”. What should be done in this situation?

One simple approach would be to substitute “Tide, 650 ml size” for the old variety, “Tide, 320 ml size”, and make a direct adjustment for quantity in the transition month. The price per 320 ml is divided by 320 to calculate a price per ml. Similarly, the price per 650 ml is divided by 650 to calculate another price per ml. These two transformed “prices per ml” are then compared and a price relative is calculated.

But is this a fair comparison? It may not always be so, since the consumer is really buying a bundle consisting of some goods plus some associated retail services. A 650 ml bottle of Tide contains more than twice as much detergent as a 320 ml bottle, but the associated retail services the purchaser is receiving are unlikely to double as well. Indeed the services may be more or less fixed regardless of the quantity of goods purchased. This is why it is normal practice for outlets to charge lower prices per unit when consumers buy in bulk quantities. So it may not be fair to normalize the price for the different physical quantities being sold. The quantity adjustment problem can be more difficult than it first appears.

3.3 Product disappearance

Another problem is that of product disappearance. The CPI interviewer, who has been collecting prices every month for a particular product variety at a particular outlet, might discover the variety is no longer available at the outlet in month ‘t’. What should the interviewer do?

If the product variety is believed to be temporarily out of stock, the best approach might be simply to wait until next month when hopefully it will be back in stock. In the meantime, the unobserved price relative for that product variety in that outlet could be imputed, perhaps as the geometric mean of the other price relatives for that same product variety in the same neighbourhood. Alternatively it could be temporarily dropped from the sample with a consequent temporary reduction in sample size.

If, however, the product variety is believed to have been permanently discontinued in that outlet, perhaps because the outlet’s manager does not wish to carry the item any longer, a different course of action is necessary. If the product variety is simply dropped from the sample, the sample size will be permanently lower. Since this is not a sustainable strategy (eventually there would be no sample remaining), the normal course of action is to substitute a new product variety for the old one that has been discontinued.

Thus, suppose the product variety “Tide, 320 ml size” has been permanently discontinued in a particular outlet. The CPI interviewer chooses another variety, say “Cheer, 330 ml size”, as a substitute. In month ‘t’ we have price data for Cheer but none for Tide, whereas in month ‘t-1’ we have price data for Tide but none for Cheer (since we only brought Cheer into the sample in month ‘t’). Our choices now are similar to those just discussed under the heading of quality change. We could wait until month ‘t+1’ when we will have two months of data for Cheer, suffering a temporary reduction of sample size in month ‘t’. Or alternatively we could analyze the quality and quantity difference between the Tide and Cheer varieties, adjusting the Tide price in ‘t-1’ so it becomes comparable with the Cheer price in ‘t’.

3.4 Outlet disappearance

The outlet disappearance case is in some ways similar to the product variety disappearance situation. After collecting price quotes for product varieties in a particular outlet for a number of months, the outlet may go out of business. The interviewer must then select another outlet in which to collect prices for those product varieties (otherwise the sample size will shrink permanently). When prices are collected in the new outlet in month ‘t’ can they be properly compared with the prices collected in the now-closed outlet in month ‘t-1’?

The problem here is that, as noted earlier, any retail purchase is essentially a bundle of goods and retail services. In addition to the physical goods being purchased, the consumer is also buying a retail service package consisting of smiling (or not) salespeople, pleasing (or not) store ambience, delivery service (possibly), guarantees (possibly), store location, etc.

3.5 Other practical problems

The four potential problems just described are very common within any CPI price sample collection program. And there are a good number of other such problems that must be dealt with. Entire “representative product” groups may essentially disappear. New products may appear and achieve rapid, marked success in the marketplace, while residing in a “not elsewhere specified” elementary class where no price quotes are being collected. Sales mechanisms evolve over time and what was formerly a representative outlet may no longer be so (for example, online sales may displace bricks-and-mortar sales). The point here is just that the standard model described earlier is a lot more difficult to apply in practice than may appear at first glance.

4. Calculating elementary price indexes with full information

So we turn now to the central question: How would we calculate an elementary price index, for one of the cases presently dealt with using the standard model, if we had no practical problems whatsoever and had access to any information needed?

As in the previous discussion, we would begin by selecting a particular elementary class, such as “liquid and automatic dishwasher detergent”. Our goal would be to calculate the best possible monthly elementary price index for this elementary class, making use of all the information we could ever want to obtain.

Because we have no information constraints, we would not need to define representative product groups within the elementary class. Instead, we could use price data for each and every product variety within the elementary class. The concept of “representative products” would no longer be useful or necessary.

Similarly, there would be no point in selecting representative outlets. Since we would have information about all transactions in all outlets, we could use all outlets for purposes of calculating the elementary index.

In the standard model, the CPI interviewers collect price quotes just once or twice a month. However, if we have no data constraints we could collect information about the prices involved in all transactions, 24 hours a day for all days of the month.

In the standard model, it is assumed we know nothing about the quantities sold in each transaction. We do not even know how many transactions occur for the product-variety-outlet pairs. So we use the unweighted Jevons formula to average the observed sample of price relatives calculated mostly from matched price quotes (same product variety, same outlet, but different months). However, if we had all the information we could possibly want, we would know the quantities purchased in each and every transaction as well as the prices associated with each transaction. Thus, we could calculate the average price over the universe of all transactions during each month, and the average could be weighted using the quantities purchased/sold in each transaction.

Notice there is a change here from the use of paired list prices, in the practical everyday application of the standard model, to the use of unpaired transaction prices in the idealized application of the model with full information. Transaction prices are much preferred to list (or “offer”) prices, though they are not so easily observed in the real world. The pairing of list prices in the two months is a device used in the real world to ensure “pure price change” is being measured, but in the idealized full-information application the pairing of prices between the two months is no longer necessary.

Finally, while our focus is on the need to calculate a single elementary price index for a particular elementary class – “liquid and automatic dishwasher detergent” in our example – we can begin by breaking this elementary class down into a new set of sub-elementary classes representing all of the various product-variety-outlet pairs. We could then focus our attention on calculating elementary price indexes for each of these sub-elementary classes and when that was done, we could aggregate them, using the Lowe formula, to calculate the price index for the elementary class “liquid and automatic dishwasher detergent”. In other words, in a world with full information available we could define our elementary indexes at the lowest possible level in the classification hierarchy. Instead of defining about 600 elementary indexes that together span the full range of goods and services purchased by consumers (as is presently done for the CPI) we would define billions of lower-level elementary indexes, each one representing price movements for a specific product-variety-outlet pair. By doing so, the problems caused by heterogeneity within the elementary class would be avoided, thereby simplifying matters greatly.

4.1 Simple case: Without new products/outlets and quality/quantity change

To proceed, we must eventually deal with the troublesome real-world problems associated with product and outlet appearance and disappearance, and quality and quantity change. However, we will begin with the simpler case where these problems are assumed away.

So suppose for a moment: (i) there were no “liquid and automatic dishwasher detergent” product varieties that were available in month ‘0’ but were gone from the market in month ‘1’, and there were no new product varieties available in month ‘1’ that were unavailable in month ‘0’ and (ii) there were no outlets selling “liquid and automatic dishwasher detergent” product varieties in month ‘0’ that stopped selling those products in month ‘1’, and there were no new outlets selling “liquid and automatic dishwasher detergent” product varieties in month ‘1’ that were not selling those products in month ‘0’. Note that if an outlet existing in month ‘0’ changed its conditions of sale – for example, by improving its after-sales service guarantees – then the old version of itself could be deemed to be an outlet that had disappeared and the outlet with better service guarantees could be viewed as a new outlet. In other words, we are assuming, for the moment, that no outlet modified, in either of the two months, the bundle of retail services it provides when it sells goods and services.

In this simple case, there might be many varieties of the product “liquid and automatic dishwasher detergent”. Each of these product varieties would be paired with each of the various outlets selling them. The task would now be redefined as one of calculating distinct elementary price indexes for each of these product-variety-outlet pairs.

Let us consider one of these product-variety-outlet pairs, for example “Tide, 320 ml size” sold in store A. The store might have charged one price for this product variety in month ‘0’ and another different price in month ‘1’. Or it might have varied its price through the course of both months, charging different prices on different days. We would have full information about all transactions involving this product-variety-outlet pair during both of months ‘0’ and ‘1’. So we could draw two histograms showing the quantity-weighted frequency of occurrence of each of the different prices that were charged for this product-variety in transactions in that store during each month. We could then calculate a measure of central tendency for each of the two histograms, for example the arithmetic mean, the geometric mean or the median. Finally, we could divide the measure of central tendency for the prices charged in month ‘1’ by the measure of central tendency for the prices charged in month ‘0’ and use this ratio to construct a monthly chain price index for the product-variety-outlet pair. That would be the elementary price index for “Tide, 320 ml size” sold in store A.

If we used the weighted arithmetic mean as our measure of central tendency we would, in effect, be using a unit value to measure price change. The unit value would be the total value of sales of the product variety during the month divided by the number of units sold during the month. The ratio of the month ‘1’ unit value to the month ‘0’ unit value would be used to construct the monthly chained elementary index.

Unit values are difficult to interpret properly when the items being averaged are heterogeneous, but in this case they are purely homogeneous so the unit value is a meaningful and clean measure of average price. Alternatively one could use a weighted geometric mean to calculate the monthly chained elementary index. Because of the way the full-information problem is set up, both measures – the weighted arithmetic mean and the weighted geometric mean – pass the commensurability test and the time reversal test. There seems to be no compelling reason to choose one over the other.

4.2 More complex case: With new products/outlets and quality/quantity change

We now turn to the more complicated, but also more realistic case where new product varieties can appear and old ones can disappear, existing product varieties can change either in terms of their inherent quality or in terms of the physical quantity offered in the package, new outlets can appear and old ones can disappear, and existing outlets can change in terms of their physical structure and service characteristics. Quantity change can be considered a particular form of quality change and the two need not be considered separately.

We are focussed on the market for a particular product-variety-outlet pair. If the product variety, or the outlet, or both change in either month, we need an adjustment factor to make the prices associated with the new product-variety-outlet pair comparable to those associated with the old one. Because we are assuming full information, these adjustment factors are taken to be readily available.

Given these adjustment factors, it is straightforward to deal with the complications. We simply apply the factors and move forward as we did in the simple case.

5. Choosing an aggregation formula

As indicated, in a world with full information we could work with far more elementary indexes than we do in the real world. There could be billions of them, each focused on a unique, homogeneous, quality-adjusted, product-variety-outlet pair. How would we aggregate these elementary indexes to create price indexes for higher-level classes, major product groups and an all-items CPI? Assuming full information, could we and should we calculate a superlative index, such as the Fisher or the Walsh or the Törnqvist price index?

We could, yes, but arguably we should not, unless we are willing to tolerate chain drift in the monthly index. It would certainly be possible to calculate a month-to-month chained superlative index, but there would surely be significant ‘bouncing’ and chain drift because of seasonality. The drift could be a very serious problem.

If we decided to use annual quantity weights with annual chaining, then with the assumption of full information we would know the annual weights from year ‘T’ when we were calculating the indexes for January of year ‘T+1’. We could calculate the indexes for all twelve months of year ‘T+1’ using weights from year ‘T’ and then update to weights from year ‘T+1’ when we calculate the index for January of year ‘T+2’. So in the full-information world, we would continue to use Lowe indexes but they could have much more timely basket updates.

If we were prepared to tolerate CPI revisions, then we might want to recalculate the twelve monthly indexes for year T with a Lowe formula using annual expenditure weights from year T. Presuming that “full information” does not include foretelling the future, this could only be done after all the data were in for year T, presumably in January T+1.

Of course, we could also choose to calculate a number of different indexes with different aggregation formulas (Fisher, Walsh, Törnqvist, Laspeyres, Paasche, Lowe), different weighting approaches and different chaining frequencies for comparison purposes.

6. Conclusions

This note has looked at the question: How would we estimate an elementary price index if we had all the data we could ever want? It has concluded, essentially, that with full information we would define the elementary indexes at a much lower level of detail – indeed, at the lowest possible level of detail, the individual product-variety-outlet pair – and then use unit values to measure prices in each month.

We have also looked at the related issues of quality adjustment and, in the full-information world, have found that the ready availability of quality adjustment factors solves many problems.

In fact, the quality change issue is even more complicated than it seems. One might well ask the following additional question: If we can, using a quality adjustment factor, make one product-variety-outlet pair comparable to another similar one, when the product variety has been modified by its manufacturer or the outlet characteristics have been changed by the retailer, then could we not play the same trick at a higher level in the classification system? Could we, in a full-information world, make cars comparable to bananas or haircuts comparable to insurance by means of appropriate quality adjustment factors?

Most would agree this takes the concept of quality adjustment well beyond its everyday meaning in the practical world of price index calculation. It seems like a reductio ad absurdum. So we are left asking ourselves just how far we can push the idea of quality adjustment. With all the data actually available to us today, and with sophisticated quality adjustment methods now being put into practice such as hedonic modelling, the scope for estimating dependable quality adjustment factors has never been greater. And as was argued earlier, quality adjustment can be extremely important for the accurate measurement of price change. But how far dare we go?

All of which serves to remind us about what index numbers are fundamentally about: They are for aggregating incommensurable quantities. The CPI covers a mix of products that are not commensurable. The index number formula underlying the CPI allows us to aggregate the prices for diverse commodities – apples, shelter, airplane rides, women’s blouses, etc. – that could otherwise never be aggregated. If we could somehow come up with “quality adjustment factors” making apples commensurable with shelter and all the other products consumers buy, then we would no longer need index number formulas at all.