Source: The Importance of the Long Form Census to Canada, David A. Green and Kevin Milligan, Canadian Public Policy – Analyse de politiques, vol. xxxvi, no. 3 2010
This Thursday, Statistics Canada will be updating the Consumer Price Index (CPI) basket weights with the latest results from its 2013 Survey of Household Spending (SHS). While this CPI basket update will be the second undertaken since a major SHS redesign in 2010, little information about the household spending survey has been made publicly available since then. Statscan stopped producing public use micro data as well as data quality reports for the SHS immediately following the redesign. The official reason: “There will be no public use microdata file (PUMF) for SHS 2010 due to resource constraints.”
Statscan’s sudden lack of transparency following years of declining data quality and a significant overhaul of its key household spending survey is cause for concern.
That Statscan regularly cites ‘resource constraints’ even as it contends they haven’t affected the agency is one thing. That it regularly does so to justify withholding data from the public is worrisome. Other national statistical agencies likewise facing resource constraints, including those in the US and the UK, seem loathe to target public data dissemination. Not only does doing so inconvenience data users, it undermines public confidence.
Since the release deals with the specific use of the SHS to weight the CPI basket, the main question is whether / how well the survey results are representative of expenditure patterns of different households across Canada. The two common sources for potential problems are sampling error (whether the selected sample households accurately represent the target population) and non-sampling error (like under-coverage and non-response).
The 2013 SHS was based on the Labour Force Survey (LFS) sampling frame in place at the time, which was based on the 2001 census. In addition to accounting for geography (its ‘primary sampling unit’), the LFS sampling frame includes ‘special strata’ to better account for aboriginal, immigrant and high-income households (and yes, that data is, or rather, was, derived from the long-form census – more on that in an upcoming post).
Assuming the LFS sampling method provides a sufficiently representative cross-section of non-institutionalised Canadian households (it doesn’t, but more on that another time), that leaves potential non-sampling errors as a possible source of bias in the SHS.
For a selected household to be part of the survey, it has to be residing at the address selected in the sample at the time the survey is conducted, with at least one household member able to respond on its behalf available to do so (coverage). That member would also have to agree to take the survey (response) as the SHS is voluntary. Statscan also has to find the survey response ‘acceptable’.
The (2010 redesigned) SHS is conducted in two parts: The interview, intended for all surveyed households, collects information on broader expenditure as well as income components. To make it more like the US Consumer Expenditure Survey (CEX), in 2010 Statscan introduced a diary, distributed to half of surveyed households, which it uses to collect more detailed expenditures (and to check against the interview data for verification).
The response rates for the 2013 SHS survey components were 67 percent and 46 percent, respectively. For comparison, the response rates for the 2013 CEX survey components were 67 percent and 61 percent, respectively.
Figure 1 illustrates the decline in SHS response rates between 1997 and 2008. The rate for 2009 was 67 percent. The ‘all-time low’ for the much more detailed Survey of Family Expenditure (FAMEX) that preceded the SHS was 74 percent. Despite the 2010 survey redesign focus on reducing respondent burden, which entailed “a number of components regarding household equipment and dwelling characteristics and most of the questions regarding changes in household assets and liabilities (being) dropped”, the interview response rate remained near historic low. Response rates for the SHS diary component have remained well below 50% since it was introduced in 2010 (just 43% in 2011, 2012).
Overall response rates are less informative than response rates accounting for different household characteristics, since different households living in different parts of the country have different consumption patterns, which can have a significant impact on CPI basket item weights.
Statscan provides some information on geographic disparities – for example, only five hundred usable diaries were received from Ontario, its nearly 5 million households comprising the most populous and diverse province in the country.
Widening wealth and income inequality in Canada has seen a disproportionately greater share of income, along with a disproportionately even greater share of discretionary income, concentrated among fewer households. As such, not having a representative share of high-income households in the SHS could significantly impact the survey results, and ultimately the CPI basket weights.
Statscan makes a point of emphasising the priority given to sampling high-income households: “The high-income household strata are allocated a larger share of the sample than the other strata, where an allocation proportional to stratum size is used.” However, that’s the first and last mention of high-income households in the User Guide for the Survey of Household Spending, 2013. It’s worth noting the LFS sampling frame used for the 2013 SHS defined ‘high-income’ as total household income over $125,000 based on the 2001 Census.
Statscan used to publish somewhat more detailed data quality indicators. From its 2009 SHS data quality report: “This report… covers the usual quality indicators that generally help users interpret data, such as coefficients of variation, response and non-response rates, slippage rates and imputation rates.” The 2009 report included non-response and under-coverage rates for high-income households.
Under-representation of high-income households in the national household spending survey is not a trivial matter. In fact, economists from across US federal statistical agencies recently published a report seeking to address this very issue with the US household spending survey. Is the Consumer Expenditure Survey Representative by Income? (NBER, 2013) notes: “Given the plutocratic nature of the CPI, the relationship of income and spending on different types of categories suggests that under-representation of high income families in the CE could be biasing the CPI.”
There are far more questions than there are answers regarding the SHS at this point. As it stands, it’s not too far a stretch to say that using the recent SHS results to weight the CPI basket could be biasing the Canadian index. Simply assuming the spending habits of households that don’t respond is similar to those of households that do is questionable at best. It’s completely inappropriate when it comes to high-income households with significantly greater relative discretionary income.
At this point, estimating how much income isn’t represented in the SHS and squaring that with the rough estimate of high-income households that didn’t respond, as the US CEX study did, is the best Canadians can hope for. And even that may be asking too much going forward, as the LFS sampling frame was just recently (January 2015) updated with the equally unreliable 2011 National Household Survey.
Either way, how much non-responding high-income households spent and to what degree their spending habits may have biased the CPI will remain unknown.