2011 NHS: A few notes on the Income release

Inevitably, media sources will cite the 2011 NHS income data, set for release tomorrow, as a realistic portrait of Canadian household and individual incomes. It will not, and could not, be owing to the nature of the source survey. The following provides a brief overview as to why the 2011 NHS income data will be unreliable.  The role of income data in the federal government’s decision to cancel the long-form Census is also discussed.


Possibly the greatest impact the change from a mandatory long-form Census to a voluntary National Household Survey will have is on the income data. Income relies less on the number of responses than it does on the distribution of those responses. The best anecdote is ‘the one percent’, who in 2006 accounted for nearly 13% of all Canadian household income.

As noted in the preceding post, the 2011 long-form Census test conducted in 2008 had less than 50% overall response rate, as well as less than 50% response for the income question(s). Combine that with a questionnaire distribution of 30% and you have a sample of 8% of the Canadian population. Some statisticians will say it doesn’t make that much difference if you have 25% or 10% sample, so long as it’s a representative sample. That depends on what’s being measured. In any case the 2011 NHS neither had a true 10% sample size, nor a representative sample.

While it tests income data quality internally by single percentages and vingtiles (5% shares), Statscan only disseminates the data by quintiles (20% shares). The most the public can do is compare the bottom 20% with the top 20% to get a picture of income disparity. With only an 8% sample of whoever-felt-like responding, at least in theory, both top and bottom quintiles could be completely unaccounted for by virtue of non-response, with plenty of room to spare.

While it’s unlikely the two income quintiles were entirely absent from the sample,  income  more than most questions illustrates the problem with having a voluntary non-Census: Simply, a disproportionate non-response for a given income group skews the distribution and renders the data meaningless. Certain groups are less interested in, and less likely to, respond. And those groups happen to belong to the two extremes of the income distribution. Even if the entire top 20% didn’t bail on the 2011 NHS, it doesn’t take many doing so to skew the distribution. There’s no chance the 2011 NHS sample accurately captured ‘the one percent’, for example.

Re-weighting, other sources

One of the popular responses to concerns with the 2011 NHS is that ‘other sources’ are available to evaluate and, if need be, adjust the population distributions. Statscan itself has acknowledged it’s been increasingly relying on the short-form to evaluate the population distribution problems brought about by the change from the mandatory to voluntary long-form survey.

As many economists have pointed out, using the short-form to ‘fix’ the 2011 NHS distribution doesn’t work, because the point of the long-form survey is to capture an accurate distribution of population characteristics not found in the short-form. Many of the sub-groups whose characteristics the long-form was designed to capture, like ethno-cultural minorities, immigrants, aboriginal groups, etc have demographic portraits that differ from that of the general population. For example, aboriginal and immigrant populations tend to be younger, slightly more male (as a result of the age distribution) and have different linguistic characteristics than the general population. Using the age, gender and language distributions from the short-form to ‘fix’ the distributions in the long-form would magnify incorrect distributions of the referenced subgroups, as demonstrated by the questionable immigrant and aboriginal population figures from the 2011 NHS.

In previous years, other income sources, such as the T1 Family File (T1FF), Longitudinal Administrative Database (LAD) and Survey of Labour and Income Dynamics (SLID) were used in the quality evaluation and data certification process for the Census Income release. They were simply used as a basis for comparison to the long-form Census data, taking into account the differences between those surveys’ methodological and population characteristics relative to the Census’. The distributions from those surveys weren’t, as far as we’re aware, used to modify the long-form data. Whether that has changed this time around given the circumstances is unclear.

Example: SLID vs long-form Census

As noted, the Survey of Labour and Income Dynamics is a survey similar to the long-form Census, but with a much smaller sample, that is often compared/contrasted to the Census.

While it is a voluntary survey, the reason for the small sample size, just 17K households, was the effort to have a representative sample from those who opted in. Despite this, only 87% population coverage was achieved. That potentially leaves plenty of room at either end of the income distribution.

SLID also happens to be the source of the annual low-income data. Its most recent (and last; SLID’s been cancelled) release showed low-income incidence at a near record-low in Canada, despite all the other data and anecdotal evidence indicating otherwise. While this in no small part has to do with the design of the low-income measure, it is worth noting that SLID has consistently underestimated the low-income incidence relative to the long-form Census.

Using the before-tax low-income cut-offs (LICO-BT), SLID under-estimated low-income incidence in 2006 at 14.3%. The 2006 long-form estimated it at 15.3%. (Same result using the after-tax low-income cut-offs, 10.4% vs 11.4%, respectively.) Sample distribution, along with sample size, apparently makes a difference.

While the SLID sample is from the LFS, the LFS sample is updated using the Census. With only the population figures from the short-form Census to rely on moving forward, the LFS, SLID and other voluntary Statscan surveys can all be expected to increasingly miscount groups that have higher incidence of low-income, such as the unemployed, immigrants and aboriginal groups.

Income possible reason for long-form Census cancellation

Before demonstrating he lacked a basic grasp of statistics, former Industry Minister Tony Clement claimed the reason he had to replace the long-form Census with a voluntary survey was the overwhelming number of privacy complaints he received from concerned Canadians. His government specifically pointed to an extremely invasive question about the number of bathrooms in people’s homes, which, unfortunately, didn’t exist. When pressed to produce the overwhelming number of complaints that he and former Industry Minister Maxime Bernier received, he couldn’t. Canadians, rightfully, felt they were being lied to.

And they probably were. There may have been a few complaints to Ministers Clement and Bernier’s offices. Most likely had to do with the government’s decision to reward a contract for part of the Census processing to Lockheed Martin, which started a somewhat popular protest movement prior to the 2006 Census.

In terms of issues with the 2011 long-form Census, one suspects that the complaints weren’t from average Canadians, and they were likely voiced privately to Minister Clement, hence his inability to produce documented complaints. What likely irked the privileged few who had Minister Clement’s ear was not the imaginary bathroom question, but a new question added for the first time to what would have been the 2011 long-form Census: Capital gains.

For many years, the primary grievance social advocacy groups had with the reported growing inequality gap was that it relied solely on income with little accounting for wealth. The old adage about how the rich get richer refers to the ability of the owners of capital to generate greater wealth simply by virtue of their ownership of said capital. The greater the capital bias, the greater the inequality. Unfortunately, the income measures of most household surveys account for very little of this capital bias, with perhaps the exception of dividends and rents.

The 2006 long-form Census for the first time captured a more accurate picture of income disparity than previous cycles largely due to the linkage to respondents’ Canada Revenue Agency file. It should be noted that the linkage was made only for those respondents who explicitly gave Statscan permission, which 82% did. So much for privacy concerns.

The 2011 long-form Census would have taken it one step further by including a question on capital gains. For readers unfamiliar with the term, it refers to an increase in the value of a capital asset (investment or real estate, for example) that gives it a higher worth than the purchase price, when the increase in value is realised with the sale of that asset. The broad term is a bit confusing as it also encompasses losses when the price realised on asset sale is below the initial purchase price.

Economic downturns tend to be magnified by profit-taking in the capital markets. After an extended period of growth, investors often cash in on capital market gains all at once, lowering market caps, corporate revenues and profits, and ultimately contributing to cut-backs and unemployment. This often leads to knee-jerks reactions by governments alternately termed ‘asset relief’, ‘quantitative easing’, etc. Basically corporate welfare. A related hypothesis, dubbed the ‘Minsky Moment’ after economist Hyman Minsky, became a somewhat popular topic following the onset of The Great Recession in 2008. But that’s a separate issue from the one at hand.

During sharp, extended economic downturns the (over-simplified) phenomena just described tends to magnify income inequality. With unemployment, many workers lose their main source of income, young people face even greater difficulty entering the workforce, while the decline in labour market demand leads to wage stagnation for those who remain in the workforce.  In an age of technological innovation that’s gone from labour-saving to labour-replacing, most of the jobs lost in traditional industry such as manufacturing are likely never to return. Indeed, there appears to have been  a structural shift to precarious, low-wage service sector employment coinciding with the loss of employment in traditional goods-producing sectors owing to The Great Recession.

On the flip-side, capital owners cashing out after extended capital market run-ups are better off. Those unfamiliar with the phenomena tend to be surprised when they read that High Net Worth Individuals (HNWI) actually did well during The Great Recession.

This leads to uncomfortable demands for government intervention and income redistribution. One way to get around that problem is to hide the data that could be used to justify those demands.

Leave a Reply