Governance Transparency

2011 NHS: Reminder that the data is still as unreliable as ever

Consider this the 2011 NHS: How much less we now know, illustrated *, FED edition.

How much we knew about Canadians following the 2006 long-form Census:

… how much less we know about Canadians following the 2011 National Household Survey:

… and how much less reliably we know it, owing to Statistics Canada’s remarkably lowered data quality standards.

If you’ve recently visited the Statscan web site, you’ve likely noticed the ‘Features’ widget on the front page. Atop the list of featured content is a link to the 2011 National Household Survey (NHS). With the federal election in full-swing, it’s tempting to compare the 2011 NHS data by Federal Electoral District (FED) in the hopes of gleaning some insight into whether/how socio-demographic/economic characteristics play a role in election results.

When looking at 2011 NHS data at the FED level, it may seem more reliable than at a lower level of geography, like Census Subdivision (CSD). It isn’t; rather, it’s a misperception stemming from the difference in concept between CSD and FED.

A CSD represents the equivalent of a city, town or reservation, usually defined by incorporation. CSD size can vary widely from barely a hundred to several million, so a map illustrating non-response rates may seem deceptive; a large, relatively sparsely populated area will have a significantly smaller impact than a small, relatively densely populated area with a similar (non-)response rate on overall data quality.

On the other hand, a FED is largely based on democratic representation. That generally makes FED sizes more homogeneous. With a few exceptions (on PEI and in the northern territories), FED size varies from about 60,000 to 130,000. Given that the 338 federal electoral districts are supposed to represent about 35 million Canadians, it should come as no surprise the average FED size is about 100,000.

That means in certain cases a number of smaller CSDs end up being combined (sometimes with a larger CSD) to form a FED, and vice versa – large CSDs end up disaggregated into multiple FEDs.

That said, either way you look at it, about half of all Canadians lived in a defined geographic area for which 2011 NHS data would not have been fit to publish before Statscan dramatically lowered its data quality standards. 50 percent of Canadians resided in a CSD where at least one in four households did not respond to the survey; 45 percent resided in a FED with the same level of survey non-response.

As such, the same warning about the 2011 NHS data quality applies today as it did shortly after its 2013 release: “The income (among other) data in the National Household Survey is not valid. It should not be used or cited. It should be withdrawn.” This fellow’s opinion notwithstanding (more on that shortly).

To clarify: No, this is not a call to simply mandate response to the long-form Census again (more on that shortly as well).

Leave a Reply

Your email address will not be published. Required fields are marked *