08/12/2021

Five Checks Before Doing Cross-Country Comparisons

6 December 2021

Background

I have always been fascinated by national indicator frameworks and making cross-country comparisons. When doing my doctoral research in the early 1990s, for example, the United Nations HDI Human Development Index was under development. At the time, there was a vibrant academic debate about whether amalgamated indices like these, were more than the sum of their parts. 

Source: https://bit.ly/hdi2019map

National indicator frameworks, like the EIS European Innovation Scorecard, or V-Dem Liberal Democracies Index, for instance, are frequently used by policy-makers and decision-makers for benchmarking a situation in a country and determining priorities for action. If we want our governments to implement evidence-based policies, these types of indicator frameworks are therefore important. National policies still matter and greatly influence outcomes in terms of people's welfare and well-being.

Reliable Indices and Data Quality

This summer, I gave a research methods course, and students asked for general guidelines when working with cross-country comparisons. Above all, we need to make sure that our indicators are based on valid, reliable, data free from systematic bias. In statistics, validity is the extent to which a concept, conclusion, or measurement is well-founded and likely corresponds accurately to the real world. In practice, validity means that indicators measure what they say they are measuring. 

A cursory inspection is therefore often not enough to determine whether the indicators are valid. This depends on a number of factors such as the overall research design, the way questions are framed and formulated, and how the answers were elicited. If research has been published in peer-reviewed journals and is widely quoted, we can assume there are no or few validity issues.

Secondly, reliability is the overall consistency of a measure. An indicator is said to have high reliability if it produces similar results under the same conditions. It is about whether if you would go back and measure again, you would get the same results again and the errors would be random.

Thirdly, systematic bias is when indicator values differ substantially from the real values of the quantitative parameter being estimated. There are many possible sources of bias. Most common sources in the context of cross-country comparisons are selection bias, recall bias, observer bias, and funding or political bias. 

Selection bias occurs when samples are not properly selected from the population in general. This can be verified by inspection of the sample selection procedures. Recall bias is when survey questions refer to the past and respondents can not accurately remember. Observer bias is, for example, when the surveyor uses his own personal prejudices to select respondents or interprets their response. This is more likely to occur when the respondents are from a different ethnic or language group. 

Funding or political bias is a research ethics issue when results are reported to please a funder or political friend. This type of bias is unfortunately quite common when doing cross-counry comparisons, with some countries keen on hiding the facts. There are therefore many sources of bias, some are unconscious, and not all can be controlled by detailed examination of survey questions and procedures. 

Five Checks

Before using survey results you should ask yourself the following questions:

1- Source check. When we deal with “official” or national statistics, we need to ask ourselves where the producer of the data has an incentive to lie or to omit data? Even a cursory inspection of international databases such as the UNESCO’s, World Bank’s, or the UNDP shows a lot of omissions. Often it is not that governments do not have the data, but they do not wish them to be publicly knowns. When examining data from a specific research project, we need to inspect the research design and assess whether the data are valid, unbiased, and reliable. 

Lately, several indices of indices have been produced which include other indices to produce a new index. This makes checking sources somewhat more complicated. This issue was hotly debated in the 1990s when the UN launched the Human Development Index, which combines income, education, and health indicators. The question was whether this added value as opposed to using indices for these three dimensions separately. Although this question can never be answered conclusively, it is a fact that policymakers like to use rankings and league tables, rather than more detailed data. In the end, the debate was decided by practitioners who have continued to use these indices.

2- Dimensions check. It is good practice to check whether the dimensions on both sides of an equation are the same, otherwise, we risk comparing apples and oranges. A variable expressed in meters can only be compared to another one expressed as a percentage through a correction coefficient. This is particularly important when developing models or when examining causal relationships.

3- Measurement scale check. When numerical indicators are compared from year to year, we need to check whether the underlying variable uses the same scale. The United Nations Human Development Index (HDI), for example, the scales for many variables were normalized using the minimum and maximum for each specific years. Similarly, Human Rights Measurement Initiative (HRMI) indicators are expressed as a percentage of the maximum achievable in a given year. As a result, for these type of indiced year-to-year comparisons are strictly speaking not possible. You need to read the methodology section, and just like when you take a medicine mind the warnings on the labels.

4- Scope check. This is related to check five as to the reason and purpose the data were collected in the first place. In many cases, we need to make sure that we are comparing units that are somewhat similar. Depending on the research question, when we compare countries it may not make much sense, for example, to compare the Kingdom of Monaco with its 31,000  inhabitants with the United States with a population of over 330 million. 

Similarly, when the countries in the research states include a large number of small island states with outlying values, this may skew the final results. Finally, when we consider certain questions such as agricultural and environmental policy, or trade issues, it makes more sense to amalgamate all 27 European member states rather than considering them separately, because policy and decision-making about these issues is largely a European Union, not a national affair.

5- Why? Last, but most importantly, you need to be clear about why you want to compare different entities or countries, or different entities across different years. You must try to become aware of your own potential biases. You may want to consider reversing the research question or including variables you left out initially.

Final Remarks

It is amazing how in the last 30 years national indicator framework have extended their coverage to more countries and have become more sophisticated. Sometimes however the sheer complexity of how indices are constructed can obscure important methodological issues or biases. By performing these five checks can protect researchers from committing major mistakes.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.