Estimates of yearly data lead to gaps with census
After the seventh national and provincial census data was released, some people might have been surprised to discover that many indexes, including the total population, population growth, age and gender structures, and the distribution of population between urban and rural areas and among ethnic groups, are not consistent — even have relatively large differences — with the yearly population data published previously.
For example, the 0-14-year-old population in the seventh national census is 253 million, 14 million more than the 239 million new births in the previous 15 years, between 2006 and 2020. Logically, the population aged 0-14 should have been much less than 239 million given the premature deaths of children in this age group. It might also be surprising to see the 15-59-year-old population in 2020 "suddenly" decline by 20 million compared with 2019.
But it is normal for the census data and previously published annual population data to be different. The annual population data published between two censuses, such as the birth rate, death rate, population growth rate, total population, and the age and ethnic structures, are all based on comprehensive and scientific analysis, calculation and estimation of vital statistics, migration registration data, and sample survey data.
This is also how the population data are basically produced in non-census years in most other countries. Strictly speaking, the population data published in non-census years, such as the total population, population structure and growth rate, ethnic structure, are not data from a direct census, but an estimated population data.
In the United States, the birth rate, population growth rate and total population published each year, too, are estimates based on a comprehensive analysis of births and deaths from the vital statistics system, international migration registration data, and population age data from the previous census.
In China, the annual population data in non-census years are often based on sample surveys, which make a population survey on a 1/1,000 basis, calculate various population indicators according to the samples, and use these statistical indicators to calculate the general population and estimate the overall population indicators, such as the growth rate, age structure and ethnic structure.
Since it is a sample survey, which inevitably will have some sampling errors, these estimates will be close to the census data, but not in perfect agreement with them. When it comes to some indicators that require a large sample size to be more accurately calculated, such as the birth rate, ethnic minority composition and population indicators, the differences could be larger.
For instance, according to the seventh national census, the population of Guangdong province is 126.01 million, 12.55 million more than 113.46 million announced in 2018, and Northeast China has a population of 98.51 million, 9.85 million less than 108.36 million announced in 2018.
It is normal for the population figures for Guangdong, Northeast China and the Xinjiang Uygur autonomous region in this census to not tally with the previously announced annual population data, because the estimates based on sample surveys, which usually are close to but not necessarily as precise as the authentic census data, can be higher or lower than the actual census data.
So what to do in case of the census data and annual population data announced in non-census years do not match, and which should be considered accurate? Given that the annual data in non-census years are estimates and the census data are the direct results of investigation and universal survey, the latter are generally used by the international community as the standard, and "inter-census population estimation technology" is used to verify and revise the main indicators of the annual population data between the two censuses, so as to make the specific data for each year consistent with the actual change in population.
For example, in Canada, the revision of the previous data begins shortly after an agricultural census is conducted. After the 2010 census, the US also revised its population data from 2000 to 2010 to more accurately reflect the population growth and development. After the sixth population census, China, too, used the "inter-census population estimation technology" to revise the annual birth rate and population growth rate from 2000 to 2010, and published the revised results. And in the statistical yearbooks published later, the revised data were used for the population indexes from 2000 to 2010.
The missing report rate of the seventh national census is only 0.05 percent, lower than that of 1982 and a record low for the missing report rate of the previous censuses, making it the most accurate census since the founding of the People's Republic of China. That has been made possible because of the high quality of the census survey which highlighted inconsistencies between the census data and some annual data in non-census years, with some gaps being relatively large.
After the census, the National Bureau of Statistics, based on the more accurate and authoritative census data, has verified and revised the data on birth rates and growth rates at the national level, giving a truer and more accurate picture of China's population growth and development.
The author is director of Population Development Studies Center at Renmin University of China.
The views don't necessarily reflect those of China Daily.