
Register-based Statistics
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions


Persons
Anders Wallgren and Britt Wallgren, Statistics Sweden.
Content
Preface xi
Chapter 1 Register Surveys - An Introduction 1
1.1 The purpose of the book 1
1.2 The need for a new theory and new methods 3
1.3 Four ways of using administrative registers 5
1.4 Preconditions for register-based statistics 6
1.4.1 Reliable administrative systems 7
1.4.2 Legal base and public approval 8
1.5 Basic concepts and terms 10
1.5.1 What is a statistical survey? 10
1.5.2 What is a register? 11
1.5.3 What is a register survey? 13
1.5.4 The Income and Taxation Register 14
1.5.5 The Quarterly and Annual Pay Registers 16
1.6 Comparing sample surveys and register surveys 20
1.7 Conclusions 23
Chapter 2 The Nature of Administrative Data 25
2.1 Different kinds of administrative data 25
2.2 How are data recorded? 26
2.3 Administrative and statistical information systems 27
2.4 Measurement errors in statistical and administrative data 29
2.5 Why use administrative data for statistics? 30
2.6 Comparing sample survey and administrative data 32
2.6.1 A questionnaire to persons compared with register data 32
2.6.2 An enterprise questionnaire compared with register data 34
2.7 Conclusions 36
Chapter 3 Protection of Privacy and Confidentiality 37
3.1 Internal security 38
3.1.1 No text in output databases! 38
3.1.2 Existence of identity numbers 39
3.2 Disclosure risks - tables 40
3.2.1 Rules for tables with counts, totals and mean values 41
3.2.2 The threshold rule - analyse complete tables! 43
3.2.3 Frequency tables are often misunderstood 44
3.2.4 Combining tables can cause disclosure 45
3.3 Disclosure risks - micro data 45
3.4 Conclusions 46
Chapter 4 The Register System 47
4.1 A register model based on object types and relations 47
4.1.1 The register system and protection of privacy 53
4.1.2 The register system and data warehousing 53
4.2 Organising the work with the system 54
4.3 The populations in the system 56
4.3.1 How to produce consistent register-based statistics 57
4.3.2 Registers and time 58
4.3.3 Populations, variables and time 59
4.4 The variables in the system 60
4.4.1 Standardised variables in the register system 60
4.4.2 Derived variables 62
4.4.3 Variables with different origins 63
4.4.4 Variables with different functions in the system 64
4.5 Using the system for micro integration 65
4.6 Three kinds of registers with different roles 70
4.7 Register systems and register surveys within enterprises 72
4.8 Conclusions 74
Chapter 5 The Base Registers in the System 77
5.1 Characteristics of a base register 77
5.2 Requirements for base registers 78
5.2.1 Defining and deriving statistical units 78
5.2.2 Objects and identities - requirements for a base register 80
5.2.3 Coverage and spanning variables in base registers 81
5.3 The Population Register 83
5.4 The Business Register 88
5.5 The Real Estate Register 93
5.6 The Activity Register 94
5.7 Everyone should support the base registers! 98
5.8 Conclusions 101
Chapter 6 How to Create a Register - Matching and Combining Sources 103
6.1 Preconditions in different countries 103
6.2 Matching methods and problems 105
6.2.1 Deterministic record linkage 105
6.2.2 Probabilistic record linkage 106
6.2.3 Four causes of matching errors 112
6.3 Matching sources with different object types 114
6.4 Conclusions 120
Chapter 7 How to Create a Register - The Population 121
7.1 How should register surveys be structured? 121
7.2 Register survey design 125
7.2.1 Determining the research objectives 125
7.2.2 Making an inventory of different sources 128
7.2.3 Analysing the usability of administrative sources 128
7.3 Defining a register's object set 131
7.3.1 Defining a population 131
7.3.2 Can you alter data from the National Tax Agency? 134
7.3.3 Defining a population - primary registers 135
7.3.4 Defining a population - integrated registers 136
7.3.5 Defining a calendar year population 137
7.3.6 Defining a population - frame or register population? 138
7.3.7 Base registers should be used when defining populations 141
7.4 Defining the statistical units 142
7.4.1 Units and identities when creating primary registers 143
7.4.2 Using administrative objects instead of statistical units 144
7.5 Creating longitudinal registers - the population 145
7.6 Conclusions 146
Chapter 8 How to Create a Register - The Variables 147
8.1 The variables in the register 147
8.1.1 Variable definitions 148
8.1.2 Variables in statistical science 149
8.1.3 Variables in informatics 150
8.1.4 Creating register variables - check list 151
8.2 Forming derived variables using models 151
8.2.1 Exact calculation of values using a rule 152
8.2.2 Estimating values with a rule 153
8.2.3 Estimating values with a causal model 154
8.2.4 Derived variables and imputed variable values 157
8.2.5 Creating variables by coding 158
8.3 Activity data 159
8.3.1 Activity statistics 160
8.3.2 Activity data aggregated for enterprises and organisations 161
8.3.3 Activity data aggregated for persons - multi-valued variables 161
8.4 Creating longitudinal registers - the variables 165
8.5 Conclusions 169
Chapter 9 How to Create a Register - Editing 171
9.1 Editing register data 171
9.1.1 Editing one administrative register 173
9.1.2 Consistency editing - is the population correct? 175
9.1.3 Consistency editing - are the units correct? 178
9.1.4 Consistency editing - are the variables correct? 180
9.2 Case studies - editing register data 181
9.2.1 Editing work within the Income and Taxation Register 181
9.2.2 Editing work with the Income Statement Register 183
9.2.3 What more can be learned from these examples? 184
9.3 Editing, quality assurance and survey design 185
9.3.1 Survey design in a register-based production system 185
9.3.2 Quality assessment in a register-based production system 186
9.3.3 Total survey error in a register-based production system 191
9.4 Conclusions 192
Chapter 10 Metadata 193
10.1 Primary registers - the need for metadata 193
10.1.1 Documentation of administrative sources 194
10.1.2 Documentation of sources within the system 195
10.1.3 Documentation of a new register 195
10.2 Changes over time - the need for metadata 195
10.3 Integrated registers - the need for metadata 196
10.4 Classification and definitions database 197
10.5 The need for metadata for registers 198
10.6 Conclusions 200
Chapter 11 Estimation Methods - Introduction 201
11.1 Estimation in sample surveys and register surveys 202
11.2 Estimation methods for register surveys that use weights 203
11.3 Calibration of weights in register surveys 204
11.4 Using weights for estimation 207
11.5 Conclusions 208
Chapter 12 Estimation Methods - Missing Values 209
12.1 Make no adjustments, publish 'value unknown' 210
12.2 Adjustment for missing values using weights 214
12.3 Adjustment for missing values by imputation 215
12.4 Missing values in a system of registers 218
12.5 Conclusions 220
Chapter 13 Estimation Methods - Coverage Problems 221
13.1 Reducing overcoverage and undercoverage 221
13.1.1 Coverage problems in the Population Register 221
13.1.2 Coverage problems in the Business Register 222
13.2 Estimation methods to correct for overcoverage 224
13.3 Undercoverage in the administrative system 226
13.4 Conclusions 228
Chapter 14 Estimation Methods - Multi-valued Variables 229
14.1 Multi-valued variables 229
14.2 Estimation methods 232
14.2.1 Occupation in the Activity and Occupation Registers 232
14.2.2 Industrial classification in the Business Register 236
14.2.3 Importing many multi-valued variables 238
14.2.4 Consistency between estimates from different registers 242
14.2.5 Multi-valued variables - what is done in practice? 245
14.2.6 Additional estimation methods 247
14.3 Application of the method 251
14.4 Linking of time series using combination objects 254
14.4.1 Linking time series 254
14.4.2 Changed industrial classification in the Business Register 256
14.5 Conclusions 258
Chapter 15 Theory and Quality of Register-based Statistics 259
15.1 Is there a theory for register surveys? 259
15.1.1 Statistical inference at a national statistical office 260
15.1.2 Theory-based methods or ad hoc methods 262
15.1.3 The survey approach and the systems approach 263
15.2 Measuring quality - why and how? 267
15.3 Analysing administrative sources - input data quality 271
15.4 Output data quality 278
15.5 The integration process - integration errors 279
15.5.1 Creating register populations - coverage errors 280
15.5.2 Creating statistical units -errors in units 282
15.5.3 Creating statistical variables - errors in variables 283
15.6 Random variation in register data 288
15.7 The register system and data warehousing 291
15.8 Conclusions 295
Chapter 16 Conclusions 297
References 301
Index 305
CHAPTER 2
The Nature of Administrative Data
Administrative data are data used to administer individual objects. Statistical data are data used to produce estimates for aggregates of units.
A national statistical office sometimes collects data on turnover and other economic variables for a sample survey of enterprises. The enterprises take data from their administrative registers and send them to the statistical office. For the enterprises, these data are administrative data; but for the statistical office, the same data are statistical data.
Administrative registers are used for administrative purposes in an administrative information system. An administrative register should contain all objects to be administered; the objects are identifiable and the variables in the register are used for administrative purposes. The register of all yearly income self-assessments from persons is an example of an administrative register that is maintained by the tax authorities. It is used to decide the income tax that should be paid by each individual person in the register. When this register is delivered to the statistical office it becomes a statistical register as it will now be used to produce estimates.
When we discuss quality issues, we distinguish between statistical data that are based on administrative registers and statistical data that have been collected for sample surveys or censuses by the statistical office.
2.1 Different kinds of administrative data
Data that have been collected or created by administrative authorities can be of different nature. Some data are actually statistical data, if the authority wants to produce its own statistics. For example, the Swedish Public Employment Service produces its own statistics on job seekers and some variables collected from the job seekers are actually statistical data.
Other kinds of variables are legally important – if you provide the wrong information for these, you have done something illegal and can be punished. Income assessments and tax returns are examples of this kind of data.
A third category of variables represents decisions made by an authority. For example, the Tax Board decides on taxable income and the amount of tax that should be paid; a court decides that a person is guilty of violating a certain law and should receive a specific punishment; social authorities decide that a family is entitled to receive some kind of benefit and set the amount of money they will receive.
Among these different kinds of administrative data, statistical data and data of no legal importance as a rule are of the lowest quality, while legally important data and decisions made by an authority are of the highest quality. The quality of administrative data is important for the individual’s rights and obligations in contrast to statistical data that do not have any consequences for the respondent.
We can take the administration in a manufacturing enterprise as an example of administrative data that are purely administrative in nature: A customer phones and asks if enterprise X can deliver a certain quantity of a certain commodity. How much will it cost and when can it be delivered? After negotiations, the following administrative data have been created:
Customer identity: cccc Item number: aaaa Quantity: qqqq Price: pppp Delivery date: ddddThis kind of administrative data can be used afterwards for a register survey on sales. It should be observed that there is no measurement here and no collection of data – data are generated during the administrative process. A statistical measurement is of a quite different nature – the true values of the variables exist first, and then we measure and collect the data.
2.2 How are data recorded?
Three technical issues have consequences for the quality of administrative data:
– How are identity numbers and other identifying variables recorded? – How are other variables recorded? – Can data be revised?Identity numbers can be handwritten by an employee at the administrative authority or by the person in contact with the authority. Quality will be low in both cases. A better alternative is to pre-print the identity number, name and address on the administrative form that is sent by mail to the person or enterprise from which information is requested. The best alternative is that the contact with the authority starts with an online identity check against a register that has identity number, name and address.
Example: Births are registered by midwives or doctors in some Latin American countries. A paper form with a pre-printed unique identity number of the birth is filled in by hand. Many errors are found regarding the mother’s identity number, name and address.
Example: Tax forms for yearly income tax of persons are sent by post to all potential tax payers in Sweden. The identity number, name and address are from the National Tax Board’s Population Register that is updated every day.
Example: When you contact a Swedish hospital (private or public), a nurse checks your identity number online against a copy of the National Tax Board’s Population Register. Diagnosis and treatment are recorded together with this identity number. This identity number and the county where the patient is registered as permanently living are essential for economic transactions.
Other variables than identities can be recorded on a paper form; but the best alternative is recording by a PC-system or internet system that checks the data as they are recorded. Errors are corrected in this way and further editing will be easier.
Taxpayers must be allowed to make corrections and send in revised tax forms. All kinds of tax reports contain such corrections that can be sent to the tax authorities over a period that can be quite long. The statistical office should analyse the inflow of these corrections for each source and determine a point in time when statistics production should start. Corrections delivered after that time will be disregarded at the statistical office.
2.3 Administrative and statistical information systems
Using administrative data for statistical purposes is not something specific to national statistical offices. It is also common practice in large enterprises and organisations. Administrative systems are generally used as sources of statistical information and there is no major difference between the following enterprise example and register-based statistics at a national statistical office:
– Statistics on staff and salaries within an enterprise can be produced using the personnel management system. – Population and income statistics are produced at a statistical office using data from the National Tax Board’s tax collection system for population registration and tax assessment.Register surveys have become increasingly common within enterprises and organisations. Knowledge about register systems, register-based statistics and register quality is needed not only within a national statistical office but also more generally. This is illustrated by the following extract from a job advertisement:
Market analyst
As an analyst in the marketing department, you will be an important cog in the wheel of our enterprise’s continued growth. You will manage and develop the use of one of the enterprise’s most valuable assets – our client register.
You will work with campaign analyses, drafting reports, segmenting and ensuring the quality of the register. You will maintain contact with external register systems and work closely with the marketing manager.
Certain information systems are built solely for statistical purposes, such as the Labour Force Survey, which are conducted in many countries. These systems can therefore be completely designed according to statistical principles.
Other information systems are used for administrative as well as statistical purposes, which can sometimes lead to conflicts with regard to the structure of the system. In general, these systems are primarily intended for administrative purposes and the statistical information is a by-product.
However, there are several differences between a pure administrative system and a pure statistical system. These two kinds of systems are compared below.
Different purposes
Information in an administrative system is used as a basis when taking administrative measures and decisions that will affect the objects in the system.
Example: A personnel management system is used to carry out salary payments every month. For each employee, a decision is made regarding how much should be paid for the specific month.
Information in a statistical system is used as the basis for analysis and drawing conclusions. These conclusions can serve as the basis for policy-related decisions.
Example: A statistical salary system is used to study salary structure. How has this changed? What are the differences in monthly salaries between different staff categories? This analysis could then involve a change in policy relating to salary issues, for example that women should be better paid.
Different roles for individual objects
In an administrative system, decisions are made and measures are taken with regard to individual objects. To...
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.