About Weighting

Name
of the Cross-sectional Weighting Variables
|
|
1992 |
1993 |
1994 |
1995 |
1996 |
1997 |
| Individual |
Children |
SULY92 |
GYEKE93S |
GYEKE94S |
GYEKE95S |
GYEKE96S |
GYEKE97S |
|
Adult |
SULY92 |
EGYKE93S |
EGYKE94S |
EGYKE95S |
EGYKE96S |
EGYKE97S |
|
Combined |
SULY92 |
KKSULY93 |
KKSULY94 |
KKSULY95 |
KKSULY96 |
KKSULY97 |
| Household |
|
SULY92 |
HAZKE93S |
HAZKE94S |
HAZKE95S |
HAZKE96S |
HAZKE97S |
Name
of the Longitudinal-Panel Weighting Variables
|
|
1992 |
1992-93 |
1992-94 |
1992-95 |
1992-96 |
1992-97 |
| Individual |
Children |
SULY92 |
GYELO93S |
GYELO94S |
GYELO95S |
GYELO96S |
GYELO97S |
|
Adult |
SULY92 |
EGYLO93S |
EGYLO94S |
EGYLO95S |
EGYLO96S |
EGYLO97S |
|
Combined |
SULY92 |
KLSULY93 |
KLSULY94 |
KLSULY95 |
KLSULY96 |
KLSULY97 |
| Household |
|
SULY92 |
HAZLO93S |
HAZLO94S |
HAZLO95S |
HAZLO96S |
HAZLO97S |
About weighting
Originally HHP consisted
of two samples: a Budapest subsample and a national sample. The national
sample consisted of approximately 2000 hhs and the capital subsample about
600. This two subsamples were treated separately till the 4th
wave. Because of the lowered sample size, it was necessary to unify the
subsamples.
After finishing the
HHP study a new version of database was created and the two subsamples
were merged from the first wave.
1st
wave, 1992 (Individual and household weight)
-
The fact that the Hungarian
capital was oversampled in the new, unified sample made necessary a correctional
weight for the first wave (SULY92). The original 2050 hhs national sample
for the first wave was supposed to be a representative sample of non-institutional
Hungarian households and individuals living in these households.
-
Households living in Budapest
and their member have a weighting value of 0,4. Other (non-Budapest) households
got a value of 1. This value is universal in terms that each household
member got the same value as his or her household. Since there is no difference
between individual and household weights, only one variable was created.
This weight is called as SULY92. Every each individual and household participated
in the first wave has a weight value of 1 or .40 on this variable.
-
Note this results a lower
weighted sample size (N=2050), than the unweighted sample size (N=2668).
The weighted sample size is equal to the original national sample size.
Unweighted and
weighted distribution of households by settlement type, HHP 1st
wave – 1992

Cross-sectional
weights for adults (EGYKEnnS), for children (GYEKEnnS) and the combined
individual weight (KKSULYnn), 2nd-6th wave
-
Working out the principle
of weights, it was supposed that there are a "natural" panel aging:
i.e. individuals die, and born, etc. Beside natural aging there is sample
attrition which should be corrected by introducing weights, for certain
group of people. By attrition we mean that some household dropped out from
the sample, because they refuse to cooperate or move to unknown places,
and can not be followed.
-
Attrition than was corrected
according to the main socio-demographic characteristics of sample dropouts.
These characteristics are gender, age, education and settlement type.
Correction hereby mean an adjustment to the sample composition of previous
wave, rather than a correction to an external source like census or microcensus.
Consequently, second wave was adjusted to 1st wave, 3rd
wave to 2nd wave, etc.
-
However, this solution
has some shortcomings. Weights were computed only for those who were aged
16 or more AND completed individual or substitution questionnaire. There
are no adult individual weights for children aged under 16,
and for those who did not complete individual questionnaire and no substitution
questionnaire was administered. It is because educational information
are available only from that sources, and because young children simply
do not have any educational credential.
-
To smooth out that problem
a
separate weight variable was introduced for children. (GYEKEnnS)
It is simply their household weight. The computation of household weights
will be discussed below.
-
However there are still
some individual who has no individual cross-sectional weights. Their number
is ranged between 13 and 50 during the panel period. These persons are
invisible for statistical procedures, when the data set is weighted.
-
The following variables
were used during the definition of weights:
|
Variable
|
Categories
|
| Gender |
1 – male
2 – female |
| Age |
1 – under
16 years old
2 – 16-29 years
3 – 30-39 years
4 – 40-49 years
5 – 50-59 years
6 – aged 60 and over |
| Education |
1 – maximum
primary
2 – vocational
3 – high school (secondary level)
4 – college & university (tertiary
level) |
| Residence
/ settlement type |
1 – village
(including homesteads)
2 – towns
3 – capital (Budapest) |
-
The weights are supposed
to correct sampling attrition for the above specified (and broader) categories.
If you perform analyses employing variables with narrower categories (like
treating separately college degree from university level education), the
results might be affected by sampling attrition.
-
Based on adults' and children
weights a combined (cross-sectional) individual weight (KKSULYnn)
was created, so one can weight the total individual sample of HHP. This
variable is simply the combination of the two weighting variables i.e.,
the value EGYKEnnS for adults and GYEKEnnS
for children.
Cross-sectional
household weights, 2nd-6th wave (HAZKEnnS)
-
Household weights were
computed as the mean individual adult weights. By this we suppose that
socio-economic position of households are dependent on their adult members’
resources. This solution is only one of the possible solutions, and it
does not necessarily follows that this is the best one. Other option can
be to use the individual weight of household head, as was the former praxis
in TARKI before the introduction of this new weighting system.
-
Again, it may cause some
problem that some households do not have weight. It is because of missing
individual weights. Specifically it means that in wave 3 and wave 5 two
households have no weights.
Longitudinal
weights
-
Household and individual
level longitudinal weights were constructed for those, who participated
in more than one waves, and were already member in the starting panel (1st
wave) sample. For example longitudinal weight for the 3rd wave
is available for those who participated in 1st, 2nd
and 3rd wave. Similarly, longitudinal weight for the last (6th)
wave refers to that subsample of individuals who were participants in the
whole panel period (from the 1st to the 6th wave).
-
Individual longitudinal
weights (EGYLOnnS) for persons aged 16 and older were computed as the product
of cross-sectional individual weights.
-
Longitudinal household
weights
(HAZLOnnS) were computed as the mean of longitudinal individual
weights of persons living in the same household.
-
Longitudinal weights for
children (GYELOnnS) were created in the same way as their cross-sectional
weights were defined. Children simply got their household longitudinal
weight.
-
Similar to combined individual
cross-sectional weights a combined individual longitudinal weight variable
(KLSULYnn) was defined as the combination of EGYLOnnS and GYELOnnS.