Sample analysis using the ICCS data An application of HLM
Download
Report
Transcript Sample analysis using the ICCS data An application of HLM
Sample analysis using the ICCS data
An application of HLM
Daniel Caro
November 25
ICCS 2009 IDB Seminar – Nov 24-26, 2010 – IEA DPC, Hamburg, Germany
Purpose
Illustrate the use of hierarchical linear models (HLM)
with ICCS 2009 data through the evaluation of specific
hypotheses
2
Table of contents
HLM theory
Applied research example
HLM data importing/estimation settings
Hypothesis testing
3
Data structure
Often participants of studies are nested within specific
contexts
Patients treated in hospitals
Firms operate within countries
Families live in neighborhoods
Students learn in classes within schools
Data stemming from such research designs have a
multilevel or hierarchical structure
4
Implications of research design
Observations are not independent within
classes/schools
Students within schools tend to share similar characteristics
(e.g., socioeconomic background and instructional setting)
Traditional linear regression (OLS) assumes:
Correlation (ei,ej)=0, i.e., the ≠ between observed and
predicted Y are uncorrelated
Ignoring dependence of observations may lead to
wrong conclusions
5
Intra-class correlation coefficient
The intra-class correlation coefficient (ICC) measures
the degree of data dependence
It is equal to the proportion of the variance between
schools, i.e., ICC = s2 b / (s2 b + s2 w)
where s2 b is the variance between schools and s2 w the
variance within schools or between students
If ICC = 0, responses of students within schools are
uncorrelated
Si ICC= 1, responses within schools are identical
6
Effective sample size
A higher ICC value indicates greater dependence among
observations within schools
Effective sample size is smaller than observed sample size
Effective n= mk / (1 + ICC*(m-1))
where n=sample size, m= number of students per schools and
k= number of schools
If ICC=1, effective n is equal to the # of schools (k)
If ICC=0, effective n is equal to the observed n (i.e., mk)
In general, effective n lies between k and mk
7
Limitations of OLS
OLS neglects ICC and considers standard errors based
on observed n
But effective n is smaller than observed n when observations
are correlated
Standard error is inversely proportional to n
Thus, OLS tends to underestimate the standard error
Underestimated standard errors can lead to incorrect
significance tests and inferences
The JRR method produces correct standard errors under
a multilevel research design
8
Hierarchical linear models
Additionally, hierarchical linear models distinguish
effects between and within clusters/schools
For example, they enable evaluating
The effect of SES on student achievement within schools and
between schools
The effect of school location (urban/rural) on the average
achievement between schools
9
Hierarchical linear models
Account explicitly for the multilevel nature of the data
with the introduction of random effects
Consider ICC for calculation of standard errors, tests, and pvalues
Decompose variance within and between schools
Student level variables explain variance within schools or
between students
School level variables explain variance between schools
A single R-squared cannot be reported
Instead, there is one for each level
10
Hierarchical linear models
Estimate regressions within schools
Provide estimates of the intercept and coefficients (e.g.,
gender gap, SES effect) for each school
Level 1 (students) coefficients may depend on level 2
(schools) characteristics as if they were dependent
variables
For example, the gender gap at the student level (i.e., gender
coefficient) may vary between classes for the gender of the
class teacher at level 2
11
Table of contents
HLM theory
Applied research example
HLM data importing/estimation settings
Hypothesis testing
12
Research goal
Evaluate 10 hypotheses related to the attitudes of
students towards equal rights for immigrants
The literature underscores the importance of:
Family SES, participation in diverse networks, intergroup
discussion about civic issues, gender, social dominance
orientation, civic knowledge, religion beliefs, the school
location (urban/rural), the school climate
References in ‘C:\ICCS2009\HLM training\References.pdf’
For each hypothesis
Theory and independent variables
13
Related data and variables
Selected country
England
The analysis is restricted to international
scales/variables
A description of the dependent and independent
variables, their type, coding scheme, and source is in
C:\ICCS2009\HLM training\List of variables.pdf
The student (england1.sav) and class level
(england2.sav) datasets are in
C:\ICCS2009\HLM training\Data
14
Data structure
Students (level 1 units) are nested in classes (level 2
units)
The ICCS sample design yields an optimal sample of
students within classes, and not optimal sample of
students within schools
Usually one class was selected within each school,
rather than students across different grades
15
NOTE
This is a didactic example only. You will not be able to
readily repeat this analysis during the presentation
16
Table of contents
HLM theory
Applied research example
HLM data importing/estimation settings
Hypothesis testing
17
HLM software
HLM estimates different type of hierarchical linear
models
The applied example is for two-level models (student nested
in classes)
Several steps are required to estimate a model:
Creating data specifications file (.mdmt)
Importing data to HLM (.mdm)
Deciding on settings (e.g., weights, plausible values)
Specifying model (.hlm)
Estimating model
18
Beginning with HLM
19
Data specifications (.mdmt)
20
Selecting student level data
21
Missing data
HLM accepts multiply imputed datasets
Multiple imputation (MI) procedure is performed in another software
Consult NORM, PAN, MICE in Stata and R, for example
Since missing data are normally not completely at random, it is
recommended to conduct MI before model estimation
But for this example we will use available data, only
HLM offers two options at level 1
Listwise deletion (making mdm): Sample is the same for all models
Pairwise deletion (running analysis): Sample depends on included variables
Missings at level 2 reduce substantially the sample size
22
Selecting class level data
23
Save data specifications (.mdmt)
24
Create data file (.mdm)
25
Check stats
26
Add dependent variable
27
Declare weights
28
Save null model
29
Run null model
30
View output
31
Interpret and save
Folder:
‘C:\ICCS2009\HLM training\Models\model0.txt ’
Class variance=12.14; Student variance=103.99
ICC=12.14/(12.14+103.99)=0.11
11% of differences occur between classes
32
Table of contents
HLM theory
Applied research example
HLM data importing/estimation settings
Hypothesis testing
33
Hypotheses
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
The SES Hypothesis
The Contact Hypothesis
The Intergroup Discussion Hypothesis
The Gender Hypothesis
The Social Dominance Orientation Hypothesis
The Learning Hypothesis
The Religion Belief Hypothesis
The National Identity Hypothesis
The Urban/Rural Differences Hypothesis
The School Climate Hypothesis
34
The SES Hypothesis
The SES hypothesis predicts more positive views of
minorities among students of higher SES families than
among students of lower SES families
Competition among low SESs
High SESs travel and confront culturally diverse realities
Independent variables
Parental education (HISCED)
Parental occupational status (HISEI)
35
The SES Hypothesis
36
Centering of Xs
The intercept is the expected value of Y when Xs are
zero
E(Y(Xs=0))=E(β0j)+β1j*0+ β2j*0+…+ βkj*0 +E(rij)
Since E(rij) and E(uoj) are zero => g00=Y(Xs=0)
But sometimes zero is not in the range of Xs
If X is age, achievement score, etc.
Here, the intercept is not interpretable
By centering the Xs, the intercept can be interpreted as
the expected value of Y at the centering value(s) of Xs
37
37
Centering of Xs
Two options at level 1
Grand and group (class) mean centering
The type of centering depends on the research interest
(Enders & Tofighi, 2007; Raudenbush & Bryk, 2002)
Group mean centering is appropriate for unadjusted or pure
within and between school effects
Grand mean centering yields school effects adjusted for
student characteristics and is preferable for contextual effects
38
38
The SES Hypothesis
39
The SES Hypothesis
The hypothesis is supported by the parental education
data
Effect size? (see stats and model estimates)
For a 1 SD increment in HISCED, IMMRGHT increases in
0.67 (1.04*0.64), that is, about 6 percent (0.67/10.75)
of a SD in IMMRGHT
40
The Contact Hypothesis
The contact hypothesis anticipates greater tolerance among
students participating in diversified and extended social networks
(Allport, 1954; Cote & Erikson, 2009)
Independent variables
Students' civic participation in the wider community (PARTCOM)
Students' civic participation at school (PARTSCHL)
Control for SES
Higher SES have more diversified social networks (Erickson, 2004) and are
more active in voluntary associations (Curtis & Grabb, 1992)
41
The Contact Hypothesis
42
The Contact Hypothesis
The hypothesis holds in England
Both students' civic participation in the wider community (PARTCOM) and
students' civic participation at school (PARTSCHL) are positively related to the
attitudes toward immigrants
For a 1 SD increment in the independent variables, the associated positive
change in IMMRGHT amounts to
7 percent of SD in IMMRGHT for PARTCOM
11 percent of SD in IMMRGHT for PARTSCHL
43
The Intergroup Discussion Hypothesis
The intergroup discussion hypothesis posits that more positive
attitudes toward minorities develop from dialogue on social and
civic issues inside and outside the school (Dessel, 2010a)
Independent variables
Students' discussion of political and social issues outside of school
(POLDISC)
Student perceptions of openness in classroom discussions (OPDISC)
Control variables
Parental education (HISCED)
44
The Intergroup Discussion Hypothesis
45
The Intergroup Discussion Hypothesis
The hypothesis is validated by the data
Both students' discussion of political and social issues outside of
school (POLDISC) and student perceptions of openness in
classroom discussions (OPDISC) are positively related to
IMMRGHT
For a 1 SD increment in the independent variables, the associated positive
change in IMMRGHT amounts to
9 percent of SD in IMMRGHT for POLDISC
18 percent of SD in IMMRGHT for OPDISC
46
The Gender Hypothesis
The gender hypothesis predicts greater tolerance
among girls than boys. Women tend to be more liberal,
nurturing and social than men and are also expected to
be more tolerant (Cote & Erikson, 2009; Gidengil, Blais, Nadeau, &
Nevitte, 2003)
Independent variable
The student’s sex (GIRL)
47
The Gender Hypothesis
48
The Gender Hypothesis
The gender hypothesis holds in England
Differences between girls and boys amount to 2.24
score points in the IMMRGHT scale, that is, 21 percent
of a SD in IMMRGHT
49
The Social Dominance Orientation Hypothesis
The social dominance orientation (SDO) hypothesis
states that gender differences are partly explained by a
differences in support for social inequality (Mata, Ghavami, &
Wittig, 2010).
Independent variables
Female (GIRL)
Students' support for democratic values (DEMVAL)
Students' attitudes towards gender equality (GENEQL)
Students' attitudes towards equal rights for all ethnic/racial
groups (ETHRGHT)
50
The Social Dominance Orientation Hypothesis
51
The Social Dominance Orientation Hypothesis
The hypothesis is supported by the data
When proxies for social dominance orientation are
included, gender differences are no longer significant
52
The Learning Hypothesis
The learning hypothesis predicts greater tolerance
when individuals know more about minorities and civic
issues in general (Cote & Erikson, 2009)
Independent variables
Civic knowledge (PV1CIV)
Control for participation (Curtis & Grabb, 1992; Erickson, 2004)
Students' civic participation in the wider community
(PARTCOM)
Students' civic participation at school (PARTSCHL)
53
The Learning Hypothesis
54
The Learning Hypothesis
The learning hypothesis holds in England
Students showing higher knowledge in civic issues also
have more positive attitudes toward immigrants even
when civic participation is controlled
A 1 SD increment in PV1CIV is associated with a positive
increase in IMMRGHT of about 22 percent of a SD
55
The Religion Belief Hypothesis
The religion belief hypothesis anticipates an association between
holding religious beliefs and tolerance toward minorities (Hall, Matz,
& Wood, 2010; Schwartz & Huismans, 1995). The direction of the
association is not clear
Negative for values of social conformity, tradition, conventionalism, and an
authoritarian belief system
Positive for humanitarianism, values of benevolence toward others, and a
search for spiritual meaning
Independent variables
Students' belonging to a religion (RELIG),
Students' attitudes towards the influence of religion on society (RELINF)
Control variables
Parental education (HISCED)
56
The Religion Belief Hypothesis
57
The Religion Belief Hypothesis
The hypothesis is not supported by the data
The RELIG coefficient is non-significant
The RELINF coefficient is positive and significant, suggesting that
students attaching a greater value to the influence of religion in
society also share more positive attitudes toward immigrants. But
the association with RELINF alone does not evaluate the
hypothesis
58
The National Identity Hypothesis
The National Identity Hypothesis maintains that
individuals are less tolerant of immigrants when they
have a greater sense of national identity
Independent variables
Students' attitudes towards their country (ATTCNT)
Control variables
Parental education (HISCED)
59
The National Identity Hypothesis
60
The National Identity Hypothesis
The hypothesis is not supported by the data
61
The Urban/Rural Differences Hypothesis
The urban/rural hypothesis anticipates more positive
views of minorities in urban areas than in rural areas
(Côté & Erickson, 2009) due to greater opportunities to meet
socially and culturally diverse people in cities (Erickson, 2004)
Independent variable
School location (RURAL)
Control variables
School level SES
School mean parental education (MHISCED)
School mean parental occupational status (MHISEI)
Availability of resources in local community (RESCOM)
62
The Urban/Rural Differences Hypothesis
63
The Urban/Rural Differences Hypothesis
The hypothesis is not supported by the data
The RURAL coefficient is non-significant
64
The School Climate Hypothesis
The school climate hypothesis states that a safe and
positive school climate favors more positive attitudes
toward minorities. Such climate contributes to reduce
the anxiety and threat underlying anti-minority
attitudes (Comerford, 2003; Dessel, 2010b; Moradi et al., 2006)
Independent variables
Teachers' perceptions of classroom climate (TCLCLIM)
Teachers' perceptions of social problems at school (TSCPROB)
Controls
School average parental education (HISCED)
Availability of resources in local community (RESCOM)
65
The School Climate Hypothesis
66
The School Climate Hypothesis
The hypothesis cannot be supported by the data
67
Questions?
THANK YOU FOR YOUR ATTENTION
[email protected]
68