Social Science Research 53 (2015) 73–88

Contents lists available at ScienceDirect

Social Science Research journal homepage: www.elsevier.com/locate/ssresearch

Heritability, family, school and academic achievement in adolescence Artur Pokropek a,⇑, Joanna Sikora b,1 a b

Institute of Philosophy and Sociology, Polish Academy of Sciences, Nowy S´wiat 72, 00-330 Warsaw, Poland School of Sociology, Research School of Social Sciences, College of Arts and Social Sciences, Australian National University, Canberra, ACT 2601, Australia

a r t i c l e

i n f o

Article history: Received 23 January 2014 Revised 16 April 2015 Accepted 10 May 2015 Available online 16 May 2015 Keywords: Twin studies Genetically informed designs Administrative exam data Latent-class modeling Heritability in education

a b s t r a c t We demonstrate how genetically informed designs can be applied to administrative exam data to study academic achievement. ACE mixture latent class models have been used with Year 6 and 9 exam data for seven cohorts of Polish students which include 24,285 pairs of twins. Depending on a learning domain and classroom environment history, from 58% to 88% of variance in exam results is attributable to heritability, up to 34% to shared environment and from 8% to 15% depends on unique events in students’ lives. Moreover, between 54% and 66% of variance in students’ learning gains made between Years 6 and 9 is explained by heritability. The unique environment accounts for between 34% and 46% of that variance. However, we find no classroom effects on student progress made between Years 6 and 9. We situate this finding against the view that classroom peer groups and teachers matter for adolescent learning. Ó 2015 Elsevier Inc. All rights reserved.

1. Introduction While educational research on twins has a long history, it remains a fairly narrow and specialized niche based mostly on twin surveys or twin registers data. The ongoing fascination with twin data methodologies arises from their potential to tease out the role of heritability in the effects of family background on educational outcomes of youth. Heritability is usually off the radar in mainstream social science research and yet it is implausible to assume that it has no role in parental influence on offspring. This is why the last two decades have seen a reinvigorated interest in educational studies of twins which are often described as ‘genetically informed designs’ (Freese, 2008). In this paper we show how methodologies developed for twin data can be applied to other data, retaining full benefits of genetically sensitive research. This may have important implications for educational research in countries where twin data are not readily available. As such, our methodology is likely to facilitate an expansion of genetically informed educational research beyond the studies from the USA, the UK and Australia, which currently dominate the literature in this area. This study showcases a unique approach because our empirical evidence comes from the entire cohorts of 13- and 16-year-old Poles who sat compulsory placement exams in mathematics and humanities between 2002 and 2011. We show how such longitudinal data can be used to generate reliable quasi-genetic designs. Compulsory exams are already a norm in

⇑ Corresponding author. Fax: +48 228267823. 1

E-mail addresses: [email protected] (A. Pokropek), [email protected] (J. Sikora). Fax: +61 261252222.

http://dx.doi.org/10.1016/j.ssresearch.2015.05.005 0049-089X/Ó 2015 Elsevier Inc. All rights reserved.

74

A. Pokropek, J. Sikora / Social Science Research 53 (2015) 73–88

many countries and are likely to be introduced in others, in line with the worldwide standardization and benchmarking advocated by intergovernmental organizations such as the UNESCO or the OECD (Kamens and McNeely, 2010). We make a contribution to genetically sensitive educational research not only by applying twin study methodologies to high-stakes exam data but also by considering learning gains made by youth in early adolescence. We begin with a description of the history and logic of genetically informed designs. Then, we review the literature on twins and educational achievement. Next, we describe the Polish education system, our research questions, data and methods. Our results follow and we conclude with a discussion of their compatibility with prior literature. 1.1. Genetically informed designs involving twin data Worldwide, one to four percent of live births are twins. The proportions of twins vary by country and region. For instance in Poland in 2010, 2.5% of newborns were twins (CEB, 2002–2011) while in the United States the comparable figure stood at 4% (Plomin et al., 2013). Although many twins are born smaller than single babies, various studies show that twins do not differ consistently from other children in physical, psychological or social characteristics (Plomin et al., 2013). If differences exist, they peter out by the time children turn five (Evans and Martin, 2008). Twins are either identical (monozygotic or MZ) or fraternal (dizygotic or DZ). MZ twins come from one egg and one sperm. DZ twins come from two eggs and two sperms. Identical twins are very similar and are always of the same sex. Genetically, the status of fraternal twins resembles that of other siblings; they grow up in the same environment and share on average about 50% of their segregating genes (Plomin et al., 2013). As DZ twinning is heritable and MZ twinning is random, the proportion of fraternal twins varies by population while the proportion of identical twins is relatively constant in different populations across the world (Bortolus et al., 1999). The information summarized above is used in behavioral genetics to build models which partition variance in a range of outcomes, including education, into components which depict environmental and genetic influences. The next section describes the typical logic of such models and reasons for their popularity. 1.2. Classical twin study designs Behavioral geneticists routinely employ three types of comparisons (Plomin et al., 2013). First, they compare identical twins that grow up in the same family with fraternal twins who share as much environment but only 50% of their segregating genes. Second, dizygotic twins are compared with adopted children who share environment but no genotype. Third, the most informative type of comparison involves identical twins who were separated at birth and thus differ entirely in the effects of environment but not genotype. Comparisons of correlations between particular traits of such twins enable teasing out all environmental influences, including the part attributable to shared environment. A typical behavioral genetics study assesses the variance in behavior, or phenotype, and then partitions this variance into three components: heritability (h2), shared (common) environment (c2) and unique environment (e2). To obtain the estimates of these components behavioral genetics uses information about within-pair correlations for MZ and DZ twins (for details see: Falconer, 1981; Jinks and Fulker, 1970). In the simplest decomposition scenario it is assumed that similarities between MZ twins arise due to fully shared genotypes and environments, while dissimilarities are caused by unique environment. The correlation for DZ twins is half of the correlation for MZ twins, as the former share on average about 50% of segregating genes. Therefore, in a simple model, heritability is equivalent to twice the difference between the MZ and the DZ correlations for the trait of interest. The contribution of shared environment is then found by subtracting the value of heritability from the correlation for identical twins. Finally, the remaining portion of the variance is attributed to unique environment (for details see Plomin et al., 2013). In practice these computations are conducted with advanced techniques which may take into account the interactions between all variance components as well as the measurement error that affects them. Sophisticated models can also utilize information about multiple family members or simultaneously consider a number of traits (Posthuma et al., 2003: 361). Regardless of estimation methods, all such models are conceptually based on within-pair trait correlation comparisons between the MZ and the DZ twins. In the interpretation of these models heritability refers to ‘‘the contribution of genetic differences to observed differences among individuals in a particular population at a particular time’’ (Plomin et al., 2013: 93). Shared environment involves all non-genetic influences that affect children from the same family in the same way (Asbury and Plomin, 2014). Nielsen (2006: 197) describes it as ‘‘background characteristics that stratification researchers presumably have in mind when they conceptualize mechanisms of social reproduction’’. They may include cultural possessions and tastes of parents, parental education, characteristics of within-family relationships, family wealth or income and much more. In short, all influences experienced in the same way by the children make up shared environment. In contrast, unique environment involves all experiences that affect each twin in a unique manner, as well as measurement error (Nielsen, 2006). They may involve illness, injuries, peer relationships or anything that influences one and not the other sibling. This includes perceptions and emotional reactions specific to each child.

A. Pokropek, J. Sikora / Social Science Research 53 (2015) 73–88

75

Overall, genetically informed designs seem to be the only method which enables a truly rigorous investigation of environmental factors as one twin is a comprehensive and exhaustive control variable for the other (Caspari, 1968: 48). Yet, most educational studies of achievement fail to acknowledge this research, which is summarized in the next section, even in their literature reviews.

1.2.1. Genetically sensitive studies of student achievement Seldom considered by sociologists, genetically informed designs are increasingly popular in health sciences, psychology and economics of education (Ashenfelter and Krueger, 1994; Bonjour et al., 2003; Bouchard and McGue, 2003; Loehlin and Nichols, 1976; Plomin et al., 2013). Twin data are occasionally used in research on student achievement but few address the respective roles of school and family environments in student outcomes, which is frequently attributed to lack of suitable data (Husén, 1959; Thompson et al., 1991). Table 1 summarizes the exemplars of studies in the last fifty years which utilized twin data and focused on educational achievement, understood as test scores or school grades. Overall, they tend to report that heritability explains much variation in the educational outcomes of students. The median estimate of variation in educational outcomes attributable to heritability in Table 1 is 55% with estimates around it varying between 30% and 90%. In contrast, the median estimate of shared environment is merely half of that, i.e. 22%, with the range of values in various studies between 0% and 60%, which is most likely due to different outcome variables of interest and ages of respondents. Apart from that, the summary in Table 1 suggests that heritability has a stronger influence on science and mathematics than on reading and humanistic outcomes. Furthermore, it stipulates that the impact of shared environment on educational outcomes of students decreases with age in favor of heritability. Shared environment is most consequential in primary school and later matters less. While this may seem counterintuitive, it is most likely attributable to egalitarian practices of modern schooling (Shakeshaft et al., 2013). The results of twin studies, which focus on academic achievement, resemble the results of studies devoted to cognitive ability and other psychological traits (Plomin et al., 2013: 186–229). They indicate that all human outcomes of interest to social sciences are largely heritable. With regard to more specific results, the few genetically informed studies of educational achievement which aim to understand if particular classroom or school environments affect students’ progress find a relatively small impact. Byrne et al. (2010) estimate that the maximum variance attributable to learning with a specific group of students and their teacher is 8% for literacy in kindergarten, Year 1, and Year 2. Similarly, Taylor et al. (2010) conclude that reading achievement of Years 1 and 2 students is influenced by genetics more than by family, school or students’ individual experiences. Likewise, Kovas et al. (2007) find no differences with regard to the impact of shared environment on reading, mathematics, and science achievement between 7, 9 and 10-year-olds who learn in the same or different classrooms. Overall, these studies suggest a need for reframing the common thinking which often conceptualizes classroom environments, i.e. teacher and peer influences, as crucial determinants of student learning differentials at all stages of education (Hattie, 2013). Relative to heritability, shared classroom experience matters less than is usually assumed. It is, therefore, more correct to attribute shared environment effects shown in Table 1 to family influences, at least for children in early stages of schooling. The review in Table 1 suggests that more research involving a larger number of countries and a wider age ranges of students is needed to understand how genetically informed designs enhance the knowledge of environmental impact on student outcomes at various stages of their education. However, few prior studies were able to partition environmental effects into family, classroom or school components. Moreover, such studies were often based on limited samples and were restricted to very young students and low-stakes tests. In contrast, this paper showcases an approach to undertaking genetically sensitive studies of educational achievement, in which learning gains of older students and high-stakes tests are of interest.

1.2.2. Goals of this paper In this analysis we set out to achieve four goals: 1. We demonstrate how data from a national exam register can be used in research designs which emulate the logic of classical twin studies. Such studies enable partitioning of variance in student achievement into the effects of heritability, shared environment and unique environment. 2. We compare the results of this analysis with conclusions of previous twin research and discuss how they can be reconciled with the conclusion that family background, peer groups and learning environments, including teachers, have a non-trivial influence on student educational success. 3. We quantify the extent to which the progress made by adolescents during a 3-year learning period might be described in terms of heritability and environmental factors. 4. We show how much variance in learning gains of adolescents can be attributed to different learning environments which comprise peer groups and teachers. We refer to them as ‘classroom effects’.

76

Table 1 Genetically informed studies of school achievement: examples. Dependent variable

Sample

Results

(Husén, 1959) Sweden (Loehlin and Nichols, 1976) USA (Thompson et al., 1991) USA (Alarcón et al., 2000) USA

End of school year results in school subjects Test scores in school subjects Standardized test scores Standardized test scores

MZ and DZ twins aged 13 850 pairs of MZ and DZ twins aged 16–17 850 pairs of MZ and DZ twins aged 6–12 1125 pairs of MZ and DZ twins aged 6–12

(Bartels et al., 2002) Netherlands (Oliver et al., 2007) UK (Wainwright et al., 2005) Australia (Nielsen, 2006) USA

Standardized test scores Teacher ratings of student achievement in mathematics and English Standardized test scores Grade Point Average (GPA)

(Kovas et al., 2007) UK

Web-based test scores in mathematics

(Haworth et al., 2008) UK

Teacher ratings of student scientific achievement Standardized test scores in reading and mathematics Standardized reading test scores (with gains in reading) Literacy skills and gains in literacy skills

306 pairs of MZ and DZ twins aged 12 3296 pairs of same sex MZ and DZ twins aged 7 582 pairs of MZ and DZ twins aged 15–18 460 pairs of MZ and DZ twins, 242 pairs of siblings and 105 pairs of cousins; grades 7 through 12 2674 pairs of low performing MZ and DZ twins aged 7 2602 pairs of MZ and DZ twins aged 9

Heritability 30–60%, depending on school subject Heritability 30–40%, depending on school subject Heritability 30%; shared environment: 60% Heritability of mathematics performance: 90% and 80% of general cognitive ability Heritability: 57%; shared environment: 27% Heritability: 66–70%; shared environment: 6–7%, depending on the sample and subject Heritability: 76%; shared environment: 14% Heritability: 67%; shared environment: 0.2%

(Hart et al., 2009) USA (Taylor et al., 2010) USA (Byrne et al., 2010) Australia, USA (Calvin et al., 2012) UK & Netherlands

(Shakeshaft et al., 2013) UK

Standardized test scores in English, mathematics and science (UK); arithmetic and Dutch (Netherlands) UK-wide examination, General Certificate of Secondary Education (GCSE)

314 pairs of MZ and DZ twins aged 6 806 pairs of MZ and DZ twins, elementary school ages 711 pairs of MZ and DZ twins in kindergarten, Years 1 and 2 Population based; unknown zygosity; primary school children aged 8–12 5746 pairs of MZ and DZ twins aged 16

Heritability: 30–60% depending on type of mathematics Heritability over 62%; shared environment: 14%. Heritability: 0–63%; shared environment: 15–52%, depending on type of test Heritability: 47%; shared environment: 37% Classroom effects: 8% Heritability: 36% (UK), 74% (Netherlands); shared environment: 29% (UK), 35% (Netherlands) Heritability of overall GCSE scores 58%; English 52%; mathematics 55%; science 58%; shared environment: 22–36%, depending on type of test

A. Pokropek, J. Sikora / Social Science Research 53 (2015) 73–88

(Author(s), year) country

77

A. Pokropek, J. Sikora / Social Science Research 53 (2015) 73–88

Our first objective requires a description of the data employed which we undertake in the next section. We also briefly comment on the key features of the Polish education system and the role of standardized exams in post-primary educational careers of Polish students. 2. Data In this paper, we use the administrative data from the Polish Government’s Central Examination Board which administers compulsory national exams. Following the educational reform of 2001, all Polish students sit exit exams at the end of primary and lower secondary school (i.e. Year 6 and Year 9 or at ages 13 and 16). Year 9 exams have a deciding impact on students’ placements at the next stage of education which may lead to university entry. Pre-tertiary education in Poland comprises three levels, in line with most European countries. Upon completing 6 years of primary schooling, students undertake 3 years of lower secondary schooling and then enter stratified upper secondary education which comprises academic as well as vocational streams. Exams in Poland are taken by all students except for those affected by severe mental disabilities. The first exit exam, comprehensive in its coverage, is administered at the end of Year 6. It can influence the chances of entry to out-of-area schools in Year 7. In Year 9 students sit two exams, one covering topics in languages, culture and history and the other focused on mathematics and sciences. The results of these exams are critical for student placement in upper secondary programs which is often fiercely competitive, particularly in highly reputed schools in urban areas. We assume, following procedures described below in more detail, that students with the same surname, place and date of birth are twins. Before commencing our analyses, we had to exclude some students, which reduced the number of potential twins to 48,870. First of all, we excluded individuals in classes which had fewer than 10 students, because in Poland a small class size indicates that schooling occurs in hospitals, correctional centers or other special settings. Second, when triplets occurred we randomly deleted one sibling from the set. Finally, we deleted all pairs of twins where each sibling attended a different primary school. Most primary school students in Poland attend schools within their residential area as by law each student is assigned to their local school. It is possible to attend some other school but this requires a formal and legitimate parental request as well as consent of an out-of-area school. Such arrangements are more common in secondary schools in large cities (Dolata, 2008). Therefore, apparent twins who attend different primary schools may actually not be twins. This is why we decided to drop such cases rather than risk introducing bias. This left us with 24,285 pairs of twins i.e. 48,570 individuals. It is reassuring that even with these restrictions our data closely resemble the Central Statistical Office of Poland’s estimates of twin births within the corresponding cohorts (Table 2). Even though we have fewer exam-taking twins than the proportions of twin births reported, by about 15% given that 90% of the relevant cohort sit each exam, such discrepancies are only to be expected (Martin and Martin, 1975). They are most likely attributable to post-birth separations of twins, changes of names, or, in small numbers, to deaths, as twins are more likely to die in early stages of life (Evans and Martin, 2008). It is also noteworthy that in that time period, shortly before and just after the accession of Poland to the European Union, many families with children emigrated in search of better employment conditions. The Central Statistical Office of Poland (CSO) reported that in 2010 almost 2 million of Polish citizens, registered as permanent residents, were temporarily living abroad. About 11% of those emigrants were children aged 14 and younger, of whom 75% were born in Poland (CSO, 2012). Apart from this, the key question for us is to know how often we misclassified students as twins, when, in fact, they were not twins at all. To estimate this we conducted a validity check. Following the exams in 2011, we contacted 100 randomly chosen lower-secondary schools from our twin dataset and asked over the phone whether the students we classified as twins

Table 2 Seven cohorts of twins and students sitting exams and twin birth numbers reported by the Central Statistical Office of Poland. Source: Own data and (CEB, 2002–2011) Year of birth

1989 1990 1991 1992 1993 1994 1995 Total

Primary school exam year (Year 6 students)

2002 2003 2004 2005 2006 2007 2008

Lower secondary exam year (Year 9 students)

2005 2006 2007 2008 2009 2010 2011

Central statistical office of Poland data

Exam data

Total live births

Students taking exam

Twin births

% of live births

Twins taking exam

N

%

562,530 545,817 545,954 513,616 492,925 481,285 433,109

10,500 10,532 11,149 10,111 9889 9816 8468

1.87 1.93 2.04 1.97 2.01 2.04 1.96

512,929 506,649 490,533 465,894 447,544 418,602 375,491

91 93 90 91 91 87 87

N 7272 7528 7174 7110 6922 6346 6218

% 1.42 1.49 1.46 1.53 1.55 1.52 1.66

3,575,236

70,465

1.97

3,217,642

90

48,570

1.51

78

A. Pokropek, J. Sikora / Social Science Research 53 (2015) 73–88

Fig. 1. Mixture ACE model with categorical indicators of latent variables.

in each school were really twins. This mini-study revealed that 7.3% of twin pairs had been misclassified. We utilized this information to improve the accuracy of our estimations. 2.1. Are exam data a good source of information for genetically informed designs? Relative to twin surveys, the key benefit of administrative data is that they are affected less by homogeneity bias (i.e. underestimation of variance caused by the selection of similar respondents) as almost all students sit exams (Table 2). The exam participation rate is around 90% in our data, while response rates in high quality educational achievement surveys approach at best 70% (Wuttke, 2007). In the context of educational studies another considerable advantage of administrative exam data is eradication of low motivation effect. Achievement tests like PISA, PIRLS, TIMSS2 are ‘‘low-stakes’’ because students’ academic outcomes are unaffected by their performance. In contrast, exam scores have tangible consequences for students, since they determine, at least in Poland, entry to desired schools and programs at the next educational level. These exams have real implications for students and as such are meaningful to them. While they might not be the best measure of ability, they stratify educational progression of students and, in this sense, generate consequential educational inequalities. However, administrative data have no information on zygosity of twins; are sparse with regard to the traditional social science variables and are not always designed to facilitate cohort comparisons. So the task of utilizing such data in educational analyses poses challenges. Missing data are less of a problem than in surveys, but even in administrative collections some information is missing. In our case, while the data are complete for Year 9 exams, we are missing about 5% of student data for Year 6 exam. Fortunately, it is plausible to assume in this instance that data are missing completely at random as missingness arises from mismatches of exam identification numbers rather than being associated with student characteristics. 3. Method Our research questions call for two analytical techniques. First, we utilize a modification of a classical ACE variance components model. Then, to compare the learning gains of twins who have always shared the same classroom environment with those who have not, we utilize residual scores from an ordinary least squares regression. 3.1. Mixture ACE model Our dataset includes identifiers of twins, but there is no information about their zygosity. Therefore, we extend, as shown in Fig. 1, a classical ACE variance components model (Silventoinen et al., 2009). In contrast to the classical model discussed earlier, in our model zygosity is estimated rather than given. Similar models previously used with simulated and real twin data turned out to be highly reliable (Benyamin et al., 2006, 2005; Calvin et al., 2012). Moreover, in Appendix A we show that such models produce results closely resembling those obtained when zygosity is known. We extend the standard ACE model by utilizing all information available in our data to generate robust estimates of zygosity which are necessary to compute the effects of heritability on student achievement. Our extension of the ACE model allows for predicting posterior probabilities of being a MZ twin which are used to compare the educational outcomes of identical and fraternal twins. 2 PISA is the OECD’s Programme for International Student Assessment of 15 year olds. PIRLS is the Progress in International Reading Literacy Study while TIMSS is the Trends in International Mathematics and Science Study involving Year 4 and 8 students from many countries.

79

A. Pokropek, J. Sikora / Social Science Research 53 (2015) 73–88 Table 3 Joint prior distribution of MZ, DZ and misclassified twin class membership.

Same sex pairs Opposite sex pairs

Monozygotic twins

Dizygotic twins

Misclassified twins

.191 .00 .19

.418 .318 .736

.048 .025 .073

.657 .343 1

Fig. 1 presents our extension of the ACE model, hereafter ‘a mixture ACE model’, in a generalized structural equations framework. The circles represent latent variables while rectangles are the observed variables. Double arrows denote correlations and single arrows represent causation. P1 and P2 stand for phenotypes (i.e. composites of student observable characteristics or traits) of each twin pair. We assume that a phenotype is fully identified by student achievement data, i.e. exam results. Latent factors are standardized to have the variance of one and represent: the additive influences of the genetic (A) shared (C) and unique (E) factors on the observed phenotype (P) for each twin. This part of Fig. 1 may be written as an equation for a twin i in pair j as:

Pij ¼ aApij þ cC pij þ eEpij

ð1Þ

In the ACE model, MZ twins are assumed to share all environmental influences and all genetic influences, while DZ twins share only some of these effects. More precisely, the model assumes that the genetic correlation within twin pairs is     DZ MZ Cor ADZ ¼ 0:5 for dizygotic twins and Cor AMZ ¼ 1 for monozygotic twins. Shared environmental factors are 1j ; A2j 1j ; A2j assumed to be independent of zygosity and equal within each pair of twins. The unique environmental variation is specific to each twin. These restrictions follow the genetic theory (Plomin et al., 2013) and enable the identification of the model. They are shown in Fig. 1 as double arrows linking latent factors. Apart from the elements which enable the estimation rather than direct inputting of information about zygosity, this model exactly matches the logic described at the opening of this paper. When zygosity is known, it is sufficient to estimate a two-class model (where MZ twins are Class 1 and DZ twins are Class 2). When zygosity is unknown, the classical ACE model needs to be re-parameterized as a latent class model with unknown membership (shown at the bottom of Fig. 1). In our mixture ACE model the probability of belonging to a particular category of twins (PrMZ or PrDZ) was estimated along with all parameters defined in the classical ACE model via the maximum likelihood algorithm within the latent class analysis framework (Lubke and Muthén, 2007; Lubke and Neale, 2008; Muthén, 2004). As estimations of latent classes might become unstable and are sensitive to misclassification of twins, we facilitated them with additional information for twin pairs. We inputed the information on prior probabilities (Hosmer, 1973; Neale, 2003: 234) of being a MZ twin; a DZ twin or not a twin by specifying three latent classes. In our model we employed prior probabilities for MZ twins from the Danish census data (Skytthe et al., 2002). As we have no Polish data on twinning we use the information from Denmark. It is the best available source for our purposes because Danish data come from similar cohorts (births between 1989 and 2000) and they provide the whole population information from another European country. Using these data we established that the MZ twinning occurred at the rate of 0.36% and that 56.8% of DZ twins were same-sex siblings. These two pieces of information were fed into our estimations. All the other information came from the Polish dataset. Our telephone validity check revealed that 7.3% of twin pairs had been misclassified. Using this information together with the information about gender and the distribution of MZ twins in Denmark, we produced the prior joint distribution of class membership shown in Table 3. Prior probabilities in Table 3 are imputed into the estimation as three variables denoting conditional probabilities of belonging to one of the three latent classes. The values of this variable differ between SS and OS pairs but are the same for SS pairs. For instance, the probability of being a MZ twin equals zero for all OS pairs but is defined as 0.291 for all SS twin pairs (note that 0.291 = 0.191/0.657 in Table 3. This is the conditional probability value given that a pair is SS). These probabilities are used to weight the likelihood function (see Neale, 2003 for details), what allows for more precise identification of latent classes. This approach, as noted by Neale (2003: 237), is most effective in samples which include some pairs with accurate zygosity information. In our data the only pairs with accurate zygosity information are OS twins, as by definition, they cannot be MZ. The main assumption behind mixture models is that the observed phenotypes of twins are bivariate and normally distributed. The simulations presented by Benyamin et al. (2006) show that where normality assumptions are satisfied and sample sizes large (2000 pairs or more), the results furnished by mixture models give unbiased estimates of heritability. The analyses of real data indicate, however, that biases occur when pair differences have a leptokurtic distribution. We graphically inspected our data and were able to conclude that they were not leptokurtic but close to the normal distribution. This reflects also the design of achievement tests in which the goal was to capture a broad range of distributions, in other words, to map out the learning level of every member of the student population. These tests involve no passing thresholds and they are not designed for screening purposes. Moreover, the scaling procedures were anchored in a three-parameter item response theory (IRT) model (Muraki and Bock, 1997), which smoothes out most

80

A. Pokropek, J. Sikora / Social Science Research 53 (2015) 73–88

deviations from the normal distribution, providing that the scaled test scores are roughly normally distributed, which was the case in our data. Additionally, we conducted a model reliability test, presented in Appendix A to this paper, using three different publicly available twin datasets with known zygosity. The test shows that a classical ACE model and a mixture model utilized here produce nearly identical results. Finally, we performed robustness checks to see how sensitive our results are to moderate changes in the rates of twinning, which we derived from the Danish data. After examining three different scenarios, i.e. first changing the MZ twining rate from 0.36% into 0.40%, then changing the unequal rates of SS and OS twins to equal rates and, finally altering both settings simultaneously, we concluded that the results are robust. The largest difference occurring in ACE variance components, as scenarios changed, did not exceed 1%. To conserve space robustness estimations are not reported in detail but are available upon request. Summing up, there are two main differences between the model employed here and those used in earlier studies which relied on population data without zygosity information (Benyamin et al., 2005; Shakeshaft et al., 2013). Firstly, we estimate the proportions of MZ and DZ twins as latent classes using prior information rather than fixing the proportion of MZ to some constant (e.g. Benyamin et al., 2005: 527). Secondly, we allow for the third class of students erroneously classified as twins. The full Mplus syntax for our model is in Appendix B. Our approach is most suitable for data in which the proportion of MZ twins is not known, there is no precise information on prior distributions and misclassification is likely to occur. However, this strategy has a drawback as it makes estimations of gender specific effects or obtaining estimates solely from SS twins problematic. Some researchers (Benyamin et al., 2006, 2005) showed that excluding opposite-sex twin pairs and relying on data from single-sex twins led to more robust results when gender differences existed. In our case, using information from one gender or using data from just SS twins generates a substantial loss of information needed for latent class identification (Table 3). This, in turn, leads to instability of the estimates. Therefore, we forego the potential benefit of excluding data from OS twins in the interest of achieving more precise class identification. In doing so we follow others, e.g. Shakeshaft et al. (2013:4), whose final estimates were based on the whole sample rather than SS twins only. Our additional rationale is that the bias in ACE estimates due to sex-limited effects, that might occur when data from SS and OS twins are used together, has never been detected in educational studies listed in Table 1. 3.2. Student learning gains Exams taken by Year 6 and Year 9 students in Poland are not designed to produce a single developmental scale across year levels. We have data from one general test in Year 6 and two domain-specific tests in Year 9. Test results are not expressed on the same scale and are not directly comparable, so the analytical approach we adopt cannot rely on the assumptions of direct correspondence between either measurement scales or test domains. Therefore, we first estimated student scores using a three-parameter item response theory (IRT) model (Muraki and Bock, 1997) and standardized these estimates on a scale with the mean of 100 and the standard deviation of 15. Despite this, student learning gains could not be expressed as a simple difference between measurements at two points of time because such standardization does not account for the variation and nonlinearity in individual growth trajectories. It is preferable in such situations to rely on regression models to generate estimates of relative learning gains. The detailed rationale for this strategy has been provided by McCaffrey et al. (2004) and applications of such models to Polish examination data are illustrated in Jakubowski (2008) and the OECD (2008). Moreover, extensive validity checks of this approach have been done in Poland by Dolata et al. (2013). In our OLS models first mathematics test scores and then humanities scores were regressed on Year 6 test scores, separately for each cohort listed in Table 2, controlling for gender as well as square and cubic terms for Year 6 test. These additional predictors were included to ensure that relative gain estimates are not biased because of nonlinearities in test scores or different trajectories of ability growth among boys and girls. Next, residual scores were computed for each model and cohort. Following prior twin studies in education, we interpret these residual scores as relative learning gains (Taylor et al., 2010). A positive value of a residual score denotes a situation where a Year 9 student achieved above the average for all students who had the same Year 6 result. A negative value of a residual score means that a Year 9 student made less progress than the average gain made by students who had the same Year 6 result. In both methods of estimation discussed above one important component of unique environment which may affect any estimates related to educational achievement of students is measurement error typical for test data. This might be a substantial problem for short tests or measures based on teachers’ observations. However, our data are based on long and extensive tests with the estimated reliability exceeding 0.9 Cronbach’s alpha, reported by the Polish Central Examination Board (2002– 2011). Therefore, our results are unlikely to be biased by the sparsity of test items. 4. Results Before discussing the influence of heritability on student achievement and learning gains and contrasting these outcomes between pairs of twins who did and did not share their classroom environments in secondary school, we consider the estimated proportions of twins in our sample (Table 4).

81

A. Pokropek, J. Sikora / Social Science Research 53 (2015) 73–88 Table 4 Estimated proportions of identical, fraternal and misclassified twins: latent class posterior probabilities (N = 24,285). Test results

Identical twins (%)

Fraternal twins (%)

Misclassified twins (%)

All learning domains (Year 6) Mathematics (Year 9) Humanities (Year 9) Gain in mathematics (Years 6–9) Gain in humanities (Years 6–9)

20.7 20.5 21.1 20.4 20.6

72.0 72.2 71.7 72.2 72.1

7.3 7.3 7.2 7.4 7.3

The estimated proportions of identical, fraternal or misclassified twins are presented in Table 4 for all tests as well as learning gains in mathematics and humanities. It is important to note that these estimates could potentially vary for each outcome, i.e. test domain, test timing or learning gain, but in our data, all estimates are almost identical for all five outcomes. They are robust and they closely correspond to prior probabilities reported in Table 3, which enhances the credibility of our results. 4.1. How strongly does heritability influence student achievement and learning gains? We commence our analyses by contrasting the within-pair correlations of Year 6 exam scores for identical and fraternal twins with the corresponding estimates for non-twins (Table 5). Identical twins perform in a strikingly similar manner in this primary school exit exam which combines questions in mathematics, science, reading and humanities. The average correlation for each pair of twins is 0.88 (Table 5). The results of Year 9 domain-specific exams of identical twins are even more highly correlated, at 0.91 (Table 5). Bearing in mind that the reliability of these tests, reported by the Central Examination Board stands at 0.90, our correlations suggest that when two MZ twins take a test, it is as if one and the same person sat the test twice. The learning gains of identical twins are correlated at 0.66 not only in mathematics but also in humanities (Table 5). These coefficients should be interpreted as denoting a strong relationship, given that they have been affected by measurement error from two sources, i.e. tests taken in Years 6 and 9. The significance of these correlations comes into focus when we compare identical twins with fraternal twins in Table 5. The correlations between fraternal twins are substantially lower than those obtained for identical twins. Exam results for misclassified twins have the lowest within-pair correlations. This pattern holds for all test scores and student learning gains and is, moreover, in accordance with typical results of studies on twins in other countries (Plomin et al., 2013 and see also Table 1). Genetic theory expects highest correlations for identical twins who share all of their genes, their home environment and school environment including classroom experiences, teachers and peer groups. As fraternal twins share the same environment but only about 50% of genes on average, it follows that the correlations denoting school achievement within pairs of dizygotic twins should be lower, regardless of how achievement is operationalized. The achievement of non-twin students who might share a classroom or a school should be even less related. These are exactly the patterns we find in Table 5 for seven different cohorts of Polish adolescents. The same results presented in the variance components format in Table 6 reveal a strong influence of heritability on test scores and learning gains, with estimates ranging between 57% and 66%. The shared environment accounts for between 24% and 34% of the variation in exam results. These estimates correspond to the ones reported in Table 1 for studies of younger children in mostly English-speaking countries. However, our results do not perfectly align with the typical patterns found in these studies. In our data the influence of shared environment on the overall achievement of students seems to increase between Years 6 and 9 (from 24% to 34% for mathematics to 31% for humanities in Table 6). This contrasts with most twin studies related to cognitive abilities (see Asbury and Plomin, 2014 for discussion) and also with the overall picture emerging from Table 1, which suggests that the impact of shared environment usually decreases with age. Two issues are pertinent to the interpretation of this result. First, not all twin studies report the fall in the influence of shared environment as students get older. Haworth et al. (2008) found that the role of shared environment increased

Table 5 Correlations between test scores of students: identical and fraternal twins versus non twins. Estimates are based on latent ACE mixture model (N = 24,285). Test results

All learning domains (Year 6) Mathematics (Year 9) Humanities (Year 9) Gain in mathematics (Years 6–9) Gain in humanities (Years 6–9) Note: Corr. is correlation, S.E. is standard error.

Identical twins

Fraternal twins

Non twins

Corr.

S.E.

Corr.

S.E.

Corr.

S.E.

0.88 0.91 0.91 0.66 0.66

(0.01) (0.00) (0.00) (0.01) (0.01)

0.56 0.63 0.61 0.33 0.33

(0.01) (0.01) (0.01) (0.01) (0.01)

0.12 0.06 0.07 0.00 0.00

(0.05) (0.04) (0.04) (0.00) (0.00)

82

A. Pokropek, J. Sikora / Social Science Research 53 (2015) 73–88

Table 6 ACE components for variance of students’ exam results (N = 24,285). ACE components

Heritability (h2)

Tests

Coeff. (%)

All learning domains (Year 6) Mathematics (Year 9) Humanities (Year 9) Gain in mathematics (Years 6–9) Gain in humanities (Years 6–9)

64.0 57.0 60.6 66.3 65.9

Shared environment (c2)

Unique environment (e2)

S.E. (%)

Coeff. (%)

S.E. (%)

Coeff. (%)

S.E. (%)

(1.9) (1.5) (1.5) (1.0) (1.0)

24.4 34.3 30.5 0.0 0.0

(1.6) (1.3) (1.3) (0.0) (0.0)

11.6 8.7 8.9 33.7 34.1

(0.5) (0.4) (0.3) (1.0) (1.0)

Total (%)

100 100 100 100 100

Note: Coef. is coefficient, S.E. is standard error.

between the ages of 8 and 12 in science education in the UK and attributed this finding to the differences in science curriculum content used for younger and older children. Similarly, Calvin et al. (2012: 704) found in their analysis of longitudinal data from the Netherlands that shared environment explained no variation in language and arithmetic scores of 8 year-olds only to increase in importance with age. This was the case even though in the same sample of students the impact of shared environment decreased with age when the variation in IQ scores was considered. This suggests that the impact of shared environment does not necessarily fall with age in all populations and for all outcome variables. Moreover, it is worth noting that most previous research rarely considered ages analyzed here but focused on the changes affecting younger children, usually between the ages of 8 and 12. Polish students are 13 years of age in Year 6 and turn 16 in Year 9 (Table 2). This life stage is connected with major changes in the biological and psychological constitution of individuals which also affect their relations with social environment (Magnusson et al., 1985; Moffitt, 1993; Steinberg and Morris, 2001). Finally, a comparative genetically sensitive study found that the pattern of genetic and environmental influence varies between countries (Samuelsson et al., 2008), so it might be undesirable to assume that findings from Scandinavia and English-speaking countries are universally generalizable. It may seem that if shared environment effects differ by roughly 10% between exams at two points of time (i.e. 24% and 34%), gain score estimates should also show this difference. Yet, Table 6 reports the contribution of shared environment to the learning gains as not statistically different from zero. To understand this one must recall that gains capture only the part of shared environment impact that is specific to the change that occurs in the three-year period between these two measurements. This, by definition, excludes any previous educational experiences or anything that is correlated with them. So the null estimate for learning gains made between the ages 13 and 16 stipulates that the changes in shared environment which occur during the lower secondary schooling do not affect individual progress above and beyond the influence of shared environment already in operation at the beginning of this period. Arguably the estimates of learning gains are of more interest in this analysis than what is indicated by partitioning of variation in long-term effects, as gain estimates have been corrected for prior learning experiences and, therefore, are a more accurate depiction of what matters in lower secondary schooling. Our data suggest that, at this stage of education, individual differences in short term progress made by Polish students are driven by mostly by innate skills (heritability: 66%) and also by events which are either objectively unique e.g. romantic relationships, friendships, personal interests, etc. or events that are shared by twins but affect each of them differently (34%). 4.2. How much more can we learn about shared environment? Estimates of classroom effects What more can this approach contribute to the understanding of the role which school or classroom environments play in educational experiences of students in lower-secondary schooling? To consider this issue we isolate the effect of classroom environment, which comprises the influences of peers and teachers, from the overall impact of shared environment. Our data do not involve twins who attended different schools at different stages of their educational careers but we can compare twin pairs who were in the same class in primary school but were separated in secondary school with pairs who shared their classroom environments at both stages. It must be borne in mind that lower secondary school students in Poland take particular subjects almost always with the same group of peers (i.e. their ‘‘class’’) while different classes are almost always taught by different sets of teachers. The instruction is subject-specific and lower-secondary teachers specialize in particular subjects. These circumstances facilitate undertaking a meaningful quasi-experimental comparison. In our sample most twins learned in the same class as their sibling in both primary and secondary stages (92.69% in Table 7). Ideally we would have liked to compare them to twins who learned in different classrooms in both primary and secondary school but we have only 214 pairs of such twins, which prevents reliable analysis. We can, however, compare 1452 pairs of twins who shared primary classrooms but not secondary classrooms with 22,510 pairs who were not separated. Splitting the sample into groups of twins with different histories of sharing classroom environment with their sibling enables obtaining more precise estimates of genetic and environmental variance components shown in Table 8. The

83

A. Pokropek, J. Sikora / Social Science Research 53 (2015) 73–88 Table 7 Numbers and percentages of twin pairs learning in same or different classrooms in primary and secondary school. Primary school classroom

Secondary school classroom

Different Same

Total

Total

Different

Same

214 (0.88%) 109 (0.45%)

1452 (5.98%) 22,510 (92.69%)

1666 (6.86%) 22,619 (93.14%)

323 (1.33%)

23,962 (98.67%)

24,285 (100%)

Table 8 ACE components for twins who were in the same class with their sibling or a different class than their sibling in secondary school (N = 22,510 same class; N = 1452 different class). ACE components

Classroom

Tests All learning domains (Year 6) Mathematics (Year 9) Humanities (Year 9) Gain in mathematics (Years 6–9) Gain in humanities (Years 6–9)

Same Different Same Different Same Different Same Different Same Different

Heritability (h2)

Shared environment (c2)

Unique environment (e2)

Coeff. (%)

S.E. (%)

Coeff. (%)

S.E. (%)

Coeff. (%)

S.E. (%)

65.1 88.3 58.0 69.5 62.6 86.4 65.6 53.6 65.0 57.9

(1.6) (6.2) (1.4) (9.5) (1.5) (7.1) (1.0) (6.4) (1.0) (4.8)

23.4 0.0 33.7 16.0 28.8 5.5 0.0 0.0 0.0 0.0

(1.4) (0.0) (1.3) (7.2) (1.3) (6.3) (0.0) (0.0) (0.0) (0.0)

11.5 11.7 8.3 14.5 8.6 8.1 34.4 46.4 35.0 42.1

(0.5) (6.2) (0.4) (3.3) (0.3) (1.9) (1.0) (6.4) (1.0) (4.9)

Note: Coef. is coefficient, S.E. is standard error.

technical details are available upon request, but in principle, the estimation on the 24,285 pairs of twins is improved if one utilizes average correlations between randomly sampled pairs of students from the same classroom and from different classrooms. Such correlations from the data on entire student population, obtained after the split, were fed into our ACE mixture models as constraints which facilitated even more robust estimations. The results of this comparison are in Table 8 which resembles Table 6, with the difference that results are presented for two groups of students. The first are twins who shared the same classroom with their sibling throughout their education and the second are twins who did not. We focus here on shared environment whose impact is substantially smaller for twins with learning experiences of different classrooms. This suggests that, in the long run, classroom effects account for a large share of common environment effects. However, when we consider only learning gains made in early adolescence over three years, we note that shared environment does not influence them at all. One caveat potentially affecting these results is that if assignment of students to classrooms is not random we might have some undocumented processes blurring the picture here. The PISA 2006 data indicate that over 80% of Polish students in lower-secondary schools are placed based on residence in a particular area, while up to 20% are placed based either on academic results or a feeder school recommendation (OECD, 2007: 218). Where above-average students are routinely placed with similar students, i.e. selection based on academic achievement is in operation during school or classroom placement, test scores should not be used for assessing classroom effects because our analytical approach requires an assumption that the placement of students occurs randomly. Furthermore, it is possible that dizygotic twins who are separated in secondary school may share a lower proportion of their genome which would account for a greater achievement gap between them. This could enhance the chances of being placed in different classrooms. Yet, the estimates of learning gains, which are obtained after controlling for prior achievement, effectively control for academic selection, so our conclusions about gains are valid even if selectivity occurs. In this population of adolescents the effects of shared environment, including classroom environments, on learning gains are equal to zero for the 3-year period of lower-secondary schooling. This actually is a plausible result given the specific history of the Polish education system, which, following the Second World War was nationally standardized and based, until recently, nearly exclusively on government provision of all education in the absence of private schooling. For decades, educational delivery, based on nationally standardized curricula at all levels, was strongly oriented toward the advancement of students from lower socio-economic backgrounds in line with the communist ideology of egalitarianism. The legacy of this is reflected in the 2006 Polish PISA data which show that 97% of adolescents in Year 9 had their own desk, 94% had their own study place, 94% had access to a computer and 97% owned their textbooks (OECD, 2007). While the contribution of shared

84

A. Pokropek, J. Sikora / Social Science Research 53 (2015) 73–88

environment to earlier educational experiences of Polish students is likely to be more substantial, by the time students reach adolescence, changes in their classroom environments seem to make little difference to the short-term learning gains that may occur at this stage of their lives.

5. Summary and discussion We have demonstrated in this paper how national examinations data can be used in research designs which emulate the logic of classical twin studies by partitioning variance in student achievement into the effects of heritability, shared environment and unique environment. We have also shown how classroom effects may be isolated in such designs. In line with other studies originating from the UK, Australia, the USA and Scandinavia, we have found that heritability accounts for between 54% and 88% of variance in adolescent exam results and that shared and unique environments have moderate contributions accounting for between 0% and 34% as well as 8% and 15% of variance, respectively. Heritability is also the strongest determinant of learning gains made by Polish students between Years 6 and 9, regardless of student history of sharing a classroom environment with their sibling. Nevertheless, some of our findings do not align with typical patterns in previous studies which reported that the impact of heritability increased with age as the importance of shared environment diminished. We found, as did two prior studies (Haworth et al., 2008; Calvin et al., 2012), stronger contributions of shared environment in Year 9 tests than in Year 6. The analysis of learning gains, which relied on OLS estimations of above-average progress made by each student relative to peers, suggested that actually shared environment has no impact at all in the specific three-year period between Years 6 and 9. Similarly, we found that learning gains made by students at this stage of education were entirely independent of their classroom environments, which comprise specific sets of peers and teachers. This corresponds to recent Polish research, based on non-genetic methodologies, which concluded that only 5% of variance in achievement in mathematics and 4% in humanities could be attributed to classroom effects (Koniewski, 2014). Our results suggest that these two estimates, which ignore heritability and respondents’ self-selection, are markedly inflated (McCaffrey et al., 2004; Kalogrides et al., 2013). While our overall results underscore the importance of heritability and match well the findings of prior twin studies, it is important to bear in mind that the discrepancies we observe may be due to different ages of students in our study or to our focus on Poland. Educational outcomes of Polish students have never been researched with a genetically informed approach before and the pattern of genetic and environmental influence varies between countries (Samuelsson et al., 2008). As more such studies emerge from other countries it may turn out that the diminishing impact of shared environment on student outcomes is not necessarily universal i.e. operating for all academic domains and in all stages of student development. It is unlikely that our findings are a by-product of the method we use, because this method produced, in three different datasets in Appendix A, the results which are nearly identical with those obtained with traditional ACE models with known zygosity. The findings of genetically informed studies in general, including mixture ACE model estimations such as ours, are not automatically in opposition to the findings of educational research that ignores heritability. While this research asserts that the climate of classrooms and peer influences are important (Hattie, 2013: 107), this does not necessarily mean that their importance is the same at all stages of student learning. It does not mean either that these effects trump heritable characteristics of students, which are mostly likely, in egalitarian systems, the primary causes of student outcome differentiation (Asbury and Plomin, 2014). Likewise, the conviction that teacher quality matters for learning in early adolescence is not equivalent to the assumption that teacher impact exceeds in importance the influence of heritable factors or student-specific events. Rather these two bodies of knowledge have the potential to mutually complement each other. This, however, requires a broader dialog about the logic and implications of educational research on heritability. Heritability has long been misunderstood by most education researchers working outside of genetics (Caspari, 1968). In the context of behavioral analysis heritability means no more than ‘‘the proportion of variance which is due to difference in genetic constitution’’ (Caspari, 1968: 48). Where heritability of student educational achievement equals 100% it is possible to argue that shared and unique environments do not matter at all. However, it is more helpful, particularly in light of research that finds higher contributions of heritability in developed countries (Asbury and Plomin, 2014) to see heritability more as an index of equality. Arguably, in a perfectly meritocratic society where all children have opportunities to develop their full genetic potential, a high impact of heritability is only to be expected. Research on heritability, including our study, is subject to limitations. It is criticized for neglecting the interactions between genetic and environmental effects, which, although substantially important (Asbury and Plomin, 2014), so far have been empirically shown to account for no more than 6% of variance in most outcomes of interest (Plomin et al., 2013). Another debate considers the naivety of the assumption, which underpins ACE modeling, that MZ and DZ twins are raised in identical environments, i.e. they are not treated differently. Evidence shows, however, that even if twins are treated differently, this has little impact on the heterogeneity of their educational outcomes (Felson, 2014; Conley et al., 2013; Loehlin and Nichols, 1976).

85

A. Pokropek, J. Sikora / Social Science Research 53 (2015) 73–88

Our study is vulnerable to such critiques as are all twin research studies. However, our approach makes genetically sensitive research possible in situations where twin survey and twin register data are not available which must be seen as a significant benefit for the future dialog about the value of genetically informed designs. Furthermore, our findings support the argument that twin survey studies are not subject to a downward bias in the estimation of the shared environment effect because their samples are homogenous with respect to the social and educational environments of students. If such a bias occurred, our study and studies based on twin registers (Lichtenstein et al., 2002; Skytthe et al., 2002) would find higher contributions of shared environment, which we do not. While ACE models might seem unattractive to social scientists, because they do not offer obvious ways of unpacking shared environment into a set of familiar net effects, it is worth bearing in mind that regression approaches suffer from the omitted variables problem. Most regression studies are cross-sectional and thus subject to bias generated by lack of controls for prior student achievement. We believe that genetically informed analyses of educational outcomes based on various data sources have a potential to open up new ways of conceptualizing and quantifying the relative roles that families, schools, peers and teachers play in specific stages of development and learning of youth. With this study, we hope to have made a step in this new direction. Acknowledgments This research was supported by a grant from the CERGE-EI Foundation under the Global Development Network program. All opinions expressed are those of the authors and have not been endorsed by either CERGE-EI or the GDN. One of the analyses in Appendix A uses data supported by a grant from the National Science Foundation (#SES-0721378) in cooperation with the Minnesota Twin Registry at the University of Minnesota. The data are publicly available from: http:// www.unl.edu/polphyslab

Appendix A. Validation of the ACE mixture model The model used in this study closely resembles the model utilized by Benyamin et al. (2005, 2006). The only difference is the use of prior information about the joint distribution of MZ twins, DZ twins, the misclassified pairs and gender. Given that Benyamin et al. (2006) performed extensive simulations to demonstrate the validity of their model, we present only simulations which validate our ACE mixture model for data with no information on zygosity. Our strategy is to estimate this model twice on three different data sets. In the first estimation we use the existing information on zygosity and the classical ACE model. The second set of estimates are obtained under the assumption that no information on zygosity is available, which mirrors the situation in our Polish data. We contrast the results of both estimations. The three data sets we employ are the National Survey of Midlife Development in the United States (MIDUS), 1995–1996 (Brim et al., 1995); the Minnesota Twins Political Survey (MTPS) (described in detail by Smith et al., 2012) and an artificial dataset provided by the Mplus portal (Example 5.18) and discussed in Mplus 7 User’s Guide (Muthen and Muthen, 1998– 2012: 83). In each dataset we sought out a variable which most resembled the statistical properties of educational achievement scales. This involved constructs with the largest number of items, highest reliability and an approximately normal distribution. Based on these criteria we modeled the Psychological Well-Being scale, in the MIDUS study and the Openness to Experience scale in the MTPS data. The Mplus data include only one simulated normally distributed dependent variable. For these three variables we obtained the estimates of heritability, shared and unique environments from a classical ACE model and the mixture model utilized by us in this paper. Table A1 lists the results for both models in rows separately for each data set. The results of classical ACE estimations and mixed ACE estimations are strikingly similar. At maximum, the differences between point estimates of variance components reach 2.6% (i.e. 50.6% versus 53.2% or 49.4% versus 46.8%). In no case do they differ significantly between models even if we assume that the results from the classical ACE model have been estimated without error.

Table A1 Estimates of classical ACE model (known zygosity) and mixture ACE model (unknown zygosity). Data (scale)

MIDUS 1995–1996 (psychological well-being) MTPS (openness to Experience) Mplus (simulated variable)

Model type

ACE Mixture ACE Mixture ACE Mixture

N of pairs

822 740 2000

H

C

E

Coeff. (%)

S.E. (%)

Coeff. (%)

S.E. (%)

Coeff. (%)

S.E. (%)

50.6 53.2 50.3 49.7 63.1 62.2

3.8 4.9 4.3 5.3 5.3 7.6

0.0 0.0 0.0 0.0 10.9 12.1

0.0 0.0 0.0 0.0 4.9 6.3

49.4 46.8 49.7 50.3 26.0 25.7

3.8 4.9 4.3 5.3 1.3 2.0

86

A. Pokropek, J. Sikora / Social Science Research 53 (2015) 73–88

Appendix B. Mplus 7.2 syntax used in the presented analysis

References Alarcón, M., Knopik, V.S., DeFries, J.C., 2000. Covariation of mathematics achievement and general cognitive ability in twins. J. Sch. Psychol. 38 (1), 63–77. http://dx.doi.org/10.1016/S0022-4405(99)00037-0. Asbury, K., Plomin, R., 2014. G is for Genes: The Impact of Genetics on Education and Achievement. Wiley Blackwell, Chennai. Ashenfelter, O., Krueger, A., 1994. Estimates of the economic return to schooling from a new sample of twins. Am. Econ. Rev. 84 (5), 1157–1173. Bartels, M., Rietveld, M.J., Van Baal, G.C.M., Boomsma, D.I., 2002. Heritability of educational achievement in 12-year-olds and the overlap with cognitive ability. Twin Res. 5 (06), 544–553. http://dx.doi.org/10.1375/136905202762342017. Benyamin, B., Wilson, V., Whalley, L.J., Visscher, P.M., Deary, I.J., 2005. Large, consistent estimates of the heritability of cognitive ability in two entire populations of 11-year-old twins from Scottish mental surveys of 1932 and 1947. Behav. Genet. 35 (5), 525–534. http://dx.doi.org/10.1007/s10519-005-3556-x. Benyamin, B., Deary, I.J., Visscher, P.M., 2006. Precision and bias of a normal finite mixture distribution model to analyze twin data when zygosity is unknown: simulations and application to IQ phenotypes on a large sample of twin pairs. Behav. Genet. 36 (6), 935–946. http://dx.doi.org/10.1007/ s10519-006-9086-3. Bonjour, D., Cherkas, L.F., Haskel, J.E., Hawkes, D.D., Spector, T.D., 2003. Returns to education: evidence from U.K. twins. Am. Econ. Rev. 93 (5), 1799–1812. http://dx.doi.org/10.1257/000282803322655554. Bortolus, R., Parazzini, F., Chatenoud, L., Benzi, G., Bianchi, M.M., Marini, A., 1999. The epidemiology of multiple births. Hum. Reprod. Update 5 (2), 179–187. http://dx.doi.org/10.1093/humupd/5.2.179. Bouchard, T.J., McGue, M., 2003. Genetic and environmental influences on human psychological differences. J. Neurobiol. 54 (1), 4–45. http://dx.doi.org/ 10.1002/neu.10160. Brim, Orville G., Baltes, Paul B., Bumpass, Larry L., Cleary, Paul D., Featherman, David L., Hazzard, William R., Kessler, Ronald C., Lachman, Margie E., Markus, Hazel Rose, Marmot, Michael G., Rossi, Alice S., Ryff, Carol D., Shweder, Richard A., National Survey of Midlife Development in the United States (MIDUS), 1995–1996. ICPSR02760-v8. Inter-university Consortium for Political and Social Research [distributor], Ann Arbor, MI, 25.10.11. doi:http://dx.doi.org/ 10.3886/ICPSR02760.v8.

A. Pokropek, J. Sikora / Social Science Research 53 (2015) 73–88

87

Byrne, B., Conventry, W.L., Olson, R.K., Wadsworth, S.J., Samuelsson, S., 2010. Teacher effects in early literacy development: evidence from a study of twins. J. Educ. Psychol. 102 (1), 32–42. http://dx.doi.org/10.1037/a0017288. Calvin, C.M., Deary, I.J., Webbink, D., Smith, P., Fernandes, C., Lee, S.H., Luciano, M., Visscher, P.M., 2012. Multivariate genetic analyses of cognition and academic achievement from two population samples of 174,000 and 166,000 school children. Behav. Genet. 42 (5), 699–710. http://dx.doi.org/10.1007/ s10519-012-9549-7. Caspari, E., 1968. Genetic endowment and environment in the determination of human behavior: biological viewpoint. Am. Educ. Res. J., 43–55 http:// dx.doi.org/10.3102/00028312005001043. CEB Central Examination Board of Poland [Centralna Komisja Egzaminacyjna], 2002–2011. Results of Lower-Secondary School Exams. Technical Reports: 2005–2011 . Conley, D., Rauscher, E., Dawes, C., Magnusson, P.K., Siegal, M.L., 2013. Heritability and the equal environments assumption: evidence from multiple samples of misclassified twins. Behav. Genet. 43 (5), 415–426. CSO Central Statistical Office of Poland [Główny Urza˛d Statystyczny], 2012, The results of the National Census of Population and Housing: Report. . Dolata, R., 2008. Szkoła, segregacje, nierównos´ci. Wydawnictwa Uniwersytetu Warszawskiego; Warszawa. _ T., 2013. Trafnos´c´ metody edukacyjnej wartos´ci dodanej dla gimnazjów Dolata, R., Hawrot, A., Humenny, G., Jasin´ska, A., Koniewski, M., Majkut, P., Zółtak, [Value added method for estimating educational gains in lower-secondary schools]. IBE; Warszawa . Evans, D.M., Martin, N.G., 2008. The validity of twin studies. Gene Screen 1 (2), 77–79. http://dx.doi.org/10.1046/j.1466-9218.2000.00027.x. Falconer, D.S., 1981. Introduction to Quantitative Genetics. Longman. Felson, J., 2014. What can we learn from twin studies? A comprehensive evaluation of the equal environments assumption. Soc. Sci. Res. (43), 184–199 http://dx.doi.org/10.1016/j.ssresearch.2013.10.004. Freese, J., 2008. Genetics and the social science explanation of individual outcomes. Am. J. Sociol. 114, 1–35. http://dx.doi.org/10.1086/592208. Hart, S.A., Petrill, S.A., Thompson, L.A., Plomin, R., 2009. The ABCs of math: a genetic analysis of mathematics and its links with reading ability and general cognitive ability. J. Educ. Psychol. 101 (2), 388. http://dx.doi.org/10.1037/a0015115. Hattie, J., 2013. Visible Learning: A Synthesis of over 800 Meta-analyses Relating to Achievement. Routledge. Haworth, C.M., Dale, P., Plomin, R., 2008. A twin study into the genetic and environmental influences on academic performance in science in nine-year-old boys and girls. Int. J. Sci. Educ. 30 (8), 1003–1025. http://dx.doi.org/10.1080/09500690701324190. Hosmer Jr, D.W., 1973. A comparison of iterative maximum likelihood estimates of the parameters of a mixture of two normal distributions under three different types of sample. Biometrics, 761–770. Husén, T., 1959. Psychological twin Research: A Methodological Study. Almqvist and Wiksell, Stockholm. Jakubowski, M., 2008. Implementing Value-added Models of School Assessment. European University Institute, Florence . Jinks, J.L., Fulker, D.W., 1970. Comparison of the biometrical genetical, MAVA, and classical approaches to the analysis of the human behavior. Psychol. Bull. 73 (5), 311. http://dx.doi.org/10.1037/h0029135. Kalogrides, D., Loeb, S., Beteille, T., 2013. Systematic sorting teacher characteristics and class assignments. Sociol. Educ. 86 (2), 103–123. http://dx.doi.org/ 10.1177/0038040712456555. Kamens, D.H., McNeely, C.L., 2010. Globalization and the growth of international educational testing and national assessment. Comp. Educ. Rev. 54 (1), 5– 25. http://dx.doi.org/10.1086/648471. Koniewski, M., 2014. Estimating teacher effect using hierarchical linear modelling. Edukacja. An interdisciplinary approach 5 (130), 70–91. Kovas, Y., Haworth, C.M.A., Petrill, S.A., Plomin, R., 2007. Mathematical ability of 10-year-old boys and girls genetic and environmental etiology of typical and low performance. J. Learn. Disabil. 40 (6), 554–567. http://dx.doi.org/10.1177/00222194070400060601. Lichtenstein, P., Floderus, B., Svartengren, M., Svedberg, P., Pedersen, N.L., 2002. The Swedish Twin Registry: a unique resource for clinical, epidemiological and genetic studies. J. Intern. Med. 252 (3), 184–205. http://dx.doi.org/10.1046/j.1365-2796.2002.01032.x. Loehlin, J.C., Nichols, R.C., 1976. Heredity, Environment, & Personality: A Study of 850 Sets of Twins. University of Texas Press, Austin. Lubke, G., Muthén, B.O., 2007. Performance of factor mixture models as a function of model size, covariate effects, and class-specific parameters. Struct. Equ. Model. 14 (1), 26–47. Lubke, G., Neale, M., 2008. Distinguishing between latent classes and continuous factors with categorical outcomes: class invariance of parameters of factor mixture models. Multivar. Behav. Res. 43 (4), 592–620. Magnusson, D., Stattin, H., Allen, V.L., 1985. Biological maturation and social development: a longitudinal study of some adjustment processes from midadolescence to adulthood. J. Youth Adolesc. 14 (4), 267–283. Martin, N.G., Martin, P.G., 1975. The inheritance of scholastic abilities in a sample of twins. I. Ascertainment of the sample and diagnosis of zygosity. Ann. Hum. Genet. 39, 213–218. http://dx.doi.org/10.1111/j.1469-1809.1975.tb00124.x. McCaffrey, D.F., Lockwood, J., Koretz, D., Louis, T.A., Hamilton, L., 2004. Models for value-added modeling of teacher effects. J. Educ. Behav. Stat. 29 (1), 67– 101. http://dx.doi.org/10.3102/10769986029001067. Moffitt, T.E., 1993. Adolescence-limited and life-course-persistent antisocial behavior: a developmental taxonomy. Psychol. Rev. 100 (4), 674. http:// dx.doi.org/10.1037/0033-295X.100.4.674. Muraki, E., Bock, R.D., 1997. Parscale: IRT Item Analysis and Test Scoring for Rating-scale Data. Scientific Software International, Chicago, IL. Muthén, B., 2004. Latent Variable Analysis. The Sage Handbook of Quantitative Methodology for the Social Sciences. Sage Publications, Thousand Oaks, CA, pp. 345–368. Muthén, L.K., Muthén, B.O., 1998–2012. Mplus User’s Guide, 7th ed. Muthén & Muthén, Los Angeles, CA. Neale, M.C., 2003. A finite mixture distribution model for data collected from twins. Twin Res. 6 (03), 235–239. Nielsen, F., 2006. Achievement and ascription in educational attainment: genetic and environmental influences on adolescent schooling. Soc. Forces 85 (1), 193–216. http://dx.doi.org/10.1353/sof.2006.0135. OECD, 2007. PISA 2006 Science Competencies for Tomorrow’s World. vol. 1. OECD, Paris . OECD, 2008. Measuring Improvements in Learning Outcomes: Best Practices to Assess the Value-added of Schools. OECD, Paris . Oliver, B.R., Dale, P.S., Plomin, R., 2007. Writing and reading skills as assessed by teachers in 7-year olds: a behavioral genetic approach. Cogn. Dev. 22 (1), 77–95. http://dx.doi.org/10.1016/j.cogdev.2006.07.003. Plomin, R., DeFries, J.C., Knopik, V.S., Neiderhiser, J.M., 2013. Behavioral Genetics, 6th ed. Worth Publishers, New York. Posthuma, D., Beem, A.L., De Geus, E.J., Van Baal, G.C.M., von Hjelmborg, J.B., Iachine, I., Boomsma, D.I., 2003. Theory and practice in quantitative genetics. Twin Res. 6 (05), 361–376. Samuelsson, S., Byrne, B., Olson, R.K., Hulslander, J., Wadsworth, S., Corley, R., Willcutt, E.G., Defries, J.C., 2008. Response to early literacy instruction in the United States, Australia, and Scandinavia: a behavioral-genetic analysis. Learn. Indiv. Differ. 18 (3), 289–295. http://dx.doi.org/10.1016/ j.lindif.2008.03.004. Shakeshaft, N.G., Trzaskowski, M., McMillan, A., Rimfeld, K., Krapohl, E., et al, 2013. Strong genetic influence on a UK nationwide test of educational achievement at the end of compulsory education at age 16. PLoS ONE 8 (12), e80341. http://dx.doi.org/10.1371/journal.pone.0080341. Silventoinen, K., Sammalisto, S., Perola, M., Boomsma, D.I., Cornes, B.K., Davis, C., de Lange, M., 2009. Heritability of adult body height: a comparative study of twin cohorts in eight countries. Twin Res. 6 (5), 399–408. http://dx.doi.org/10.1375/136905203770326402.

88

A. Pokropek, J. Sikora / Social Science Research 53 (2015) 73–88

Skytthe, A., Kyvik, K., Holm, N.V., Vaupel, J.W., Christensen, K., 2002. The Danish Twin Registry: 127 birth cohorts of twins. Twin Res. 5 (5), 352–357. http:// dx.doi.org/10.1375/136905202320906084. Smith, K.B., Hatemi, P.K., Eaves, L.J., Alford, J.R., Hibbing, J.R., 2012. Biology, epistemology, and nature of human ideology. Am. J. Polit. Sci. 56, 17–33. Steinberg, L., Morris, A.S., 2001. Adolescent development. J. Cogn. Educ. Psychol. 2 (1), 55–87. http://dx.doi.org/10.1016/j.dr.2007.08.002. Taylor, J., Roehrig, A.D., Hensle, B.S., Connor, C.M., Schatschneider, C., 2010. Teacher quality moderates the genetic effects on early reading. Science 328 (5977), 512–514. http://dx.doi.org/10.1126/science.1186149. Thompson, L.A., Detterman, D.K., Plomin, R., 1991. Associations between cognitive abilities and scholastic achievement: genetic overlap but environmental differences. Psychol. Sci. 2 (3), 158–165. http://dx.doi.org/10.1111/j.1467-9280.1991.tb00124.x. Wainwright, M.A., Wright, M.J., Geffen, G.M., Luciano, M., Martin, N.G., 2005. The genetic basis of academic achievement on the Queensland Core Skills Test and its shared genetic variance with IQ. Behav. Genet. 35 (2), 133–145. http://dx.doi.org/10.1007/s10519-004-1014-9. Wuttke, J., 2007. Uncertainties and bias in PISA. In: Hopmann, S.T., Brinek, G., Retzl, M. (Eds.), PISA According to PISA. Does PISA Keep What It Promises? (PISA zufolge PISA. Halt PISA was es verspricht?) Published in German. Lit-Verlag, Vienna, pp. 241–263.

Heritability, family, school and academic achievement in adolescence.

We demonstrate how genetically informed designs can be applied to administrative exam data to study academic achievement. ACE mixture latent class mod...
811KB Sizes 1 Downloads 7 Views