57

British Journal of Developmental Psychology (2015), 33, 57–72 © 2014 The British Psychological Society www.wileyonlinelibrary.com

Testing primary-school children’s understanding of the nature of science Susanne Koerber1*, Christopher Osterhaus1 and Beate Sodian2 1 2

Freiburg University of Education, Freiburg, Germany Ludwig-Maximilians-University Munich, Germany Understanding the nature of science (NOS) is a critical aspect of scientific reasoning, yet few studies have investigated its developmental beginnings and initial structure. One contributing reason is the lack of an adequate instrument. Two studies assessed NOS understanding among third graders using a multiple-select (MS) paper-and-pencil test. Study 1 investigated the validity of the MS test by presenting the items to 68 third graders (9-year-olds) and subsequently interviewing them on their underlying NOS conception of the items. All items were significantly related between formats, indicating that the test was valid. Study 2 applied the same instrument to a larger sample of 243 third graders, and their performance was compared to a multiple-choice (MC) version of the test. Although the MC format inflated the guessing probability, there was a significant relation between the two formats. In summary, the MS format was a valid method revealing third graders’ NOS understanding, thereby representing an economical test instrument. A latent class analysis identified three groups of children with expertise in qualitatively different aspects of NOS, suggesting that there is not a single common starting point for the development of NOS understanding; instead, multiple developmental pathways may exist.

Recent research on scientific reasoning in primary school has demonstrated basic competencies in hypothesis testing and data evaluation in children as young as 7 years (Piekny & Maehler, 2013; for a review, see Zimmerman, 2007). However, it is unclear whether children’s choice of a critical test of a hypothesis or their correct interpretation of a data pattern relative to a hypothesis is based on a broader metaconceptual understanding of the nature of science (NOS). Understanding NOS is based on a clear differentiation between ideas (theories, hypotheses) and states of the world (data), implying an understanding of scientific knowledge as a construction. It also includes some understanding of the tentative nature of scientific theories and the cumulative and cyclical process of scientific inquiry (cf. Carey, Evans, Honda, Jay, & Unger, 1989; Lederman, 2007; Smith, Maclin, Houghton, & Hennessey, 2000). Research in developmental psychology and science education has shown deep-seated misconceptions about NOS, even in secondary-school students and some adults (Lederman, 2007; Thoermer & Sodian, 2002). For example, the interview study of Carey et al. (1989) probed for an explicit understanding of different aspects of NOS, such as the goals of science (e.g., ‘What do you think science is all about?’), central scientific concepts (e.g., scientific questions, experiments, hypotheses, theories) and their interrelations (e.g., ‘What happens when scientists test their idea (hypothesis) and they obtain a *Correspondence should be addressed to Susanne Koerber, Freiburg University of Education, Kunzenweg 21, 79117 Freiburg, Germany (email: [email protected]). DOI:10.1111/bjdp.12067

58

Susanne Koerber et al.

different result from the one they expected?’). Three qualitatively different levels of understanding were distinguished: Students at level 1 regard science as a set of mere activities (neglecting the role of ideas) or as the gathering of factual information, those at level 2 can distinguish between ideas and experiments and understand that data not concordant with their hypothesis might lead to its revision, and those at level 3 understand that theories guide the generation of hypotheses and experiments and shape the interpretation of evidence and that the growth of scientific knowledge has a cyclic and cumulative character. Carey et al. (1989) found that, without prior instruction, seventh graders answered primarily at level 1, indicating a lack of understanding of the role of ideas in the inquiry process. In a longitudinal study, Bullock, Sodian, and Koerber (2009) used a contextualized interview by Sodian, Carey, Grosslight, and Smith (1992), which probes for an understanding of (framework) theories. Bullock et al. found predominantly level-1 answers in children, with moderate developmental progress from the age of 11 years to adulthood in the understanding of theories as systems of beliefs. While most 11-year-olds scored at level 1, indicating a lack of understanding that beliefs that are deeply rooted in theory are not immediately changed by a single incidence of counterevidence, there were individual differences even at this early age. These differences were longitudinally predictive of the spontaneous use of the control-of-variables strategy in a scientific reasoning task in adolescence, as well as of argumentation at the age of 21 (Bullock et al., 2009), indicating that an early grasp of the role of theoretical frameworks in inquiry processes is important for the development of scientific reasoning and argumentation. Almost no research has directly addressed NOS in primary-school children. While the performance of young secondary-school students in NOS interviews is generally poor, research from the Theory-of-Mind (ToM) tradition on children’s understanding of mental construction and interpretation points to a more positive picture. In particular, research on advanced ToM understanding indicates that primary-school children already have a nascent understanding of interpretative frameworks and perspectives (Carpendale & Chandler, 1996). For instance, Pillow (1991) found that even 8-year-olds take into account the individual theories of a person regarding a specific issue when predicting subjective interpretations of events, which can be objectively interpreted from different perspectives. Hence, children exhibit an understanding that knowledge is actively constructed and that it depends upon individual experiences and subjective theories; these are issues that are relevant in NOS understanding. The apparent discrepancy between empirical studies detecting mature NOS understanding rather late in adolescence and the early competencies detected in ToM research and in some scientific reasoning tasks (Koerber, Sodian, Thoermer, & Nett, 2005; Sodian, Zaitchik, & Carey, 1991) suggests that traditional research methods could underestimate nascent NOS understanding. Research on NOS understanding has mostly focused on adults and adolescents, possibly due to the commonly used research instruments (mostly interviews). Interviews often require that participants have sophisticated verbal abilities, including the ability to generate spontaneous productions of definitional statements and to elaborate on aspects that children may never have reflected on. Alternative question formats, such as closed-answer formats, might be more suited for NOS research in primary school. In a multiple-choice (MC) format, participants must choose between several statements of varying quality (Mayer, Sodian, Koerber, & Schwippert, 2014). This format lowers verbal demands and might be more appropriate for testing intuitive understanding of epistemological concepts. Research from other areas of scientific reasoning has shown that choice tasks (with verbal justification) are mastered at earlier age than are production tasks (Bullock et al., 2009). Although a widely used format,

Testing NOS in primary school

59

one important critique of MC tasks is that they cannot validly assess performance due to the high probability of correct guessing. Verbal justifications might reduce this problem, but this is only possible in one-to-one interview sessions. Large samples are often necessary for in-depth analyses of children’s evolving NOS understanding, requiring whole-class testing formats. A format that reduces guessing while simultaneously offering the contextual support of MC items is the multiple-select (MS) format, which requires children to separately accept or reject each of multiple statements. If performance is understood as the conjunction of the acceptance of correct (advanced) and the rejection of incorrect (na€ıve, intermediate) statements, the probability of correct guessing is reduced to 12.5% when these three statements are presented. The present study investigated the validity of a paper-and-pencil test that assesses primary-school children’s NOS understanding using an MS format. This format was chosen because (1) it can facilitate performance in young children due to lower demands for the production of spontaneous verbal answers, (2) it reduces the probability of correct guessing to 12.5% by coding patterns of answers to different levels of understanding, and (3) it allows whole-class testing as a cost-efficient way of testing large samples and offering the possibility to conduct more sophisticated data analyses. The validity of the MS format was tested by comparing performance in the MS test with that of the same children in two other poles of the test procedure: (1) interview and justifications with high cognitive and verbal demands and (2) MC tasks with low cognitive and verbal demands and high guessing probability. Study 1 compared MS items and interviews, while Study 2 focused on the comparison between the MS and MC formats. In addition, this large-scale study made it possible to analyse the initial structure of children’s developing NOS understanding.

STUDY 1 Study 1 compared the children’s performance on the MS items and interviews. The MS format requires the children to separately accept scientifically advanced answers and reject na€ıve and less advanced answer alternatives, and thus to reflect more deeply on the different options.

Method Participants Altogether, 68 third graders (M = 9.1 years; SD = 5 months; 32 girls, 36 boys) from four classes in two middle-class schools in Germany participated in Study 1.

Materials The children were presented with four paper-and-pencil items, all focusing on the understanding of the distinction between ideas and states of the world (i.e., an understanding of the hypothesis–evidence relation). Although all items incorporated this guiding principle, the specific NOS understanding that was addressed in each item differed slightly (see Figures 1, A1–A3): 1. ‘Goals’ tested the understanding of the purpose and the goals of science (i.e., developing ideas about the world). 2. ‘Testing ideas’ tapped the understanding of the necessity to have ideas (i.e., hypotheses) that can be tested (i.e., falsified).

60

Susanne Koerber et al. Long ago, in the Middle Ages, people believed there were witches who could make people sick.

A modern-day scientist traveled back to the Middle Ages with a time machine.

Scientists in the Middle Ages thought that witches could make people sick. The modern-day scientist believed that bacteria could make people sick. The modern-day scientist showed the scientist from the Middle Ages the bacteria under the microscope and explained: “These bacteria are the reason why people get sick!” What would the scientist from the Middle Ages say to this? He would say this.

1.

“Of course you’re right. Bacteria make people sick, not witches.” naïve

2.

“Bacteria could be the witches’ little helpers.”

3.

“It may be true that there are bacteria here, but witches are still the ones who make people sick.” intermediate

He would not say this.

advanced

Which is the best answer?

No.______

Figure 1. ‘Theories’ (Study 2).

3. ‘Hypothesis’ assessed the understanding of ideas and hypotheses (i.e., what makes an idea testable). 4. ‘Theories’ probed the conception of the role of interpretive frameworks (i.e., ideas are influenced not only by data but also by theoretical and sociocultural frameworks). The items were representative of the spectrum of understanding NOS as it is structured by the different categories of the NOS Interview by Carey et al. (1989), with the last item specifically being adapted from the aforementioned interview by Sodian et al. (1992, see also Bullock et al., 2009). The ‘Goals’ item refers to understanding of science in general (goals, activities) and it assessed the evaluation of three statements concerning the goals, activities, and questions of science (Figure A1). The ‘Testing ideas’ item refers to the role of experiments in science (i.e., their relevance and adequate application for testing ideas) and tested children’s ability to differentiate a hypothesis (here: ‘certain animals are diurnal’) from evidence and their ability to choose an appropriate test of this hypothesis, instead of producing mere effects (Figure A2). Items 3 and 4 tested the children’s understanding of the concepts ‘hypothesis’ and ‘theories’. In the ‘Hypothesis’ item, children must evaluate different examples of hypotheses that differ in testability and explanatory power (Figure A3). The final item tests the understanding of theories and the role of the broader sociocultural framework in science, which guides the interpretation of

Testing NOS in primary school

61

evidence and people’s willingness to accept new theories (Figure 1). Here, the children were asked to imagine that a medieval scientist who believes in witchcraft meets a modern scientist. The modern scientist shows the medieval scientist bacteria under a microscope: Would the medieval scientist change his false belief? The first option represents the na€ıve level, implying that the medieval scientist would readily give up his prior culturally established belief. The third option represents the intermediate level, which conveys an understanding of sociocultural influences on the formation of theories and the reasoning that because such theories were established over a long period of time, they are not abandoned easily. The second option represents the advanced level, which expresses the idea that people try to incorporate new evidence into their own existing theoretical framework. All items incorporate these distinct levels of understanding, among which there is an advanced conception as well as a na€ıve level (i.e., ideas are a copy of nature, knowledge is certain) and an intermediate level (i.e., evidence does not impact personal ideas). The answer options were retrieved from children’s spontaneous answers in open-ended interviews (Sodian, Thoermer, Kircher, Grygier, & G€ unther, 2002). For the MS items in this study, children had to accept or reject each assertion separately. Note that the ‘best answer’ option in Figures 1, A1–A3 was only added for the items in Study 2 to compare MS and MC formats.

Procedure All items were read out loud to each child individually. After the MS part of each item, the children had to indicate in an interview why they agreed or disagreed with each answer option, thereby justifying their MS choices. The children were often quite taciturn, and so the interviewers were instructed to use follow-up questions to probe for understanding. Probing questions were asked when it was unclear what the children meant, and usually consisted of ‘Why would he do this?’ or ‘How come?’, or by asking the children to explain a concept that was used (by asking them to elaborate on it).

Coding of the MS task The answers to the MS components were coded from the eight possible patterns resulting from the acceptance or rejection of the three answer options, as follows. The lowest-level answer accepted for each item was identified as the final level. That is, an answer was coded as being na€ıve whenever the child endorsed a na€ıve level (regardless of the other answer options). Similarly, children who accepted an intermediate- and an advanced-level answer while simultaneously rejecting a na€ıve one were coded at an intermediate level. This procedure ensured that a strict criterion was used to identify children as advanced (i.e., only when they also refuted the less advanced levels), by reducing the probability of guessing correctly to 12.5%.

Transcription and coding of justifications Interviews were audiotaped and transcribed verbatim. Justifications given by the children were coded by at least two independent raters as na€ıve (or lowest; 0), intermediate level (1), or advanced level (2). An example of a justification on an advanced level for ‘Theories’ would be: ‘He [the scientist from the Middle Ages] believed for a long time that witches would make people sick, so he is not going to change his idea very quickly,’ or ‘it is a good compromise answer because it includes both the bacteria and the witches.’ Justifications

62

Susanne Koerber et al.

at an advanced level demonstrated that the children showed an understanding of the strength and persistence of framework theories and their influence on the interpretation of new and conflicting data, whereas those at an intermediate level drew more strongly from the idea that the scientist would not change his beliefs, but that he would rather call into question the new evidence: ‘he [the scientist from the Middle Ages] will not believe him [the modern-day scientist],’ ‘he [the modern-day scientist] is from the future and he [the scientist from the Middle Ages] does not know who he is. . . maybe he is just someone who is just talking. . . so he will not believe him.’ The inter–rater reliabilities (i.e., j values) for ‘Hypothesis’, ‘Testing ideas’, ‘Theories’, and ‘Goals’ were .82, .86, .87, and .94, respectively.

Results Pre-analyses revealed that the MS responses were not significantly affected by either gender, F(1, 60) = 0.96, ns, or classroom, F(3, 60) = 1.09, ns. The same held true for the interview with regard to gender, F(1, 60) = 0.01, ns, and classroom, F(3, 60) = 0.91, ns. Tables 1–4 reveal the relation between the MS format and the interview score for each item. Significant v2 values were found for each item, indicating that both forms of the test tapped the same understanding. The v2 values indicated a significant relation between the results of the interview and the MS test for all items. Between 56% and 79% of children were credited with the same level in both tests. Most of the children credited with an intermediate or advanced understanding in the interview accompanying a specific item were also credited with an advanced understanding on the respective MS item, but not vice versa. This pattern indicated that the difficulty of the task in the interview was greater than for the MS items. Although the expected frequencies in some of the table cells were below 5, and hence v2 values need to be interpreted with caution, the high convergence between interview and MS levels supports a meaningful relation between the two formats.

Discussion The significant relation between the children’s scores on the MS items and the interviews confirms the validity of the MS test. Consistent with the literature (Bullock et al., 2009; Pollmeier, M€ oller, Hardy, & Koerber, 2011), the verbal justification (interview) performance was lower than the MS item score performance. Frede et al. (2011) claim that these differences might not be attributable solely to the production–recognition distinction but also to the fact that the requirements of a task are conveyed more clearly by forced-choice Table 1. Item 1: Goals: Cross-tabulation of multiple-select and interview results in Study 1 Multiple-select score Interview level Low/na€ıve level Intermediate level Advanced level Total

Na€ıve level

Intermediate level

Advanced level

51 (75%) 4 (6%) 1 (2%) 56

6 (9%) 1 (2%) 2 (3%) 9

1 (2%) 1 (2%) 1 (2%) 3

Total 58 6 4 68 (100%)

Note. v2(4) = 13.34, p < .05. Altogether, 79% of the children were credited with the same level.

Testing NOS in primary school

63

Table 2. Item 2: Theories: Cross-tabulation of multiple-select and interview results in Study 1 Multiple-select score Interview level Low/na€ıve level Intermediate level Advanced level Total

Na€ıve level

Intermediate level

Advanced level

33 (59%) 1 (2%) 1 (2%) 35

7 (13%) 5 (9%) 1 (2%) 13

3 (5%) 1 (2%) 4 (7%) 8

Total 43 7 6 56 (100%)

Note. v2(4) = 27.16, p < .05. Altogether, 76% of the children were credited with the same level. Table 3. Item 3: Testing ideas: Cross-tabulation of multiple-select and interview results in Study 1 Multiple-select score Interview level Low/na€ıve level Intermediate level Advanced level Total

Na€ıve level

Intermediate level

Advanced level

15 (22%) 4 (6%) 3 (5%) 22

6 (9%) 9 (13%) 0 (0%) 15

10 (15%) 6 (9%) 14 (21%) 30

Total 31 19 17 67 (100%)

Note. v2(4) = 20.69, p < .05. Altogether, 56% of the children were credited with the same level. Table 4. Item 4: Hypothesis: Cross-tabulation of multiple-select and interview results in Study 1 Multiple-select score Interview level Low/na€ıve level Intermediate level Advanced level Total

Na€ıve level

Intermediate level

Advanced level

27 (42%) 2 (3%) 1 (2%) 30

15 (23%) 8 (12%) 2 (3%) 25

7 (11%) 0 (0%) 3 (5%) 10

Total 49 10 6 65 (100%)

Note. v2(4) = 15.03, p < .05. Altogether, 59% of the children were credited with the same level.

questions (as in this MS format) than by open-ended questions (as in interviews). This finding suggests strongly that low verbal abilities together with uncertainty about the requirements of an open-ended question can negatively influence the interview score. Interviews in this age might therefore underestimate the NOS understanding because of young children’s difficulty to provide adequate verbal elaborations. Therefore, tasks to assess the very beginning of an understanding should be chosen carefully and provide some scaffolding for eliciting children’s nascent understanding (e.g., a clear task presenting different answer alternatives) without jeopardizing the test quality by high guessing probability.

STUDY 2 In Study 2, we compared the MS and MC formats, and a more sophisticated analysis, latent class analysis, was implemented. This type of analysis goes beyond a mere

64

Susanne Koerber et al.

comparison of means and frequencies and enables examination of the relations between item formats. It allows the identification of classes of respondents who exhibit similar response patterns. More specifically, it makes it possible to identify a typical group of children who perform correctly by chance on MC items, but who can be identified as incompetent by MS items.

Method Participants In total, 243 third graders (M = 9.3 years; SD = 6 months; 113 girls, 130 boys) from 14 classes in German middle-class schools participated in Study 2.

Materials The same four items that were used in Study 1 were used in Study 2, with the additional of an MC question at the end of each item. Furthermore, based on findings of influencing factors on scientific reasoning in general (Koerber, Mayer, Osterhaus, Schwippert, & Sodian, in press; Mayer et al., 2014), the children were tested on their text comprehension (Lenhard & Schneider, 2006) as well as their intelligence (subtests 1 and 4 of the CFT 20-R; Weiß, 2006). Socio-educational status (SES) was assessed using a questionnaire asking for the parents’ highest educational achievement.

Procedure The test was conducted as a whole-class test. All children worked on their own test booklet, but the items were presented step by step by the experimenter via a PowerPoint presentation.

Coding The goal of this study was to show competencies across formats and to identify groups of children with the same pattern (e.g., ‘guessers’). We thus dichotomized children’s answers by collapsing the na€ıve and intermediate levels and tested this score against the advanced level.

Results The children’s answer patterns on the MS and MC formats were investigated using latent class analysis, which was conducted in ‘depmixS4’ (Visser & Speekenbrink, 2010). As indicated by the data in Table 5, the best-fitting, most parsimonious model was a model with four latent classes. The answer patterns identified in this model and the posterior probabilities (percentages of children in the respective classes) are provided in Figure 2, which shows four classes of children with parallel patterns in the MS and MC formats. Three of the four classes (C1, C2, and C3) that were identified were characterized by high probabilities of providing the correct answer to one particular item in both the MC and MS parts. These ‘expert groups’ are referred to below as the ‘testing-ideas group’, the ‘theories group’, and the ‘hypothesis group’. Most of the children (59%) fell into a fourth

Testing NOS in primary school

65

Table 5. Study 2: Model comparisons for the latent class analysis Model

AIC

BIC

1-Class model 2-Class model 3-Class model 4-Class modela 5-Class model

1890.68 1813.49 1771.82 1722.37 1734.18

1918.62 1872.87 1862.64 1844.63 1887.88

Note. aThe best-fitting most parsimonious model. AIC = Akaike information citerion; BIC = Bayesian information criterion. multiple−select 1.0

multiple−choice 19%

C1 C2 C3 C4

Pr(correct)

0.8

9%

0.6

13% 0.4 59% 0.2

0.0 Goals

Testing ideas

Theories

Hypothesis

Goals

Testing ideas

Theories

Hypothesis

Figure 2. Study 2: Response probabilities for the correct answer (dichotomized) for four multiple-choice (best answer, right side of the figure) and four multiple-select items (left side of the figure). The percentages of children in the different classes are given next to the respective answer patterns.

group (the ‘incompetent group’, C4) that was characterized by answer probabilities that were close to chance (around 33%) for the MC format. These children performed at worse than a chance level on the MS format and had (depending on the item) only a 0–7% chance of guessing correctly in this format, possibly due to the tendency to select more answers than the correct one. A multinomial logistic regression model with the dependent variable latent class membership, and age in months, intelligence, text comprehension, and SES as predictors revealed that the only difference between the four groups was in the three-level SES score (low, medium, and high), v2(3) = 8.96, p < .05. The children in the incompetent group had parents with a lower education than the children in both the testing-ideas group (b = 1.77, p < .05) and the hypothesis group (b = 1.75, p < .05). For the children in the theories group, SES was not a significant predictor of class membership with respect to the

66

Susanne Koerber et al.

incompetent group. However, SES also did not differ significantly between the three expert groups.

Discussion This study identified four groups of children who performed differently on the NOS items. Application of a strict criterion identified more than 40% of the children as experts in one specific aspect of NOS. The abilities of this group of third graders can be considered their ‘true’ abilities because they displayed simultaneously high probabilities of solving both the MS and MC items; hence, guessing was minimized. For all four groups, the latent class analysis identified a firm relation between MS and MC scores, which was accompanied by better performance scores on the MC tasks. The latent class analysis made it possible to assess the degree to which guessing contributes to good performance and to show that MS tasks are less affected by guessing than MC tasks. Our finding concerning the three expert classes of children who performed well on a certain item but not on others suggests that the early competencies of children in NOS understanding might not yet be very coherent; rather, their understanding entails a specific aspect of the relation between ideas and the world. It is highly plausible that children as young as third graders do not yet have a coherent NOS understanding, but rather that the ‘exhibition’ of their beginning NOS understanding is triggered in a different way by different demands of NOS understanding. The articulation of coherent advanced NOS understanding might even be rare among adults (Thoermer & Sodian, 2002). In addition, most people’s NOS understanding might be rather implicit: Although most people have a stance on how our ideas and our knowledge relate to the external world, very few have explicitly reflected on this, as it is required in most interviews. Rather, their view might be triggered by asking them to agree or disagree with certain statements, a procedure that might be very sensitive to different aspects of NOS understanding. This interpretation that MS tasks might tap a more ‘implicit’ NOS understanding is supported by our finding that intelligence and text comprehension do not influence whether children belong to any of the four groups identified. Therefore, our MS tasks worked well, especially for children who are less articulate.

GENERAL DISCUSSION The findings of the present study demonstrate a beginning NOS understanding in primary school: A large number of children did exhibit a basic understanding of at least one of the central four NOS elements tested. As almost no research has directly addressed NOS in primary-school children, these findings fill an important research gap and show that primary-school children are able to grasp basic fundamentals and concepts of NOS, characterized by a differentiation between ideas (e.g., hypotheses, theories) and evidence (cf. Kuhn, 2010). All of the tasks used in the present studies tapped the coordination of these two epistemologically distinct categories. In ‘Goals of science’, children must recognize that science’s goals are in the realm of ideas and the elaboration and investigation of theories (in contrast to inventing something or the mere production of evidence without relating it to the underlying ideas). The ‘Hypothesis’ task requires children to understand that a hypothesis is an idea, that is testable and therefore falsifiable by evidence. Similarly, in ‘Testing ideas’, children must realize that evidence can either prove the validity of a hypothesis or falsify it. Finally, in ‘Theories’, the children are

Testing NOS in primary school

67

required to acknowledge that the mutual influences of theories and evidence are mediated by sociocultural frameworks and beliefs. Although all tasks cover basic fundamentals of NOS, not all children start understanding all aspects of NOS simultaneously. This finding might account for individual approaches to NOS and be caused by children’s individual experiences at home and in school, which might stress different aspects in their daily communication and argumentation. One intriguing finding from this study is that individual children may build their NOS understanding from different starting points. One group of children possessed a better understanding of the notion of hypotheses, whereas another group was especially competent in understanding that ideas must be tested appropriately to evaluate them; while yet a third group had an early grasp of the role of sociocultural frameworks for theory construction and revision. These findings suggest that there may be different starting points from which children develop their understanding of NOS. Importantly, there does not appear to be one specific aspect, that is easy to grasp and that serves as a foundation for other NOS aspects, as there was no task in which all groups excelled. It might be speculated whether these differences in children’s starting points might account for later individual differences, as shown by the large variability in children’s performance of different aspects of NOS understanding (Grygier, 2008; Smith et al., 2000). As this is a question, which can only be answered developmentally, most preferably in longitudinal studies, the present instrument contributes to this research because it offers a valid measure necessary to investigate this development. Interestingly, the expert groups could not be distinguished according to intelligence, text comprehension, gender, age, or SES. One distinctive feature was found only for the difference between the incompetent group and two of the expert groups in our latent class analysis: Children in the testing-ideas and hypothesis groups had significantly higher SES scores. This parent characteristic has been also been related to scientific reasoning in prior studies (Koerber et al., in press) and hence underlines the validity of the expertise assignment of the latent class analysis. If this assignment had been a product of mere chance, this significant relation would not have been identified. Thus, the general finding that parental education influences competencies suggests that parents indeed play an important part in modelling scientific thinking in everyday life and discourse with their child. In line with the findings of Bullock et al. (2009) and Sodian et al. (1992) who also used contextualized tasks, we found early abilities when using NOS assessments that were integrated in a story. This finding is in contrast to the results of Carey et al. (1989), who applied an interview that required verbal elaboration of abstract terms. Our findings are supported by primary-school children’s competencies in ToM, demonstrating that children even as young as primary-school age are able to grasp the idea of active mental construction and, as the advanced ToM literature shows, possess a nascent understanding of interpretative frameworks and perspectives (Carpendale & Chandler, 1996; Pillow, 1991). Furthermore, children performed slightly better on the MS items than in the open-ended interview (Bullock et al., 2009; Frede et al., 2011; Pollmeier et al., 2011). It seems that presenting children with a choice of answers and asking them to select the best one (in MC tests), or asking them to reflect on each proposition (of different levels) and to accept or reject it (in MS tests) helped the children to structure their ideas. These formats may have elicited answers that children would not have given if they had been required to spontaneously produce answers. We conceive of this format as a scaffold that helps children to express their already existing ideas and to compensate for their still-limited eloquence.

68

Susanne Koerber et al.

Our studies show that a valid assessment of different aspects of NOS understanding is already possible in primary-school children. The significant relation between children’s MS scores and their verbal justifications in interviews and their parallel performance on an MC format indicate that interindividual differences can be reliably captured by an MS-format assessment. This item format requires children to not only identify the most advanced answer but also reject the answers that reflect lower levels of understanding. Therefore, in contrast to MC items, MS reduces the probability of false-positive answers due to guessing from 33% to 12.5% when one adequate and two less adequate forms of understanding are offered. The problem of guessing is even further reduced when consistencies between MC and MS items are inspected, as performed in Study 2, where the chance of correct guessing on both formats simultaneously decreases to about 4%. The MS format, by its own or in combination with MC, is thus an economical and flexible way of acquiring larger data sets through whole-class testing (Pollmeier et al., 2011), an advantage that allows more powerful analyses (e.g., latent class analysis), and thus more profound conclusions to be drawn. In conclusion, there are three main findings from these studies, each offering a new insight into primary-school children’s understanding of NOS. First, our MS format proved to be valid and economical, and especially suited for large-scale studies. Interviews used for testing NOS understanding in primary school usually do not specify the development from misconceptions to more adequate views explicitly; instead, developmental models are needed to provide extensive empirical evidence to back up the validity of the proposed categorizations of different levels of NOS understanding. The development of a paper-and-pencil test that meets the demands of large-scale assessment is therefore an important prerequisite, as the hierarchy of the different levels as well as their coherence can be tested only in large-scale probabilistic models that enable the researcher to empirically test the item–construct fit. As an important step in the testing of a developmental model, we have demonstrated that MS tests are sensitive to the NOS understanding of third graders and that children in this age group possess basic competencies and structures of NOS understanding that have not been credited to them in previous developmental research. Second, our results suggest the presence of substantial interindividual differences in the understanding of NOS in primary school. This finding is in line with and extends previous results on NOS development in adolescence obtained in the longitudinal study of Bullock et al. (2009). Although not all 9-year-olds exhibited expertise in NOS understanding, already 40% of the participants did have an understanding of at least one NOS aspect, demonstrating that some children who are this young have already overcome a consistent na€ıve epistemology. This finding points to the relevance of research on NOS for cognitive development and is in line with the ToM literature, showing that a central theme of advanced ToM development (understanding the NOS and knowledge construction) can already be attained in primary-school children. Third, we have shown that the initial structure of NOS understanding in primary school is not consistent among children of the same age. Rather, children seem to begin from different points and it may be speculated that they follow different pathways in their endeavours to master a full understanding of NOS. Our findings suggest that there might be no standard trajectory with the same starting point for all children in NOS development and that this should be borne in mind when investigating developmental sequences and considering appropriate education strategies.

Testing NOS in primary school

69

Acknowledgements The study was conducted as part of the project Science-P ‘Development of science competencies in primary school’ and was partially funded by the German Research Council (DFG) within the Priority Program ‘Models of Competencies for Assessment of Individual Learning Outcomes and the Evaluation of Educational Processes’ (SPP 1293; SO 213/29-2; KO 2276/4-3). We would like to thank Martha Alibali for helpful discussions and all children, parents, and teachers who supported this study.

References Bullock, M., Sodian, B., & Koerber, S. (2009). Doing experiments and understanding science: Development of scientific reasoning from childhood to adulthood. In W. Schneider & M. Bullock (Eds.), Human development from early childhood to early adulthood. Findings from the Munich longitudinal study (pp. 173–197). Mahwah, NJ: Erlbaum. Carey, S., Evans, R., Honda, M., Jay, E., & Unger, C. (1989). ‘An experiment is when you try it and see if it works’: A study of grade 7 students’ understanding of the construction of scientific knowledge. International Journal of Science Education, 11, 514–529. doi:10.1080/ 0950069890110504 Carpendale, J. I., & Chandler, M. J. (1996). On the distinction between false belief understanding and subscribing to an interpretive theory of mind. Child Development, 67, 1686–1706. doi:10.1111/ j.1467-8624.1996.tb01821.x Frede, V., Nobes, G., Frappart, S., Panagiotaki, G., Troadec, B., & Martin, A. (2011). The acquisition of scientific knowledge: The influence of methods of questioning and analysis on the interpretation of children’s conceptions of the earth. Infant and Child Development, 20, 432–448. doi:10. 1002/icd.730 € € Grygier, P. (2008). Wissenschaftsverstandnis von Grundschulern im Sachunterricht. [Primary-school children’s understanding of the nature of science in science instruction]. Bad Heilbrunn, Germany: Julius Klinkhardt. Koerber, S., Mayer, D., Osterhaus, C., Schwippert, K., & Sodian, B. (in press). The development of early scientific reasoning. Child Development. Koerber, S., Sodian, B., Thoermer, C., & Nett, U. (2005). Scientific reasoning in young children: Preschoolers’ ability to evaluate covariation evidence. Swiss Journal of Psychology, 64, 141– 152. doi:10.1024/1421-0185.64.3.141 Kuhn, D. (2010). What is scientific thinking and how does it develop? In U. Goswami (Ed.), Handbook of childhood cognitive development (2nd ed., pp. 472–523). Oxford, UK: Blackwell. Lederman, N. G. (2007). Nature of science: Past, present, and future. In S. K. Abell & N. G. Lederman (Eds.), Handbook of research on science education (pp. 831–880). Mahwah, NJ: Lawrence Erlbaum. € € Erst- bis Sechstklassler € Lenhard, W., & Schneider, W. (2006). ELFE 1-6. Ein Leseverstandnistest fur [A reading test for first- to sixth-graders]. G€ ottingen, Germany: Hogrefe. Mayer, D., Sodian, B., Koerber, S., & Schwippert, K. (2014). Scientific reasoning in elementary school children: Assessment and relations with cognitive abilities. Learning and Instruction, 29, 43–55. doi:10.1016/j.learninstruc.2013.07.005 Piekny, J., & Maehler, C. (2013). Scientific reasoning in early and middle childhood: The development of domain-general evidence evaluation, experimentation, and hypothesis generation skills. British Journal of Developmental Psychology, 31, 153–179. doi:10.1111/j. 2044-835X.2012.02082.x Pillow, B. H. (1991). Children’s understanding of biased social cognition. Developmental Psychology, 27, 539–551. doi:10.1037/0012-1649.27.4.539 Pollmeier, J., M€ oller, K., Hardy, I., & Koerber, S. (2011). Naturwissenschaftliche Lernst€ande im Grundschulalter mit schriftlichen Aufgaben valide erfassen? [Do paper-and-pencil tasks validly € Padagogik, € assess natural science skills in elementary school?]. Zeitschrift fur 6, 834–853.

70

Susanne Koerber et al.

Smith, C. L., Maclin, D., Houghton, C., & Hennessey, M. G. (2000). Sixth-grade students’ epistemologies of science: The impact of school science experiences on epistemological development. Cognition and Instruction, 18, 349–422. doi:10.1207/S1532690XCI1803_3 Sodian, B., Carey, S., Grosslight, L., & Smith, C. (1992). Junior high school students’ understanding of the nature of scientific knowledge. The notion of interpretive frameworks. Paper presented at the Annual Meeting of the American Educational Research Association, San Francisco, CA. Sodian, B., Thoermer, C., Kircher, E., Grygier, P., & G€ unther, J. (2002). Vermittlung von Wissenschaftsverst€andnis in der Grundschule [Teaching understanding of NOS in elementary € Padagogik, € school]. Zeitschrift fur Supplement, 45, 192–206. Sodian, B., Zaitchik, D., & Carey, S. (1991). Young children’s differentiation of hypothetical beliefs from evidence. Child Development, 62, 753–766. doi:10.1111/j.1467-8624.1991.tb01567.x Thoermer, C., & Sodian, B. (2002). Science undergraduates’ and graduates’ epistemologies of science: The notion of interpretive frameworks. New Ideas in Psychology, 26, 263–283. doi:10. 1016/S0732-118X(02)00009-0 Visser, I., & Speekenbrink, M. (2010). depmixS4: An R-package for hidden Markov models. Journal of Statistical Software, 36(7), 1–21. Weiß, R. H. (2006). Grundintelligenztest Skala 2 – Revision (CFT 20-R) [Culture fair intelligence test]. G€ ottingen, Germany: Hogrefe. Zimmerman, C. (2007). The development of scientific thinking skills in elementary and middle school. Developmental Review, 27, 172–223. doi:10.1016/j.dr.2006.12.001 Received 6 June 2014; revised version received 2 September 2014

Appendix Lukas, Markus and Andreas want to become scientists. They have several ideas about what a scientist does. Who has a good idea about what a scientist does? a good idea

1. Lukas believes: Many scientists invent new things and help people. naïve 2. Markus believes: Many scientists try to find explanations for what they can observe in the world. advanced 3. Andreas believes: Many scientists do experiments in order to find out something. intermediate Who has the best idea of what a scientist does?

Figure A1. ‘Goals’ (Study 2). All items were presented in colour.

No.

not a good idea

Testing NOS in primary school

Lilans live on the planet “Lila”. For quite a while they observed that small green animals come from a hole in the ground every morning. They want to find out why. The Lilans think that the green animals just come out of their hole during daytime. They have several ideas about how to test their idea. Which idea do you think is good, and which one is not so good, in order to test the guess about the Lilans? a good idea

1.

We sit in front of the hole and observe when the green animals come out of their hole. We then write down what we have seen. intermediate

2.

We observe the hole during the night. If the green animals come out of their hole then we know that our idea was wrong. advanced

3.

We just put a piece of salami in front of the hole, and observe whether they then come out of their hole. naïve

Which is the best idea? Figure A2. ‘Testing ideas’ (Study 2).

not a good idea

No.______

71

72

Susanne Koerber et al. A hypothesis is a scientific guess that can be tested. Michael thinks about what is a good example of a hypothesis.

What is a good example of a hypothesis? a good example

not a good example

1. There are a lot of different kinds of whales. intermediate 2. In order to make a pizza dough you need yeast, flour, salt and water. naïve 3. Sunflowers that are fertilized grow bigger advanced Which is the best example of a hypothesis?

Figure A3. ‘Hypothesis’ (Study 2).

No.______

Copyright of British Journal of Developmental Psychology is the property of WileyBlackwell and its content may not be copied or emailed to multiple sites or posted to a listserv without the copyright holder's express written permission. However, users may print, download, or email articles for individual use.

Testing primary-school children's understanding of the nature of science.

Understanding the nature of science (NOS) is a critical aspect of scientific reasoning, yet few studies have investigated its developmental beginnings...
456KB Sizes 2 Downloads 6 Views