REL Reference Desk

Using Growth Models and Value-added Models to Inform Instruction

November 2, 2010

Request

How can you use value-added assessments or growth models to inform instruction, as opposed to using them for school-level or teacher-level evaluation?

Response

We found little research about using value-added assessments or growth models to inform instruction. However, we found a large amount of literature about value-added models related to tracking student achievement, measuring teacher quality and effectiveness, and statistical difficulties and limitations of the models. We present three articles below that describe how various school districts have used data from value-added assessments or growth models for instructional purposes.

While schools have relied on annual administration of standardized tests to measure student and school performance, educators are finding these scores do not generate meaningful growth data and, more specifically, data to inform instruction. With an increased interest to track schools on how much they contribute to students’ learning over time, school officials are looking more and more at using growth models to measure this progress; a value-added model (VAM) is one example of growth models (Olson, 2004). VAM can be used to estimate the gains individual students make from year to year (i.e., first grade reading level to third grade reading level) and how much teachers contribute to student scores. Recently educators have also shown interest in how VAM data can be used to inform classroom instruction. 

Methodology

In order to answer the submitted question, articles were identified through online library searches and Internet searches. Searches were conducted in academic databases such as ERIC, Wilson Web, Tennessee Department of Education, Ohio Department of Education, Pennsylvania Department of Education, JStore, Academic One File, and the University of Memphis online library. To supplement the findings, we conducted an additional search through Google using the phrases: “value added models”, “growth models”, and “value added models to inform instruction.”

We found a large amount of literature about value-added models that discusses:

  • tracking student achievement scores (Martineau, 2006; Lockwood, McCaffrey, & Hamilton, 2007);
  • measuring teacher quality and effectiveness (Braun, 2005; Goe, Bell, & Little, 2008; Kupermintz, 2003); and
  • unique statistical difficulties and limitations (Schochet & Chiang; 2010, Martineau, 2010; Rothstein, 2009; Sanders, Wright, Rivers, & Leandro, 2009).

But, we found little research about using VAM or growth models to inform instruction. The following section summarizes the small amount of research currently available.

Value-added assessment data to inform instruction

Because VAM shows the impact of classroom instruction on student learning, it can provide detailed information at the classroom level. Below lists articles that show how different districts have used value added data to inform instruction.

In Rochester, New Hampshire, school leaders began implementing Measure of Academic Progress (MAP) to report both status and growth scores for each of their students (Yeagley, 2007). What they found from these scores was that while the low achieving students were making gains, the advanced students were showing a low rate of academic growth. With this data they developed an instructional plan so that all students made significant gains. First, teachers identified the needs of individual students by using the value-added assessment data. Then the teachers and students together develop individualized learning plans.

Teachers for the first time could use the data to create flexible grouping and differentiate instruction within their classroom. Students helped with developing their individualized plan by meeting with the teacher in the fall to review the results and choose growth targets to accomplish over the following year. Specific outcomes for the district were not mentioned (i.e. raised achievement scores, higher academic growth by the advanced students), just a detailed plan of how the schools used the data to inform instruction

The author does suggest “exercising caution” when using growth measure and value-added data. First, these measures should not be used as a complete measure of student progress or school performance. Second, these growth measures are not easily practical in high schools. And finally, if the assessment and growth data are used for teacher evaluation, they should be paired with other measurements to give an accurate evaluation.

In a second article, Hershberg, Simon, and Lea-Kruger (2004), discuss using value-added data to see where the focus of instruction is in a classroom, how effective instruction has been, and how to change instructional patterns. They then give three examples of school districts (Tennessee, Ohio, and Pennsylvania) that have used value-added data to transform instructional practices in their schools and ultimately raise student achievement.

The authors believe value-added assessments alone do not improve student achievement. Instead they encourage school leaders to first understand the data and then use it to guide classroom instruction and professional development. There is some detail within the article of how to accomplish this, but overall it is still quite vague.

Finally, Jerald (2009) gives fairly detailed examples of districts (Tennessee, Pennsylvania, and Ohio) that use data to improve student achievement and reorganize curriculum and classes. For example, a district in Ohio added a more advance mathematics course when they discovered students were achieving “below-average” growth in mathematics. The final section describes how other districts (New York City, Maryville TN., and North Carolina) use the data to compensate teachers for their “effectiveness,” matching teacher strengths with student needs, and moving ineffective teachers out of low performing schools.

Jerald (2009) believes enough information exists to utilize value-added statistical methods to inform instruction, stating that “if data is used wisely, such information can lead to better informed decisions that benefit everyone with a stake in improving teaching and learning” (p.6). But, the author also states that most districts do not have the necessary data to take advantage of VAM.

Conclusion

The literature suggests that while schools can use VAM to inform instruction, there is still little research to help educators fully understand how this can be accomplished. Therefore, until educators become more experienced in using value-added assessment data, best practices cannot be accurately determined (McCaffrey and Hamilton, 2007).

Note that the review offers a summary of research literature related to using value added assessment data to inform instruction. It is not an exhaustive review of the research literature pertaining to value added assessments, but focuses on a collection of literature that specifically gives examples of how school leaders have used VAM data to inform instruction.

 Resources

The following abstracts and summaries were taken verbatim from the online academic, government, or public databases from which we obtained them.

Braun, H.I. (2005). Using student progress to evaluate teachers: A primer on value-added models. Princeton, NJ: Educational Testing Service. Retrieved October 14, 2010 from http://www.ets.org/Media/Research/pdf/PICVAM.pdf.

The Educational Testing Service (ETS) has issued Using Student Progress to Evaluate Teachers: A Primer on Value-Added Models. The report is intended to aid interested parties who are considering using the value-added model statistical tool to hold teachers accountable for student achievement.

Goe, L., Bell, C., & Little, O. (2008). Approaches to evaluating teacher effectiveness: A research synthesis. National Comprehensive Center for Teacher Quality. Washington D.C.

This research synthesis examines how teacher effectiveness is currently measured and provides practical guidance for how best to evaluate teacher effectiveness. It evaluates the research on teacher effectiveness and the different instruments used to measure it. In addition, it defines the components and indicators that characterize effective teachers, extending this definition beyond teachers' contribution to student achievement gains to include how teachers impact classrooms, schools, and their colleagues as well as how they contribute to other important outcomes for students. The findings are presented along with related policy implications. In addition, the synthesis describes how various measures have been evaluated, explains why certain measures are most suitable for certain purposes (high-stakes evaluation versus formative evaluation, for instance), and suggests how the results of the study might be used to inform the national conversation about teacher effectiveness.

Hershberg, T., Simon, V.A., and Lea-Kruger, B. (2004). The revelations of value-added. School Administrator, 61 (11) p. 10-14. Retrieved from wilsonweb.

Value-added assessment offers administrators a new way to measure teaching and learning. This assessment model broadens understanding of the contribution instruction makes to student learning by focusing on growth rather than solely on levels of absolute achievement. Details of how value-added assessment has been used in three different school districts to transform instructional practice and raise student achievement are provided.

Jerald, C.D. (2009). The value of value-added data. K-12 policy. Education Trust. Retrieved from ERIC, ED507719.

Researchers demonstrated a quarter century ago that schools could effectively employ valueadded statistical methods. At the time, only a few states and districts had accumulated the necessary annual assessment data to take advantage of the breakthrough. Today, every state has the capacity to provide educators with value-added data. Yet most American teachers and Using Growth Models and Value-added Models to Inform Instruction  administrators still lack access to such information. In its proposed regulations for the Race to the Top program, the U.S. Department of Education has signaled that it wants to change this. Educators should welcome the push. Principals, teachers, and parents will gain valuable information about students' past and predicted performance. School and district administrators will have more information about teachers and the programs intended to hone teachers' skills. Last but certainly not least, teachers will have more information about the effectiveness of their own classroom instruction. If used wisely, such information can lead to better informed decisions that benefit everyone with a stake in improving teaching and learning.

Kupermintz, H. (2003). Teacher effects and teacher effectiveness: A validity investigation of the Tennessee Value Added Assessment System,” Educational Evaluation and Policy Analysis, 25 (3), p.287–298.

This article addresses the validity of teacher evaluation measures produced by the Tennessee Value Added Assessment System (TVAAS). The system analyzes student test score data and estimates the effects of individual teachers on score gains. These effects are used to construct teacher value-added measures of teaching effectiveness. We describe the process of generating teacher effectiveness estimates in TVAAS and discuss policy implications of using these estimates for accountability purposes. Specifically, the article examines the TVAAS definition of teacher effectiveness, the mechanism employed in calculating numerical estimates of teacher effectiveness, and the relationships between these estimates and student ability and socioeconomic background characteristics. Our validity analyses point to several logical and empirical weaknesses of the system, and underscore the need for a strong validation research program on TVAAS.

Lockwood, J.R., McCaffrey, D.F., & Hamilton, L.S. (2007). The sensitivity of value-added teacher effect estimates to different mathematics achievement measure. Journal of Educational Measurement. Retrieved from WilsonWeb, ISSN 0022-0655.

Using longitudinal data from a cohort of middle school students from a large school district, we estimate separate “value-added” teacher effects for two subscales of a mathematics assessment under a variety of statistical models varying inform and degree of control for student background characteristics. We find that the variation in estimated effects resulting from the different mathematics achievement measures is large relative to variation resulting from choices about model specification, and that the variation within teachers across achievement measures is larger than the variation across teachers. These results suggest that conclusions about individual teachers' performance based on value-added models can be sensitive to the ways in which student achievement is measured.

Martineau, J.A. (2010). The validity of value-added models: An allegory. Phi Delta Kappan. 91(7), p.64-67. Retrieved from ERIC, EJ882374.

Value added models have become popular fixes for various accountability schemes aimed at measuring teacher effectiveness. Value added models may resolve some of the issues in accountability models, but they bring their own set of challenges to the table. Unfortunately, Using Growth Models and Value-added Models to Inform Instruction political and emotional considerations sometimes keep one from examining value added models carefully. To make it easier to scrutinize them objectively, the author presents an allegory that places them in a noneducational context. A story about a foster parent who tries to find out how his effectiveness is measured serves as an allegory for the problems that beset many valueadded teacher assessment models.

Martineau, J.A. Value added models in education: Theory and application. University of Maryland, MD: Department of Measurement, Statistics, and Evaluation. Retrieved October 14, 2010 from http://marces.org/booklink/Value%20Added%20Models%20in%20Education%20Theory%20and%20Application.pdf.

The editor tried to select authors who could represent a variety of experiences with Value Added Modeling (VAM). These included theoretical developments such as modeling the effects of test linking on growth analysis, what it means to do VAM applications in a valid way, what has been learned about VAM from applications in local school systems and from analyses of national and state data sets, and the application of VAM to important questions in education policy.

Martineau, J.A. (2006). Distorting value added: The use of longitudinal, vertically scaled student achievement data for growth-based, value-added accountability. Journal of Educational and Behavioral Statistics. 31 (1), p.35-62.

Longitudinal, student performance-based, value-added accountability models have become popular of late and continue to enjoy increasing popularity. Such models require student data to be vertically scaled across wide grade and developmental ranges so that the value added to student growth/achievement by teachers, schools, and districts may be modeled in an accurate manner. Many assessment companies provide such vertical scales and claim that those scales are adequate for longitudinal value-added modeling. However, psychometricians tend to agree that scales spanning wide grade/developmental ranges also span wide content ranges, and that scores cannot be considered exchangeable along the various portions of the scale. This shift in the constructs being measured from grade to grade jeopardizes the validity of inferences made from longitudinal value-added models. This study demonstrates mathematically that the use of such "construct-shifting "vertical scales in longitudinal, value-added models introduces remarkable distortions in the value-added estimates of the majority of educators. These distortions include (a) identification of effective teachers/schools as ineffective( and vice versa) simply because their students' achievement is outside the developmental range measured well by "appropriate" grade-level tests, and (b) the attribution of prior teacher/school effects to later teachers/schools. Therefore, theories, models, policies, rewards, and sanctions based upon such value-added estimates are likely to be invalid because of distorted conclusions about educator effectiveness in eliciting student growth. This study identifies highly restrictive scenarios in which current value-added models can be validly applied in high-stakes and lowstakes research uses. This article further identifies one use of student achievement data for growth-based, value-added modeling that is not plagued by the problems of construct shift: the assessment of an upper grade content (e.g., fourth grade) in both the grade below and the Using Growth Models and Value-added Models to Inform Instruction appropriate grade to obtain a measure of student gain on a grade-specific mix of constructs. Directions for future research on methods to alleviate the problems of construct shift are identified as well.

McCaffrey, D.F. & Hamilton, L.S. (2007). Value-added assessment in practice. Lessons from the Pennsylvania value-added assessment system pilot project. RAND Corporation. Retrieved from http://www.rand.org/pubs/technical_reports/TR506/

The No Child Left Behind Act of 2001 places a strong emphasis on the use of student achievement test scores to measure school performance, and, throughout the United States, school and district education reform efforts are increasingly focusing on the use of student achievement data to make decisions about curriculum and instruction. To encourage and facilitate data-driven decision making, many states and districts have begun providing staff with information from value-added assessment (VAA) systems—collections of complex statistical techniques that use multiple years of test-score data to try to estimate the causal effects of individual schools or teachers on student learning. The authors examined Pennsylvania’s valueadded assessment system, which was rolled out in four waves, allowing comparison of a subset of school districts participating in the VAA program with matched comparison districts not in the program. The study found no significant differences in student achievement between VAA and comparison districts. The authors surveyed school superintendents, principals, and teachers from these districts about their attitudes toward and use of test and value-added data for decision making, and found that most educators at schools participating in the VAA program do not make significant use of the information it provides. McCaffrey and Hamilton conclude that the utility of VAA cannot be accurately assessed until educators become more engaged in using value-added measures.

Olson, L. (Nov. 2004). States weigh value added models. Education Week. Retrieved from www.edweek.org/ew/articles/2004/11/24/13test.h24.html.

No abstract available.

Rothstein, J. (2009). Student sorting and biases in value-added assessment: Selection on observables and unobservables. Education Finance and Policy. 4(4), p.537-571. Retrieved from ERIC, EJ863344.

Nonrandom assignment of students to teachers can bias value-added estimates of teachers' causal effects. Rothstein (2008, 2010) shows that typical value-added models indicate large counterfactual effects of fifth-grade teachers on students' fourth-grade learning, indicating that classroom assignments are far from random. This article quantifies the resulting biases in estimates of fifth-grade teachers' causal effects from several value-added models, under varying assumptions about the assignment process. If assignments are assumed to depend only on observables, the most commonly used specifications are subject to important bias, but other feasible specifications are nearly free of bias. I also consider the case in which assignments depend on unobserved variables. I use the across-classroom variance of observables to calibrate several models of the sorting process. Results indicate that even the best feasible Using Growth Models and Value-added Models to Inform Instruction value-added models may be substantially biased, with the magnitude of the bias depending on the amount of information available for use in classroom assignments.

Sanders, W.L., Wright, S.P., Rivers, J.C., Leandro, J.G. (2009). A response to criticisms of SAS EVAAS. SAS Institute Inc. Retrieved from http://www.portal.state.pa.us/portal/server.pt/community/pa_valueadded_assessment_system_(pvaas)/8751/research/614788.

No abstract available.

Schochet, P.Z., & Chiang, H.S. (2010). Error rates in measuring teacher and school performance based on student test score gains. U.S. Department of Education.

This paper addresses likely error rates for measuring teacher and school performance in the upper elementary grades using value-added models applied to student test score gain data. Using realistic performance measurement system schemes based on hypothesis testing, we develop error rate formulas based on OLS and Empirical Bayes estimators. Simulation results suggest that value-added estimates are likely to be noisy using the amount of data that are typically used in practice. Type I and II error rates for comparing a teacher's performance to the average are likely to be about 25 percent with three years of data and 35 percent with one year of data. Corresponding error rates for overall false positive and negative errors are 10 and 20 percent, respectively. Lower error rates can be achieved if schools are the performance unit. The results suggest that policymakers must carefully consider likely system error rates when using value-added estimates to make high-stakes decisions regarding educators.

This publication was prepared under a contract with the U.S. Department of Education’s Institute of Education Sciences, Contract ED-06-CO-0021, by Regional Educational Laboratory Appalachia, administered by CNA. The content of the publication does not necessarily reflect the views or policies of IES or the U.S. Department of Education, nor does mention of trade names, commercial products, or organizations imply endorsement by the U.S. government.