Schools Matter: value-added models

Showing posts with label value-added models. Show all posts

Saturday, November 14, 2015

AERA Concludes the Facts Are Factual about VAM

The American Education Research Association (AERA) can be counted on to remain irrelevant to research discussions of great social significance. Not surprisingly, AERA's shrinking membership numbers have coincided with the org's steady drift into the arms of education reform schoolers and the corrupt CorpEd foundations that are laser focused on redirecting education at all levels into corporate revenue streams.

AERA's complacency and complicity have been sickening to watch, with their kowtowing to Bill Gates and his various bad ideas culminating last year when AERA announced a fellowship program for doctoral students interested channeling and then relinquishing their doctoral research to the Gates's MET database.

Now almost 20 years after legitimate researchers starting ringing the alarm bell on the value-added muddle that was thrust upon the education world by the tobacco-chewing ag statistitian, Bill Sanders, and six years after the National Academy of Sciences sent their hair-on-fire letter to Arne Duncan (which was ignored), warning him about including not-ready-for-prime-time VAM in Race to the Top requirements , AERA has finally concluded that the truth must be true: VAM is not a legitimate tool for ANY high stakes education decisions.

From AERA's announcement:

. . . . In recent years, many states and districts have attempted to use VAM to determine the contributions of educators, or the programs in which they were trained, to student learning outcomes, as captured by standardized student tests. The AERA statement speaks to the formidable statistical and methodological issues involved in isolating either the effects of educators or teacher preparation programs from a complex set of factors that shape student performance.

“This statement draws on the leading testing, statistical, and methodological expertise in the field of education research and related sciences, and on the highest standards that guide education research and its applications in policy and practice,” said AERA Executive Director Felice J. Levine.

The statement addresses the challenges facing the validity of inferences from VAM, as well as specifies eight technical requirements that must be met for the use of VAM to be accurate, reliable, and valid. It cautions that these requirements cannot be met in most evaluative contexts.

The statement notes that, while VAM may be superior to some other models of measuring teacher impacts on student learning outcomes, “it does not mean that they are ready for use in educator or program evaluation. There are potentially serious negative consequences in the context of evaluation that can result from the use of VAM based on incomplete or flawed data, as well as from the misinterpretation or misuse of the VAM results.”

The statement also notes that there are promising alternatives to VAM currently in use in the United States that merit attention, including the use of teacher observation data and peer assistance and review models that provide formative and summative assessments of teaching and honor teachers’ due process rights.

The statement concludes: “The value of high-quality, research-based evidence cannot be over-emphasized. Ultimately, only rigorously supported inferences about the quality and effectiveness of teachers, educational leaders, and preparation programs can contribute to improved student learning.” Thus, the statement also calls for substantial investment in research on VAM and on alternative methods and models of educator and educator preparation program evaluation.

For a full research review and history of VAM's origin and growth in Tennessee, see The Mismeasure of Education (Horn & Wilburn, 2013).

Thursday, October 10, 2013

Evidence Presented in the Case Against Growth Models for High Stakes Purposes

Posted earlier today at Substance News:

Evidence Presented in the Case Against Growth Models for High Stakes Purposes

Denise Wilburn and Jim Horn

The following article quotes liberally from The Mismeasure of Education, and it represents an overview of the research-based critiques of value-added modeling, or growth models. We offer it here to Chicago educators [and educators everywhere] with the hope that it may serve to inspire and inform the restoration of fairness, reliability, and validity to the teacher evaluation process and the assessment of children and schools.

In the fall of 2009 before the final guidance was issued to all the cash-strapped states lined up for some of the $3.4 billion in Race to the Top grants, the Board of Testing and Assessment issued a 17-page letter to Arne Duncan, citing the National Research Council’s response to the RTTT draft plan. BOTA cited reasons to applaud the DOEd’s efforts, but the main purpose of the letter was to voice, in unequivocal language, the NRC’s concern regarding the use of value-added measures, or growth models, for high stakes purposes specifically related to the evaluation of teachers:

BOTA has significant concerns that the Department’s proposal places too much emphasis on measures of growth in student achievement (1) that have not yet been adequately studied for the purposes of evaluating teachers and principals and (2) that face substantial practical barriers to being successfully deployed in an operational personnel system that is fair, reliable, and valid (p. 8).

In 1992 when Dr. William Sanders sold value-added measurement to Tennessee politicians as the most reliable, valid, and fair way to measure student academic growth and the impact that teachers, schools, and districts have on student achievement, the idea seemed reasonable, more fair, and even scientific to some. But since 1992, leading statisticians and testing experts who have scrutinized value-added models have concluded that these assessment systems for measuring test score growth do not meet the reliability, validity, and fairness standards established by respected national organizations, such as the American Educational Research Association, the American Psychological Association, and the National Council on Measurement in Education (Amrein-Beardsley, 2008). Nonetheless, value-added modeling for high-stakes decision making now consumes significant portions of state education budgets, even though there is no oversight agency to make sure that students and teachers are protected:

Who protects [students and teachers in America’s schools] from assessment models that could do as much harm as good? Who protects their well-being and ensures that assessment models are safe, wholesome, and effective? Who guarantees that assessment models honestly and accurately inform the public about student progress and teacher effectiveness? Who regulates the assessment industry? (Amrein-Beardsley, 2008, p. 72)

If value-added measures do not meet the highest standards established for reliable, valid and fair measurement, then the resulting high stakes decisions made based on these value-added measures are also unreliable, invalid and unfair. Therefore, legislators, policymakers, and administrators who require high-stakes decisions based on value-added measures are equally culpable and liable for the wrongful termination of educators mismeasured with metrics derived from tests that were never intended to do anything but give a ballpark idea of how students are progressing on academic benchmarks for subject matter concepts at each grade level.

Is value-added measurement reliable? Is it consistent in its measurement and free of measurement errors from year to year? As early as 1995, the Tennessee Office of Education Accountability (OEA) reported “unexplained variability” in Tennessee’s value-added scores and called for an outside evaluation of all components of the Tennessee Value-Added Assessment System (TVAAS), including the achievement tests used in calculating the value-added scores. The outside evaluators, Bock, Wolfe & Fisher (1996), questioned the reliability of the system for high-stakes decisions based on how the achievement tests were constructed. These experts recognized that test makers are engaged in a very difficult and imprecise science when they attempt to rank order learning concepts by difficulty.

This difficulty is compounded when they attempt to link those concepts across a grade-level continuum so that students successfully build subject matter knowledge and skill from grade to grade. An extra layer of content design imprecision is added when test makers create multiple forms of the test at each grade level to represent those rank-ordered test items. Bock, Wolfe, & Fisher found that variation in test construction was, in part, responsible for the “unexplained variability” in Tennessee’s state test results.

Other highly respected researchers (Ballou, 2002; Lockwood, 2006; McCaffrey & Lockwood, 2008; Briggs, Weeks & Wiley, 2008) have weighed in on the issue of reliability of value-added measures based on questionable achievement test construction. As an invited speaker to the National Research Council workshop on value-added methodology and accountability in 2010, Ballou pointedly went to the heart of the test quality matter when he acknowledged the “most neglected” question among economists concerned with accountability measures:

The question of what achievement tests measure and how they measure it is probably the [issue] most neglected by economists…. If tests do not cover enough of what teachers actually teach (a common complaint), the most sophisticated statistical analysis in the world still will not yield good estimates of value-added unless it is appropriate to attach zero weight to learning that is not covered by the test. (National Research Council and National Academy of Education, 2010, p. 27).

In addition to these test issues, the reliability of the teacher effect estimates in high-stakes applications are compromised by a number of other recurring problems:

1) the timing of the test administration and summer learning loss (Papay, 2011);

2) missing student data (Bock & Wolfe, 1996; Fisher, 1996; McCaffrey et al, 2003; Braun, 2005; National Research Council, 2010);

3) student data poorly linked to teachers (Dunn, Kadane & Garrow, 2003; Baker et al, 2010);

4) inadequate sample size of students due to classroom arrangements or other school logistical and demographic issues (Ballou, 2005; McCaffrey, Sass, Lockwood, & Mihaly, 2009).

Growth models such as the Sanders VAM use multiple years of data in order to reduce the degree of potential error in gauging teacher effect. Sanders justifies this practice by claiming that a teacher’s effect on her students learning will persist into the future and, therefore, can be measured with consistency.

However, research conducted by McCaffrey, Lockwood, Koretz, Louis, & Hamilton (2004) and subsequently by Jacob, Lefgrens and Sims (2008) shatters this bedrock assumption. These researchers found that “only about one-fifth of the test score gain from a high value-added teacher remains after a single year…. After two years, about one-eighth of the original gain persists” (p. 33).

Too many uncontrolled factors impact the stability and sensitivity of value-added measurement for making high-stakes personnel decisions for teachers. In fact, Schochet and Chiang (2010) found that the error rates for distinguishing teachers from the average teaching performance using three years of data was about 26 percent. They concluded

more than 1 in 4 teachers who are truly average in performance will be erroneously identified for special treatment, and more than 1 in 4 teachers who differ from average performance by 3 months of student learning in math or 4 months in reading will be overlooked (p. 35).

Schochet and Chiang also found that to reduce error in the variance in teachers’ effect scores to 12 percent, rather than 26 percent, ten years of data would be required for each teacher (p. 35). When we consider the stakes for altering school communities and students’ and teachers’ lives, the utter impracticality of reducing error to what may be argued as an acceptable level makes value-added modeling for high-stakes decisions simply unacceptable.

Besides this brief summary above of reliability problems, growth models also have serious validity issues caused by the effect of nonrandom placement of students from classroom to classroom, from school to school within districts, or from system to system. Non-random placement of students further erodes Sanders’ causal claims for teacher effects on achievement, as well as his claims that the impact of student characteristics on student achievement is irrelevant. For a teacher’s value-added scores to be valid, she must have “an equal chance of being assigned any of the students in the district of the appropriate grade and subject” and “a teacher might be disadvantaged [her scores might be biased] by placement in a school serving a particular population” year after year (Ballou, 2005, p. 5). In Tennessee, no educational policy or administrative rule exists that requires schools to randomly assign teachers or students to classrooms, thereby denying an equal chance of randomly placed teachers or students within a school, within a district, or across the state.

To underscore the effect of non-random placement of disadvantaged students on teacher effect estimates, Kupermintz (2003) found in reexamining Tennessee value-added data that “schools with more than 90% minority enrollment tend to exhibit lower cumulative average gains” and school systems’ data showed “even stronger relations between average gains and the percentage of students eligible for free or reduced-price lunch” (p. 295).

Value-added models like TVAAS that assume random assignment of students to teachers’ classrooms “yield misleading [teacher effect] estimates, and policies that use these estimates in hiring, firing, and compensation decisions may reward and punish teachers for the students they are assigned as much as for their actual effectiveness in the classroom” (Rothstein, 2010, p. 177).

In his value-added primer from 2005, Henry Braun (2005) stated clearly and boldly that the “fundamental concern is that, if making causal attributions (of teacher effect on student achievement performance) is the goal, then no statistical model, however complex, and no method of analysis, however sophisticated, can fully compensate for the lack of randomization” (p. 8).

Any system of assessment that claims to measure teacher and school effectiveness must be fair in its application to all teachers and to all schools. Because teaching is a contextually-embedded, nonlinear activity that cannot be accurately assessed by using a linear, context-independent value-added model, it is unfair to use such a model at this time. Consensus among VAM researchers recommends against the use of growth models for high stakes purposes. Any assessment system that can misidentify 26 percent or more of the teachers as above or below average when they are neither is unfair when used for decisions of dismissal, merit pay, granting or revoking tenure, closing a school, retaining students, or withholding resources for poor performance.

When almost two-thirds of teachers who do not teach subjects where standardized tests are administered will be rated based on the test score gains of other teachers in their schools, then the assessment system has led to unfair and unequal treatment (Gonzalez, 2012).

When the assessment system intensifies teaching to the test, narrowing of curriculum, avoidance of the neediest students, reduction of teacher collaboration, or the widespread demoralization of teachers (Baker, E. et al, 2010), then it has unfair and regressive effects.

Any assessment system whose proprietary status limits access by the scholarly community to validate its findings and interpretations is antithetical to the review process upon which knowledge claims are based. An unfair assessment system is unacceptable for high stakes decision-making.

In August, 2013, the Tennessee State Board of Education adopted a new teacher licensing policy that ties teacher license renewal to value-added scores. However, the implementation of this policy was delayed by a very important presentation made public by the Tennessee Education Association. Presented by TEA attorney, Rick Colbert, and based on individual teachers sharing their value-added data for additional analysis, the TEA demonstrated that 43 percent of the teachers who would have lost their licenses due to declining value-added scores in one year had higher scores the following year, with twenty percent of those teachers scoring high enough in the following year to retain their licenses. The presentation may be viewed at YouTube: http://www.youtube.com/watch?v=l1BWGiqhHac

After 20 years of using value-added assessment in Tennessee, educational achievement does not reflect an added value in the information gained from an expensive investment in the value-added assessment system. With $326,000,000 spent for assessment, the TVAAS, and other costs related to accountability since 1992, the State’s student achievement levels remain in the bottom quarter nationally (Score Report, 2010, p. 7). Tennessee received a D on K–12 achievement when compared to other states based on NAEP achievement levels and gains, poverty gaps, graduation rates, and Advanced Placement test scores (Quality Counts 2011, p. 46). The Public Education Finances Reports (U.S Census Bureau) ranks Tennessee’s per pupil spending as 47th for both 1992 and 2009. When state legislators and policymakers were led to believe in 1992 that the teacher is the single most important factor in improving student academic performance, they found reason to justify lowering education spending as a priority and increasing accountability.

Finally, the evidence from twenty years of review and analysis by leading national experts in educational measurement, accountability, lead to the same conclusion when trying to answer Dr. Sanders’ original question: Can student test data be used to determine teacher effectiveness? The answer: No, not with enough certainty to make high-stakes personnel decisions. In turn, when we ask the larger social science question (Flyvbjerg, 2001): Is the use of value-added modeling and high-stakes testing a desirable social policy for improving learning conditions and learning for all students? The answer must be an unequivocal “no,” and it must remain so until assessments measure various levels of learning at the highest levels of reliability and validity, and with the conscious purpose of equality in educational opportunity for all students.

We have wasted much time, money, and effort to find out what we already knew: effective teachers and schools make a difference in student learning and students’ lives. What the TVAAS and the EVAAS do not tell us, and what supporters of growth models seem oddly uncurious to know is what, how or why teachers make a difference? While test data and value-added analysis may highlight strengths and/or areas of needed intervention in school programs or subgroups of the student population, we can only know the “what,” “how” and “why” of effective teaching through careful observation by knowledgeable observers in classrooms where effective teachers engage students in varied levels of learning across multiple contexts. And while this kind of knowing may be too much to ask of any set of algorithms developed so far for deployment in schools, it is not at all alien to great educators who have been asking these questions and doing this kind of knowledge sharing since Socrates, at least.

References

Amrein-Beardsley, A. (2008). Methodological concerns about the Education Value-Added Assessment System. Educational Researcher, 37(2), 65-75. doi: 10.3102/0013189X08316420

Baker, A., Xu, D., and Detch, E. (1995). The measure of education: A review of the Tennessee value added assessment system. Nashville, TN: Comptroller of the Treasury, Office of Education Accountability Report.

Baker, E. L., Barton, P. E., Darling-Hammond, L., Haertel, E., Ladd, H. F., Linn, R. L., Ravitch, D., Rothstein, R., Shavelson, R. J., & Shephard, L. A. (2010, August 29). Problems with the use of student test scores to evaluate teachers (Briefing Paper #278). Washington, DC: Economic Policy Institute.

Ballou, D. (2002). Sizing up test scores. Education Next. Retrieved from www.educationnext.org

Ballou, D. (2005). Value-added assessment: Lessons from Tennessee. Retrieved from http://dpi.state.nc.us/docs/superintendents/quarterly/2010-11/20100928/ballou-lessons.pdf

Bock, R. and Wolfe, R. (1996, Jan. 23). Audit and review of the Tennessee value-added assessment system (TVAAS): Preliminary report. Nashville, TN: Comptroller of the Treasury, Office of Education Accountability Report.

Braun, H. I. (2005). Using student progress to evaluate teachers (Policy Information Perspective). Retrieved from Educational Testing Service, Policy Information Center website: http://www.ets.org/Media/Research/pdf/PICVAM.pdf

Briggs, D. C., Weeks, J. P. & Wiley, E. (2008, April). The sensitivity of value-added modeling to the creation of a vertical scale score. Paper presented at the National Conference on Value-Added Modeling, Madison, WI. Retrieved from http://academiclanguag.wceruw.org/news/events/VAM%20Conference%20Final%20Papers/SensitivityOfVAM_BriggsWeeksWiley.pdf

Dunn, M., Kadane, J., & Garrow, J. (2003). Comparing harm done by mobility and class absence: Missing students and missing data. Journal of Educational and Behavioral Statistics, 28, 269–288.

Fisher, T. (1996, January). A review and analysis of the Tennessee value-added assessment system. Nashville, TN: Tennessee Comptroller of the Treasury, Office of Education Accountability Report.

Flyvbjerg, B. (2001). Making social science matter: Why social inquiry fails and how to make it succeed again. Cambridge: Cambridge University Press.

Gonzalez, T. (2012, July 17). TN education reform hits bump in teacher evaluation. The Tennessean. Retrieved from

http://www.wbir.com/news/article/226990/0/TN-education-reform-hits-bump-in-teacher-evaluation

Jacob, B. A., Lefgrens, L. & Sims, D. P. (2008, June). The persistence of teacher-induced learning gains (Working Paper 14065). Retrieved from the National Bureau of Economic Research website: http://www.nber.org/papers/w14065

Kupermintz, H. (2003). Teacher effects and teacher effectiveness: A validity investigation of the Tennessee Value Added Assessment System. Educational Evaluation and Policy Analysis, 25(3), 287-298.

Lockwood, J. R., McCaffrey, D. F., Hamilton, L. S., Stecher, B. Le, V. & Martinez, F. (2006). The sensitivity of value-added teacher effect estimates to different mathematics achievement measures. Retrieved from The Rand Corporation website: http://www.rand.org/content/dam/rand/pubs/reports/2009/RAND_RP1269.pdf

McCaffrey, D. F. & Lockwood, J. R. (2008, November). Value-added models: Analytic Issues. Paper presented at the National Research Council and the National Academy of Education, Board of Testing and Accountability Workshop on Value-Added Modeling, Washington DC.

McCaffrey, D. F., Lockwood, J. R., Koretz, D. M. & Hamilton, L. S. (2003). Evaluating value-added models for teacher accountability. Retrieved from The Rand Corporation website: http://www.rand.org/pubs/monographs/MG158.html

McCaffrey, D. F., Lockwood, J. R., Koretz, D., Louis, T. A., &Hamilton, L. (2004). Models for Value-Added Modeling of Teacher Effects. Journal of Educational and Behavioral Statistics, 29(1), 67-101.

McCaffrey, D. F., Sass, T. R., Lockwood, J. R. & Mihaly, K. (2009). The intertemporal variability of teacher effect estimates. Education Finance and Policy, 4(4), 572-606.

National Academy of Sciences. (2009). Letter report to the U. S. Department of Education on the Race to the Top fund. Washington, DC: National Academies of Sciences. Retrieved from http://www.nap.edu/catalog.php?record_id=12780

National Research Council and National Academy of Education. (2010). Getting Value Out of Value-Added: Report of a Workshop. Committee on Value-Added Methodology for Instructional Improvement, Program Evaluation, and Educational Accountability, Henry Braun, Naomi Chudowsky, and Judith Koenig, Editors. Center for Education, Division of Behavioral and Social Sciences and Education. Washington, DC: The National Academies Press.

Papay, J. (2011). Different tests, different answers: The stability of teacher value-added estimates across outcome measures. American Educational Research Journal, 48(1),163-193.

Quality counts, 2011: Uncertain forecast. (2011, January 13). Education Week. Retrieved from http://www.edweek.org/ew/toc/2011/01/13/index.html

Rothstein, J. (2010). Teacher quality in educational production: Tracking, decay, and student achievement. The Quarterly Journal of Economics, 125(1), 175-214.

Schochet, P. Z. & Chiang, H. S. (2010). Error Rates in Measuring Teacher and School Performance Based on Student Test Score Gains (NCEE 2010-4004). Washington, DC: National Center for Education Evaluation and Regional Assistance, Institute of Education Sciences, U.S. Department of Education.

State Collaborative on Reforming Education. (2010). The state of education in Tennessee (Annual Report). Retrieved from http://www.tnscore.org/wp-content/uploads/2010/06/Score-2010-Annual-Report-Full.pdf

U. S. Census Bureau. (2011). Public Education Finances: 2009 (G09-ASPEF). Washington, DC: U.S. Government Printing Office.

Thursday, September 19, 2013

TMoE Excerpt

Posted some days ago @thechalkface:

Part of the story we tell in TMoE focuses on how value-added modeling (VAM) grew up out of the backwoods of Tennessee to become an essential component of the national Common Core testing delivery engine that now appears to be running on fumes way short of its destination.

Had anyone at the Gates Foundation or ED cared enough about how VAM has affected learning in Tennessee over the past 20 years to do some of the basic research that we did, they would have found little to recommend the scaling up of VAM to its current level. Sadly, other priorities with deep historical roots prevailed, which is the other part of the story we tell.

The piece below appeared at Truthout last week as "State Failures, National Models and the Perpetuation of Educational Injustice:"

____________________________

The following excerpts have been slightly revised from the original text, The Mismeasure of Education, published July 2013 by Information Age Publishers.

Passed in 1992, Tennessee's Education Improvement Act (EIA) represented the culmination of a decade of education reform experimentation in testing, teacher performance pay and teacher evaluation and credentialing. State legislators and policy elites joined the debate leading up to the EIA with the expressed intent of increasing student test performance and decreasing the disparity in funding among state school districts, and they came out of the process with a brand new statewide system for testing, data collection and monitoring: the Tennessee Value Added Assessment System (TVAAS). The TVAAS has significantly reshaped instructional practices and student, school and teacher assessment statewide, and its value-added modeling (VAM) has become a central element in national education policy talk and implementation. Its statistical parameters, codified in Tennessee state law since 1992, make needed adjustments to the assessment system next to impossible, and they have made teaching to standardized tests a classroom reality. More importantly, perhaps, the value-added focus on test score growth has masked a continuing slippage of proficiency rates when measured against students in other states on the National Assessment of Educational Progress (NAEP) tests.

Tennessee's commitment to the TVAAS has diminished educational diversity in Tennessee, while stunting the educational opportunities of children, particularly in urban areas, in ways that are likely to have lasting negative effects in adulthood. With the continuing need for changing workforce skill sets among ever-changing economic environments at home and abroad, students enmeshed in testing protocols have not been provided with the intellectual and applied skills that they needed most to enable them to survive and thrive, or to prepare them as literate creators and innovators, responsible decision-makers, and collaborative problem-solvers. The capacity to assess high-level cognitive and non-cognitive skills has been severely limited by the use of multiple-choice tests that became increasingly high stakes, for which teachers spent inordinate amounts of time preparing children to take and pass. The focus on the state tests and the results of the value-added manipulations has diminished students' access to learning environments that allow and encourage the development of high-level thinkers and doers. By every psychometric comparison dear to the hearts of testing reform advocates, whether it is the SAT, ACT or NAEP, Tennessee has not improved the education of its citizens in relation to national testing trends, nor has it successfully addressed the funding equity gaps. And yet, 20years after the passage of EIA, another generation of reformers supports the national application of VAM with tighter testing accountability and even higher stakes for those with the least power to alter the conditions that are responsible for the continued failure on the next generation of tests.

After decades of highly anticipated reform success that remains unrealized, business elites, corporate foundations, philanthro-capitalists and venture philanthropists (Horn & Libby, 2011) soldier on in their crusade to bring business-inspired social efficiencies, privatization and corporate governance to public education. Fortified by tax breaks for funding corporate charter reform schools and school voucher programs, recently re-labeled as "scholarship tax credit programs" (National Conference of State Legislators, 2012), venture philanthropy and corporate advocacy philanthropy were able to grow into multibillion-dollar enterprises operated by Wall Street hedge funds (Gabriel & Medina, 2010) and tax-exempt foundations. Huge financial investments have yielded unprecedented levels of political and education policy influence, as noted by the founder of the Economic Policy Institute, Jeff Faux (2012):

It is well known, although rarely acknowledged in the press, that the reform movement has been financed and led by the corporate class. For over twenty years, large business oriented foundations, such as Gates (Microsoft), Walton (Wal-Mart) and Broad (Sun Life) have poured billions into charter school start-ups, sympathetic academics and pundits, media campaigns (including Hollywood movies) and sophisticated nurturing of the careers of privatization promoters who now dominate the education policy debate from local school boards to the U.S. Department of Education. (para. 4)

As corporate influence has come to dominate the ways that the education policy agenda is implemented, then, more top-down testing mandates, higher stakes and heavier sanctions have displaced most other school priorities, particularly in communities made up of poor, minority and immigrant populations. Following passage of NCLB, schools in disadvantaged communities found themselves struggling even to stay open to educate the children of the poor, whose parents were encouraged to opt for the only alternative to their neglected and failure-designated public school: corporate reform charter schools. Upon the recommendations of corporate foundations and an increasingly influential education reform industry flush with cash from federal discretionary grants, the U.S. Department of Education grew its Charter School Program from $6 million annually in 1995 to $256 million in 2011 (Lazarin, 2011, p. 11). With the passage of NCLB and the subsequent introduction of RTTT grants, the federal government also provided other generous economic incentives to nonprofit and for-profit corporations to help them open the preferred type of "No Excuses" reform charter schools to replace the most vulnerable urban public schools. Allowed to operate without the services and protections (for both teachers and students) required of public schools, the most significant "public" aspect of the corporate reform charter schools remains the public tax dollars that fund them.

Through the spread of charter schools, contracted management, transportation, commercial curriculum, private tutoring and testing services, the well-worn reform strategies based on more high-stakes testing and nationalized curriculum expanded to claim a significant market share of the $600 billion spent each year on P-20 education. In late 2012, Rob Lytle, a partner in a Boston consulting firm, (Simon, 2012) outlined for a group of investors in Manhattan the imminent bonanza taking shape as national standards and national testing were scheduled to replace state and local standards and tests:

Think about the upcoming rollout of new national academic standards for public schools. . . . If they're as rigorous as advertised, a huge number of schools will suddenly look really bad, their students testing way behind in reading and math. They'll want help, quick. And private, for-profit vendors selling lesson plans, educational software and student assessments will be right there to provide it. . . . You start to see entire ecosystems of investment opportunity lining up. It could get really, really big. (para. 2–3)

Educators and other concerned citizens of the nation and the world should not wait and wonder if Mr. Lytle or those he advises will notice the social and natural ecologies screaming for attention just behind the investment ecosystem that holds the attention of policy elites. For as surely as the last half of the 20th century directed schools toward competing in a global economy that has further concentrated power in economic institutions without national boundaries, the first half of the 21st century must be devoted to education aimed toward cooperating in global, national and local ecologies to save life on the planet. Otherwise, economies of the future, whether global, national or local, will not matter for much.

We dismiss educational research, to our peril

John Dewey believed that we might proceed toward the future only so far as we are willing to examine our past. For Dewey, experience was made up of a continuum of past, present and future that intersected with our internal and external geographies in an ongoing interactive dance (Dewey, 1938). Experience, thus derived from the components of time (past, present, future) and space (internal and external), became organized by reflective and purposive thought directed toward action. If Dewey was correct, the current steady state of education reform, which is seemingly unable to break out of a fixation on high-stakes testing, makes sense. For in perpetuating a brazen kind of anti-history that discards knowledge that scholars have gleaned from past reform efforts, education reformers have become doomed to repeat the failures that they appear determined to ignore. Take, for example, what was learned about implementation failures following the 1960s "reform initiatives" at the federal level that had to span various levels and layers of government and institutions (McLaughlin, 1987). Writing some 20 years after the initial Title I evaluations, McLaughlin (1987) noted that economists and sociologists, who were the "chief architects" of Great Society programs, were quick to assign blame when their "theories of scientific management" did not produce the results predicted by their "notions of hierarchical authority and bureaucratic control":

Thus while economists interpreted disappointing program outcomes as market failure and sought solutions in incentives, sociologists and organization theorists saw signs of inadequate organizational control, and counseled new penalties and increased oversight. (p. 171)

Unfortunately, lessons learned by second generation policy analysts were lost on the next generation of reformers, whose laser focus on high-stakes testing from the 1990s forward missed or dismissed much of the scholarly and practical advice emanating from university departments, whose empirically based research studies were quickly being replaced by advocacy research of corporate-sponsored think tanks fronted as research centers and functioned as public relations and marketing annexes for corporate education reform. A deepening tunnel vision resulted among politicians and their corporate patrons who sought to make reputations as education reformers. Independent scholarship, then, from within universities often was ignored or treated as ad hoc support of the "resistance to reforms," which has been attributed mainly to teachers and their professional organizations since the 1960s. After all, university researchers were teachers, too.

A decade before No Child Left Behind announced the war against the achievement gap, policy analysts who had closely studied and written about earlier reform efforts warned of negative outcomes from the push following the Charlottesville Conference in 1989 for more test-based accountability and harsher sanctions to quell "continued resistance" to Reagan-Bush market-based education initiatives. In a special section in The Phi Delta Kappan, Milbrey McLaughlin (1991) wrote the introduction to five articles by education scholars with extensive knowledge of policy reforms, and under her summary point number four, "Test-based accountability plans often misplace trust and protection," McLaughlin offered these potential negative outcomes for "high-stakes testing schemes" (p. 250):

• Perverting incentives for teachers - encouraging them to avoid difficult students and difficult schools.
• Discouraging classroom innovation, risk-taking and invention.
• Allocating "failure" disproportionately to nontraditional or at-risk students.
• Forcing out of the curriculum the very kinds of learning - higher-order thinking and problem solving - that learning theorists and others say are most important to "increased national competitiveness" and success in the world marketplace (p. 250).

Just as reformers ignored warnings during the 1960s, we know now that these prescient warnings were ignored as well, even though they were offered ten years before NCLB became law in 2001and almost 20 years before Race to the Top, which has served to buy alliances at the statehouse policy level for the continuing war that would accept "no excuses" for any outcome that did not result in corporatization of all educational territories.

As long as American education policy continues to rank schools using scores, whether scores are from current state tests or the tougher tests being designed to align with a national curriculum adopted by 46 states in 2012, there will always be a bottom 5% of low-performing schools, which offers plenty of room for the growth of corporate charter schools that will carry the torch for corporate education reform initiatives. The growth of charter schools encourages the further elimination of "resistance" to the intensification of social separation through segregated schools, which are staffed largely by young, minimally prepared recruits who impose total-compliance test-based curriculums that, if deemed successful, indoctrinate children (Horn, 2011) to behave in ways that defy the effects of socioeconomic inequality. Unfortunately, inequality cannot be cornered into the schools, treated with harsh instructional solvents designed to scrub away the effects of poverty, and then tested to make sure the residue has been alleviated. Nor can we pretend that the creation and nurturing of unequal schools will solve inequality, whether made unequal by minimally prepared teachers, the absence of basic services like school libraries, the preponderance of atrophied curriculums or the physical and psychic separation of the poor and disenfranchised. That such outcomes may be offered as a viable resolution to inequality in education, which is parroted as the "civil rights issue of our generation," (Change.gov,2008) represents a minstrel version of social justice, paraded on the public stage for the benefits that may be derived from delusion or deception - or both.

Although forgotten in rhetoric and in deed, the U.S. Supreme Court declared in a unanimous decision almost 60 years ago that "separate educational facilities are inherently unequal" (Brown v. Board of Education, 1954). We must stop pretending they are not - or else risk a further erosion of the moral courage required to complete the forgotten goal and neglected task of building a quality system of public schools that serves the needs of all children.

References

Change.gov. (2008). President-Elect Obama nominates Arne Duncan as Secretary of Education. Change.org. Retrieved from

http://change.gov/newsroom/entry/president_elect_obama_nominates_arne_duncan_as_secretary_of_education/

Dewey, J. (1938). Experience and education. New York: Touchstone.

Faux, J. (2012). Education profiteering: Wall Street's next big thing? Huffington Post. Retrieved from http://www.huffingtonpost.com/jeff-faux/education-wall-street_b_1919727.html

Gabriel, T., & Medina, J. (2010, May 10). Charter schools' new cheerleaders: Financiers. New York Times. Retrieved from http://www.nytimes.com/2010/05/10/nyregion/10charter.html?pagewanted=all&_r=0

Horn, J. (2011). Corporatism, KIPP, and cultural eugenics. In P. Kovacs, (Ed.), The Gates Foundation and the future of U. S. ‘public schools' (pp. 80-103). New York: Routledge.

Horn, J., & Libby, K. (2011). The giving business: The New Schools Venture Fund. In P. Kovacs, (Ed.), The Gates Foundation and the future of U. S. ‘public schools' (pp. 168-185). New York: Routledge.

Lazarin, M. (2011). Federal investment in charter schools: A proposal for reauthorizing the Elementary and Secondary Education Act. Washington, DC: Center for American Progress. Retrieved from http://www.americanprogress.org/wp-content/uploads/issues/2011/10/pdf/charter_investment.pdf

McLaughlin, M. (1987). Learning from experience: Lessons from policy implementation. Education Evaluation and Policy Analysis, 9(2), 171-178.

McLaughlin, M. (1991). Test-based accountability as a reform strategy. The Phi Delta Kappan, 73(3), 248-251.

National Conference of State Legislators. (2012). Tuition tax credits. Washington, DC: NCSL. Retrieved from http://www.ncsl.org/issues-research/educ/school-choice-scholarship-tax-credits.aspx

Simon, S. (2012, August 2). Private firms eyeing profits from U. S. public schools. New York: Thompson Reuters. Retrieved from http://in.reuters.com/assets/print?aid=INL2E8J15FR20120802