Teacher Practical Guidance:
Standardized Testing
Category: Assessment & Planning
Rank Order
Effect Size
Achievement Gain %
How-To Strategies
BENEFITS
- Provide an objective measure and performance “viewpoint.”
- Can be used as a metric for improvement and identify growth over time.
- Can provide meaningful data to show where instruction is strong or weak.
- Testing can improve students attention and recall.
- Provides a basis for comparison.
- These tests are accurate in identifying which students are good at taking tests.
- Often required for course, school, district, and states. link
HOW TO
-
Analyze class and subgroup patterns to identify priority standards where students are consistently underperforming, then adjust upcoming units and re-teaching accordingly.
- Combine standardized scores with classroom formative assessments to confirm patterns before making major instructional changes or interventions.
- Use performance bands (e.g., advanced, proficient, basic) to create flexible groups for enrichment, core instruction, and intensive support.
- Pair standardized data with quick diagnostics (exit tickets, short quizzes) to pinpoint specific skill gaps inside.
- Confer with students about their scores, explaining scale scores, growth, and performance levels in student-friendly language.
- Teach students to track their own progress using both standardized and classroom data.
-
Use test data in PLCs to compare trends across classrooms or grade levels, co-plan responses (common interventions, shared tasks), and monitor the impact of changes over time.
- Share with families how standardized data informs instruction (not just placement or accountability), highlighting strengths, growth, and specific next steps. link
CHALLENGES
- Tests often emphasize recall and basic skills in a narrow set of subjects, capturing only part of what students know and can do.
-
Important competencies such as creativity, collaboration, problem solving, and real-world application are typically underrepresented or not measured at all.
- Heavy focus on tested areas (often reading and math) can reduce time for science, social studies, arts, physical education, and richer projects.
- Pressure to raise scores can push teachers toward “teaching to the test,” emphasizing test format and rote learning over deeper understanding.
- Increase anxiety for students who may know content but underperform under timed, high-pressure conditions.
- Teachers often feel their evaluation, job security, or school rating hinges on scores, which can create a climate of fear.
- Standardized tests can disadvantage students from low-income, minority, or culturally diverse backgrounds.
- Imperfect measures of achievement and can be misused for high-stakes decisions like promotion, graduation, tracking, or teacher evaluation.
- Not predictors of future success.
- May not reflect curriculum. link
- Measures only one point in time.
- Create anxiety, stress and cheating.
- Provide limited feedback as results often received much later than the test date. link
WHAT NOT TO DO
- Teachers should avoid any actions that compromise test security, distort scores, or harm students’ well-being during standardized testing. The goal is to preserve ethical standards while maintaining a calm, supportive environment.
- Do not use actual or harvested test items for drill-and-kill practice, which undermines validity and breaches ethical standards.
-
Do not let “teaching to the test” replace broad, concept-rich instruction. link
- Schools should not make promotion, retention, graduation, or special program decisions based on single test score.
-
Schools should not deny access to advanced courses, electives, or extracurriculars solely because a student falls below an arbitrary cut score.
-
Schools should not use test scores teacher evaluation, pay, or employment decisions, especially for teachers not directly tied to the tested subject.
-
Schools should not rank or label schools (failing, low-performing) mainly on test results.
-
Schools should not present rising scores as proof of real learning gains without asking whether gains reflect test prep, coaching, or score inflation rather than deeper understanding. link
How-To Resources
ARTICLE
Link – GUIDE (Hanover Research Brief) Using Assessments to Support Instruction
Link – ARTICLE (Everett) Is standardized testing improving education?
Link – ARTICLE (Fordham) The case for standardized testing
Link – ARTICLE (OxfordL) Pro’s and con’s of standardized testing
Link – ARTICLE (EducAdvanced) The benefits and impact of standardized testing
Link – ARTICLE (EducWeek) Grades and Standardized Test Scores
Link – ARTICLES (EducWeek) Grading & Assessment
Link – ARTICLE (GPE) Redefining assessment
Link – ARTICLE (Stanford) Transforming assessment
Link – ARTICLE (NWEA) 3 ways to use assessment data
Link – ARTICLE (RE) How do teachers to improve instruction
Link – ARTICLE (ASCD) 15 reasons why standardized tests are problematic
Link – ARTICLE (FairTest) What’s wrong with standardized testing
Link – ARTICLE (Brookings) Standardized tests aren’t the problem, it’s how we use them
REPORT / RESEARCH
Link – REPORT (EPI) Problems with the use of student test scores to evaluate teachers
Link – REPORT (NCES) Test integrity
Link – REPORT (IL) Professional testing practices for educators
Link – RESEARCH (NIH) Using assessment to improve accuracy of teachers perceptions
Link – RESEARCH (ERIC) Assessment of data-driven inquiry
Link – RESEARCH (NIH) Standardized ability tests and testing: Major issues
Link – RESEARCH (NIH) Testing improves performance
VIDEO
Link – VIDEO (YouTube) Should we get rid of standardized testing?
Link – VIDEO (YouTube) What standardized tests should be measuring?
Link – VIDEO (Oliver) Standardized testing
Link – VIDEO (YouTube) Prepare kids for life
Link – VIDEO (YouTube) Testing (Simpsons)
TESTING COMPANIES
Harcourt Educational Measurement (later Harcourt Assessment), known for the Stanford Achievement Test (e.g., SAT‑9) link
CTB/McGraw‑Hill, publisher of TerraNova and California Achievement Tests link
Riverside Publishing (a Houghton Mifflin company), another major producer of norm‑referenced and state assessments. link
NCS Pearson / Pearson Educational Measurement, which expanded from scoring into large‑scale test development link
Renaissance: K–12 assessment and reading/math solutions (e.g., Star Assessments).link
Educational Testing Service (ETS): The world’s largest private educational testing organization, responsible for exams such as GRE and TOEFL.link
DIGITAL
NWEA provides computer-adaptive assessments (e.g., MAP) with detailed growth reports, learning continuum views,
ExamSoft offers secure offline test delivery, item banks, and rich analytics for K–12.
MasteryConnect from Instructure supports standards-aligned benchmarks and common assessments.
Pear Assessment combines classroom and larger-scale assessments.
Proctortrack provides AI-enabled and live online proctoring.
G2 Online Proctoring Software lists tools like Proctorio and Mercer|Mettl
MonitorEDU supports live remote proctoring, multi-camera setups, and identity verification.
PureData Assessment Dashboard ingests raw assessment files (e.g., PSAT, state tests) and turns them into interactive filters.
References
Adesope, Trevisan, & Trevisan (2013). The Neglected Benefits of Testing: Implications for Classroom and Self-Directed Learning. WERA Educational Journal
Amrein, A., Berliner, D., (2002). High-stakes testing & student learning. Education Policy Analysis Archives, 10, 18-19 Link
Berwick, C. (2019). What does research say about testing. Edutopia. Link
Dufor, R. (2015). In praise of American educators: And how to become even better. Solution Tree.
Fuchs & Fuchs (1986). Test Procedure Bias: A Meta-Analysis of Examiner Familiarity Effects. Review of Educational Research.
Gatlin-Nash B, Hwang JK, Tani NE, Zargar E, Wood TS, Yang D, Powell KB, Connor CM. (2021). Using Assessment to Improve the Accuracy of Teachers’ Perceptions of Students’ Academic Competence. Elementary School Journal. 121(4):609-634.
Goslin DA.(1968). Standardized ability tests and testing. Major issues and the validity of current criticisms of tests are discussed. Science. 159(3817):851-5.
Lee, J (2006). Is test-driven external accountability effective? A meta-analysis of the evidence from cross-state causal-comparative and correlational studies. Paper presentation. Annual meeting of American Educational Research Association. San Franciso CA. Link
Phelps (2019). Test frequency, stakes, and feedback in student achievement: A meta-analysis. Evaluation Review.
Phelps (2012). The effect of testing on student achievement, 1910–2010. International Journal of Testing.
Polack CW, Miller RR. (2022). Testing improves performance as well as assesses learning: A review of the testing effect with implications for models of learning. J Exp Psychol Anim Learn Cogn. 48(3):222-241.
Rowland (2014). The effect of testing versus restudy on retention: A meta-analytic review of the testing effect. Psychological Bulletin.
Standardized Testing
DEFINTION
Standardized Testing: are assessments given and scored in a uniform way so that results are directly comparable across test takers. Every examinee receives essentially the same questions or a structured sample from a common pool, under the same conditions, and responses are evaluated with the same scoring rules.
Standardized tests are used to measure individual students’ achievement or aptitude, to compare performance across classrooms, schools, districts, or states, and to make decisions such as placement, graduation, or admission. They also provide data for accountability systems, program evaluation, and the identification of learning gaps among groups of students.
Summative assessments: are tools used to evaluate what students have learned at the end of a defined period of instruction, such as a unit, course, semester, or program. They focus on judging the level of achievement against a standard or benchmark and typically result in a score or grade.
DATA
-
6 Meta Analysis reviews
-
728 Research studies
-
7 Million students in studies
-
5 Confidence level. Hattie (2023) p. 320
6 Meta Analysis reviews
728 Research studies
7 Million students in studies
5 Confidence level. Hattie (2023) p. 320
QUOTES
“States with the strongest accountability measures have made more gains over the years than those with weaker accountability measures. However, these gains mapped similar trajectories from the years before these accountability policies were brought into law. It is no guarantee that states adopting strong accountability policies will impact student achievement until substantial improvements in schooling conditions and practices occur.” Lee (2006)
…some advocates of using student test scores for teacher evaluation believe that doing so will make it easier to dismiss ineffective teachers. However, because of the broad agreement by technical experts that student test scores alone are not a sufficiently reliable or valid indicator of teacher effectiveness, any school district that bases a teacher’s dismissal on her students’ test scores is likely to face the prospect of drawn-out and expensive arbitration and/or litigation in which experts will be called to testify, making the district unlikely to prevail. link
