2016 EAGER Grant PI Convening

EAGER K-12 STEM Education Indicators Principal Investigators Gather in Washington, DC

As the DC area thawed out from an ice storm, principal investigators (PIs) for the 15 EAGER grants that NSF awarded in support of a K-12 STEM Education Indicator System, along with other stakeholders, gathered for a two-day convening at SRI’s Arlington, VA office.

SRI’s project manager for the work, Jessica Mislevy, welcomed attendees and provided an update on ongoing efforts by NSF and SRI to build a framework for collecting indicator data as called for in the National Research Council’s 2013 report Monitoring Progress Toward Successful K-12 STEM Education. Jessica explained that the purpose of the meeting would be to provide space for PIs to form connections and collaborations to better measure the indicators and to plan for impact and sustainability of the indicators system.

Dr. Joan Ferrini-Mundy, Assistant Director of NSF for Education and Human Services, kicked the morning off. Stressing the importance of the STEM indicators work as a unifying force in education research, positioned to advance research, policy, and practice, Dr. Ferrini-Mundy compared the K-12 STEM Education Indicators to the recent success of the LIGO Scientific Collaboration, in that both projects bring together disparate work to advance a field. Dr. Ferrini-Mundy described the influence of the STEM indicators research in both the Obama Administration’s Federal 5-Year Strategic Plan for STEM Education and a new National Research Council committee on STEM indicators for postsecondary education. Looking to the future, she highlighted the importance of building and maintaining the STEM indicators research community.

EAGER Principal Investigators Enlighten Us in Just Seven Minutes

Next up was the K-12 STEM Education Indicators version of a lightning round. Each investigator was challenged to describe his or her project in 7 minutes using 7 PowerPoint slides.

Melanie LaForce (University of Chicago) kicked off the convening by describing how she and project PI Jeanne Century developed taxonomies for types of STEM schools and programs as a way to identify, describe, and build consensus around what constitutes the various types of STEM schools and programs. In February, the team met with policymakers, researchers, and practitioners to receive feedback on the taxonomies.

Rena Dorph and Ardice Hartry (UC Berkeley Lawrence Hall of Science) explained how their project will tackle the issue of measuring the amount and kind of time and resources devoted to elementary science instruction. Dorph and Hartry have completed interviews with educators, curriculum designers, and researchers, convened an expert panel, and are in the process of piloting survey items with educators.

Morgan Polikoff (University of Southern California) presented his project’s plan to develop an online system for gathering textbook adoption information from local education agencies and to implement the approach in five states (CA, FL, Illinois, NY and TX). He found that textbook adoption data was difficult to collect from states, and now plans to demonstrate the utility of these data in order to encourage states to begin tracking these data in usable formats. Polikoff has begun the analysis of textbook data that he has collected.

April Gardner (BSCS) described how her project convened stakeholders at a meeting to develop a consensus around criteria for evaluating whether instructional materials align with the Framework and the NGSS, and to identify characteristics of measures to assess the quality of instructional materials. Gardner and her project team have synthesized the work of the Summit participants, and have begun working on a draft of the Guidelines that will be available in March for public review and comment.

Eric Banilower (Horizon Research) explained how his project has worked with an expert panel to operationalize definitions of the science and engineering practices and then validate survey questions based on those definitions through teacher interviews. Banilower noted that computational thinking was proving especially challenging to operationalize. Banilower’s team has conducted the first round of cognitive interviews with teachers, and the large-scale pilot to determine the psychometric properties of the survey items is scheduled for this spring.

Kun Yuan (RAND) described how she and PI Laura Hamilton are tackling the problem of obtaining a representative sample of classroom coverage of CCSS-Math and NGSS practices. Yuan explained that their project will measure instruction from the student’s perspective, and pointed out the technical and practical limitations of commonly used approaches, such as surveys and classroom observations, especially at a time when personalized learning approaches are becoming increasingly popular. The project will explore innovative approaches identified through expert interviews to address these limitations.

Drew Gitomer (Rutgers University) described how his project will use classroom artifacts as indicators of quality in STEM education. Gitomer explained how artifacts can be used as a way to capture what kinds of instructional activities students are doing in the classroom, and as explicit evidence of teachers’ goals. Gitomer has developed protocols that can be used as tools within districts, and is in the process of collecting and scoring data from teachers.

Bill Schmidt (Michigan State University) explained that he has collected a large database of coded textbooks and teacher log data. From this database he developed draft indicators of quality mathematics textbooks and classroom instruction. He then convened an expert Advisory Board to review these potential indicators.

Ellen Mandinach (WestEd) provided an update on her project’s efforts to understand the potential of State Longitudinal Data Systems for informing STEM indicator related work. The team has conducted feasibility tests using CEDS and CRDC data and is working with state data directors to review data repositories. In addition, the team is also exploring how local data systems can provide information on data elements that are missing from state level channels.

Dan Goldhaber (University of Washington) and Roddy Theobald (AIR) reported on early findings assessing the use of teachers’ licensure test scores as predictors for students’ STEM achievement. Their predictive model found that more effective teachers had higher scores on most teacher licensure tests, although the strength of this relationship varied by test and grade.

Jamie Mikeska (ETS) described work with colleague Geoffrey Phelps that examined data from 255 elementary science teachers to examine how measures of science teacher knowledge relates to other aspects of elementary teaching preparation and teaching experience. They are currently conducting cognitive interviews with a subset of 31 teachers to validate a suite of Content Knowledge for Teaching (CKT) assessment tasks.

Heather Howell and co-PI Geoffrey Phelps (ETS), together with project PI Yvonne Lai (University of Nebraska-Lincoln) are conducting research to determine the extent to which elementary school measures of content knowledge for teaching can be applied to secondary school contexts. Early findings have revealed that constructs in elementary assessment items generally function as designed in secondary school contexts, but caution that critical components could be omitted.

Nicole Kersting (University of Arizona) reviewed recorded video of classroom teaching to determine the suitability of this approach as a scalable method for measuring teachers’ math content knowledge for teaching. Analyses of their instruments have found a positive relationship between their newly developed items and teachers’ self-reports of their practice.

Susan Kowalski (BSCS) described progress with co-PI Joseph Taylor (Abt) on reviewing published manuscripts and abstracts from federally funded projects on STEM education. The goal of this study is to establish a baseline for federal funding and publication patterns. Early findings show that it is uncommon for studies to examine causal effects in science education. Among the causal studies identified, quasi-experiments are far more common than randomized trials; however, they tend to be small with unmatched comparisons. These efforts to identify funded work that can make causal claims will have impacts for decision-making for policymakers.

Rolf Blank (NORC at the University of Chicago) is developing a web portal of information about states’ science and mathematics assessment policies. With guidance from the project’s advisors, his work thus far has explored what information should be collected about state assessment policies in science and math—such as how results are used and disseminated—and has investigated the extent to which assessments are aligned to state-adopted standards via indicators such as the topics assessed at specific grade levels and cognitive demand. Ten states helped pilot the data collection in AY 2014-15 (view PowerPoint).

During a poster session/working lunch, attendees had the opportunity to continue conversations, ask questions, and explore potential connections, while visiting the posters of the DCL PIs.

Measuring the Indicators using NCES Surveys

Amy Ho and Andy Zukerberg of NCES discussed STEM indicators applications of the National Teacher and Principal Survey (NTPS) and the National Assessment of Educational Progress Technology and Engineering Literacy Assessment (NAEP TEL). The NTPS is testing items relating to three STEM indicators: STEM-focused programs, granular measures of STEM instructional time, and teachers’ participation in STEM-related professional development. Data from the 2014 NAEP TEL will be released in May 2016 and will address several of the STEM indicators. Relevant data will include access to resources for learning and instruction, learning experiences in and out of school, students’ confidence in performing TEL activities, and TEL-related practices in teacher professional development. Amy noted that STEM is an important topic right now and that NCES feels that it is important to measure at a national level.

Panel: STEM Indicators White Paper Authors

Next, the discussion pivoted to the recent series of three STEM Indicators white papers, designed to whet the appetite of researchers and policymakers for the potential the indicators system holds for better research and policy. Sarah Gerard moderated a discussion between the three STEM Indicators white paper authors:

  • Jim Pellegrino, University of Illinois at Chicago (21st Century Science Assessment: The Future is Now)
  • Michael Lach, University of Chicago (Using Indicator Data to Drive K-12 STEM Improvements in States & Districts: Implications for Leaders & Policymakers)
  • Suzanne Wilson, University of Connecticut (Measuring the Quantity & Quality of the K-12 STEM Teacher Pipeline).

The authors discussed their motivations to sign on for the series, highlights from their papers, and implications for the indicator system. Jim Pellegrino discussed the complexity of science assessments, especially as it relates to the NGSS and the Framework. Pellegrino expressed his desire to use his white paper as a way to encourage states to consider coherent assessment systems that pay adequate attention how assessments should be used by teachers and schools. Pellegrino urged the group to communicate to states and the field at large that we need to think about building the science assessment system from the classroom up towards the monitoring level. He also discussed the need for quality formative assessments that provide information that can be used by teachers to guide instruction and advance student learning.

Michael Lach pointed out that the people responsible for math and science in district and state education agencies typically have little interaction with the people responsible for school reform. He would like to use the Indicators as a way to develop stronger connections between these groups of people that will benefit students. Lach explained that many administrators don’t fully recognize how different math and science are from one another, which is why it is important to determine what STEM really means in the STEM Indicators.

Suzanne Wilson acknowledged that we are in a much better place as a country in terms of data to inform conversations about teacher quality, teacher content knowledge for teaching, and teacher opportunities to learn and teach STEM. There has been strong conceptual and large-scale work done at multiple levels, however, the literature reveals gaps in many areas, including science and secondary education. Wilson also noted that the field relies heavily on teacher self-report, and is lacking information on teacher practice. Wilson urged the EAGER researchers to conduct work that builds on key concepts in the field, in order to create greater consensus around these concepts.

Sharon Lynch (George Washington University) raised a question regarding the adoption of the indicators at the state level and how the recent adoption of the Every Student Succeeds Act (ESSA) could impact their adoption. Pellegrino suggested that as a result of ESSA and the additional control over accountability that it awards states, states could give greater consideration to the indicator system and proactively support the indicator climate. Additionally, he suggested that state data centers could use this opportunity to collect data on the indicators.

Breakout Group Discussions: Indicator Deep Dives

The remainder of the day was devoted to break-out groups for three indicator-themed groups to delve into discussions to connect on topics relevant to their projects and what is needed to advance the field:

  1. Measuring Quality STEM Education Indicators (Multiple Indicators)

This group discussed issues of definition, scalability, and locus of control in measuring STEM Indicators. Researchers pointed out that there is still a great deal of variability in the definitions of STEM schools and programs, making them difficult to measure. They also highlighted the importance to the indicators effort of collecting high quality, longitudinal data that can be used to make policy decisions. The group urged the researchers to think about who the best entity is to provide the data, whether at the district, state, or national level, and how data on the indicators would be used. For example, would data be used to improve learning, or just accountability, and how will the field leverage them to make progress. Lastly, the group pointed out the importance of a common data source across the states.

  1. Teachers’ Science and Mathematics Content Knowledge for Teaching (Indicator 6)

This session focused on measurement of teachers’ science and mathematics content knowledge for teaching. The group discussed a few promising directions that should be explored to further unpack content knowledge for teaching and its role on student learning, such as developing measures that would detect the high end of content knowledge, not just a minimum level of proficiency. Geoffrey Phelps noted that “five years from now, we want a measure that is capturing the higher end of proficiency.” The group agreed that the status quo, in which an individual can get an undergraduate degree in mathematics without taking a geometry class, is not ideal preparation for classroom mathematics instruction.

  1. Embodying Rigorous, Research-Based Standards (Indicators 4, 5, and 12)

This session covered adoption of standards-aligned instructional materials; classroom coverage of content and practices; and states’ use of standards-aligned assessments. The group agreed that measures developed through the NSF-funded STEM Indicators projects are at varying degrees of readiness to function as national indicators. Therefore, it was suggested that different data collection tools might appeal to users at different levels: simple, precisely defined indicators would be useful at the national level, whereas increasingly detailed measures would be valuable for states and districts. The indicators can also be considered a plan of action for researchers, uniting related research efforts by providing common, validated measures.

After discussing highlights from the breakout sessions with the full group, the meeting broke for the day, leaving attendees with lots to think about for the evening.

K-12 STEM Indicators: Impact and Sustainability

Barbara Means opened the second day of the convening by reminding the group that the long-term goal of the indicators system is to provide students with the opportunity to learn. As part of that effort, we want to think not just about improving the indicators, but about actually having a positive impact on STEM education.

John Gawalt, Director of NCSES, provided an overview of the Science and Engineering Indicators 2016 Report, which he described as first and foremost a “collection of data presented in a policy-neutral but policy-relevant way.” Gawalt explained that there is a large effort within NCSES to grow the number of data sources that can be reported in the SEI, and that the Indicators system was referenced in the 2016 SEI report. In the future, Gawalt is optimistic that the indicators can help to fill an important gap in data in Chapter 1, which focuses on K-12 STEM education.

Panel: Policymakers and Practitioners

Next, the meeting transitioned to a lively panel discussion which included energetic state and district practitioner/policymakers: Tiffany Neill, Science Director, Oklahoma State Office of Education; John Staley, Director, PreK-12 Mathematics, Baltimore County Public Schools and current NCTM president; and Sarah Young, STEM Liaison, Utah State Office of Education. Jessica Mislevy of SRI led the discussion, which ranged from uses of STEM data in daily work to opportunities to create new researcher-practitioner partnerships. Neill and Young described their roles at the state level and their uses of data – often, they will be fielding calls from teachers, state legislative staff, community members or parents and will need to respond to questions about STEM education programs using the best data they have available. Young noted the importance of using data that is relevant to a state or district’s particular context, as that is often more powerful for policymakers. For Young, it was especially fortuitous to see Melanie LaForce’s research included in a newsletter, as she was able to contact LaForce, participate in her research, and use the experience to build the Utah STEM school designation pilot program. Niell chimed in, noting that Oklahoma had started a research partnership with Rolf Blank and his DCL project, and how excited other state science supervisors were about these types of research-practice partnerships. Staley referred to Bill Schmidt’s project, and the potential that he saw for his work with Baltimore schools, to provide data on potential gaps that can be created in math instruction as students move through kindergarten through 12th grade. He noted that in projects such as Schmidt’s and others with immediate practical relevance for schools, “it gives me a case to fight if I have data.” Finally, the panel ended with a call that if researchers want policymakers and practitioners to read their research, then it must be distilled into just a few pages of research highlights and policy recommendations.

Roundtable Discussions: Insights on K-12 STEM Education Challenges

Next, attendees split into five roundtables to discuss a series of K-12 STEM education challenges. Each topic was proposed by an attendee who also led the discussion.

In The Role of Science and Mathematics Standards in STEM Curricula Roundtable, Peter McLaren from Achieve, Inc. opened the discussion by asking how educators can identify the influence of existing educational standards on the ability to support high-quality STEM learning opportunities and instruction. Tiffany Neill from the Oklahoma State Department of Education expressed concern that the language used in standards and to describe the indicators are unfamiliar to teachers and not readily understood. It was agreed that nebulous terms and constructs in the standards needed to operationalized and unpacked to be accessible for a broader audience. The group also discussed the need for coherent curriculum materials that are mature and well aligned to standards, yet flexible enough for teacher to modify and adapt for local needs.

In the Synergies Between Formal and Informal STEM Roundtable, Melissa Moritz of the Department of Education led a discussion in which researchers discussed challenges in synergizing informal and formal STEM education, including measurement issues, equity of access and quality, implementation quality, and the fact that formal and informal instructors rarely share professional development experiences. This led the group to call for a comprehensive vision of STEM education that includes both formal and informal education. While seeds of knowledge to develop such a vision already exist, the effort would require strong research-practice partnership and financial support.

The STEM Education & Opportunity Structures for Minority & Low-SES Student Populations Roundtable discussed the idea that STEM-focused schools need to provide opportunity structures–not just more math/science courses–for under-represented youth. Sharon Lynch described how the inclusive STEM-focused high schools in her research considered opportunity structures, including connections to STEM professionals in the community and connections with higher education. She pointed out that these schools achieved good outcomes in terms of traditional measures, such as graduation and acceptance into 4-year colleges, for their minority and low-income students. There was speculation that the fact that many of these schools did not have a majority ethnic/racial group may have contributed to the positive social climate for students from groups under-represented in STEM fields.

In the Efficient Measurement of Complex Constructs in STEM Education Roundtable, Chris Wilson of BSCS led a discussion in which group members discussed challenges and opportunities for efficient measurement of complex constructs in STEM education, including processes for scaling up the measures, affordances and pitfalls of technology, and building measures from the classroom-level up, rather than top-down, in order to gain teacher support for the indicators. They also addressed the differences between the STEM disciplines, and how measurement related to engineering can be especially difficult.

Finally, in the PreK-8 STEM Teachers – Identification and Support Roundtable, Skip Fennel from McDaniel College posed the question, “What does it mean to be a STEM teacher at the elementary level?” The group wrestled with this given the considerable variation in when and how elementary science is facilitated across states and districts. Participants showed interest in the possibilities offered by an elementary specialist model (as opposed to generalist model). Math education expert Jere Confrey argued that elementary students get better teaching in music and PE since we put specialists in those classrooms. Some states and school districts have found ways to create expert STEM teachers at the elementary level. Sarah Young from the Utah State Office of Education described a new STEM endorsement in her state that helps leverage endorsements teachers may have already earned in math and offers an incentive to continue moving up the career ladder.

Breakout Group Discussions: Future of the Indicators

Next, the group broke into three sessions to discuss the future of the indicators system.

The first group focused on Research Recommendations for Access to Quality STEM Education Indicators (1-5). Laura Hamilton provided a summary of the discussion to the full group, noting the importance of ensuring that the indicators are relevant for state and local governments. She did, however, reference some challenges with data collection, such as collecting information on interdisciplinary STEM time and understanding what teachers are counting as STEM activities. Hamilton cited Morgan Polikoff’s success in obtaining textbook data in Texas as a great example, and encouraged the group to think of ways to incentivize state and local data collection so that it’s easier and we have comparability across localities. Finally, so that lessons learned from these projects do not fall through the cracks, she implored the PIs to maintain connections across projects once the EAGER grants end.

The second group discussed Research Recommendations for Educator Capacity Indicators (6-8), with Nikki Kersting reporting the conversations to the full group. The group focused primarily on Indicator 7, teachers’ participation in STEM-specific professional development activities. The group discussed whether teacher professional development (PD) is an appropriate indicator given slim evidence of PD effectiveness. Some argued that the indicator should be retained despite this limited evidence, given that true effect sizes may be small and outcome measures of student achievement are insensitive. The group agreed that it would be valuable to collect data on characteristics of effective teacher professional development, such as duration and relevance to practice.

The third group focused on Supporting the Impact and Sustainability of the STEM Indicators. Skip Fennell provided a discussion summary to the full group, starting by recognizing the challenges of the current climate for federal involvement in education. Fennell noted that to continue to move the STEM indicators work forward, support would be necessary from existing stakeholders and organizations like CCSSO, NGA, NCTM and NCSM. In addition, the Department of Education and NSF staff should come together to think about the alignment between research and practice. Methods for increasing the likeliness of state use of the indicators were suggested, such as developing state-specific indicator data to increase local interest. Skip stated that is also important for researchers to be flexible and sensitive to the audiences they present this work to, and to provide solutions rather than simply questions. The group agreed that further investment and development will be necessary to move the STEM Indicators effort forward, and hoped that NSF would continue to prioritize research related to the indicators.

Finally, Karen King, NSF’s program officer for this work, closed the meeting by stressing that the 14 indicators are an evolving, interacting system. King called for researcher consensus around how to measure the indicators precisely and feasibly, ideally at both the state and national level, and urged the group to continue the momentum of this work.

See how the meeting developed in real time with tweets from the event, and follow the conversation with the future with #STEMindicators.

Posted in resource