Education Assessment and Accountability Review Subcommittee

Minutes of the<MeetNo1> 1st Meeting

of the 2005 Interim

<MeetMDY1> April 26, 2005

The<MeetNo2> first meeting of the Education Assessment and Accountability Review Subcommittee was held on<Day> Tuesday,<MeetMDY2> April 26, 2005, at<MeetTime> 10:00 AM, in<Room> Room 131 of the Capitol Annex. Representative Harry Moberly Jr, Co-Chair, called the meeting to order, and the secretary called the roll.

Present were:

Members:<Members> Senator Jack Westwood, Co-Chair; Representative Harry Moberly Jr, Co-Chair; Senators Dan Kelly, Ken Winters, and Ed Worley; Representatives Jon Draud, Mary Lou Marzian, and Frank Rasche.

Guests: Ben Hicks, CTB Mc-Graw Hill; Janice Allen, Kentucky Board of Education; Clyde Caudill, Jefferson County Public Schools; Andrea Sinclair and Art Thacker, HumRRO; and Wayne Young, Kentucky Association of School Administrators.

LRC Staff: Sandy Deaton, Audrey Carr, Jonathan Lowe, Janet Stevens, and Lisa Moore.

Representative Moberly welcomed new members Senator Westwood and Senator Winters to the Education Assessment and Accountability Review Subcommittee (EAARS). He entertained a motion for approval of the minutes. Senator Worley made a motion to approve the minutes from the November 3, 2004 meeting, and Senator Winters seconded the motion.

Representative Moberly proceeded with the election of co-chairs for the subcommittee. Senator Kelly moved to elect Senator Westwood to be the Senate co-chair, and the motion was seconded by Senator Worley. Senator Kelly made the motion that nominations cease, and Senator Worley seconded the motion. The motion was approved by voice vote.

Representative Rasche moved to elect Representative Moberly as the House of Representatives co-chair, and the motion was seconded by Representative Draud. Representative Rasche made the motion that nominations cease, and Representative Draud seconded the motion. The motion was approved by voice vote.

Representative Moberly said Senate Joint Resolution 156 from the 2004 Regular Session directed the Office of Education Accountability (OEA) to conduct a study of the Commonwealth Accountability and Testing System (CATS). He said the study is nearly complete, and the final components will be presented at the next EAARS meetings on May 20 and June 1, 2005.

Representative Moberly introduced Ms. Marcia Seiler, Director, OEA, who presented to the members using a PowerPoint presentation. Ms. Seiler said the study was to address seven components regarding: 1) the appropriateness of CATS component scores to measure achievement levels of the core content; 2) the validity and adequacy of CATS results as indicators of student achievement; 3) alignment with the No Child Left Behind (NCLB) Act; 4) the value of CATS assessment and enhancing instructional practices; 5) the validity of the writing portfolio; 6) the effects of CATS on assessment in the school; and 7) the cost of CATS. Ms. Seiler said that in June 2004, OEA presented EAARS a study plan that addressed the seven components of the study. The plan was approved by the subcommittee, and an external contractor was hired to assist in conducting several parts of the study. She said the results of the contractor's survey data and literature review, along with responses from members of the National Technical Advisory Panel on Assessment and Accountability (NTAPAA), and other work conducted by OEA and Legislative Research Commission (LRC) staff, will be used to answer specific questions in the study.

Ms. Seiler began by giving a background of the CATS system. In 1998, House Bill 53 modified the system of assessment and accountability in Kentucky. She said CATS was first administered in 1999 and fully implemented in 2002. The Kentucky Board of Education (KBE) set a goal for each school in the state to reach the standard of Proficient by 2014. CATS was created to ensure school accountability for student achievement set forth in statute.

Ms. Seiler went over the requirements set forth in House Bill 53. The three main requirements of the bill required the assessment to: 1) measure grade appropriate core academic content, basic skills, and higher-order thinking skills and their application; 2) provide valid and reliable scores for schools. If scores are reported for students individually, they shall be valid and reliable; and 3) minimize the time spent by teachers and students on assessment.

Ms. Seiler said the Kentucky Educational Standards are based on the six learning goals set out in the Kentucky Education Reform Act (KERA). The Kentucky Department of Education (KDE) then created academic expectations that characterized those student achievement goals. In addition, the program of studies was created and provided local school districts with the basis for establishing their curricula. She said the core content for assessment represents the content that has been identified as being essential for all students to know and will be included on the state assessment. The content is designed to be used with the academic expectations and the program of studies.

Ms. Seiler said the current assessment components of CATS include: 1) the Kentucky Core Content Tests (KCCT); 2) writing, which is assessed through the portfolio; 3) Comprehensive Test of Basic Skills (CTBS), and 4) the alternate portfolio. Ms. Seiler explained the specific details of each assessment component, and information as outlined in the members' handouts.

Ms. Seiler said schools and districts are held accountable based on student performance and relevant nonacademic measures. She said the long term goal for each school and district, is to reach Proficient by 2014. Schools and districts are evaluated every two years to determine whether progress is being made toward their accountability goals.

Ms. Seiler said the KCCT measures a student mastery of the core content for assessment. Performance judgments are given to each student based upon their assessment performance. These are: Distinguished, Proficient, Apprentice (low, medium, high), and Novice (non-performance, medium, and high).

Ms. Seiler said after the academic index is determined for each content area, the accountability index for a school is calculated. This is a complex formula that is explained in detail in the full report. She said the academic index for each subject along with CTBS scores and non-academic data receive a weight in this calculation. The accountability index is then formulated for each school. She said the performance of a school is evaluated every biennium so every two years the accountability index of the school is combined to produce one value.

Ms. Seiler said a school growth chart is created for each school. The growth chart is formulated as if the school would reach the long-term goal of 100 in equal steps with each step taking two years. Every two years the school's accountability index is calculated and plotted on the growth chart, and depending upon where the point falls, determines the school's designation for that biennium. Ms. Seiler said every school has seven accountability cycles between 2000 and 2014 to reach their goal. If a school index is above the goal line, then the school is considered to have met its goal. A point between the goal line and the assistance line is progressing, and a point below the assistance line is considered needing assistance. If a school's accountability index falls within the margin of error, the school is classified as meeting goals.

Ms. Seiler said the percentage of schools in assistance has decreased over the past two biennia, and the percentage of schools meeting goals has increased from 49.6 percent to 56.2 percent. She said by statute, each parent is to receive a report card on the performance of their student's school, and the report card at a minimum should include student academic achievement, including assessment results; non-academic achievement, including the school's attendance, retention, and drop-out rates; and student transition to adult life data. The KDE also provides to the districts individual students reports, and districts can distribute these reports to parents.

Ms. Seiler gave another PowerPoint presentation on the alignment of CATS with NCLB. She said NCLB was enacted in 2002 and mandated the establishment of educational standards and assessments in every state that applied for federal Title I funds.

Ms. Seiler said that in order for Kentucky to meet the NCLB mandate, the KBE decided to continue to implement CATS, incorporate changes where necessary, and augment the system to meet the federal requirements. She said this decision set up a single system assessment, but two measures of accountability. The two systems each use some of the same assessments. The CATS assessments cover all content, yet not all the grades required by NCLB. In addition, the CATS system tests additional subjects and grades not accessed by NCLB.

Ms. Seiler said under NCLB, each state that applies for Title I funding, must submit a plan detailing that state's compliance with the NCLB Act. The United States Department of Education (USDOE) has established a process for a review and approval of each state plan. Phase I was completed in 2004, which was a review of accountability plans, and Phase II, the review of the assessment and standards plans, will begin in 2005. She said both phases must be complete in 2005 in order for state systems to fully compliant and operational. Ms. Seiler explained five key differences between NCLB and CATS, which are listed in the members' handouts.

Ms. Seiler said NCLB and CATS have similar goals as both state and federal goals seek proficient student performance by 2014. NCLB allows each state flexibility to define "proficient" and develop assessments that measure student knowledge of math, reading, and science core content. NCLB judgments of schools and subgroups are based upon reaching Adequate Yearly Progress (AYP) goals set for each school. She said subgroups that are considered under NCLB include: ethnicity/race, economically disadvantaged students, and disabled students, and limited English proficient students.

Ms. Seiler said NCLB measures school and subgroup performance in meeting AYP on three objectives: 1) annual measurable objective; 2) student participation rates; and 3) other academic indicators. She said an annual measurable objective is the goal of increasing the percent of students performing at the proficient level in both grade and subgroup levels in both reading and math. The growth toward the goal is measured yearly, and a goal is set for each level of school: elementary, middle, and high school.

Ms. Seiler said NCLB requires a 95 percent participation rate of the student population. Participation rate applies to grade level and to each subgroup. She said under CATS there is a 100 percent participation as all enrolled students are assessed, excluding students who have valid reasons for exclusion set out in regulation.

Ms. Seiler said the final factor in meeting AYP are other academic indicators, which vary according to the school level. In high school, it is the graduation rate. In elementary and middle school, it is the CATS Accountability Index. She said the high school graduation rate was established at 71 percent in 2002, and increases every year to an ultimate goal of a 98 percent graduation rate in 2014.

Ms. Seiler said the two accountability systems under CATS and NCLB create two different accountability results. As a result of this, a school can be subject to different interventions and sanctions depending upon how they perform on each of the accountability models. Under NCLB, the interventions grow with each year a school falls short of its AYP goal. Under CATS, the interventions grow the further a school is from its two year goal. Ms. Seiler further explained the NCLB consequences for failure to meet AYP as outlined in the handout provided to members.

Ms. Seiler said in some instances a school will meet its NCLB goals and not meet the CATS goals. She said NCLB requires AYP decisions to be provided with sufficient time for parents to make decisions about transfer options. In 2004, the delayed return of final CATS scores resulted in incorrect NCLB determinations of AYP. The preliminary AYP determinations were incorrect for 78 schools.

Ms. Seiler said NCLB reporting requirements requires individual student interpretive, descriptive, and diagnostic reports. NCLB and CATS both require annual school report cards to be prepared for each public school. The NCLB report card is the Kentucky Performance Report for each school which meets the state and federal requirements for reporting.

Ms. Seiler introduced Dr. James S. Catterall, Chair, and Dr. John Poggio, Vice Chair, NTAPAA, to present to the members their responses to questions concerning the validity and reliability of student test scores generated by the CATS and KCCT tests. She said NTAPAA has been in an advisory role to Kentucky's education system since 1999.

Representative Moberly asked the members if there were questions of Ms. Seiler. Senator Kelly asked about the last item on the NCLB report card about data on teacher quality. He wanted to know if this was teachers that were teaching within their subject areas and rankings. Ms. Seiler said she believes it does, but did not know the specifics. Representative Moberly said Kentucky ranks pretty well under teacher quality under the NCLB standards, and Ms. Seiler agreed.

Representative Moberly asked Ms. Seiler to talk briefly about the NCLB requirements with respect to the test that Kentucky would give whether it would be a norm-referenced test, or a test off the shelf. What is the latitude that NLCB gives, and what does it require? Ms. Seiler said the requirement is that the assessment chosen must access the core content and standards. It has to be aligned with the state standards. It is left to the state's discretion to set the standards and choose the assessments. The next phase of review will determine whether Kentucky's standards and assessments meet the NCLB requirements.

Representative Draud asked about the incorrect data distributed to 78 schools regarding preliminary AYP determinations. Ms. Seiler said she understood that preliminary data was only based upon the math and reading multiple choice scores that came out in August. Final scores were held until after the scoring of the open response questions, which adjusted the scores. Representative Draud asked how this problem could be corrected in the future. Ms. Seiler said the KDE is looking at ways to correct this including backing up the assessment window a week, and this helps to get the results out earlier.

Representative Draud asked how many examples there are in Kentucky where the AYP is actually different than the CATS scores, for example, where a school did quite well on the CATS, but failed with the AYP. Ms. Seiler asked Mr. Gene Wilhoit, Commissioner, KDE, to come to the table to answer the question. Mr. Wilhoit said the KDE knew this was going to happen, and that is why Kentucky asked the DOE for a waiver, but it was denied. The USDOE said that Kentucky had to release any results that it had in the first year by August 1, 2004. He said this problem has been corrected for the second year. Representative Draud said it is a complex system to have two different testing systems within one state. Mr. Wilhoit said it is difficult, but it is changing. He said the new secretary of the USDOE is going to allow more flexibility, but he does not know the specifics yet. He said this was not the mindset initially when NCLB was written.

Representative Moberly asked Mr. Wilhoit about litigation in other states concerning NCLB. Mr. Wilhoit said Connecticut is suing the USDOE over the issue, and the Utah state legislature just passed some legislation that would cause state law to supercede the federal law, and there are a number of lawsuits around the federal law not providing the financial support for the changes that need to take place for a state to come into compliance with NCLB.

Representative Moberly asked Mr. Wilhoit if 17 schools met their goals under CATS, but did not meet NCLB goals. Mr. Wilhoit said it is 67 schools. Representative Moberly asked how many schools met their goals under NCLB, but not CATS, and Mr. Wilhoit said the correct number was 301. Representative Moberly said this is a striking difference in that more schools are meeting AYP, but not meeting their goals under CATS. Mr. Wilhoit said this can vary with the school and the student populations within a school.

Representative Moberly asked about the frequency of transfer that has occurred in the state because of NCLB. Mr. Wilhoit said the largest movement was in Fayette County. He said the harsh reality is that in Kentucky, outside the two urban areas and possibly Northern Kentucky, it is a very difficult proposition for a parent to opt for a choice of moving schools in rural areas.

Representative Moberly asked Dr. Catterall and Dr. Poggio to come forward to discuss the validity and reliability of CATS and the KCCT test. Dr. Catterall said the NCLB and the trajectory for the growth of test scores did not require as much as Kentucky for 2004. Schools did not have to grow as fast, in fact it was plateaued for a year. Dr. Catterall said this could explain in part why some schools met federal requirements, but did not meet requirements for CATS.

Dr. Catterall referred to his memorandum to Ms. Seiler dated February 22, 2005, which contains the requested responses to OEA's questions regarding Senate Joint Resolution 156. He said NCLB requires states to establish comprehensive standards reflective of the established curriculum goals in the state, and to have an assessment that is reflective of the depth of those standards in English, language arts, and mathematics (and science in the near future). He said this means in Kentucky that the assessments have to map onto the core content for assessment, which derives from the program of studies, and the academic expectations. Dr. Catterall said the system has been designed to align itself and to cover adequately the core content, and all of the content areas that are assessed. This particular requirement affects the kinds of scores the schools get at the student level. For instance, in order to cover the mathematics curriculum with sufficient depth and breadth, the Kentucky CATS or KCCT mathematics tests are designed so that multiple forms with a different set of questions cover the entire math curriculum at fourth grade. He said one individual student's test will address a subset of the mathematics curriculum, it is decently broad, but it does not cover the entire curriculum, and is not independent of the other forms. This model of testing is called a matrix sampling or a matrix administration of a test.

Dr. Catterall said individual students scores have different meaning. The meaning is not uniform from student to student. They are not entirely independent, because some items overlap, but individual student scores are a different matter. Dr. Catterall emphasized that this fact impacts everything outlined in the February 22, 2005 memorandum. He said the scores for individual students on the various CATS tests are some approximation of how kids are doing. The scores are not completely independent of how students are doing in the fourth grade, as to whether a student is assigned a distinguished or a proficient level individually. He said there is room for error if a student is on the edge, or if there is a classification error.

Dr. Catterall said NTAPAA members have worked and developed a system since 1999 without great concern about the individual student scores, but more about school level scores. He realizes that Kentucky has invested a large amount of money into this current system, and would like to get the most bang for its buck. The system could work toward obtaining individual students scores that could be used for various things, but there are obstacles to overcome. Dr. Catterall said the biggest hurdle is cost and time. In order to make it a test for individual students, it would have to contain more multiple choice and open-response items. This would increase testing time, the cost of constructing the test increases, and the major cost would be scoring an expanded test.

Dr. Catterall said NTAPAA has studied the questions posed from OEA for about a year. These are not new issues, and NTAPAA has discussed them with the Student Curriculum Assessment and Accountability Council (SCAAC), KDE staff, and among the NTAPAA members. He said in general, most NTAPAA members are in agreement with the responses outlined in the February 22, 2005 memorandum to Ms. Seiler, after much lively discussion about parts of it. Dr. Catterall clarified that this piece is a specific request of NTAPAA, that is contributing to part of OEA's overall study. He said NTAPPA itself has a much broader look at some historical issues of validity of the system, and assessing the various new directions being proposed for Kentucky that will be discussed at the EAARS meeting on May 20, 2005.

Dr. Catterall said the answer to the first question in the memorandum about whether the component scores of CATS assessments are valid, reliable, and adequate indicators of individual student knowledge of the core content is no. Dr. Poggio said the tests at the school level are quite strong, but the sampling at the student level is so narrow, and often so diverse, that too many students are seeing entirely different tests. He said he would hope that parents would ask for more information from the teachers about their child before using the test score as the individual basis for how their child is performing in school. He said these individual students data reports would be most useful when blended with other information about the child when discussing if student scores are reliable and valid. He said that while Kentucky tests limit test errors as much as humanly possible, the representativeness, the content, and the program as it has existed with the purpose primarily being defined as school accountability, does not offer strong evidence of the absolute validity of the individual student scores.

Representative Moberly asked if it was fair to say that their answer to the first question on the memorandum is no, except when taken in context with other indicators. Dr. Catterall said that was definitely a through-line message of the document. Dr. Poggio said they are moving on the side of being cautious because it is the most justified decision with respect to the interpretation of the individual student scores.

Representative Draud clarified that they said testing could be done for individual student scores for reliability and validity, except that a larger sampling was needed, as well as additional time and cost. Dr. Catterall and Dr. Poggio said yes, but a much larger sampling is needed of the content on the individual test for the student.

Senator Westwood asked them if they were comfortable with the amount of information that the parents receive from the individual student report card. Dr. Poggio said Senator Westwood's question gets right to the root of the issue. Dr. Poggio said all parents should visit the school and meet with the teacher after receiving the individual student report card. A teacher can always add insight into a student's true ability that may not be reflected in a grade on a report card. Dr. Catterall said every score on a report card is accompanied by an error band that can point out the student's real or true score because of sampling error or error of measurement that could be anywhere on this band. This information satisfies Dr. Poggio and himself, but he does not know how satisfying it is to an average parent who is reading the report card and understanding what errors of measurement mean.

Dr. Catterall said that on-demand writing and the writing portfolios are components of the exam that do not have the combination of multiple-choice and open response questions, nor do they have differences across each student's exam, and so these items are placed within a different category. He said the main concern has been the use of a single writing score to determine a student's overall and forever after writing ability.

Dr. Poggio said the CATS program that exists today in Kentucky was set out to achieve a particular purpose, and now the state is trying to retrofit it to meet other purposes. He said with regard to the purpose that was originally embraced, beginning with KERA to KIRIS to CATS, there has been an unwavering disposition to address school accountability. Dr. Poggio said Kentucky may need to go back and redefine its purpose. He said with respect to the portfolio as it relates to the alternate assessment, and as it relates to the writing, the purpose there became more defined about changing instruction, and ensuring that students were writing everyday. In order to achieve that purpose, standardization to a certain extent was allowed to vary. He said the portfolio system is not a system that is broken, but if its purpose is to shift more towards a student accountability system, it needs to achieve greater standardization than it is offering.

Representative Draud asked if teacher judgment is more reliable than testing. Dr. Poggio does not disagree with that, but says the system of multiple measurements works best. He said teacher judgment is as important as the test data, perhaps more so. Representative Draud said Kentucky has taken a multiple choice approach to assessment. He said it is impossible to have a system in place where there is not some judgment of error, and some room for error. Dr. Poggio said he does not really agree with that in his personal, professional opinion. He said Kentucky has done an outstanding job with the current system. It would have been so easy for Kentucky to abandon the performance assessment, but it has not. He thinks this is a rather healthy system in Kentucky. Representative Draud said he did not think Kentucky had gone afield, but the multiple approach that Kentucky has taken in regards to assessment has been a sound one. Dr. Catterall and Dr. Poggio agreed.

Dr. Catterall said testing by any definition is an exercise in sampling so there is going to be error of some description, and one of his goals in his profession is to keep error within tolerable, useful, and acceptable limits.

Representative Draud said this is the comment he was wanting to hear. As policymakers, he found it very difficult to conceptualize, creating a testing system where there is not some room for error, and teacher judgment needs to be an important part of the whole process, even though, there is room for error there as well. Dr. Poggio said they have been very congratulatory of the CATS system in regards to sampling. He said it goes into the arts and humanities, vocational living, and lifeskills areas, and should not revert back to the narrow view of just assessing reading and mathematics. Representative Draud said the current system, however, does not lend itself to individual kinds of scores and accountability. Dr. Poggio said it was not built to do that, but it is close if it includes teacher judgment and information associated with the parent.

Senator Winters said that as a university president he would like to have some data in the overall testing scheme that will allow him to compare students individually. He would like for Kentucky to make some tweaks to the system that would provide this type of data. Can Kentucky have a system in one package that provides longitudinal data about student success, along with other needed information, and come out a better state as a result of it? He does not understand the reasoning of adding more open-response questions to go to a student specific evaluation system, and also asked if it was possible to evaluate higher level thinking with a multiple-choice test.

Dr. Catterall said he heard Senator Winters asking if Kentucky could move toward a system that would allow KCCT test scores on a high school transcript so that college admission officers would have more information to go by in accepting students into college. He said NTAPAA is diverse, and members would be divided upon their responses on whether to include these test scores on the high school transcript. He said one reason not to include this information on the transcript is because the colleges and universities are the only likely entity who would look at this test score. Dr. Catterall said there is no way a CATS score in writing is going to drive a college admission officer one way or the other, even if the score is placed right there on the transcript. He said the CATS score taken into consideration with other information is useful.

Dr. Poggio said if the sense of the legislature was to include test scores on the high school transcript, then NTAPAA would need to hear this message as the tests would need to be redesigned somewhat. He said the test would have to be broadened to be more uniform, and to have a more narrow focus. Dr. Poggio said he would be hesitant to do away with the performance test all together. He said it is a problem of how you get to the end result and what should the structure look like.

Senator Kelly said that Dr. Poggio had used the term performance to describe open-response questions. He thought there was a requirement when KERA was adopted that the questions on the CATS test be performance-based, and that it test critical thinking skills. He thought this is what drove the emphasis on open-response questions because of the availability and the technology of existing tests. He thought it was determined that open-response questions did not mean performance-based questioning, it could just be a fact recall question. He thought performance applied to the arts area only.

Dr. Poggio said Senator's Kelly memory is absolutely on target, but what he recalls about the early programs in KERA was that the first edition was entirely performance questions. He said performance assessment today means any opportunity extended to a student to pick up a pen and create a response.

Senator Kelly said performance could be incorporated into a multiple-choice question if it was designed right. Dr. Poggio agreed.

Senator Kelly paraphrased a statement made by Dr. Robert Linn, NTAPAA, in a KBE meeting that said it is time to think about norm-referenced tests that are augmented with open-response questions. He said we are also getting close to having to send out the RFP, which could govern what Kentucky does for the next two to four years. He said a norm-referenced test augmented with open-response questions could provide the same form with sufficient questions asked in order to obtain student validity, teachers could receive information quicker, and some of the subjective and logistical problems could also be solved.

Dr. Catterall said he wrote a memorandum that explained that Dr. Linn was talking about a range of possibilities for designing a system, and he has two problems of degree (not absolute problems) with this idea. He said it is a broad alignment question, with a piece of it being the alignment of the norm-referenced test (NRT) with Kentucky's Core Content for Assessment. Dr. Catterall said when NRT was added to CATS and KCCT as a result of the law in 1999, NTAPAA studied three, prominent, commercial off-the-shelf assessment tests and discovered some alignment at the basic skills end of the spectrum, and little alignment at the higher ordered thinking skills end of the spectrum. The conclusion was that if the law requires a NRT, it can be used in a limited way for the purpose of percentile rankings, but it should not dominate the system. Dr. Catterall said his understanding of a system that begins with a NRT and built with augmentations is being suggested to be done in limited grades now to meet the requirements of NCLB as a supplement to the system, so this is not something that is not reasonable to think about. The magnitude of the augmentation that would have to take place to convert this to the entire system to get the coverage that Kentucky is seeking, would be phenomenal.

Dr. Poggio said Senator Kelly's suggestion merits serious attention and evaluation, but based upon the 1999-2000 work, it was not just the question that the shelf test not reflect the core content, but the catalog test cannot be changed, and 40-50 percent of the content did not relate to Kentucky's core content curriculum. He said this could advance or hold back a student's score as a function of items that no one has taught him or her. Dr. Poggio said with the accountability framework that is currently in place, the off-grade NRT augmented test only weighs in at five percent of the evaluative accountability.

Senator Kelly asked why Kentucky's core content is covered by so few NRT tests. Are we teaching our students content that should not be taught in our schools? Dr. Poggio said the commercial NRT tests are basing their questions and content on New York City, Cleveland, and Dallas. It is not the norm for publishers of catalog tests to customize to states. Senator Kelly asked if they were preparing their content to match the high volume purchasers. Dr. Poggio agreed.

Dr. Poggio said Kentucky has already done the alignment work. He said Kentucky has looked at a catalog test and Kentucky's core content, and said these things only aligned at about fifty or sixty percent. They measure better the basic skills, and do not do as well on the challenging skills.

Senator Kelly said Dr. Poggio has pointed out the tremendous difficulty of using assessment for accountability in education because there is such a vast body of knowledge that can be taught and assessed. He said obtaining consensus on what is important and what should be assessed and taught is extremely difficult. He said the high stakes nature of the testing is that once an agreement is made, that is all that will be taught. Senator Kelly said physics, chemistry, advanced algebra, and foreign languages are not tested, and wondered if they should be. He thinks the test Kentucky has developed has a very high quality for the purpose it was intended, but he is not sure that the purpose does not need to be reevaluated. He said he is not talking about abandonment, just some adjustments. It is very important that Kentucky maintains flexibility and utilizes NTAPAA's experience and knowledge to make sure that the state is developing and requiring an accountability system that is in fact providing the type of information that teachers, higher education officials, parents, and policymakers need.

Dr. Poggio said that when Kentucky passed KERA, it made a statement that it would build tests that are challenging, hold schools accountable, and require teaching according to the core content areas. Dr. Poggio said today, through NCLB, this is what is going on. His reflection is that no one is interested in a national curriculum. CATS, as it exists today, defines a core content that represents itself as the suggested curriculum, and not the mandated curriculum. He said he assumes that local control is an issue at the school district level. If Kentucky were to relegate the decision of the assessment and how it was tested to a shelf-based product, it would be forfeiting the nature of that curriculum. It is a heavy question - where do you go?

Senator Kelly said the technology of shelf-based testing has evolved since Dr. Poggio conducted his study in 1999. Dr. Poggio said the advances in the testing profession have not kept pace with the questions the EAARS members have posed in the meeting. He said advances have been made in working the numbers, and making the score more accurate, while ensuring the dependability of the measurement of an individual or for a group. Dr. Poggio said the central question of validity, which is linking what is tested to Kentucky's curriculum, other than the principle of alignment, is no further along today than it was in 1988. He said people have figured out how to use the computer to squeeze numbers and deliver tests on-line, but the central question of what to teach and how to test it has not seen great advances. Senator Kelly said this is an issue in itself because it has certainly not been due to lack of effort, research, or study. Dr. Poggio said it comes back to local control and what people want to test.

Dr. Catterall said one would think that available technologies would revolutionize these types of testing in administration or in customization. He said a big problem with that is that the large, commercial, nationally used tests, have to hold onto their norms over seven years. A test cannot be customized and sent to Kentucky that would be great for an assessment that is aligned with curriculum, but a national percentile cannot be obtained from it because no one else is taking that group of assessments.

Senator Kelly said this is a huge demand that we obtain this comparison ability. He said it is the law, along with the need for longitudinal data and the desire for teachers to receive meaningful feedback early so they can use the data, are demands and concerns that militate towards having a test that is normed.

Dr. Catterall said KERA was passed at a time when there was a tremendous backlash against standardized testing. KERA was the first time in the nation's history that a state had the opportunity to go after all the big ideas in assessment. He said Kentucky wanted true performance assessments, and it was not long in building that into a statewide school accountability system, that big difficulties arose in equating forms of tests, and equating tests from year to year. He said the technical difficulties arose because there were not any anchors across forms, and multiple-choice questions slipped in as a way to broaden the types of assessing and include more topics. It also gave technicians some data in assessing the difficulties of tests from year to year.

Representative Rasche asked what is learned from a NRT in terms of measuring progress. He said every time the state needs to re-norm, it is at the mercy of everyone else's progress. Dr. Catterall said this is why it only weighs in at five percent in the accountability formula, it is very weak because populations change. Representative Rasche said ultimately there will be an average at the 50 percentile. He said it is based upon a previous norm, and he said unless teachers are teaching to the content of the test, there is no way a student can max out on the test.

Dr. Poggio said that progress can be monitored on the NRT by looking at the cohort at a given grade in successive years. The issue always goes back to the question of validity and alignment. He said NRT tests take more time for students and one has to consider how the NRT will affect the students whose native language is not English. He said NCLB clearly states the provision that states cannot take catalog tests and use it to meet the mandate. States are to establish their own curriculum standards, which are intended to be challenging, and then states must have an assessment that aligns with those standards.

Representative Rasche gave an example of taking a NRT test in a high school science class. He said he scored a 99 on the pre-test and also on the post-test. He learned a vast amount of material in between that was not reflected on any test score. He said this information essentially did not tell anyone anything, except for that he knew more than the rest of the students in the class.

Senator Worley said Dr. Poggio and Dr. Catterall have provided members with the strengths and weaknesses of both the open-response question tests, and a nationally normed, standardized test. He said he assumes there will be much more discussion on the pros and cons of each test as Kentucky tries to strengthen the current assessment test. He asked a question as a parent rather than as a legislator. Has Kentucky's testing mission evolved to assess the schools instead of the students? Is there more priority on holding schools accountable, rather than students.

Dr. Poggio said his response is colored by his profession. He would not want his child assessed and held accountable to a single paper and pencil test. It is his professional opinion that holding a child accountable for performance on a single measure is ill advised, and not recommended. He said his professional standards say that using an educational assessment to make decisions about children is a serious decision that should not be automatically assumed as credible. He said all states in the country right now are making choices, and some states are doing this and some are not.

Dr. Catterall asked if Senator Worley was asking about the evolution of moving towards holding schools accountable. Senator Worley said this was the core of his question. Are we teaching to the test so the schools can have high accountability, or are we creating and teaching students to score well on tests that adequately assesses their quality of education? Dr. Catterall said this is an empirical question, and it could be debated throughout schools in Kentucky. He said the Kentucky test is very broad because it covers many topics and content areas in the curriculum. It is broad because a student contemplating taking a test knows they face multiple-choice and open-response questions. He said the student has to write something on-demand, and assemble a portfolio of writing. He said it would be difficult as a teacher in Kentucky to teach to this test because it is so vast with so many different components. He said he does not see anything insidious in terms of teacher or system behavior that is being driven by the presence of this very broad test.

Representative Moberly asked if a parent in Kentucky should be satisfied with the emphasis placed on the test and how it drives the curriculum. Dr. Poggio said there will be abuses that need to be contained, but he would be very satisfied with the system as it exists if he were a parent in Kentucky.

Senator Worley asked if Dr. Poggio has data where classroom teachers were interviewed as to how they teach, or has he spoken to superintendents to see what kind of emphasis they place on the assessment? Dr. Poggio said he has the data on a couple of independent studies conducted in the state, and other data that he relies on are studies performed by the KDE and its contractors reviewed by the NTAPAA panel. He said studies of impact have shown teachers becoming more clear about the focus of core curriculum, and teachers have a chance to respond through state surveys and interviews. Teachers communicate to the administrators if they need assistance and do not understand how to teach items that are on the test, and he believes the system is working the way it was intended.

Senator Worley said he is not for doing away with CATS test, but is for making adjustments within the test as it needs to occur. He said before Kentucky can make adjustments that would be effective on education across the Commonwealth, problems within the CATS test need to be identified. He said at the core of the issue is determining whether or not the emphasis on the CATS test is on the assessment of the school, or on the assessment of students. If the emphasis is placed on assessing the school, the legislature and interested agencies need to acknowledge this, and change this. If this is the mission, it is misguided.

Dr. Poggio said Senator Worley is talking about motivation of students at the high school level. He cautioned Kentucky to move towards student accountability to solve motivation problems. He said schools can say they are being held accountable for results, and their students are not trying to do well on the test because they feel it does not count for anything. He said there is no doubt some truth in this, but studies conducted by KDE revealed that 82 - 85 percent of high school students say they tried their hardest on the test. He said another seven percent of the students say they did not try hard because the content was not covered in their curriculum so they did not know how to do it. Some other students say they already knew the material, and therefore did not have to work hard on it. This leaves a pool of about eight percent of the high school students who seemingly have a motivation problem. He suggested before Kentucky embraces high-stakes student accountability to solve the problems of about eight percent of the students, guidance counselors need to talk to their colleagues about helping to motivate their students.

Senator Worley said he believes Dr. Poggio has completely misunderstood the motivation in high school students as it relates to the CATS test. He said not one college in Kentucky uses a CATS test score as an admission into college. The students realize the CATS test score means nothing to their grade in the classroom, and nothing to their admission into a college, it only means something about the evaluation of the school. It is certainly not going to be changed by guidance counselors getting seven percent of the student population to change their own personal conduct.

Senator Westwood said he believes these philosophical debates are healthy. He said there is another corollary to this discussion. He said he is concerned that the material not being tested is not being taught. This is a concern for him as parent and a former teacher. He liked to believe that things he taught in the classroom, whether it was tested or not, was going to add to that child's dimension to be a better citizen and a better person. He is fearful that the discussion about only teaching the students what is being tested is robbing Kentucky's students of a rich educational experience.

Dr. Catterall said the breadth of the test, which is helped by the ability to have different tests for different students in a class to broaden the entire coverage of the test, makes it difficult for teachers to teach to the test entirely.

Senator Westwood said it reverts back into the same argument that brought Kentucky into KERA in the first place, which were things not being taught that needed to be, so a test was created to ensure those items are covered. Why can Kentucky not use a NRT test, which is likely to cover all the material anyway? He gave an example of the ACT test, which is given in the states of Illinois and Colorado to all eleventh grade students, including special education students, who are improving even on the ACT test. This test gives a pretty good indication if the child is going to succeed in college. He would be satisfied as a parent if he received a report card that said his child had scored a 30 on the ACT. ACT scores would tell parents three things: 1) How well is the school preparing my child for college; 2) How is my child performing in different subject areas; and 3) Should money be spent on sending my child to college? He is having trouble with the entire concept of teachers only teaching to the test, and is Kentucky leaving out some significant curriculum, and if not, does Kentucky need this form of a test in the first place?

Dr. Poggio said that if the curriculum in the ACT aligned with 85 - 90 percent of Kentucky's expectations, he would recommend using it. He does not want to be perceived as being opposed to the NRT or a shelf test, but he wants to make sure the alignment is there with the curriculum that Kentucky wants offered to the students. He said the evidence suggests in studies that have been conducted that when a state defines its core content, and identifies uniquely what it wants, does it fit with a standardized test? He said no, not well enough to justify it as a means of school or student accountability. These tests like the SAT used to be called aptitude tests. He said they have changed the name, but not the test. He said aptitude does not mean an achievement test. Aptitude is a capacity in a specific area, the likelihood of someone acquiring information in a defined curriculum area, while achievement is what someone has mastered in consideration under the presumption that there has been instruction. Dr. Poggio summarized by saying that if Kentucky can find a catalog test that aligns with its content, then it should use it. Or, Kentucky can embrace the catalog test's content, and relegate the decision of the core content to the shelf product.

Dr. Catterall said the tone of the discussion on NTAPAA has shifted over time. He said the panel has monitored development, and worked on issues and problems that have derived over time. He said the last two meetings have centered around the word "purpose." He said Kentucky needs to determine the purpose of its testing system. Is it that high school students need to be motivated? Is the purpose to put high school test scores on transcripts? Is it that schools, or students be held accountable? He said a system has to start with its purpose, and let its design meet the purpose. He said purpose should be Kentucky's guideword as it moves forward in its work.

Representative Moberly said Kentucky's general purpose has not changed since the passage of KERA in 1990. He said the state has run into problems with implementation, or what Dr. Poggio referred to as abuses. He said as the curriculum is aligned with the core content, the everyday classroom instruction should facilitate a good score on the test, however, it was not anticipated that teachers would take three weeks and teach the items on the test prior to testing week. He said it was never intended for the writing portfolio to take a lot of extra time, it was intended to be based generally on the instruction during the day and in the class, and samples of what the student produced during class time. He said it suddenly turned into this issue that students were having to spend hours and hours working on the writing portfolio. He said this was not the intent, but the implementation or the abuses of it are what causes the problem. He said student accountability is extremely important, and is a problem area that needs to be addressed.

Dr. Poggio discussed KCCT writing scores. He said some tuning of the writing is probably in order. There are issues with the reliability of the writing portfolios, and KDE is well aware of them. He said discussions have taken place about reorienting the writing piece to more of an on-demand approach. He explained that on-demand means two or three writing samples of specified duration to get the students best writing, and this could be an adjustment worth giving consideration.

Representative Moberly asked about teacher advancement programs, and if the CATS scores could be used as one of several other criteria to be looked at in terms of value added to the students in a class for a teacher in regards to increasing the teacher's pay.

Dr. Catterall said his first instinct is yes, the CATS scores could be used with an array of other information to talk about how a teacher is doing. He said the negative is that the teacher has only had the students for one year, and some of what the students are producing on the CATS test is the result of the work of other teachers. He said another big component of what the student is capable of doing in class is his or her parents. He said the CATS could have a small weighting in a package for assessing teachers.

Dr. Poggio said the issue of using students' CATS test scores in evaluating teachers needs additional study. He said it needs more sophisticated, statistical modeling. He said it could be modeled in the state for a couple of years, especially in conjunction with a program that is monitoring a child over time, and building longitudinal databases.

Senator Kelly said he would like to see the dialogue between NTAPAA and the legislature improved. He feels that NTAPAA helped the KDE at a critical time, but at this particular juncture, he is very interested in continuing this type of dialogue that was in the meeting. He mentioned his concern about creating a single index that becomes a measure of a success of a school. He said Kentucky's purpose is to have instruction in Kentucky schools at a high level. He wants students exposed to an education that has high standards, and make sure that no one is overlooked. He said Kentucky is very fortunate to have had 14 years of intensive research and experience, and needs to make sure the state is not pursuing something that exceeds the technical capability of assessment to accomplish, and that Kentucky is taking advantage of everywhere that we ought to be testing, and testing in the most efficient manner.

Dr. Catterall said it is time to look at the stability and reliability of the school level scores. He said once the transition is made from the problems with the individual level students scores and sampling issues, these problems do not replicate themselves at the school level because most of the sampling issues simply wash out. Senator Kelly said it opens up a whole new host of issues at that point. Dr. Catterall said yes, there are substantive issues such as if the language test is on the mark.

Dr. Poggio said NTAPAA has said that they believe Kentucky's school accountability system is rock solid. He said the comfort in coming to that conclusion is the way the system is designed, which places reward and evaluation on the basis of progress, not standing. He also said he would like to have further meetings with the legislators in a less formal atmosphere to discuss issues with the entire NTAPAA panel.

Dr. Catterall said NTAPAA has changed its operating procedures for the logistics of how the panel interacts with the Capitol. He works with the Legislative Research Commission to build an agenda for quarterly meetings, although he can still request needed information from the KDE.

Representative Moberly said this has been one of the best discussions that he can remember through the years and thanked the experts for coming to the meeting and looked forward to their continuing service in the future. He asked Commissioner Wilhoit to come to the table and give his response to the study report.

Commissioner Wilhoit said he provided the OEA with a 10 page document that included his response to the study report. He characterized the comments as factual based on changes that may have occurred since the drafting. Secondly, there would be comments that KDE would submit for expansion or further understanding, but generally feel that OEA has done an excellent job of capturing the spirit of both the legislation and the NCLB. His most substantive comment was around NCLB, and most of his points that he addressed as concerns have been covered in the meeting today. He has completed his analysis and sent it to Ms. Seiler if members need a copy.

Representative Moberly said the next EAARS meeting will be on May 20, 2005. He said members will continue to receive more sections of the report, and begin to discuss proposed changes to the CATS system. He said Dr. Catterall and Dr. Poggio will be back to offer their assistance, and to provide a written summary of the assessment and accountability issues brought to the panel by the KBE.

With no further business before the committee, the meeting adjourned at 12:55 p.m.