DateSpeaker & TitleDescription
December 5, 2021
The 1st keynote speech: 9:30am-10:20am (40-min presentation and 10-min Q & A)  Prof. Trevor BOND AUSTRALIA   From estimation to consideration: The role of Rasch measurement in promoting understanding and validity  Many come to Rasch measurement when they are put in the position of developing their own instruments for measurement or trying to determine whether the instrument they have selected works in the way it was intended. The first step in constructing measures is to have a deep understanding of the variable under investigation. In order to construct an instrument that measures just one variable at a time, the items generated must be both, as similar as possible, as well as being as different as possible. Rasch measurement helps us untangle that apparent conundrum. A distinctive feature of Rasch analysis is that items and persons are analysed together, so that their performances maybe examined independently. This presentation uses empirical evidence from a recent examination of the ability of second language English learners to write English language essays. The results show how the Many Facets Rasch model can be used to untangle the effects, not only of rater and examinee, but the interactions between topic, grade and a common making rubric.
Break: 10:20am-10:40am (20 mins)
The 2nd keynote speech: 10:40am – 11:30am (40-min presentation and 10-min Q & A)Prof. Kit Tai HAU Hong Kong, CHINA   Large Scale International Educational Assessment: Uses, Limitations and Counter-Intuitive FindingsIn the last two decades, policy makers, researchers, and teachers have paid great attention to large scale international student assessment programs such as PISA. Using results from these programs, we would discuss some counter-intuitive findings, limitations of the research design, and other comparability of scale issues.
DateSpeaker & TitleDescription
December 6, 2021
The 1st keynote speech: 8:30am-9:20am (40-min presentation and 10-min Q & A)  Prof. Ricardo PRIMI BRAZIL   Response styles as Person Differential Functioning: methodological approaches to solve Person DIF  Likert-type self-report scales are frequently used in large-scale educational assessment of social-emotional skills. Self-report scales rely on the assumption that their items elicit information only about the trait they are supposed to measure. Specifically, in children, the response style of acquiescence is an important source of systematic error. Balanced scales, including an equal number of positively and negatively keyed items, have been proposed as a solution to control for acquiescence, but the reasons why this design feature worked from the perspective of modern psychometric models have been underexplored. Three methods for controlling for acquiescence are compared: classical method by partialling out the mean; an item response theory method to measure differential person functioning (DPF); and multidimensional item response theory (MIRT) with random intercept. Comparative analyses are conducted on simulated ratings and on self-ratings provided by 40,649 students (aged 11–18) on a fully balanced 30-item scale assessing conscientious self-management. Acquiescence bias was found to be explained as DPF.
Break: 9:20am-9:30am (10 mins)
The 2nd keynote speech: 9:30am – 10:20am (40-min presentation and 10-min Q & A)Prof. Kelly BRADLEY AMERICA   (Soon)  (Soon)
Break: 10:20am-10:40am (20 mins)
The 3rd keynote speech: 10:40am – 11:30am (40-min presentation and 10-min Q & A)Prof. Steven STEMLER AMERICA   Better Measurement and Fewer Parameters! The True Value of Rasch over IRTProponents of Item Response Theory models have sometimes described the Rasch model as “the one-parameter IRT model”. In doing so, however, they miss both the point and the power of Rasch model. Only the Rasch model can guarantee that the scale being constructed has the same meaning for all test takers, and this provides a powerful advantage over 2 and 3 parameter IRT models. By modeling a second parameter (item discrimination) and allowing item characteristic curves to cross, as IRT models do, more information is incorporated into person ability and item difficulty estimates, but this comes with an attendant loss in the power to interpret the test scale in a way that means the same thing for all test takers. Thus, any approach to assessment that aims to be able to report what all test-takers know and can do at each level of ability or which hopes to use adaptive testing algorithms to select the appropriate items to administer must necessarily rely on the Rasch model and not on 2 or 3 parameter IRT models.