MATERIALS AND METHODS: This review is on some of the issues in standard setting based on the published articles of educational assessment researchers.
RESULTS: Standard or cut-off score should be to determine whether the examinee attained the requirement to be certified competent. There is no perfect method to determine cut score on a test and none is agreed upon as the best method. Setting standard is not an exact science. Legitimacy of the standard is supported when performance standard is linked to the requirement of practice. Test-curriculum alignment and content validity are important for most educational test validity arguments.
CONCLUSION: Representative percentage of must-know learning objectives in the curriculum may be the basis of test items and pass/fail marks. Practice analysis may help in identifying the must-know areas of curriculum. Cut score set by this procedure may give the credibility, validity, defensibility and comparability of the standard. Constructing the test items by subject experts and vetted by multi-disciplinary faculty members may ensure the reliability of the test as well as the standard.
MATERIALS AND METHODS: A survey of the tutors who had used the instrument was conducted to determine whether the assessment instrument or form was user-friendly. The 4 competencies assessed, using a 5-point rating scale, were (1) participation and communication skills, (2) cooperation or team-building skills, (3) comprehension or reasoning skills and (4) knowledge or information-gathering skills. Tutors were given a set of criteria guidelines for scoring the students' performance in these 4 competencies. Tutors were not attached to a particular PBL group, but took turns to facilitate different groups on different case or problem discussions. Assessment scores for one cohort of undergraduate medical students in their respective PBL groups in Year I (2003/2004) and Year II (2004/2005) were analysed. The consistency of scores was analysed using intraclass correlation.
RESULTS: The majority of the tutors surveyed expressed no difficulty in using the instrument and agreed that it helped them assess the students fairly. Analysis of the scores obtained for the above cohort indicated that the different raters were relatively consistent in their assessment of student performance, despite a small number consistently showing either "strict" or "indiscriminate" rating practice.
CONCLUSION: The instrument designed for the assessment of student performance in the PBL tutorial classroom setting is user-friendly and is reliable when used judiciously with the criteria guidelines provided.