educational assessment

Friday, May 22, 2009

Evaluation Method: The CIPP Model

Evaluation is methodologically diverse using both qualitative and quantitative methods, including case studies, survey research, statistical analysis and model building among other.

Dr. Rosita Santos cited Stufflebeam (1983) who developed a very useful approach in educational evaluation known as the CIPP or Context, Input, Process, Product approach (although this model has since then been expanded to CIPPOI (where the ast stand for Outcome and Impact respectively).

The CIPP systematizes the way to evaluate the different dimensions and aspects of curriculum development and the sum total of student experiences in the educative process. The model require the stakeholders be involved in the evaluation process. In this approach, the user is asked to to go through a series of questions in the context, inputs, process and product stages. Some questions are listed below:

1. Context
What is the relation of the course to other courses?
Is the time adequate?
What are the critical or important external factors?
Should courses be integrated or separate?
What are the links between the course ad research/extensions service?
Is there a need for the course?
Is the course relevant to job needs?

2. Inputs
What is the entering ability of students?
What are the learning skills of students?
What is the motivation of students?
What are the living conditions of students?
What is the students' existing knowledge?
Are the aims suitable?
Do the objectives derive from aims?
Are the aims SMART?
Is the course content clearly defined?
Does the content match student abilities?
Is the content relevant to practical problems?
What is the theory/practice balance?
What resources/equipment are available?
What books do teachers have?
What books do the students have?
How strong are the teaching skills of teachers?
What time is available compared with the workload for preparation?
What KAS related to the subject, do the teachers have?
How supportive is the classroom environment?
How many students are there?
How many teachers are there?
How is the course organized?
What regulations relate to training?

3. Process
What is the workload of students?
How well/actively do students participate?
Are there any problems related to teaching?
Are there any problems related to learning?
Is there an effective 2-way communications?
Is knowledge only transferred to students, or do they use and apply it?
Are there any problems which students face in using/applying/analyzing the knowledge and skills?
Are the teaching and learning affected by practical/institutional problems?
What is the level of cooperation/interpersonal relations between teachers/students?
How is discipline maintained?

4. Product
Is there one final exam at the end or several during the course?
Is there any informal assessment?
What is the quality of assessment (what levels of KSA are assessed?)
What are the students' KSA levels after the course?
How do students use what they have learned?
How was the over-all experience for the teachers and for the students?
What are the main lessons learned?
Is there an official report?
Has the teachers' reputation improved or been ruined as a result?

The guide questions are not answered by the teacher only or by a single individual. Instead, there are m,any ways in which they can be answered. Some of the more common methods are listed below:

1. discussion with class
2. informal conversation or observation
3. individual student interviews
4. evaluation forms
5. observation in class/session of teacher/trainer by colleagues
6. video-tape of own teaching (micro-teaching)
7. organizational document
8. participant contract
9. performance test
10. questionnaire
11. self-assessment
12. written test

Evaluation Approaches

Evaluation approaches are the various conceptual arrangements made for designing and actually conducting the evaluation process. These approaches are classified below (source: Rosita de Guzman-Santos, 2008):

A. Pseudo-evaluation. These approaches are not acceptable evaluation practice, although the seasoned reader can surely think of a few examples where they have been used.
1. Politically controlled . Information obtained through politically controlled studies is released or withheld to meet the special interest of the holder.
2. Public relations studies or information is used to paint a positive image of an object regardless of the actual situation.

B. Objectivist, elite, quasi-evaluation. These are highly respected collection of disciplined inquiry approaches. The are quasi-evaluation because particular studies legitimately can focus only on questions of knowledge without addressing any questions of value.Such studies are, by definitions, not evaluations since it produce only characterizations without appraisals.
1. experimental research. This is used to determine causal relationships between variables. Its highly controlled and stylized methodology may not be sufficiently responsive to the dynamically changing needs of most human service programs, and thus posed its potential problem.
2. Management information Systems (MIS). This can give detailed information about the dynamic operations of complex programs. However, this information is restricted to readily quantifiable data usually available at regular intervals.
3. Testing Programs. These programs are good at comparing individuals or groups to selected norms in a number of subject areas or to set a standards of performance. However, they only focus on the testee performance and they might not adequately sample what is taught or expected.
4. Objectives-based approaches. These relate outcomes to prespecified objectives, allowing judgments to be made about their level of attainment. Unfortunately, hey only focus on outcomes too narrow to provide basis for determining the value of an object.
5. Content Analysis. This approach is considered a quasi-evaluation as it is not based on value judgment, only based on knowledge, thus not true evaluation. On the other hand, when content analysis judgments are based on values, such studies are evaluation.

C. Objectivist, mass, quasi-evaluation. Accountability is popular with constituents because it is intended to provide an accurate accounting of results that can improve the quality of products and services. However, this approach can quickly turn practitioners and consumers into adversaries when implemented in a heavy-handed fashion.

D. Objectivist, elite, true evaluation. The drawback in these studies can be corrupted or subverted by the politically motivated actions of the participants.
1. Decision-oriented studies. These are designed to provide knowledge based for making and defending decisions. It requires close collaboration between the evaluator and decision-maker allowing it to be susceptible to corruption and bias.
2. Policy studies. These provide general guidance and direction on broad issues by identifying and assessing potential costs and benefits of competing policies.

E. Objectivist, mass, true evaluation. Consumer-oriented studies are used to judge the relative merits of goods and services based on generalized needs and values, along with a comprehensive range of effects. However, this approach does not necessarily help practitioners improve their work, and it requires a very good and credible evaluation to do it well.

F. Subjectivist, elite, true evaluation. Accreditation/certification programs are based on self-study and peer review of organizations, programs and personnel. They draw on the insights, experience and expertise of qualified individuals who use established guidelines to determine if the applicant should be approved to perform specified functions. However, unless performance-based standards are used, attributes of applicants and the processes they preform often are over-emphasized in relation to measure of outcomes or effects.

G. Subjectivist, mass, true evaluation. These studies help people understand the activities and values involved from a variety of perspectives. However, this responsive approach can lead to low external credibility and a favorable bias toward those who participated in the study.
1. adversary approach focuses on drawing out the pros and cons of controversial issues through quasi-legal proceedings. This helps ensure a balanced presentation of different perspectives on the issues, but also likely to discourage later cooperation and heighten animosities between contesting parties if "winners" and "losers" emerge.
2. Client-centered studies address specific concerns and issues of practitioners and other clients of the study in a particular setting. These studies help people understand the activities and values involved from a variety of perspoectves.

Educational Evaluation

Evaluation is defined as a systematic, continuous and comprehensive process of determining the growth and progress of the pupil towards objectives or values of the curriculum. It is also a systematic determination of merit, worth, and significance of something or someone. Furthermore, it is used to characterize and appraise subjects of interest in a wide range of human enterprises.

The American Evaluation Association created a set of Guiding Principles for evaluators which can equally apply in the Philippine context:

1. systematic inquiry. Evaluation must be based on concrete evidence and data to support the inquiry process.

2. competence, Evaluators must be people of known competence and generally acknowledge in the educational field.

3. Integrity/Honesty. Evaluators ensure the honesty and integrity of the entire evaluation process.

4. Respect for People. Evaluators respect the security, dignity and self-worth of the respondents, program participants, clients and other stakeholders with whom they interact.

5. responsibilities for general and public welfare. Evaluators articulate and take into account the diversity of interests and values that may be related to the general and public welfare.

The above-mentioned evaluation guiding principles can be used in various levels: at the institutional level ( to evaluate learning), at the policy level ( to evaluate institutions), and at the international level ( to rank and evaluate performance of various institutions of higher learning). These principles serve as benchmarks for good practices in educational evaluation.

Saturday, May 16, 2009

Authentic Assessment

In 1935, the distinguished educator Ralph Tyler proposed an "enlarged concept of student evaluation," encompassing other approaches besides tests and quizzes. He urged teachers to sample learning by collecting products of their efforts throughout the year. That practice has evolved into what is today termed "authentic assessment," which encompasses a range of approaches including portfolio assessment, journals and logs, products, videotapes of performances, and projects.

Authentic assessments have many potential benefits. Diane Hart, in her excellent introduction to Authentic Assessment: A Handbook for Educators, suggested the following benefits:

1. Students assume an active role in the assessment process. This shift in emphasis may result in reduced test anxiety and enhanced self-esteem.
2. Authentic assessment can be successfully used with students of varying cultural backgrounds, learning styles, and academic ability.
3. Tasks used in authentic assessment are more interesting and reflective of students' daily lives.
4. Ultimately, a more positive attitude toward school and learning may evolve.
5. Authentic assessment promotes a more student-centered approach to teaching.
6. Teachers assume a larger role in the assessment process than through traditional testing programs. This involvement is more likely to assure the evaluation process reflects course goals and objectives.
7. Authentic assessment provides valuable information to the teacher on student progress as well as the success of instruction.
8. Parents will more readily understand authentic assessments than the abstract percentiles, grade equivalents, and other measures of standardized tests.

Authentic assessments are new to most students. They may be suspicious at first; years of conditioning with paper-pencil tests, searching for the single right answer, are not easily undone. Authentic assessments require a new way of perceiving learning and evaluation. The role of the teacher also changes. Specific assignments or tasks to be evaluated and the assessment criteria need to be clearly identified at the start. It may be best to begin on a small scale. Introduce authentic assessments in one area (for example, on homework assignments) and progress in small steps as students adapt.

Develop a record-keeping system that works for you. Try to keep it simple, allowing students to do as much of the work as feasible.

Types of Authentic Assessment
Performance Assessment
Portfolio Assessment
Self-Assessment

Excerpted from Classroom Teacher's Survival Guide.

Principles of High Quality Assessment

1. Clarity of Learning Targets
Assessment can be made precise, accurate and dependable only if what are to be achieved are clearly stated and feasible. The learning targets, involving knowledge, reasoning, skills, products and effects, need to be stated in behavioral terms which denote something which can be observed through the behavior of the students.
a. Cognitive Targets
Benjamin Bloom (1954) proposed a hierarchy of educational objectives at the cognitive level. These are:
- Knowledge – acquisition of facts, concepts and theories
- Comprehension - understanding, involves cognition or awareness of the interrelationships
- Application – transfer of knowledge from one field of study to another of from one concept to another concept in the same discipline
- Analysis – breaking down of a concept or idea into its components and explaining g the concept as a composition of these concepts
- Synthesis – opposite of analysis, entails putting together the components in order to summarize the concept
- Evaluation and Reasoning – valuing and judgment or putting the “worth” of a concept or principle.
b. Skills, Competencies and Abilities Targets
- Skills – specific activities or tasks that a student can proficiently do
- Competencies – cluster of skills
- Abilities – made up of relate competencies categorized as:
i. Cognitive
ii. Affective
iii. Psychomotor
c. Products, Outputs and Project Targets
- tangible and concrete evidence of a student’s ability
- need to clearly specify the level of workmanship of projects
i. expert
ii. skilled
iii. novice
2. Appropriateness of Assessment Methods
a. Written-Response Instruments
- Objective tests – appropriate for assessing the various levels of hierarchy of educational objectives
- Essays – can test the students’ grasp of the higher level cognitive skills
- Checklists – list of several characteristics or activities presented to the subjects of a study, where they will analyze and place a mark opposite to the characteristics.
b. Product Rating Scales
- Used to rate products like book reports, maps, charts, diagrams, notebooks, creative endeavors
- Need to be developed to assess various products over the years
c. Performance Tests - Performance checklist
- Consists of a list of behaviors that make up a certain type of performance
- Used to determine whether or not an individual behaves in a certain way when asked to complete a particular task
d. Oral Questioning – appropriate assessment method when the objectives are to:
- Assess the students’ stock knowledge and/or
- Determine the students’ ability to communicate ideas in coherent verbal sentences.

e. Observation and Self Reports
- Useful supplementary methods when used in conjunction with oral questioning and performance tests

3. Properties of Assessment Methods
a. Validity – appropriateness, correctness, meaningfulness and usefulness of the specific conclusions that a teacher reaches regarding the teaching-learning situation.
- Content validity – content and format of the instrument
i. Students’ adequate experience
ii. Coverage of sufficient material
iii. Reflect the degree of emphasis
- Face validity – outward appearance of the test, the lowest form of test validity
- Criterion-related validity – the test is judge against a specific criterion
- Construct validity – the test is loaded on a “construct” or factor

b. Reliability – consistency, dependability, stability which can be estimated by
- Split-half method
- Calculated using the
i. Spearman-Brown prophecy formula
ii. Kuder-Richardson – KR 20 and KR21
- Consistency of test results when the same test is administered at two different time periods
i. Test-retest method
ii. Correlating the two test results

c. Fairness – assessment procedure needs to be fair, which means:
- Students need to know exactly what the learning targets are and wat method of assessment will be used
- Assessment has to be viewed as an opportunity to learn rather than an opportunity to weed out poor and slow learners
- Freedom from teacher-stereotyping

d. Practicality and Efficiency
- Teachers should be familiar with the test,
- does not require too much time
- implementable

e. Ethics in Assessment – “right and wrong”
- Conforming to the standards of conduct of a given profession or group
- Ethical issues that may be raised
i. Possible harm to the participants.
ii. Confidentiality.
iii. Presence of concealment or deception.
iv. Temptation to assist students.

(Source: Adanced Methods of Educational Assessment by De. Guzman)

Final Requirement Due: May 31, 2009

Prepare and submit your Documentation Portfolio (follow the suggested guide in portfolio assessment) for the "Advanced Methods in Educational Assessment" on or before May 30, 2009. Required entries include:

a. Course Syllabus
b. Hand-outs, Outputs
c. Drafts of rubrics presented in the class
d. Corrected /final rubrics presented (prototype)
e. Test Papers
f. Reflection/reaction papers
g. Copy of a certificate or proofs (if any) of any training attended relating to authentic assessment
h. Sample hand-outs from seminar/training on authentic assessment
i. Copy of training session guide (if any) relating to task as Speaker/Trainer in Seminars on Educational Assessment
j. Copy of an abstract relating to a research conducted on authentic assessment either in basic or higher education. (Sample: http://www.amstat.org/publications/jse/v2n1/garfield.html)
k. Other documents relating to learning Advance Methods in Educational Assessment

Performance-based assessment

http://coe.sdsu.edu/eet/articles/pbassess/index.htm

STANDARDIZED TESTS, the cornerstone of public school assessment, while inexpensive, efficient to administer, and easy to report, none the less give an incomplete picture of student achievement. While effective at measuring content knowledge, standardized tests do not measure students' skills or ability to perform higher level thinking. Performance based assessment, on the other hand, can give us a more complete picture of student achievement.

What is performance based assessment?
Performance based assessments ask students to show what they can do given an authentic task which is then judged using a specific set of criteria.
Where can performance based assessment be used?
Performance based assessment provides teachers with information about how a student understands and applies knowledge. They can be used to evaluate reasoning, products, and skills that can be observed and judged using specific criteria.
Tasks that have more than one acceptable solution often lend themselves well to a performance based assessment, since they may call for the student to use higher-order thinking skills such as experimenting, analyzing or reasoning.
Examples of learning that can be measured well using a performance based assessment include: giving an oral presentation; writing a research paper; operating a piece of equipment; and creating and conducting a science experiment (see a definition of performance assessment).

How do you measure student performance?
Student performance tasks are measured using performance criteria. Creating performance criteria serves two purposes. First, it defines for the student and the teacher what the expectations of the task are. Second, well-defined criteria allow the teacher and student to evaluate the task as objectively as possible.
If performance criteria are well defined, another person acting alone would be able to evaluate the student accurately and easily. In addition, well-written performance criteria allow the teacher to be consistent in scoring over time which is especially good when evaluating skills (Stiggins, 1997).

How can you document student performance?
Student performance can be documented in four ways:
• Rubric – A rubric is a rating scale that shows to what degree a criterion is met. Most rubrics use a four or five degree scale that would allow the teacher to evaluate a performance criteria from "not present" to “exemplary"
• Checklist - A checklist is a simpler version of a rubric and usually documents only whether or not certain criteria were met during the task.
• Narrative - A narrative is a written record that explains exactly how well a student has met the performance criteria.
• Memory - Using no mechanical means the teacher observes the student performing the task. This mental information is then used to determine whether or not the student was successful meeting the performance criteria.