Analyzing and Improving a Test Using Statistics

 Question 1 Question 2 Question 3 Question 4 Question 5 Student 1 2 2 0 2 0 Student 2 2 2 0 2 0 Student 3 2 0 0 2 0 Student 4 2 2 0 2 0 Student 5 2 2 2 2 2 Student 6 2 0 0 2 0 Student 7 2 2 0 2 0 Student 8 0 0 0 2 0 Student 9 2 2 0 2 0 Student 10 2 2 0 2 0 Mean  grade 1.8 1.2 0.2 2 0.2

The mean grade score for the each question in the test reads as from question 1 to question 5. The following figures 1.8, 1.2, 0.2, 2, and 02 occur as a result of total score for each question divided by the number of students who sat for the test. This information is represented in the following graph.

The mean score for the entire paper is 1.08.This is equals to the total sum of mean score for each question divided by the number of questions in the test.

The score for each question in the test is represented in the following graph .The number of questions done are represented on the left hand side of the graph.

From the test report generated, it’s evident that most of the students did not do well in questions 3 and question 5. Most students got nothing in these questions. As a result of this the mean score of the two questions is very low compared to other questions. Question two recorded good score from all the students. The mean score for question 4 is extremely high showing that most students or all students performed well in the question. Questions two was fairly done and from the report given, it is evident that most of the students were above the mean score.

In the report above questions 3 and 5 were performed poorly by most students. This would have resulted may be from lack of proper proofreading or even typing errors in these questions. As a result of this students may have misinterpreted the question thus proving wrong answers for the question which led to recording of poor grades in the questions. When compared to question two which every student passed, it shows that these two questions were raising issues which need to be discussed by the exam body by first conducting data collection of various views from the examinees.

Qualitative analysis procedures, mostly includes activities such as exam proofreading to correct typographical errors, to try and identify grammatical cues that might tip off correct answers from examinees. The appropriate reading of the entire material is also important. Procedures like these can also include small group discussion by the exam experts. The group may also include examinees who may have already done the exam or even departmental student representatives. This helps the examinees to give their views verbally on the exam when they respond to every item in the exam. These procedures allow the examination instructors to come up with a report on the student performance. Through this the instructors are able to identify the student who performed poorly or well in the test was as a result of the following: Misinterpretation of items in the test, this may help in determining the reason why such misinterpretation my have occurred among some students and not all (Creswell, 2009).

In addition to the above mentioned qualitative analysis qualitative procedures needs also to be included. During this process these numerical indicators are derived, they include:  item discrimination, destructor power, and item difficulty.

In the item difficulty index statistic is an important choice so as to achieve amplitude test in an exam. This focuses more on correct answers verses incorrect ones and can easily be derived from true false multiple choices. It also applies in easy items. Here the instructor is usually in a position to convert the range between possible values into passing and falling categories.

This index which is symbolized p is computed by ssimply dividing the number of task takers in our case, it is the number of students who sat for the test and answered the question correctly, then divide this number of students who answered the question. From the above difficulty index like in question 1 and and question 5. For question 1 the index equals to: correctly answered questions = 7 and all answered were 10 questions. Thus the difficulty index =7/10=0.7. For question 5 correctly answered questions =1 and all answered questions were 10.Thus the difficulty index =1/10=0.1.

From the results gotten above, it evident that question 5 was the most difficult and students performed poorly, and it is in this the difficulty index is compared to that of questio2. More insight on this test results by computing the item difficulty level for different subgroups in the entire class like those who performed well and those who performed very poorly. It’s also important to note that for question two, the difficulty index is 0.7 which is 70%. This indicates that 70 percent which is more than a half, answered question two correctly; but it not easy to tell why they did so. It may raise issues such as instructors’ failure to teach those key areas assessed by these items. It also not evident whether the students failed in their studying duty. To be able to answer these questions other quantitative analysis need to be used (Day & Underwood, 1967).

The other quantitative analysis is the item discrimination index. This index takes in the facts that often most exam takers answer an exam in different ways. It also tries to answer questions of key interest to the faculties. Such questions in the test taken by students give a difference between those who did well on the overall exam from those students who did poorly. In a technical sense the discrimination index take into account the items validity on the test which is simply the extent to which the test items tap the key purposes they were intended to asses. This index involves a number of techniques and the one to be applied depends on the nature of the test.

Basing our argument from the nature of the report above, the most favorable discrimination index is the one parallel with difficulty index. This technique can be used when the test items are scored in dichotomous way or simply as correct or in correct. The main aim of the tests is to determine whether the students understand the taught material. Basing our argument from this report we want to determine why students performed extremely well in some questions and poorly in some others. The index is calculated in the following way. The first step is by dividing the test takers into two high scoring and low scoring. Then compute the discrimination index separately that is for upper scoring and lower scoring. Finally you subtract the upper from lower to get the discrimination index in exam.

In the report above we can use the example in difficulty index to compute the discrimination index. In question two those who scored high scores were 7 and those who scored or answered the questions incorrectly were 3 out of ten D = p upper-p lower which is equals to 0.7- 0.3 so D = 0.5. This suggest that those who answered the question correctly were more than half which suggest that a high number of students were well prepared in this area. This also suggests that one student in which the half of the student who passed the item were student who did well in the overall test.

From this report it is noticed that there are items which are passed by all students like question 4 and those failed almost all like question 5. There are reasons to include such items in the exam. Easy item might reflect that straight forward facts were taught well and as a result it was understood by all students. Similarly, an exam instructor may decide to include extremely difficult items like question 5 as we have seen in the report and as a result it may pose a big challenge even to the most prepared students. Irrespective of this entire exam, instructor needs to be aware that non of these item functions to discrimination among those taking any test.

