www.au.dk

Multiple Choice Tool

www.au.dk

About this tool

This site presents a free tool for conducting multiple choice tests in the style of the paper:

     A Singular Choice for Multiple Choice
     Gudmund S. Frandsen and Michael I. Schwartzbach
     ACM SIGCSE Bulletin (Inroads), December, 2006

which present the first (and only) mathematically correct scoring strategy for multiple choice tests allowing partial knowledge. To use the tool, first download the Java file multiple.jar.

Describing a test

A test is described as an XML document that contains LaTeX fragments as character data. As an example, consider the following file tiny.xml which also uses the file tree.ps. The format is assumed to be fairly self-explanatory. A test consists of a preamble and a frontpage followed by a sequence of group elements. Each group contains a collection of question elements that deal with a (possibly empty) example. Each question consists of a what element that poses the question and a sequence of answer elements, of which exactly one has a true attribute.

The test may be viewed in LaTeX format through the following Unix commands:

java -classpath multiple.jar Multiple -latex tiny.xml output.tex
latex output.tex
dvips -o output.ps output.dvi
ps2pdf14 output.ps
Look at the resulting files: output.tex and output.pdf. By default, each question is called "Question", but you may choose another label by setting the question attribute in the test root element.

Generating test copies

The test is assumed to be conducted in an ordinary auditorium. To avoid (any benefits from) peeking, the order of questions and answers are randomly permuted in each copy of the test (however, questions are not permuted among different group elements). The following Unix commands generate 4 copies of the test:
java -classpath multiple.jar Multiple -generate tiny.xml copies.tex 4
latex copies.tex
dvips -o copies.ps copies.dvi
ps2pdf14 copies.ps
Look at the resulting files: copies.tex and copies.pdf. Note that each copy is equipped with a 4-digit code in the top left-most corner.

The output clears on double pages, so it is safe to print in two-sided format. If your printer can add staples, you may want to use instead the mk script, which is invoked as:

mk tiny.xml 4
to generate, print, and staple 4 copies (note that you must modify the script to contain the appropriate print command).

Grading a test

After the test, the answers are collected. Now follows a manual process of typing the answers into a file in the following format:
20060001 7948 1bc2d3-4bc5a
20060002 8087 1b2ab3-4-5ab
20060003 9294 1a2a3b4ad5d
20060004 1063 1d2ac3b4ab5b
There is one line for each student, consisting of the student ID, the 4-digit code for the given copy of the test, and a string that notes the marks. For redundancy, the number of each question must be written, followed by the letters corresponding to selected answers (if no answers have been selected, write a "-"). Admittedly, this is a bit tedious; the best strategy is if one person reads the marks and another writes the file. The file enter.jar contains a small application for making this task easier. It is invoked once for each student and generates a single line that can be appended to the score file. Thus, for the first student above, use the Unix command:
java -classpath enter.jar Enter 20060001 7948 >> scores.txt
and proceed to enter the characters:
b c space d space space b c space a space return
In practice, this approach seems to be 4 times faster than using a text editor. The tool provides you with a nice, friendly status window, where you can see the current question number and edit the latest entered characters:

Assume then that the resulting score file is called scores.txt. The test is then scored by the Unix command:

java -classpath multiple.jar Multiple -evaluate tiny.xml scores.txt
This generates output such as:
Question 5 is wrong!
20060001:  23%
------------------------------
Question 1 is wrong!
20060002:  12%
------------------------------
20060003:  88%
------------------------------
20060004:  77%
------------------------------
Those questions that are wrong (meaning that the correct answer is not among the selected ones) are listed. It is then easy to manually double-check that they are indeed incorrect. Note that the scores may be low even if the answers are not listed as wrong, if many answers have been selected for a given question or it has been left blank. The output lists the percentage score for each student ID, which may then be translated into a grade by whatever means appropriate.

If a question is judged to be flawed (after the test has been conducted), it may be removed from the scoring process by adding an ignore attribute to the question element in the XML file.

Evaluating the results

After the test, it is interesting to see how each question was received by the students. The following Unix commands generate for each question a histogram of the possible answers (with the correct answer marked by an x):
java -classpath multiple.jar Multiple -histogram tiny.xml scores.txt histo.tex
latex histo.tex
dvips -o histo.ps histo.dvi
ps2pdf14 histo.ps
Look at the resulting files: histo.tex and histo.pdf.

Analyzing the test

As described in the paper, it is possible to compute a nonconfidence value for the test, roughly reflecting whether it is large enough to give reliable results. The Unix command:
java -classpath multiple.jar Multiple -analyze tiny.xml
computes a percentile value that ideally should be at most 5% (for our tiny test, which is of course too small, it is a whopping 29%). To lower the percentile, one may add more questions or more possible answers. In practice, 5% is hard to obtain, so settle for 10%.

Feedback

If you are using our system we would like to know about it and receive your feedback. Ideas for improving the usability are also welcome. Please send e-mail to mis@cs.au.dk.