Given the little experience with HCI-type evaluations represented in the group, we have decided that this version of the benchmark will not include any such evaluation criteria, e.g., learning costs or usability by the disabled, which tend to be harder in defining precisely. We thus confined ourselves to two criteria, which are given below together with their corresponding quantitative measures.
Efficiency: This is measured by the following quantities:
Quality of answer: This is measured by the following quantities:
The use of the terms is not in the traditional way. The intention here is to capture the level of satisfaction expressed by users with respect to whether or not they found what they were looking for (assuming they know that), and level of user confidence that they have succeeded based on system feedback. Either the terms should be carefully defined, or some other terms should be used.