IJCAI Report on Changes in the Review Process

Lessons learned from changes in the review process of IJCAI(-PRICAI) 2020#

by Christian Bessiere, January 5th, 2021

Summary statement:
The mandatory re-submission declaration requirement was very
successful, providing more information, arguably higher quality, and
an increased acceptance rate for resubmitted papers (15.8% acceptance
rate vs. 10.7% for other papers). We continue to believe that
restricting the author response to factual errors and ethical
concerns, and limiting access to the SPC, results in positive outcomes
in the ideal case -- but there are challenges in "culture change" that
make this somewhat difficult to implement well. Allowing
supplementary materials at PC request during the review process has
suffered from the confusion caused by the restrictions imposed on
author response, but it has shown some unexpected benefits.
Triple-blind reviewing (PCs cannot see each other's names) and an
explicit expectation for authors to also serve as reviewers both help
to increase equity along multiple dimensions.

In this note, I discuss various changes in the review process of
IJCAI(-PRICAI) 2020 (apart from summary reject, which was analyzed in
another letter) and the lessons learned from those changes.

I'll first briefly recap the structure of the program committee of
IJCAI-PRICAI 2020. The IJCAI program committee was composed of PC
members (PCs), who write reviews, senior PCs (SPCs), who monitor
discussions and suggest accept/reject decisions, and area chairs
(ACs), who take the accept/reject decisions. In 2020, there were
associate program chairs (APCCs) on top of the ACs. Each of them was
in charge of supervising/helping a subset of the ACs. At the end of
the review process they checked more than 300 borderline decisions,
overriding about 15% of those decisions.

Discussions inside the board of trustees and with former program
chairs have raised several issues in the review process. I decided to
implement several changes to address some of these issues.

Preliminary remark:
The comments below are based on what I observed on the 200 to 300
papers for which I read the reviews, the discussion, and the
meta-review myself.

When a paper had been recently rejected from one of the other major AI conferences, it was mandatory to declare re-submission and to provide the reviews of the rejected submission
The response box was to be used only to declare factual errors or unethical review
The content of the response box was made visible to the SPC and AC only, that is, NOT visible to PCs
It was forbidden to add supplementary material at submission time. It was allowed at response time if requested by PC
Triple blind review process (PCs and SPCs didn't know the names of each other)
At submission time, authors of a paper were asked to agree to review up to three papers if requested

1. When a paper had been recently rejected from one of the other major AI conferences, it was mandatory to declare re-submission and to provide the reviews of the rejected submission

REASONS FOR CHANGE:
We decided to make the declaration of re-submission mandatory because
of the large proportion of papers which are repeatedly re-submitted
from one conference to the other without addressing any comments from
the previous reviews. The guidelines I gave to PCs were that they
could use the previous submission and reviews to check whether authors
had addressed the comments, but the PCs had the right to decide that a
paper was too weak for IJCAI despite addressing well the former
comments.

PROS:
The first and most visible effect of this re-submission policy is that
among the 6814 abstracts submitted, only 5147 turned into an actual
paper submission. It is probably due to the fact that several
conferences had set their notification just before IJCAI
deadline. Many authors of rejected papers from those conferences have
probably understood that they don't have the time necessary to address
reviewers' comments before the IJCAI deadline.

Some authors criticized this new policy. The argument was that if you
get rejected by bad PCs at the first submission, then this penalty
follows you to the next conference. These authors were claiming that
PCs would be more negative if they already knew that that paper was
previously rejected and that they would use the same arguments that
previous PCs used. I followed the review process of many of those
re-submissions, and the general trend contradicted that claim. In
many cases, PCs, even if they didn't like a paper, were positively
influenced by the fact that the paper addressed the comments from the
previous reviews. In the extreme cases, some PCs based their review
solely on that fact. In the end, 15.8% of the papers declared as
re-submission have been accepted compared to only 10.7% for the other
papers.

CONS:
On the negative side, the submission file was more difficult to
read. It was composed of a cover letter explaining the changes, the
former reviews, the former submission, and finally the IJCAI
submission. A few PCs were puzzled by this long PDF file. In
addition, some authors didn't anonymize the submission appropriately
as some cover letters contained authors' names. This led to increased
volume of emails and additional clarifications to all parties.

2. The response box was to be used only to declare factual errors or unethical review

REASONS FOR CHANGE:
I simply wanted to return to the original spirit of rebuttal
phase. Rebuttal phase was created to limit the frustration of authors
to the extent possible when their paper got rejected because of a
review based on erroneous arguments. If we look at guidelines of the
early years of rebuttal, it was explicitly stated that you should use
rebuttal only for spotting factual errors. But this nice feature
progressively led the authors to comment on the review, make it more
verbose, and then decrease its impact. The rebuttal often becomes a
dialogue between PC and author and the chances that it influences the
SPC's decision drop significantly. The second effect is that PCs tend
to ask for more information from the authors. We often see reviews
containing sentences such as: (1) "What would happen if your algorithm
was launched on black and white images?", (2) "Can you clarify why Z
is always greater than the sum of X and Y?", or even worse, a PC who,
during the discussion phase, says: (3) "I reject this paper because
the authors have not replied to my questions" or "...because I am not
convinced by their response". Questions of the kind of (1) are
typically what should be asked at the end of the talk. The problem of
questions of type (2) is that only the PCs will see the clarification.
If clarifications are needed, either they are minor and the PC
recommends to clarify in the CRC, or they are needed to make the paper
acceptable, and a revision would be needed. Unfortunately the format
of IJCAI doesn't give us time for a revision/re-review turn. Three
more months (at least) would be needed, and even more PC-time
consumption. Finally, comments of type (3) are the illustration of the
pernicious behavior that this trend creates: it gives the PC some
unhealthy power over the author.

PROS:
PCs were no longer able to recommend rejection for the reason that the
authors didn't address their comments the way they were expecting. In
the discussions, some PCs complained about the absence of response to
their questions. In those cases, the SPC had to remind them that the
authors were not allowed to respond.

A large number of authors told me that they appreciated not to have to
respond to reviews when there was not much to say.

CONS:
This is the change that has created the more fuss in the community.
Why such a resistance? I had clearly underestimated the power of
habits. People do what they are used to do and don't read the
guidelines carefully enough. (I saw some PCs replying to criteria used
in other major conferences but not in IJCAI!) The result was that many
PCs asked questions in their reviews. The way I communicated the
guidelines to PCs could have been better, perhaps. My original policy
was to send as few emails as possible to avoid the effect of flooding
people with information. But I quickly understood that PC members
didn't read long emails. (Interestingly, ACs received very few long
emails and immediately got the point.) I then started sending
frequent short emails with one piece of information at a time. It
didn't improve the situation much. Some PCs complained they were
receiving too many emails. Having such a review process in the middle
of the first wave of Covid-19 has probably not helped, either. Some
PCs told me they were totally overwhelmed by the situation.

To solve the issue of reviews not following the guidelines, I told
authors to ignore questions from PCs. But it has put a lot of stress
on them. They were afraid that their paper would get rejected if they
didn't respond. A consequence is that some authors responded to
questions/comments from PCs even when they were not factual
errors. This could have led to an unfair situation for those who had
not responded. Fortunately PCs didn't have access to the response
box (see section below).

3. The content of the response box was made visible to the SPC and AC only, that is, NOT visible to PCs

REASONS FOR CHANGE:
This change is strongly related to change (2). I made the response box
visible to the SPC (and AC) and not to the PCs because, in face of a
factual error, authors do not dare to say to the PC that they are
wrong. Even when they do, the PC often evades the question and the
author receives final reviews in which the erroneous judgment is still
present. Concerning unethical reviews, the reason not to send to the
incriminated PC is obvious. The usual way is to contact the PC chair
but very few authors do.

PROS:
I observed that when the factual error was correctly handled by the
SPC, the impact was significantly stronger than when it is sent to PCs
directly. SPCs were asked to resolve the factual errors either by
themselves or by involving PC members. Even when the PCs didn't
respond to the SPC about the claimed errors, the SPC often took the
factual error into account by simply ignoring the refuted review.
Concerning ethical issues, I have been aware of a small dozen of clear
unethical reviews that have been detected thanks to this process. I
don't think all of them would have dared to directly send me an email.

CONS:
This new policy has put more responsibility on the shoulders of the
SPCs and some of them didn't want to take this responsibility. Some
SPCs didn't push the PCs until the error is resolved. A few of them
transferred the whole response box to the PCs despite they were urged
to only transfer the points to be resolved.

4. It was forbidden to add supplementary material at submission time. It was allowed at response time if requested by PC

REASONS FOR CHANGE:
Appendices and links to supplementary material have been observed as
an easy way to bypass the page limitation. This is more and more
common, and it is unfair to those who follow the page limit. But some
papers really need an appendix/supplementary material (for instance
theoretical papers with long proofs). In the new policy, the PC was
the one who decides whether the paper needs extra material to be fully
evaluated. Hence, supplementary material was allowed at response time
if requested by the PC.

PROS:
This new policy forced authors to describe their contribution within 6
pages, which was the original goal. When PCs requested full proofs of
theorems, it seems that this new policy worked well. Another
unexpected positive effect is that it has sometimes been used to
provide material that authors would not necessarily have put in
standard supplementary material. For instance, authors often provide a
selection of instances in their experiments and the PC may wonder if
it is not a biased selection. Here PCs were allowed to request the
full dataset in the supplementary material. I observed several such
cases.

CONS:
Some authors commented on the reviews inside the supplementary
material, bypassing the policy of changes 2 and 3. In addition, the
size was not limited (as opposed to the response box), making it even
more attractive. To avoid unfairness among papers, the IJCAI staff
browsed all supplementary materials (about 500) and blocked access to
the PDFs by PCs when the policy was violated.

All in all, I think that this change could have been good but it was
somewhat tainted by the collusion with changes 2 and 3.

REASONS FOR CHANGE:
In recent years, IJCAI (and other conferences as well) have observed
increase in institutionally organized cheating strategies. Some PCs
leak the names of the other PCs of the papers of their friends and the
authors assert pressure on those PCs to get positive reviews. Triple
blind was a way to protect PCs from denunciation to authors.

PROS:
A direct effect has been to protect PCs from external pressure by
authors. It is difficult to assess how much it has prevented cheating
in general because we don't have quantified data from previous years
and we don't know whether other cheating strategies adapted to our
change have been applied.

CONS:
SPCs cannot weigh reviews by their amount of trust in a particular
person because they don't know who the PCs are. Some SPCs have used
the trusting score provided by CMT, but it seems that it remained
marginal.

Triple blind doesn't protect from collusion of people who are both
authors and PCs. We have observed groups of authors/PCs who share the
ID/title of their papers. They bid 'eager' (the strongest bid) on the
papers of their friends to have great chances to get them
assigned. Once assigned, they score these papers 'strong accept'.

6. At submission time, authors of a paper were asked to agree to review up to three papers if requested

REASONS FOR CHANGE:
IJCAI receives more and more papers every year and it is difficult to
predict how many PCs would be needed in each subject area, in
advance. In addition, we didn't know how many papers will go to full
review after the summary-reject phase, making the estimation even more
uncertain. The feature of asking authors to become reviewers on
demand was a kind of security net to maintain a light overall load on
the shoulders of PCs. After the results of the summary-reject phase,
we estimated the needs of reviewers per subject area and we decided
not to put authors in the assignment system. (The tightest subject
area was covered by relevant PCs with an upper bound of 7 papers per
PC). Authors were asked to review papers at the end of the review
process, when we started detecting non-responsive PC members.

PROS-CONS:
Authors who were requested to review papers almost always accepted
despite the emergency. Regarding the quality of their reviews, we
cannot draw definite conclusions because we have not recruited enough
such emergency reviewers. In addition, the CMT feature to assess the
quality of reviews was seldom used by SPCs on these reviews because
these emergency reviews came at the end of the process. From what I
saw, there were no obviously bad reviews. The main drawback when
seeking authors to whom to assign a paper is that authors had not
participated to the bidding phase, forcing us to assign based on
subject areas only.