Grounds for Rejection – Machine Learning (Theory)

The flaws John lists are absolutely critical for papers, and often result in outright rejection. However, there are also some other positive suggestions which are particularly useful to new grad students writing their first papers. These suggestions improve the chances of a paper getting accepted in a conference because it helps a reviewer easily understand the paper even when he is pressed for time.

I have made these mistakes in the past, and as a reviewer I see the same problems in some papers I review. While these are not usually grounds for me to reject a paper I now realize that if they are addressed well, then it is a lot easier to read the paper. Often these are the papers that are widely read, and hence widely cited / used in practice.

0. Before writing anything in Latex–and often even before running any experiments–I find it useful to first summarize the following as short bullets on a piece of paper. For most of these questions, if I can’t write the answer in just one or two lines of text, then it is often indicative of an incomplete understanding. This means that I need to spend more time thinking about the problem carefully.

(a) what is the problem solved in the paper, and why is this problem important? Try to be as specific and precise as possible.

(b) what are the most relevant previous papers that attempt to solve the same problem, and what are the relative strengths / weaknesses of those methods? It is important to be comprehensive here–even if in the final paper you dont list all of them for want of space.

(d) Exactly how is the proposed approach different from previous work, and why does this difference matter in practice?

(e) What are the assumptions that I make in deriving my theorems/algorithms/probabilistic models/equations? When will these assumptions help and when will they break? (it is critical to highlight both aspects).
Also consider: What happens in the limiting cases (eg of large amount of data or of very little data)?

(f) What “toy” experiments will clearly highlight the ideas from (c) and (d) in a visual and intuitive way that will help the reader fully understand and evaluate the proposed work?

(g) What are the most widely used benchmarks or real life data that can be used to illustrate the proposed methods? What are the benchmarks used by comparable papers from the literature?

(h) what exact details and results from the experiments should be recorded so that a complete stranger can: (1) reproduce the experiments and obtain the same results which I did; and (2) correctly evaluate its merits/weaknesses in a scientific way even if he does not reproduce my experiments? It is a good idea to use public benchmarks or to make your data public, to ensure reproducibility (which is an important goal in any scientific work). It also helps to distribute your code publicly where feasible.

(i)To answer (h.2) one can: provide error bars wherever we have some randomization (eg in 10 fold cross validation); use statistical hypothesis testing to show if one method is truly superior to another or if the apparent differences are purely artefactual; or provide Bayesian “marginals” or “evidence” values for the model. Each of these approaches suffers from its own limitations (and has its critics), but seeing at least one of them is very useful for reviewers and readers: it allows them to objectively evaluate whether the method is worth using/publishing.

If our ultimate goal as scientists is to make meaningful contributions that will be useful to society, each of the above comments in (0) is important. Often several iterations are necessary (especially when experimental results are unexpected or negative) before I understand the problem well enough to write a paper about it.

Once we have done the above (and have performed the necessary experiments), we can get down to writing about the paper in a form that makes for easy reading. It is important to point out that even if there are a few spelling or grammatical mistakes, a paper can be well written in the sense that a reader can understand its message easily. Similarly, even with no syntactical errors, a paper can be said to be badly written if it is not easily understood by a technically sophisticated reader.

1. The structure of the paper can convey a lot of information. The main goal of a good structuring should be that a reader knows (even before reading the text) exactly what to expect while reading any part of the paper; he should also be able to quickly locate exactly where in the paper he can find any particular idea explained. This is almost always the key difference between well written papers (which have a higher chance of acceptance at conferences) and the rest of the submissions. Some empirical suggestions/evidence in the context of grant proposals (that got funded) may be useful:
http://class.ee.iastate.edu/berleant/home/me/cv/papers/typography.htm

(A) It is always a good idea to organize the presentation of the material so that related ideas/statements are located in one contiguous space (rather than distributing them throughout the paper). For example, a common beginner’s mistake is to make different (but related) statements throughout the paper about how the paper differs from previous work, usually by pointing this out in each section as each novel idea in the paper is introduced. Instead of this, I find it very helpful to see them all collected in one place – by placing the related ideas together the reader finds the paper a lot easier to understand.

(B) Similarly, structuring sentences or paragraphs within a section is also important: for example, a whole page written as a single paragraph introduces a small strain or anxiety (and even fear) in a reader and he will already be mentally “tuned out” even before he starts reading it. The use of colon, semi-colon, and other such devices helps to make the structure of the sentence easily parsable. It is not always necessary to break related ideas into separate sentences, just to keep each sentence short; a good structuring can help to keep it understandable.

(C) As far as possible the order of presentation of ideas should be carefully thought out. Ideally, the material each section should lead seamlessly (and logically) to the ideas presented in the next section.

(D) Clear section/subsection headings, figure/table captions, and boldface headings to start (some particularly important) conclusions or paragraphs are all very useful so that a reader will understand the structure and will know what to expect even before he reads the text. This makes it easier for the reader and reduces the strain that is always there in reading about a new idea in any paper.

(E) As a non-native speaker of english, I have often found it necessary to write some parts of the text and then to subsequently spend a fairly significant amount of time re-structuring it so that the sentence construction is more “natural” and easily read by native speakers of the language. I sometimes need to make a special effort to avoid constructions which may be natural in my mother tongue, especially when they do not neccessarily read easily in english. It is important to budget for this additional effort while writing papers for a conference paper deadline.

2. It is *very* important to write a clear but succinct abstract, introduction, and conclusion section: if this part is well written, it is a lot easier to understand the rest of the paper. On the other hand, if this part is poorly written the rest of the paper is difficult to understand even if those sections have been well presented. It is also important to clearly mention the related work *in a separate (sub)section*, to highlight exactly how your paper differs from the previous methods in the literature, and what are the key contributions in the paper.

3. While writing equations it is useful to consider the following:
(A) It is absolutely essential to ensure that the notation used in the paper is (a) clear; (b) follows the established conventions in the community which will read the paper; and (c) remains consistent in the whole paper. This is usually a problem overlooked by grad students writing their first papers, but it is essential so that a reader can understand the paper easily.

(B) It is usually an excellent idea to first motivate the idea, then explain the intuition in words, and only subsequently state the equation formally. After an equation is provided, it is very useful to explain the intuition behind each term in the text. Reading a long stream of equations without explanation is not easy for most people, and the investment of time, effort and space (in a conference submission with page limits) is usually well rewarded by the higher chances of the paper being understood and hence accepted.

(C) It is sometimes a good idea to move some of the theorems/equations to the appendix if these theorems/equations do not carry significant intuition and are only necessary as a step in arriving at an important result.

(D) Instead of trying to explain everything about an idea in english, it is useful to introduce notation early, and to then use equations accompanied by smaller amount of text.

4. Often, intuitions are better explained with the help of a figure, graph, or a graphical representation of the probabilistic model between the different variables used in the paper. The latter serves to both explain and summarize the notation, and also to present the big picture graphically so that a long series of equations can be more easily understood.
(When I first started writing papers I relied too much on text and managed to confuse the presentation; this is common in papers by new grad students)

5. While explaining an abstract notion/definition etc, it is usually a good idea to state the idea and then immediately present a concrete example of this. For example, after presenting a general theorem, it helps to present a few practical special cases. This is particularly true of intuitive and easy special cases which may be fairly trivial by themselves, but which are a natural consequence of the theorem.

6. None of the above is a rule that has to *always* be followed. Each of these are suggestions and they should not be followed in special cases when the alternative will make it easier to read the paper.

7. It is useful to ensure that you do not commit any of the cardinal sins mentioned here:
http://www.maths.uwa.edu.au/~berwin/humour/invalid.proofs.html

Writing papers clearly is an acquired skill that often requires effort and experience as well as guidance from co-authors (well, that was at least the case for me). I found the notes, tutorials, and review papers by Tom Minka, Matthias Seeger, David Mackay, Ed Jaynes, Stephen Boyd and others very useful and my thought process has been significantly influenced by their presentation of the material. From their impact on me I surmise that clear explanation has the potential to impact a potentially large community and achieve a large number of citations to boot; on the other hand even very high quality technical work work take a long time to impact the community if it is not communicated well. “Learning” to write good papers from such examples is a skill that ML people will find probably useful 🙂

One Reply to “Grounds for Rejection”

Balaji Krishnapuram says:

4/4/2005 at 10:29 pm

The flaws John lists are absolutely critical for papers, and often result in outright rejection. However, there are also some other positive suggestions which are particularly useful to new grad students writing their first papers. These suggestions improve the chances of a paper getting accepted in a conference because it helps a reviewer easily understand the paper even when he is pressed for time.

I have made these mistakes in the past, and as a reviewer I see the same problems in some papers I review. While these are not usually grounds for me to reject a paper I now realize that if they are addressed well, then it is a lot easier to read the paper. Often these are the papers that are widely read, and hence widely cited / used in practice.

0. Before writing anything in Latex–and often even before running any experiments–I find it useful to first summarize the following as short bullets on a piece of paper. For most of these questions, if I can’t write the answer in just one or two lines of text, then it is often indicative of an incomplete understanding. This means that I need to spend more time thinking about the problem carefully.

(a) what is the problem solved in the paper, and why is this problem important? Try to be as specific and precise as possible.

(b) what are the most relevant previous papers that attempt to solve the same problem, and what are the relative strengths / weaknesses of those methods? It is important to be comprehensive here–even if in the final paper you dont list all of them for want of space.

(c) what is the key (novel) idea introduced here, and intuitively why is this going to improve the solution?

(d) Exactly how is the proposed approach different from previous work, and why does this difference matter in practice?

(e) What are the assumptions that I make in deriving my theorems/algorithms/probabilistic models/equations? When will these assumptions help and when will they break? (it is critical to highlight both aspects).
Also consider: What happens in the limiting cases (eg of large amount of data or of very little data)?

(f) What “toy” experiments will clearly highlight the ideas from (c) and (d) in a visual and intuitive way that will help the reader fully understand and evaluate the proposed work?

(g) What are the most widely used benchmarks or real life data that can be used to illustrate the proposed methods? What are the benchmarks used by comparable papers from the literature?

(h) what exact details and results from the experiments should be recorded so that a complete stranger can: (1) reproduce the experiments and obtain the same results which I did; and (2) correctly evaluate its merits/weaknesses in a scientific way even if he does not reproduce my experiments? It is a good idea to use public benchmarks or to make your data public, to ensure reproducibility (which is an important goal in any scientific work). It also helps to distribute your code publicly where feasible.

(i)To answer (h.2) one can: provide error bars wherever we have some randomization (eg in 10 fold cross validation); use statistical hypothesis testing to show if one method is truly superior to another or if the apparent differences are purely artefactual; or provide Bayesian “marginals” or “evidence” values for the model. Each of these approaches suffers from its own limitations (and has its critics), but seeing at least one of them is very useful for reviewers and readers: it allows them to objectively evaluate whether the method is worth using/publishing.

If our ultimate goal as scientists is to make meaningful contributions that will be useful to society, each of the above comments in (0) is important. Often several iterations are necessary (especially when experimental results are unexpected or negative) before I understand the problem well enough to write a paper about it.

Once we have done the above (and have performed the necessary experiments), we can get down to writing about the paper in a form that makes for easy reading. It is important to point out that even if there are a few spelling or grammatical mistakes, a paper can be well written in the sense that a reader can understand its message easily. Similarly, even with no syntactical errors, a paper can be said to be badly written if it is not easily understood by a technically sophisticated reader.

1. The structure of the paper can convey a lot of information. The main goal of a good structuring should be that a reader knows (even before reading the text) exactly what to expect while reading any part of the paper; he should also be able to quickly locate exactly where in the paper he can find any particular idea explained. This is almost always the key difference between well written papers (which have a higher chance of acceptance at conferences) and the rest of the submissions. Some empirical suggestions/evidence in the context of grant proposals (that got funded) may be useful:
http://class.ee.iastate.edu/berleant/home/me/cv/papers/typography.htm

(A) It is always a good idea to organize the presentation of the material so that related ideas/statements are located in one contiguous space (rather than distributing them throughout the paper). For example, a common beginner’s mistake is to make different (but related) statements throughout the paper about how the paper differs from previous work, usually by pointing this out in each section as each novel idea in the paper is introduced. Instead of this, I find it very helpful to see them all collected in one place – by placing the related ideas together the reader finds the paper a lot easier to understand.

(B) Similarly, structuring sentences or paragraphs within a section is also important: for example, a whole page written as a single paragraph introduces a small strain or anxiety (and even fear) in a reader and he will already be mentally “tuned out” even before he starts reading it. The use of colon, semi-colon, and other such devices helps to make the structure of the sentence easily parsable. It is not always necessary to break related ideas into separate sentences, just to keep each sentence short; a good structuring can help to keep it understandable.

(C) As far as possible the order of presentation of ideas should be carefully thought out. Ideally, the material each section should lead seamlessly (and logically) to the ideas presented in the next section.

(D) Clear section/subsection headings, figure/table captions, and boldface headings to start (some particularly important) conclusions or paragraphs are all very useful so that a reader will understand the structure and will know what to expect even before he reads the text. This makes it easier for the reader and reduces the strain that is always there in reading about a new idea in any paper.

(E) As a non-native speaker of english, I have often found it necessary to write some parts of the text and then to subsequently spend a fairly significant amount of time re-structuring it so that the sentence construction is more “natural” and easily read by native speakers of the language. I sometimes need to make a special effort to avoid constructions which may be natural in my mother tongue, especially when they do not neccessarily read easily in english. It is important to budget for this additional effort while writing papers for a conference paper deadline.

2. It is *very* important to write a clear but succinct abstract, introduction, and conclusion section: if this part is well written, it is a lot easier to understand the rest of the paper. On the other hand, if this part is poorly written the rest of the paper is difficult to understand even if those sections have been well presented. It is also important to clearly mention the related work *in a separate (sub)section*, to highlight exactly how your paper differs from the previous methods in the literature, and what are the key contributions in the paper.

3. While writing equations it is useful to consider the following:
(A) It is absolutely essential to ensure that the notation used in the paper is (a) clear; (b) follows the established conventions in the community which will read the paper; and (c) remains consistent in the whole paper. This is usually a problem overlooked by grad students writing their first papers, but it is essential so that a reader can understand the paper easily.

(B) It is usually an excellent idea to first motivate the idea, then explain the intuition in words, and only subsequently state the equation formally. After an equation is provided, it is very useful to explain the intuition behind each term in the text. Reading a long stream of equations without explanation is not easy for most people, and the investment of time, effort and space (in a conference submission with page limits) is usually well rewarded by the higher chances of the paper being understood and hence accepted.

(C) It is sometimes a good idea to move some of the theorems/equations to the appendix if these theorems/equations do not carry significant intuition and are only necessary as a step in arriving at an important result.

(D) Instead of trying to explain everything about an idea in english, it is useful to introduce notation early, and to then use equations accompanied by smaller amount of text.

4. Often, intuitions are better explained with the help of a figure, graph, or a graphical representation of the probabilistic model between the different variables used in the paper. The latter serves to both explain and summarize the notation, and also to present the big picture graphically so that a long series of equations can be more easily understood.
(When I first started writing papers I relied too much on text and managed to confuse the presentation; this is common in papers by new grad students)

5. While explaining an abstract notion/definition etc, it is usually a good idea to state the idea and then immediately present a concrete example of this. For example, after presenting a general theorem, it helps to present a few practical special cases. This is particularly true of intuitive and easy special cases which may be fairly trivial by themselves, but which are a natural consequence of the theorem.

6. None of the above is a rule that has to *always* be followed. Each of these are suggestions and they should not be followed in special cases when the alternative will make it easier to read the paper.

7. It is useful to ensure that you do not commit any of the cardinal sins mentioned here:
http://www.maths.uwa.edu.au/~berwin/humour/invalid.proofs.html

Writing papers clearly is an acquired skill that often requires effort and experience as well as guidance from co-authors (well, that was at least the case for me). I found the notes, tutorials, and review papers by Tom Minka, Matthias Seeger, David Mackay, Ed Jaynes, Stephen Boyd and others very useful and my thought process has been significantly influenced by their presentation of the material. From their impact on me I surmise that clear explanation has the potential to impact a potentially large community and achieve a large number of citations to boot; on the other hand even very high quality technical work work take a long time to impact the community if it is not communicated well. “Learning” to write good papers from such examples is a skill that ML people will find probably useful 🙂

Comments are closed.