Bayesian and Neural Systems
Machine learning research group in Edinburgh

Writing Rules

Amos Storkey


This document gives some basic rules for all writing that should be followed by students for formal documents (papers, annual reports, theses etc). For informal internal communication, there is more leeway for rough edges, uncertainty or even sloppiness, but even then good writing helps.

First, all students should read Style: Toward Clarity and Grace and inwardly digest it and work on using its advice in writing. This should be done this early in the PhD, and not be put off.

Even beyond the things mentioned in that book, there are specific and common issues for scientific writing. I note some of these issues below.

Point
A piece of writing has one point and only one point. It can have subpoints, but they must aid the main point. It must be blindingly obvious what the point is - everything should support that. A student should be able to answer the question "What is the point of this paper?" with an immediate one sentence concise answer. And there is no exception to this rule. If a part of a document does not support and relate to the point it doesn't belong in the document. If a piece of research or experiment does not support the point, the point must be changed. If a student thinks there really must be two points to make, then there should be two documents, or they should be unified into a greater singular point.
Audience
Writing is used to communicate to other people, not to get ideas down on paper. Writing and notetaking are completely different things. Students should always focus on how other, real, people who bleed will understand the document. For academic writing, do always think about what both reviewer one and reviewer two might say.
Narrative
Academic writing is not fiction, but it is still telling a story; it needs to be interesting, to unfold, to capture attention and excite the reader. It helps if the writer is excited about the work while writing it. Boring writing is generally more ignored than writing that has an engaging narrative.
Structure
There are many ways to structure a paper, and this should be discussed with the supervisor. However there are two common pitfalls that should be avoided. The first of these failures is the motivational paper; one that spends far too long trying to introduce what it is doing, and motivate the approach before it gets to the actual point. This can seem like a good idea, but as the reader has no idea where the author is going with this, it can be tiresome or even painful for that reader, who is basically asking first and foremost, "What is the point? What is the contribution?" However it is achieved, writing should be structured to get to the main contribution sufficiently quickly. One can always give reasoning post-hoc. The second failure is dismissiveness of others: research builds on the foundation of previous work in the community -- this should be established early in the paper as a basic act of humility. We are standing on the shoulders of giants. A short, related-work section at the rear-end of the paper doesn't pass muster.
Introduce and conclude
A paper is multi-scale. Almost every level (document, chapter, section, paragraph) is structured something like [context/point, supporting text, take-forward/summary]. E.g. for a typical paragraph: (a) the first sentence tells the reader what point is being addressed, (b) the following sentences elaborate on that point or make the argument, and (c) the final sentence states a summary. Structure is critical to communication.
Order
Scientific writers often choose to state first the premise, then the reasoning, and then finally state the concluding point. That seems traditional -- one should draw conclusions not presume them. However it ends up feeling a bit `German' -- you are waiting for the verb at the end of the sentence before you can get an inkling of where things are going. Its easier on the reader if they can see the destination ahead. This is why much science is written as a hypothesis-test-analysis (its nothing to do with so-called Scientific Method -- it is just science communication) or maths as theorem-proof. It is easier on the reader. State your point up-front, and then carefully and justifiably take the reader to that destination.
Sentences
Writing things that are supposed to be sentences that are missing the requirements of a sentence is a cardinal sin (titles, sections, paragraph headings don't need to be sentences). A sentence must generally have a subject -- which can be implicit for the imperative mood -- must have a verb and in many cases requires an object. There are very limited exceptions to these requirements. Students should never have such non-sentences in formal writing, and should never need a supervisor to point these issues out. Passing a formal document to a supervisor that has such non-sentences in it will be interpreted by the supervisor as student laziness. If anyone does not know how you tell if something is a sentence or not, then learning that is an immediate priority.
Ill-definition
Students should not use terms that have not been properly defined or introduced. This is really important. It is vital to provide a definition, even if it is just the sentence or paragraph before. This can easily become an issue when students try to say too much in every sentence/paragraph. All writing should build up conceptually. We start with defining key concepts, then the next concepts in terms of those. Once we have introduced something we can then say what is to be done with them. Then we can say what the benefit of that is. But if we try to talk about stuff the reader has not been introduced to, we have likely lost a percentage of our readership. No one can afford to do that. Nor can one use colloquial or common English terms as a proxy for technical terms to avoid defining them. That just adds vagueness to the lack of definition -- it makes things worse.
Vagueness
It is not uncommon for students to choose to use vague terms, where it is then not clear what is meant. This avoids precision, e.g. using "vision" instead of "computer vision", "worse" instead of "poorly performing", terms like "reasonable", "viable", "feasible", etc. where in each of these cases what is actually meant is something much more specific. Being specific makes things clearer not less clear. Also one should not use multiple different words for the same thing. Define a concept once, and stick with that term. Finally the use of "this", "those", "the aforementioned", should be avoided; the same applies to any term where there is any potential ambiguity about which of the many preceding concepts you are referencing back to. Qualifying "this" sufficiently to remove all ambiguity is fine. In technical writing, precision is key. Remove all chance of different interpretations of what you write.
Missing information
Many sentences can have terms that are ambiguous because vital information is missing, leaving it implicit. The writer might know what was meant but the reader likely doesn't. For example, one might say "normalisation" (of what?), "average" (of what?), "computation of the gradient" (of what?) "with respect to the weights", etc. There are many cases of this that supervisors see. All the information must be provided, even if that feels like it lengthens the sentence. Most English terms take arguments (just like functions!). The writer needs to provide all the arguments required for any word for the English to compile. Technical writing should not be assumptive. Even the assumptions should be defined. On another point, students should avoid abbreviations. It is not likely that the reader will remember all the abbreviations (which must always be defined in long at first use). One should write things out in full unless it really can't be done because the result is awful.
Long sentences
Long sentences are often bad. Sentences that try to say too much end up saying nothing. It is best to say less in each sentence. Make every sentence crystal clear. Split a sentence up if it is too long. Or better still, just don't say some of the things being said. Pick the most important.
Non-commensurate terms
A bit of model can't be a prior. A representation of an item can't be a function. A set can't be an item in the same set. An image can't be in representation space. If there are sentences which effectively, or even implicitly, say A=B where A and B are different conceptual quantities, or reside in different spaces, then something has gone wrong. Also sometimes people do switcheroo between sentences where in one sentence it talks of A as if it is a set, and then next sentence as if it is an item in that set. Academic English is a typed language. Don't generate type errors.
Ordering
In paragraphs, later sentences should relate to the first. Usually this involves a hook -- a word or concept in one sentence that is picked up on or elaborated on in another. If paragraphs appear to a casual reader to consist of unrelated sentences (even though they are related), that causes dissonance for the reader. Check paragraphs for this -- it is often straightforward to fix by adding a clearer hook. In sentences, the writer should ensure any concept that is already known is in the first part of the sentence, and the new concept in the second, not the other way round. For example "We denote the diameter of the large circle by d" is better than "d denotes the diameter of the large circle" if the document was previously describing the circle. But the latter is better than the former if an equation has just been introduced that included the term d. Any term used in the subject of, or early in, a sentence sounds to the reader as if it ought to be something they already know about or are aware of.
Assertions and justifications
Every claim or statement in academic writing needs to be justified. Either it is demonstrated by previous work (provided by a reference) or by the current work (a referenceable conclusion, or preceding argument). If it is not known how to handle this while writing, decide if it will be substantiated by previous or current work and put a placemarker (e.g. a blank citation) in the document as a reminder that this must be fixed.
Floats
All figures, tables etc. are not part of the writing itself but must be referenced within the writing. They must have captions, and those captions should be as self contained as possible, and fully explain what can be seen in the figure/table. Every figure/table should have a singular reason for its inclusion in the document. That reason should be communicated clearly in the caption. Finally, a caption should always state the conclusion that the author wants the reader to draw from the figure/table. That way the reader can decide if they agree/disagree on the basis of the evidence, and if the reader agrees, then the reader can just remember the conclusion and not have to remember everything from the figure/table.
Consistency
All notation, maths, spelling, punctuation etc. must be consistent. Decide ahead on spelling (e.g. British English), punctuation choice (e.g. dash type). Decide the maths notation you will use and what terms you will use for concepts before doing any writing wherever possible. Keep capitalisation consistent in headings.
Capitalisation, references and footnotes
Students can find capitalisation a challenge. As mentioned above, capitalisation in headers is a choice, but should be consistent. However in the body of the document, capitalisation should be kept to a minimum. Subjects (e.g. mathematics, machine learning) should not be capitalised, but specific figures, chapters, sections, tables etc. (Table 2, Chapter 3) are proper nouns (it is referring to a singlular object) and should always be capitalised, as should anything derived from a name (e.g. Laplacian). Learn how to use Bibtex properly to prevent automatic non-capitalisation of proper nouns in citations or overcapitalisation. References are bracketed concepts. Don't use a reference as a noun, and space the citation properly so it is separated from the text. References come within the sentence. However, sentence level footnotes should be referred to after the full stop.
Redundancy
Remove redundancy. Be very careful with any repetition. Don't use the words "obviously" or "trivially" (if it really was obvious or trivial, it wouldn't need stating that it was obvious or trivial) or "clearly" (make it clear, don't tell me it is!) or "of course" (if it is that obvious it doesn't need saying).
Articles
Many students come from countries that speak sensible, well structured and consistent languages, including languages that don't use articles. We then force such students to write in English. We are very aware of the pain involved. Knowing when to put "a" or "the" in a document is a challenge, and supervisors are generally happy to be a little forgiving about this. But the more students can get this right, the more supervisors can spend time commenting on things that matter. Despite appearance otherwise, there are generally consistent, if esoteric, rules for article use. Time that students put into learning how to get this right is certainly appreciated. And it is possible!
Brackets
Beware of the overuse of brackets. If you overuse brackets, your work may end up looking like this document.

Rightly, someone will now ask what the point of this document is. My answer is, "Academic writing requires purpose, precision, clarity, economy, authority and consistency throughout. Here's some ways to help ensure that." Yes, I know that is two sentences. I wrote one at first but then split it into two to make it clearer. Touché!*

* This is an exclamatory sentence -- one of those aforementioned (where!?) rare exceptions.

This document is made available under your choice of a CC0 or beerware licence.