0.scientific Writing For Computer Science Students
0.scientific Writing For Computer Science Students
net/publication/240701960
CITATIONS READS
2 3,127
1 author:
Wilhelmiina Hämäläinen
Aalto University
34 PUBLICATIONS 571 CITATIONS
SEE PROFILE
All content following this page was uploaded by Wilhelmiina Hämäläinen on 06 February 2016.
Wilhelmiina Hämäläinen
Wilhelmiina Hämäläinen
Contents
1 Introduction 1
1.1 Goal 1: How to write scientific text is cs? . . . . . . . . . . . . 1
1.1.1 Problem . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.2 Example . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.3 Instructions . . . . . . . . . . . . . . . . . . . . . . . . 2
1.1.4 Writing tree t . . . . . . . . . . . . . . . . . . . . . . . 3
1.1.5 Properties of a good tree t . . . . . . . . . . . . . . . . 3
1.2 Goal 2: How to write English? . . . . . . . . . . . . . . . . . . 4
1.3 Goal 3: How to write a master’s thesis? . . . . . . . . . . . . . 5
1.4 Scientific writing style . . . . . . . . . . . . . . . . . . . . . . 6
1.4.1 Exact . . . . . . . . . . . . . . . . . . . . . . . . . . . 6
1.4.2 Clear . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.3 Compact . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.4.4 Smooth . . . . . . . . . . . . . . . . . . . . . . . . . . 8
1.4.5 Objective . . . . . . . . . . . . . . . . . . . . . . . . . 9
3
4 CONTENTS
4.5.5 Phrases . . . . . . . . . . . . . . . . . . . . . . . . . . 51
4.5.6 Relative pronouns . . . . . . . . . . . . . . . . . . . . . 51
4.5.7 Extra material: Tricks for gender-neutral language . . . 52
4.6 Adjectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 52
4.6.1 Vague adjecives . . . . . . . . . . . . . . . . . . . . . . 52
4.6.2 Comparative and superlative . . . . . . . . . . . . . . . 52
4.6.3 When you compare things . . . . . . . . . . . . . . . . 53
4.7 Adverbs . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 54
4.7.1 The position of adverbs in a sentence . . . . . . . . . . 54
4.7.2 Special cases . . . . . . . . . . . . . . . . . . . . . . . . 55
4.7.3 Extra: How to derive adverbs from adjectives? . . . . . 55
4.7.4 Comparing adverbs . . . . . . . . . . . . . . . . . . . . 56
4.8 Parallel structures . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.8.1 Basic rules . . . . . . . . . . . . . . . . . . . . . . . . . 57
4.8.2 Parallel items combined by conjunctions and, or, but . 58
4.8.3 Lists . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58
4.8.4 Parallel items combined by conjunction pairs . . . . . . 60
4.8.5 The comparative – the comparative . . . . . . . . . . . 62
4.8.6 Parallel sentences . . . . . . . . . . . . . . . . . . . . . 62
4.9 Prepositions . . . . . . . . . . . . . . . . . . . . . . . . . . . . 62
4.9.1 Expressing location . . . . . . . . . . . . . . . . . . . . 63
4.9.2 Expressing time . . . . . . . . . . . . . . . . . . . . . . 63
4.9.3 Expressing the target or the receiver: to or for? . . . . 63
4.9.4 Special phrases . . . . . . . . . . . . . . . . . . . . . . 64
4.10 Sentences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66
4.10.1 Terminology . . . . . . . . . . . . . . . . . . . . . . . . 66
4.10.2 Sentence types . . . . . . . . . . . . . . . . . . . . . . 66
4.10.3 Sentence length? . . . . . . . . . . . . . . . . . . . . . 67
4.10.4 Word order . . . . . . . . . . . . . . . . . . . . . . . . 68
4.10.5 Combining clauses . . . . . . . . . . . . . . . . . . . . 70
4.10.6 Combining clauses by sub-ordinating conjunctions . . . 70
4.10.7 Relative clauses . . . . . . . . . . . . . . . . . . . . . . 71
4.10.8 Indirect questions . . . . . . . . . . . . . . . . . . . . . 74
4.11 Paragraphs . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
4.11.1 Combining sentences in a paragraph . . . . . . . . . . 74
4.11.2 Dividing a section into paragraphs . . . . . . . . . . . 75
4.11.3 Introductory paragraphs . . . . . . . . . . . . . . . . . 77
4.12 Punctuation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.12.1 Full-stop . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.12.2 Comma . . . . . . . . . . . . . . . . . . . . . . . . . . 78
4.12.3 Colon . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
6 CONTENTS
4.12.4 Dash . . . . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.12.5 Semicolon . . . . . . . . . . . . . . . . . . . . . . . . . 80
4.12.6 Quotation marks . . . . . . . . . . . . . . . . . . . . . 81
4.12.7 Parantheses . . . . . . . . . . . . . . . . . . . . . . . . 81
4.13 Genitive: ’s or of? . . . . . . . . . . . . . . . . . . . . . . . . 82
4.13.1 Special cases where ’s genitive is used for unanimate
things . . . . . . . . . . . . . . . . . . . . . . . . . . . 82
4.13.2 When of structure is necessary . . . . . . . . . . . . . 82
4.13.3 Possessive form of pronouns . . . . . . . . . . . . . . . 83
4.14 Abbreviations . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
7 Appendices 115
Appendix A: A simple latex template . . . . . . . . . . . . . . . . . 115
Appendix B: A latex template for articles . . . . . . . . . . . . . . 118
Appendix C: A check list for the master’s thesis . . . . . . . . . . . 123
References 123
Chapter 1
Introduction
1
2 CHAPTER 1. INTRODUCTION
1.1.1 Problem
Writing w is a mapping from a set of ideas I to a set of scientific
texts S, w : I → S.
1.1.2 Example
1.1.3 Instructions
1. Organize your ideas in a hierarchical manner, as a tree of ideas t (”mini-
mal spanning tree” of idea graph)
• title(n): the main title or the name of the chapter, section or subsec-
tion. In leaf nodes (paragraphs) N U LL
The following algorithm descibes how to walk through t in preorder and write
it as a sequence s ∈ S (scientific text):
• For all leaf nodes n, the sizes of content(n) are balanced: each para-
graph contains at least two sentences, but is not too long (e.g. ≤ 7 or
≤ 10 sentences)
• For all non-leaf nodes m, the sizes of content(m) are balanced. These
introductory paragraphs can be very brief. They just give an overview
what will be covered in that chapter or section. Exceptionally you can
use more than one paragraph. Notice that it is possible to skip them
totally, but be systematic!
4 CHAPTER 1. INTRODUCTION
Alg. 1 WriteTree(t)
Input: tree of ideas t
Output: scientific text s
1 begin
2 Write title(n)
3 if (n is not leaf node)
4 begin
Writing an introductory paragraph:
5 Write content(n)
6 for all u = child(n)
7 Write title(u)
8 for all u = child(n)
9 WriteTree(u)
10 end
11 else
Writing a main paragrap:
12 Write content(n)
13 end
• For all leaf nodes ni in preorder, content(ni ) can refer only to previously
written contents content(n1 ), ..., content(ni−1 ). E.g. you cannot define
deterministic automaton as an opposite of non-deterministic automa-
ton, if you haven’t given the definition of non-deterministic automaton,
yet. Exception: you can briefly advertise what will be described in the
future. E.g. ”This problem is solved in Chapter X”.
The process has the same phases as a software project or any problem solving
activity:
1. Defining the problem: Discuss with your supervisor and define what
is the problem. Try to understand it in a larger context: other related
problems and subproblems. Read some introductory article about the
topic or select the main books written about your topic. You can
already generate several ideas how to solve it, but don’t fix anything,
yet.
2. Specification: Specify your topic carefully. Don’t take too large topic!
Invent a preliminary title for your thesis and define the content in a
coarse level (main chapters). Ask your supervisor’s approval! Decide
with your supervisor what material you should read or what experi-
ments to make.
3. Design: Define the content more carefully: all sections and a brief de-
scription what you will write in each of them. Define the main concepts
you will need and fix the notations. Then you can write the chapters
in any order you want. Make also a work plan: what you will do and
when.
4. Implementation: You can write the thesis after you have read all
material or made all experiments. However, you can begin to write
some parts already when you are working. Often you have to change
your design plan, but it is just life! Ask feedback from your supervisor,
when your work proceeds.
6 CHAPTER 1. INTRODUCTION
1.4.1 Exact
• Word choice: make certain that every word means exactly what you
want to express. Choose synonyms with care. Be not afraid of repeti-
tion.
• Avoid vague expressions which are typical for the spoken language.
E.g. the interpretation of words which approximate quantities (”quite
large”, ”practically all”, ”very few”) depends on the reader and the
context. Avoid them especially if you describe empirical observations.
• Make clear what the pronouns refer to. The reader shouldn’t have to
search the previous text to determine their meaning. Simple pronouns
like this, that, these, those are often the most probematic, especially
when they refer to the previous sentence. Hint: mention the noun, e.g.
”this test”.
→ See Section 4.5 Pronouns.
1.4.2 Clear
• Use illustrative titles which describe the essential in a chapter or a
section.
1.4.3 Compact
• Say only what needs to be said!
• Weed out too detailed descriptions. E.g. when you describe previous
work, avoid unnnecessary details. Give a reference to a general survey
or a review if available.
Notice: ”reason” and ”because” have the same meaning → don’t use
together!
• Use no more words than are necessary. Redundant words and phrases
(which have no new information) should be omitted.
1.4.4 Smooth
• Verbs: Stay within the chosen tense! No unnecessary shifts in verb
tense within
Hint: sometimes you can move the last word to the beginning and fill
in with verbs and prepositions
• Each pronoun should agree with the referant in number and gender.
1.4.5 Objective
• Use the 3rd person rather than the 1st person.
• Use words which are free from bias (implied or irrelevant evaluation)
Especially, be careful when you talk about
– gender
– marital status
– racial or ethnical groups
– disability
– age
Hints:
• Select an appropriate degree of specifity. When in doubt, prefer the
more specific expression. E.g.
• original sources
• the papers should have appeared in a reviewed journal/conference
(i.e. reviewers have checked their correctness!)
• also technical reports and other theses
11
12CHAPTER 2. SEARCHING, READING, AND REFERRING LITERATURE
3. Bibliographies
Task: Can you trust the information you find in wikipedia? Why or why
not? Why wikipedia cannot be used as a reference in a scientific text?
• goal
How to proceed?
• Begin from familiar: notes, textbooks
• If you make an internet query, prefer scholar google. Check always that
the paper has been published!
• Write down the references – they can be hard to find afterwards! (es-
pecially store the bibtex files)
2.4. READING 13
Tasks
• Practise to use the most important digital libraries for cs: ACM, IEEE,
and Springer (also series Lecture Notes in Computer Science). Try to
find at least one article in each library about Bayesian networks.
• You know only the author and article name, but not any publication
details. How can you find the article?
• Try to find the following articles and write full references (authors, title,
page numbers, where published, publisher, year):
2.4 Reading
• You cannot read everything throughout!
⇒ Read only as much as is needed to
• Ask yourself:
2.5 References
2.5.1 Referring in the text
• The reference is usually immediately after the referred theory, algo-
rithm, author, etc.
Other examples
”Prolog was primarly used for writing compilers [VRo90] and parsing natural
language [PeW80].”
• Sometimes numbers
Notes
• If you use only one chapter from a book, you can give the chapter
number and title in the reference list. If you use several chapters, give
the chapter number in the reference: [WMB94, chapter 2]
• The authors: surname and the first letters of the first names. If you
have ≥ 3 authors, give only the first one, and replace the others by ”et
al.” E.g. ”Mitchell, T.M. et al.”
• The title
• The title and the editors of the collection, if the paper has appeared in
a collection (e.g. conference articles).
• The volume (always!) and the issue number after a comma or in paran-
theses, if the source is a journal paper.
• Series, if the book has appeared in some series. (E.g. Lecture Notes in
Computer Science + number)
1. A journal article:
<Authors>: <Title>. <Journal>, <volume> (<issue>): <pages>,
<year>.
2. A conference article:
<Authors>: <Title>. In <book title>, <pages>, <year>.
2.5. REFERENCES 17
Examples:
A journal article:
Cheng, V., Li, C.H., Kwok, J.T. and Li, C.-K.: Dissimilarity learning for
nominal data. Pattern Recognition, 37(7):1471–1477, 2004.
A conference article:
Note 1: In the previous, you could replace the last authors by <First author>
et al.
Note 2: Sometimes a comma or a full stop is used instead of the colon ”:”.
Books
1. A book:
<Authors>: < Title>. < Publisher>, < year>.
2. An article in a collection:
<Authors>: < Title>. In <Editors>, editors, <Book title>.
< Publisher>, < year>.
3. A chapter in a book (by one author):
<Authors>: < Title>, <Book title>, chapter < chapter number>.
< Publisher>, < year>.
Examples:
Smyth, P.: Data mining at the interface of computer science and statistics,
volume 2 of Massive Computing, chapter 3. Kluwer Academic Publishers,
Norwell, MA, USA, 2001.
1. A technical report:
<Authors>: < Title>. <Report series> <report number>, <Institution>,
<year>.
2. A master thesis:
<Author>: < Title>. Master’s thesis, <Department>, <University
or institution>, <year>.
Examples:
• If you refer to an article, which is available in the internet but has been
published in a paper form, give the normal reference to the paper
version. The url address is not necessary, but it can be given to help
the reader to find the article.
2.5. REFERENCES 19
Referring to software
• Standard software tools and programming languages like LATEX,
Matlab, and Java do not need any references.
• If you use special tools or programs with limited distribution it is
recommendable to give the reference. E.g.
Examples:
• Notice that the journal and book titles are written with capital letters!
• You can select the style by setting the style parameter for the bibliog-
raphy environment
• Just invent a unique label string for each source, which you use in
references by command \cite. E.g. \cite{whamalai}, or if you want
to refer page 3, \cite[3]{whamalai}
2.6 Citations
Direct citations are seldom used in cs texts.
• If you express somebody else’s ideas by your own words, then put the
reference immediately after the idea.
• As a rule of thumb: if you borrow more than 7 words, then use quota-
tion marks.
• An example:
• However, make clear, what is borrowed and what are your own opinions!
23
24CHAPTER 3. USE OF TABLES, FIGURES, EXAMPLES, AND SIMILAR ELEMENTS
w(j − j 0 ) + w − 1 w(j − j 0 ) − w + 1
w
Di−1,j 0
w
Di,j
α+1
3.1.3 Captions
• Each table or figure should be understandable by its own. Give a brief
but clear explanation or a title in the caption.
• Use the same style in all tables. If you use abbreviation stdev for
standard deviation in one table, then do not use sd in another table.
• If you copy (draw again) a table or a figure from some other source,
then give a reference to the original source in the end of caption, e.g.
”Table 5. Plaa-plaa-plaa. Note. From [ref ].”
A page number is needed, if the table or figure is from a book.
• Inside table or figure environment you can write the caption for the
figure/table, and define a label (after the caption).
3.1.5 Expressions
When you refer to figures and tables you can use the following expressions:
• Figure 2 illustrates
• etc.
3.2 Lists
• Lists are not separate objects, and they are introduced in the text.
– Criterion 1
– Criterion 2
– ...
• If you list only a couple of items, you can usually write them without
a list. Use lists when the clarify things!
26CHAPTER 3. USE OF TABLES, FIGURES, EXAMPLES, AND SIMILAR ELEMENTS
• You just have to define a unique label name for the referred chapter.
\chapter{Conclusions}
\label{concl}
• Notice that you can invent the labels yourself, if they are just unique
and not reserved words in latex. E.g. above label could be simply ”c”,
but now there is a danger that you will give the same name for another
object.
• etc.
3.4 Algorithms
• Give only the main algorithms in the text, and in an appropriate ab-
straction level (pseudocode)
3.5.2 In latex
In latex, you can easily define environments for writing examples or defini-
tions in a systematic way. The examples or definitions are numbered auto-
matically and you can refer to them without knowing the actual number.
In the header you define \newtheorem{example}{Example}
In text you write
”The problem is demonstrated in the following example:”
\begin{example}
\label{example:bayes}
Write the example here.
\end{example}
28CHAPTER 3. USE OF TABLES, FIGURES, EXAMPLES, AND SIMILAR ELEMENTS
When you want to refer to the example afterwards, you can write
”Let the problem be the same as in Example \ref{example:bayes},...’’
• Formally, we define
3.6 Equations
3.6.1 Without equation numbers
If you don’t need equation numbers, you can write the equations simply
between double $ characters: $$<equation>$$.
E.g. ”The prior probability of X is updated by Bayes rule, given new evidence
Y:
P (X)P (Y |X)
P (X|Y ) .
P (Y )
Remember the full stop in the end of the equaton, if the sentence finishes!
If the sentence continues, then you need comma:
”The dependency is described by equation
P (X)P (Y |X)
P (X|Y ) (3.1)
P (Y )
Now the equation is written in the math mode, and you don’t need $ char-
acters.
If you want to refer to some previous equation, you have to give it a label
like for examples.
3.6. EQUATIONS 29
• adverbs,
• prepositions, and
• conjunctions.
4.1 Verbs
Remember two important rules when you use verbs:
31
32 CHAPTER 4. GRAMMAR WITH STYLE NOTES
• If the number of the subject changes, retain the verb in each clause.
E.g. ”The positions in a sequence were changed and the test rerun” →
”The positions in the sequence were changed, and the test was rerun.”
• Past or present prefect (but not both) when you describe pre-
vious research (literature review)
• In scientific writing, the default is present (is). With present, you can
combine perfect (has been) (and future, will be) if needed, but not the
other tenses.
• Use past tense (was) only for good reasons. It expresses that something
belongs to the past and has already finished. E.g. when you report your
experiments.
• Past perfect (had been) is seldom needed. It is used, when you de-
scribe something in the past tense, and you refer to something which
has happened before it. E.g.
”We tested the system with data which had been collected in Program-
ming 1 course.”
• In the basic form of passive (”sg is done”), you can express also the actor
(”sg is done by sy”). Expressing the actor is always more informative!
• Often the purpose determines the voice. Usually we want to begin with
a familiar word and put the new information in the end. E.g. before an
equation or a definition, we can say ”The model is defined as follows.”.
• ”There is/there are” is a similar expression, but now we don’t need the
passive. This expression is used when the real subject (what is some-
where) comes later and we haven’t mentioned it before.
E.g. ”There was only one outlier in the data set 1” v.s. ”The outlier
was in the data set 1.”
Person?
• Basic rule: avoid the first person (no opinions, but facts). However,
sometimes we can use ”we” as a passive expression. Problem: whom
you are referring to, if you write alone?
• Referring to yourself: you can talk about ”the author”. E.g. ”All pro-
grams have been implemented by the author.” Notice that I don’t
guarantee that your supervisor likes this! Some supervisors prefer ”I”.
Useful verbs:
represent, analyze, apply, compare, demonstrate, illustrate, summarize, op-
timize, minimize, maximize, conclude, list, define, report, model, implement,
design, consider, involve, simplify, generalize, perform, reduce, obey, fit, con-
tain, consists of, scale up to, be based on sg., take into account sg., depend
on sg, increase, decrease, evaluate, predict, assign, require, satisfy, ...
Examples:
”As k increases, the model allows for quite flexible functional forms.”
”Data obeys the assumed functional form.”
”Data increases exponentially with dimensionality.”
”We will discuss examples of each of these approaches.”
36 CHAPTER 4. GRAMMAR WITH STYLE NOTES
Task: What is the difference between the following concepts? Give examples
when they are used!
evaluate – assess
compute – calculate
derive – infer
approximate – estimate
discover – find
Notice: American English is not so strict, and ispell can complain about
correct spelling!
4.2. NOUNS 37
Exercise
Read the given text part and underline useful expressions. Search especially
the following kind of expressions:
The same text is given to two people. Thus, you can discuss with your pair,
if you don’t understand something. However, it is not important if you don’t
understand all words.
4.2 Nouns
Nouns are usually easy. If you don’t know a word, you can check it from a
dictionary – just be careful that the meaning is what you want.
Often a better way is to move a term from your passive vocabulary to the
active one – then you known also the use context!
38 CHAPTER 4. GRAMMAR WITH STYLE NOTES
Notice also:
• The same happens with most words which have suffix -o, unless the
word is abbreviated or of foreign origin. E.g.
cargo – cargoes, but photo – photos, dynamo – dynamos
Notes
• Uncountable words are missing the plural form!
• The words in group 3 are grammatically singular but they have also
plural meaning. If you want to refer to a singular piece you have to
express it in another way: ”a piece of information”, ”an item of news”,
”a bit of advice”.
British American
colour color
neighbour neighbor
behaviour behavior
favour favor
honour honor
metre (unit) meter
meter (device) meter
centre/center center
analogue analog
dialogue dialog
encyclopaedia encyclopedia
arguement argument
judgement judgment
programme (academic, tv) program
program (computer) program
defence defense
practice (noun)1 practise
maths math
speciality specialty
• If the words have become one concept, they are usually written to-
gether, e.g. ”software”, ”keyboard”, ”database”
• Hyphen is often used when the concept consists of more than two words:
”depth-first search”, ”between-cluster variation”, ”feed-forward neural
network”, ”first-order logic”
• Notice that many words which are compound in your mother tongue
are written separately in English: ”data set”, ”density function”, ”wave
length” (this is typical especially for long words)
4.4 Articles
4.4.1 Position
Basic rule: before the noun phrase (a noun + preceeding attributes)
Exceptions:
noun
the whole
definite undefinite general class
familiar unknown concept
• the context defines what you mean (”The left-most bit is always 1.”,
”The result of process A were correct.”)
• the concept is familiar to everybody (the Earth, the sun, the moon)
Usually this kind of expressions are defining: ”The delay between two pro-
cesses P1 and P2 is tend (P1 ) − tstart (P2 ).”
Exceptional expressions
Sometimes you can use a/an article with an abstract word:
• with adjectives same, only, right, wrong (”The results were the same”,
”The only model which has this property is X”))
Notice: ”the” is not used with ordinal numbers or adjective ”last”, when
you refer to the performance in a competition (”Program X came first and
program Y was last when the programs were compared by the Z test.)
4.4.3 Hints
A better decision tree for articles:
Noun type
(in the context)
countable uncountable
Definite? Definite?
no yes no yes
a/an no article
1. The store of things learnt or the power or process of recalling (in our
brains) → generally uncountable. ”Memory can be divided into two
classes: short-term memory and long-term memory. The short-term
memory...” However, you can say: ”I have a good memory”.
Time is another word which can be used in different ways. It can mean a
limited period or interval, an indefinite period or duration, or it can express
an occasion of repeated actions. In addition, it occurs in several phrases. By
default, time is uncountable (either no article or article ”the”).
4.4. ARTICLES 45
2. Article ”the”
”all the time”
”at the same time”
3. Article ”a”:
”It is a long time...”
”one at a time” (i.e. one by one)
4. Plural:
”many times”
”modern times”
• When you use the name without any modifying word → no article
”X is independent from Y ”, ”S contains no outliers”
• When you use a modifying word like ”set”, vector”, ”model” etc. before
the name →
Two habits:
1. No article when you mention the entity for the first time. After
that use definite article ”the”, or
2. Never any articles.
4.4. ARTICLES 47
Exercises
Task 1: Add the correct articles to the following sentences or mark the
absence of articles by −!
• space
• requirement
• model
• program
• computation
• power
• capacity
• data
• information
• knowledge
• recognition
• software
• hardware
• code
• value
• property
• strength
4.4. ARTICLES 49
• weakness
• use
• usability
50 CHAPTER 4. GRAMMAR WITH STYLE NOTES
4.5 Pronouns
Two important rules when you use pronouns:
4.5.5 Phrases
on one’s own, e.g. ”The students solved the task on their own”.
4.6 Adjectives
These seem to be well mastered, just two notes:
• E.g. for statisticians, a data set of 500 rows is quite large, while for a
data miner it is extremely small → numbers are more exact!
• The expressions become even vaguer, when you add modifiers ”quite”,
”rather”, ”very”, etc. Skip them always when possible!
Basic structure:
Exceptional expressions:
X is different from Y
X is similar to Y
X is the same as Y
X is inferior/superior to Y
X is equal to Y (Notice: use ”X equals Y ” only in math, for X = Y )
54 CHAPTER 4. GRAMMAR WITH STYLE NOTES
4.7 Adverbs
Adverbs answer questions When? Where? What? Why? How?
They express
Notes:
2. in the end, when you express way, time or place. E.g. “This problem
occurs frequently in sparse data.”
• Still (mostly in positive sentences): before the main verb, but after
be-verb. ”These enlargements are still unimplemented”
so and such
Exceptions
Adjctive Adverb Examples
suffix
-y -ily easy – easily
-e -ly whole – wholly, true – truly
-ic -ally automatic – automatically,
systematic – systematically
Exception: public – publicly
-able/-ible -l disappears sensible –sensibly
-ly in a <adj.> way in a friendly way
If you are not sure how to derive an adverb, check it from a dictionary!
Adverb = adjective
fast, hard, lat, straight, low, wrong, right, long
Notice:
• Parallel items are combined by parallel conjunctions (and, or, but, ...).
• E.g.
”Method X has several advantages: it is easy to implement, it works
in polynomial time, and it can use both numeric and categorial data.”
contains two parallel structures: three advantages (”it is, it works, it
can”) in a list and ”both numeric and categorial data”
”The students were told to make themselves comfortable, to read the instruc-
tions, and that they should ask about anything they did not understand”
→ ”The students were told to make themselves comfortable, to read the in-
structions, and to ask about anything they did not understand”
”The results show that X did not affect the error rate and the model over-
fitted the data”
→ ”The results show that X did not affect the error rate and that the model
overfitted the data”
4.8.3 Lists
Notice that elements in a list should be in a parallel form!
Example 1
”Boud [Bou89] has listed general characteristics which are typical for problem-
based courses:
Example 2
”The clustering methods can be divided into three categories:
Example 3
”The whole procedure is following:
Example 4
”According to O’Shea [OSh00], an intelligent tutoring system should be
• robust,
• helpfull
• simple,
• transparent
• flexible
60 CHAPTER 4. GRAMMAR WITH STYLE NOTES
• ...
• sensitive, and
• powerfull.”
Notice! The previous kind of list should be avoided, because it can be written
as normal sentences. A list was used above, because 13 items were listed (and
they were analyzed later). If you list only a couple of items (e.g. less than
5), write them as a normal sentence!
• between...and,
• both...and,
• either...or,
• neither...nor, and
• not only...but.
The first conjunction should be immediately before the first part of the par-
allelism.
between – and
”between 20-22 years of age” → ”between 20 and 22 years of age”
”We recorded the difference between the students who completed the first
task and the second task”
→ ”We recorded the difference between the students who completed the
first task and the students who completed the second task.”
both – and
”The task is both easy to solve and efficient.” (Doesn’t make any sense!)
→ The task is both easy to solve and can be solved efficienty.”
Or another structure:
”The task is easy and the solution is efficient.”
4.8. PARALLEL STRUCTURES 61
either – or
”The students either gave the worst answer or the best answer.”
→ ”The students either gave the worst answer or gave the best answer.” or
”The students gave either the worst answer or the best answer.”
neither – nor
”On the one hand, a complex model can describe the data well, but on
the other hand, it overfits easily.”
”There is always a wrestling between the descriptive power and the general-
ization ability. On the one hand, too complex a model describes the data
well, but it does not generalize to any new data. On the other hand, too
simple a model generalizes well, but it does not describe the essential features
in the data.”
62 CHAPTER 4. GRAMMAR WITH STYLE NOTES
”The more complex the model is, the better it describes the training
data.”
”X model has three important properties: First, the model structure is easy
to understand. This is a critical feature in adaptive learning environments,
as we have noted before. Second, the model can be learnt efficiently from
data. There are feasible algorithms for both numeric and categorial data.
Third, the model tolerates noise and missing values.”
4.9 Prepositions
• Be careful with prepositions. A wrong preposition can give a totally
different meaning!
• If you are unsure about the use of a preposition, ask yourself what a
cat would do! (Fedor’s sciwri book)
Cats sit on mats, go into rooms, are part of the family, roam among
the flowers.
4.9. PREPOSITIONS 63
Notice: ”on page 3”, ”on line 5”, ”on the Internet”
• Longer period of time: in, e.g. ”in the 1970’s”, ”in the future”, ”in five
minutes”, ”events occur close in time”
• If both are pronouns, then the object becomes first (case ii)
Task: Draw a decision tree for deciding when to use ”to” or ”for”!
opportunity of/for sg
in spite of sg (but despite sg)
regardless of sg
take into account
in relation to sg
in contrast with sg
a proportion of sg. (”a large proportion of data”)
in proportion to sg, proportional to sg (”The time complexity of f propor-
tional to n is...”)
the ratio of a to b = a/b
x% of y
under some conditions
by default
contrary to sg
in contrast
by contrast (∼ ”however”)
on the contrary
at an extreme
From Kdict:
Usage: Things are compared with each other in order to learn their relative
value or excellence. Thus we compare Cicero with Demosthenes, for the
sake of deciding which was the greater orator. One thing is compared to
another because of a real or fanciful likeness or similarity which exists between
them. Thus it has been common to compare the eloquence of Demosthenes
to a thunderbolt, on account of its force, and the eloquence of Cicero to
a conflagration, on account of its splendor. Burke compares the parks of
London to the lungs of the human body.
66 CHAPTER 4. GRAMMAR WITH STYLE NOTES
4.10 Sentences
4.10.1 Terminology
• A sentence consist of one or more clauses
• A clause contains always a subject and a predicate, and usually an
object
– An independent clause (main clause) can make a sentence alone.
– A dependent clause (subordinate clause) needs an independent
clause for support.
In scientific writing the default type is the statement. Direct questions and
orders are seldom used.
Questions suit best to the introduction where you state your main research
questions clearly and concretely, e.g.
”The main research questions are the following:
Orders can be useful in pseudo code, when you describe some method. E.g.
”Search such ci that d(x, ci ) is minimal”.
Examples:
”The dependency is trivial, because Y = f (X).”
”X and Y are linearly independent, if the correlation coefficient, corr(X, Y ),
is zero”
”Let ci be the cluster which is closest to x.
”We select the first model that fits the data.”
”First we should study what is the relationship between X and Y .”
”The main problem is whether X can be applied in Z.”
”We analyze the conditions under which X can be applied.”
• 1-3 clauses
3. Check the verb structures and ask yourself if they could be shorter
E.g. verb structure ”has been shown” can often be replaced by ”is”.
Notice! Don’t go into the other extreme when you shorten sentences! If the
clarity suffers, then a longer sentence is better.
Analogue: A good model of data does not overfit nor underfit, i.e. it is
simple enough but still expresses all essential features. Now the sentence is
a model of the idea you want to express.
68 CHAPTER 4. GRAMMAR WITH STYLE NOTES
Why?
• Or begin by a familiar thing and put the new information to the end
”The probabilities are updated by the Bayes rule:” + the equation.
• Often the sentence is most informative, if you express the most impor-
tant topic by the subject.
The adverbs and prepositional phrases occur in order: way, place, time.
Hint: always consider if the word modifies the verb (the action) or the object
(the target).
• before the predicate, if the verb consists of one word and is not the
”be”-verb.
• after the first auxiliary verb, if the verb consists of several words.
E.g.
”X often implies Y .”
”The method gets sometimes stuck at a local optimum”
”The data was probably biased.”
Problem: some words like ”only” can modify also other words!
→ Put the word ”only” next to the word or phrase it modifies!
E.g. (notice the different meaning):
”X was the only method which could parse the LL(1) grammar”
”X was the method which could only parse the LL(1) grammar”
”X was the method which could parse only the LL(1) grammar”
– and just links one idea to another (doesn’t describe the relation-
ship – typical for the children’s language and dreams where things
just happen). E.g. ”The data is sparse and the model overfits eas-
ily.”
– but establishes an interesting relationship between the ideas →
a higher level of argument. E.g. ”The data was sparse, but the
model did not overfit.” (=”Even if the data was sparse, the model
did not overfit.”)
• Commas? If the clauses have the same subject, then no commas. Oth-
erwise usually a comma, unless the clauses are very short.
Examples:
”The search can be halted as soon as minf r proportion of data is checked”
”The method is time-efficient, because all the parameters can be updated in
one loop”
2. which
3. that
4. what
• E.g.
”WYSIWYG means ’What you see is what you get’.”
”This is what we know so far.”
Hint: turn the subordinate clause around and substitute the relative pro-
noun by a personal pronoun. If you can use ”she” or ”he”, it is subject
74 CHAPTER 4. GRAMMAR WITH STYLE NOTES
• No auxiliary word do
• No comma!
• No question mark
4.11 Paragraphs
How to combine sentences? How to begin paragraphs? How to link para-
graphs to each other? Introductory paragraphs (at the beginning of a chap-
ter)
Task: Seach useful expressions from the text extract given to you!
An iterative process:
1. The main structure of the whole thesis: the main chapters and their
contents in a couple of sentences or key words. The order of chapters.
2. For each chapter (or an article), the main sections + key words, intro-
ductory sentences or phrases. The order of sections.
3. In each section, the subsections or paragraphs. The introductory sen-
tences, key words, and the order of paragraphs. List the related tables
and figures.
Suggestion: put your disposition on one side for a while, before you begin
writing.
76 CHAPTER 4. GRAMMAR WITH STYLE NOTES
A paragraph
The topic for each paragraph must be clearly stated – usually in the first
sentence = topic sentence.
• If you cannot write a clear topic sentence, ask yourself whether the
paragraph is needed at all!
• Keep the same verb tense (change it only for good reasons).
”In the following, we recall the most common measure for correlation, Pear-
son correlation coefficient. We discuss restrictions and extensions of the
common correlation analysis. Finally, we analyze the ViSCoS data by Pear-
son correlation and correlation ratios to reveal linear and non-linear depen-
dencies.”
In the following, we define the main types of dependencies for categorial and
numeric data. We introduce three techniques (correlation analysis, corre-
lation ratios, and multiple linear regression) for modelling dependencies in
numeric data and four techniques (χ2 independence test, mutual informa-
78 CHAPTER 4. GRAMMAR WITH STYLE NOTES
tion, association rules, and Bayesian networks) for categorial data. In both
cases we begin by analyzing pair-wise dependencies between two attributes,
before we analyze dependencies between multiple attributes X1 , ..., Xk and the
target attribute Y . This approach has two benefits. First, we can avoid test-
ing all 2k dependencies between subsets of {X1 , ..., Xk } and Y , if Y turns
out to be independent from some Xi . Second, this analysis can reveal im-
portant information about suitable model structures. For example, in some
modelling paradigms, like multiple linear regression and naive Bayes model,
the explanatory variables should be independent from each other. Finally,
we analyze the suitability of described modelling techniques for educational
domain.”
4.12 Punctuation
Goal: to make the text clearer. Unfortunately, the English punctuation
rules (especially the use of comma) do not always coincide with the rules of
your mother tongue.
Usually you manage with just two marks: full-stop and comma! The basic
rules for other marks are:
4.12.1 Full-stop
Full-stop ends a full sentence. Do not use comma instead of full-stop to
separate independent clauses which are not logically related.
4.12.2 Comma
Comma is used
1. To separate introductory phrases and conjunctions (however, thus, sim-
ilarly, etc.):
”Ideally, all references are entered into a bitex database.”
”Theorem 1 is important for two reasons. First, it allows us to... Sec-
ond, it ...”
4.12. PUNCTUATION 79
4. When two phrases with the same meaning are used side by side.
”One of the most useful statistics is x, the sample mean.”
7. To avoid ambiguity.
”Instead of hundreds, thousands rows of data is required”
”Instead of 20, 50 students participated...”
”What the actual reason is, is not fully understood”
(better: ”The actual reason is not fully understood”)
No comma is used
1. When an independent clause is followed by a restrictive relative clause
or is embedded with a restrictive rel. clause (especially before that).
Exception: ”It must be remembered, however, that...”
4.12.3 Colon
Use colon between a grammatically complete introductory clause and a final
phrase or clause that illustrates or extends it. If the following clause is a
complete sentence, it begins with a capital letter.
4.12.4 Dash
Dash is nearly always used in pairs. You can always use commas instead of
dashes. Additional details can also be separated by parantheses. Notice that
dash interrupts the contuinity of a sentence!
Advice: Do not use dash, if you are not sure how to use it!
”The two students – one cs student and one maths student – were tested
separately.”
4.12.5 Semicolon
Semicolon separates two independent clauses. It is stronger than a comma
but weaker than a full-stop. You can always replace it by a full-stop, and
sometimes by a comma structure.
You can use them also when you introduce a word or phrase used as an ironic
comment, as slang, or as an invented expression. Use the quotation marks
only when the new term is mentioned for the first time!
Notice: when you use a word or letter as an linguistic example, you can use
a special font, e.g. italicize it (just be systematic with the font you select).
”According to algorithm X, words cat and God were similar.”
Similarly, when you mention variable names, values etc. use a special font
(unless they mathematical symbols → $ characters (math mode). E.g. ”X
can have three values low, medium, high.” ”Action1 is selected with the
probability of 0.6 and Action2 with the probability of 0.4.”
In latex
{\tt Action1}
4.12.7 Parantheses
Parantheses are used for two purposes:
• To introduce an abbreviation
”Minimum description length (MDL) principle is often used to...”
• To add extra details. Advice: do not overuse them!
”Two common choices are to represent a cluster by its centroid (central
point) or by its boundary points.”
”In minimum edit distance we define the minimum number of oper-
ations (e.g. insertion, deletion, substitution) needed to transform one
string to another.”
Sometimes you can give extra references (extra reading) in paranthesis:
If the possessive pronoun is not followed by noun, then special forms {mine,
your, hers/his, ours, yours, theirs}. Seldom needed in scientific writing! (In
spoken language e.g. ”Whose cat is this? It is mine.”)
In some special cases (rarely) you can use structure ”of it” (referring to
unanimate things) to emphasize the possessed. ”I don’t remember the name
of it.”
4.14 Abbreviations
• Use abbreviations sparingly, especially the abbreviations which you de-
fine yourself for technical terms. E.g.
”The performance of NB and LR classifiers are measured by TP and
TN rates” vs.
”The performance of naive Bayes and linear regression classifiers are
measured by true positive and true negative rates”
• As a rule of thumb, if the term is used less than three times, don’t
introduce any new abbreviation for it.
• Use only those abbreviations that help you to communicate with your
reader.
• When the term is mentioned first time, write it out completely and give
the abbreviation immediately in parantheses. E.g.
”According to maximum likelihood (ML) principle ...”
• Notice that standard abbreviations do not have to be written out on
first use! Such abbreviations are a.m., i.e., vol., ed.
Notice that p.=page, pp.=pages
• Do not switch between the abbreviation and complete term in the same
paragraph.
• If you use special abbreviations in figures or tables, describe them in
the caption.
84 CHAPTER 4. GRAMMAR WITH STYLE NOTES
5.1.1 Abstract
• Tells compactly the research problem, methods and results.
5.1.2 Introduction
Typically 4-7 pages.
The introduction should define the problem clearly and give suffiecient back-
ground information for the following chapters. However, no details, yet!
85
86 CHAPTER 5. WRITING MASTER’S THESIS
5.1.4 Conclusions
Just 1-3 pages!
• Tell what was your own conctribution and what was based on other
sources.
• No more new results and seldomly any references (at most for alteran-
tive, unmentioned approaches)
5.1.5 References
• A rule of thumb: at least 20 references, but no more than 50. 30-35 is
often the ideal.
5.1.6 Appendices
• Additional material which is relevant to the research and is referred in
the text. E.g. if you have made a questionnaire, you can put the form
into appendix.
• Conclusions
88 CHAPTER 5. WRITING MASTER’S THESIS
Literature review
A theory or a model is analyzed based on literature. Often a comparison of
different approaches.
Your own contribution: how the results are described in a uniform manner,
analyzed and compared.
Now the existing litertaure is referred in all chapters, no need for a separate
chapter “Related research”.
• Introduction
• Main concepts
• Approaches + their analysis (2-3 chapters)
• Or a chapter for comparison and analysis of all approaches
• Conclusions
Variation: analysis of the suitablity of existing approaches to a new problem.
• Introduction
• The new problem + criteria for an ideal solution method
• Potential solution methods + analysis of their suitability (2-3 chapters)
• Possibly discussion (comparison, new solution ideas)
• Conclusions
Empirical research
E.g. a new method or tool is tested with real users or products of students
are analyzed.
• Introduction: Begin by introducing the research problem: what was
the goal of empirical study.
• Main concepts and background theories (one chapter) and
• Related research (one chapter) (or both in one chapter)
• Experiment and results (one chapter), e.g. four sections: Material,
Methods, Results, and Discussion
• Conclusions: what was the probem, what results were achieved
5.2. MASTER’S THESIS PROCESS 89
• Plan how much time you can spend for studying literature! In some
point you have to stop collecting new material and begin to write.
→ Suggestion: In the end of Aug, your it-project is finished and you
have collected and selected relevant material for your thesis.
5.2.2 Planning
Well planned is halfly done!
• Collect literature and scan through it. Select the most important
sources.
• List the main research problems (in the form of questions) and write
the introductory paragraphs for the chapters.
90 CHAPTER 5. WRITING MASTER’S THESIS
• Work together with your friend. You can set the deadlines, discuss your
topics, and read each other’s texts. After good work you can reward
yourself by doing something fun.
• Imagine that you are writing to your friend about your research topic!
• Write down ideas when they come – even in the middle of night.
• If some part is difficult to write, beging from an easier one. Write the
difficult parts, when you are in a good working mood.
• Draw figures which describe the some method or model and write a
description.
• Collect main concepts and write definitions for them. Fix the notations.
• Don’t spend too much time trying to find an effective beginning – you
can always modify it afterwards.
• Go straight to the point and, if possible, refer to things that you expect
your readers to know (vs. contructivism).
5.2.4 Revising
“The time taken in planning, writing and revising is time for thought. It is
well spent, for when the work is complete your understanding of the subject
will have been improved.” [1, 44]
• First of all, admit that the first draft(s) is not perfect! Ask criticts and
respect it. Good criticts is really valuable.
• You can write and revise your work for ever, but in some point you
have to stop! One trick is that you don’t allow yourself to gather any
more new literature.
• Have a break when your work is finished. At least, sleep one night
before revising the text yourself.
Technical hints:
2. Give a definition yourself and tell that in this work the term is defined
as given.
92 CHAPTER 5. WRITING MASTER’S THESIS
Symbols
• Don’t use the same symbol for different things!
Equations
Avoid listing mathematical equations! Try to integrate equations into sen-
tences so that the results is readable.
Latex files are pure text files (ASCII), which can be compiled to dvi (device
independent format), pdf (portable document format), ps (postscript) or
something else. The modifications become visible only after compilation.
• Good-looking results!
93
94 CHAPTER 6. LATEXINSTRUCTIONS AND EXERCISES
• You can easily make your own macros (commands) for special purposes.
• Latex is free!
\command[options]{parameter}
• \includegraphics[width=0.6\textwidth]{figure1.eps} includes
figure ”figure1.eps” into the document. The option defines the width
of the figure to be 60% of the text width. Note that \textwidth is a
command without parameters. The width option is optional.
9. Open your file in an editor. For example, you can use emacs or
xemacs. xemacs is heavier to run, but maybe easier to use, if you
are used to graphical interphase. In emacs, the file is opened by emacs
latexercise.tex.
10. When you finish, you can transfer your file to cs, where it can be
accessed from windows, if needed. The command is
scp latexercise.tex [email protected]:directory, where user
is your username and directory is the directory name.
6.3.2 Exercises
Give your document title ”Exercises 1” and create a section for each task.
• Understanding domain
• Preprocessing data
• Learning the model from data
• Interpreting the results
• Understanding domain
• Preprocessing data
• Learning the model from data
– Data mining or
– Machine learning step
• Interpreting the results
3. A proper table should have a title and be referred from the text. Ta-
ble 6.1 gives an example. Check from http://www.cs.joensuu.fi/
pages/whamalai/sciwri/articletemplate.tex how it is done!
Write Table 6.2 yourself!
Name Problem
Satifiability (SAT) Given a Boolean formula of variables, parantheses,
and connectives ∧ (and), ∨ (or), and ¬ (not),
can the formula be true with any truth value
assignment?
Independent set (IS) Does the given undirected graph contain k
vertices which are not connected to each other?
Clique Does the given undirected graph contain k vertices
which are all connected to each other?
Hamiltonian cycle (HC) Does an undirceted graph contain a path which goes
through all vertices exactly once and returns to the starting
point?
d) a = v/t
(n+1)×n
e) 1 + 2 + ... + n = 2
2. $, %, #, and are special characters in latex, and you cannot use them
in the text as such. Can you find out how to do it? Hint: an escape
character \.
R \mathbb{R}
P \mathcal{P}
∅ \emptyset
∞ \infty
x \overline{x}
n
k
n \atopk
... \ldots
..
. \vdots
¡n¢
For example, k
is achieved by
\begin{thebibliography}{4}
\bibitem{assrule} Agrawal, R., Mannila, H., et al.:
Fast discovery of association rules.
In Fayyad, U.M., Piatetsky-Shapiro, G.,P., Smyth, P., Uthurasamy, R. (eds.):
Advances in knowledge dicovery and data mining.
AAAI/MIT Press, Menlo Park, CA (1996) 307--328
\bibitem{boulay} Boulay, B. du:
Can We Learn from ITSs?
Intelligent Tutoring Systems (2000) 9--17
\bibitem{butz} Butz, C.J., Hua, S., Maguire, R.B.:
Web-based intelligent tutoring system for computer programming. Web
Intelligence and Agent Systems: An International Journal 4,
1 (2006) To appear.
\end{thebibliography}
1. Compile the template (http://www.cs.joensuu.fi/pages/whamalai/
sciwri/articletamplate.tex) and check what the list looks like!
2. The reference notations are defined in the header by command
\bibliographystyle{style}.Style alpha is often used in the cs mas-
ter’s thesis. Check what happens if you change the style to plain!
3. Referring to sources, like to [AM96], happens by \citecommand. Try
to refer to other sources! Notice that you have to run latex command
a couple of times, before all references are solved.
100 CHAPTER 6. LATEXINSTRUCTIONS AND EXERCISES
1. Test how to include cat.ps into your own document! What happens if
you remove commands \begin{quotation} and \end{quotation} ?
2. In scientific text, all figures must have a title (caption) and be referred
from the text. This is demonstrated in Figure 6.1.
Figure 6.1 is aligned in the center. The figure width is defined to be
60% of the text width. Try what happens if you change it!
3. Load file articletree.eps from
http://cs.joensuu.fi/pages/whamalai/sciwri/articletree.eps.
6.7. DRAWING FIGURES 101
Include the figure into your document. Write some caption, invent a
label, and test referring to it!
6.7.1 Advices
• You can start xfig from shell by command xfig (when the file doesn’t
have any name) or you can already give it a name by command xfig
example.fig. If you didn’t give any file name in the beginning, you
have to save your figure by command save as.
• Click grid mode and select a grid. Now it is easier to draw objects
into positions you want.
• When you are finishing, remember to save your file. (You can save it
during drawing, too. If something goes wrong, you can continue from
the last saved version.)
102 CHAPTER 6. LATEXINSTRUCTIONS AND EXERCISES
• In edit menu there is command undo which lets you cancel the last
drawing operation.
• By default, you cannot draw or move objects anywhere, but only in the
grid. If this is too restricting, you can select Point position → Any.
6.7.2 Tasks
1. Draw some of the given figures by xfig, save them as eps. Check the
eps figures by command ghostview example.eps or gv example.eps.
Notice: your figures do not have to be indentical than the examples!
4. Extra task (if you have time): test how to include latex math com-
mands into a figure in xfig. Write the math commands (inside $ char-
acters) into your figure. Select Edit command and click the string
which contains latex symbols. Change the Special Flag to Special.
When you export the figure select language Combined PS/Latex (both
parts). This produces two files example.pstex and example.pstex t
into your working directory. Include the latter into your document by
\input{example.pstex_t}as demonstrated in
• If you run aspell from emacs, select the dictionary from tools →
spell checking → select British dict. Start checking by select-
ing Spell-Check Buffer from the same menu. You get the list of
commands by ctrl-h.
6.8. SPELL CHECKING 103
+, − d
q0 q1 q2 d
x1 . . y1
x2 . . y2
. y3
1. Understanding domain
2. Preprocessing data
3. Discovering patterns
4. Postprocessing results
5. Application
6.9.1 Idea
• You collect a database of bibtex records (bibtex entries) for all sources
you may refer in your document. It can contain also extra entries,
because the bibtex selects only those references which are actually re-
ferred.
• Each bibtex entry should have a unique label (above Gettys90). The
labels are referred in the text normally by \cite{label}.
• The resulting database is a common text file, and only the reocrds
have a special format. The file should be called <file name>.bib. For
example, dbase.bib.
\bibliographystyle{alpha}
\bibliography{dbase}
6.9. WRITING REFERENCES BY BIBTEX 105
• a book → @book
Other types:
106 CHAPTER 6. LATEXINSTRUCTIONS AND EXERCISES
When the type is fixed, you should define all required fields. The most often
needed fields are:
• author
• booktitle (if the paper belongs to a book or collection, and already has
a title of its own. Especially, the name of the conference proceedings.)
• pages
• volume (in journals, also if a book has several volumes, and the volumes
in LNCS series)
Note: By default, Bibtex capitalizes only the first letter of the first word in
the titles. If you need other capital letters, you have two choices:
1. Put the letter or letters to be capitalized into braces, e.g.
2. Put the whole field value into braces. Now you don’t need the quotation
marks at all:
Notes:
• Journal and book names are usually written such that the first letter
of each word is capitalized!
• Remember all the commas and quotation marks! Otherwise bibtex
cannot parse the entry. The most common error is a missing quotation
mark or a comma in the end of field.
• In DBLP the entry is often in two separate records: one for the whole
proceeding and one for the article. The article entry does not contain
all fields alone, but it refers to the collection by field crossref and
inherits all fields from it. → copy both entries into your database or
add the missing fields to the article entry.
6.9.4 Exercise
Search or write the bibtex entries for your literature sources. Test that the
bibtex can generate all references! (Now it is important that you also refer
to your sources in the text.)
\usepackage{float}
\usepackage{xspace}
\usepackage{algorithmwh}
Notes
• You can add your own commands to algorithmwh.sty by \newcommand.
Suggestion: rename the style file according to you, if you make changes
to it.
• Line numbers are useful, if you refer to certain lines in your code. Begin
each code line by \uln. If you don’t need line numbers, drop \uln.
• Fix the style you use for assignments. There are several alternatives:
x = y, x ← y, x := y.
$x \uor y$
1 begin
2 compute all connected components in G = (V, E)
3 for each connected component V 0 in G = (V, E) do
4 for all v ∈ V 0 dfs({v}, degree(v), minf , v)
5 end
1 begin
2 if fref (X) ≥ minf then
3 output X
d(1−minf )
4 else if (fref (X) < 1 − (|X|−1)min f
)
5 then return // search failed
0
6 for all vertices u ∈ V (u > last and ∃v ∈ X (v, u) ∈ E) do
7 dfs(X ∪ {u}, d, minf , u)
8 end
6.10.2 Exercises
1. Write Algorithm 4! Test how to refer to it in the text (like here).
2. Test how to write the following kind of method using an itemize list!
Step 1 x = x + 1
Step 2 y = x2 + 1
Step 3 If y ≤ n return to Step 1.
Alg. 4 PartitioningClustering(S, n, k)
Input: Data set S, n = |S|, number of clusters k
Output: Centroids c1 , ..., ck
1 begin
2 Select randomly k data points p1 , ..., pk ∈ S
3 for all pi // Initialization
4 begin
5 ci = pi
6 Ci = {pi }
7 end
8 while (not converged) // Update clusters
9 begin
10 for all pi ∈ S
11 begin
12 Search cj such that d(pi , cj ) is minimal
13 Cj = Cj ∪ {pi }
14 end
15 Update centroids ci
16 end
17 end
\begin{itemize}
\item[Step 1] $x=x+1$
\item[Step 2] $y=x^2+1$
6.11. SPECIAL LATEX NOTES 111
outputs
Step 1 x = x + 1
Step 2 y = x2 + 1
6.11.3 Footnotes
Footnotes1 are achieved by command \footnote{text}.
etc. See latex manual! E.g. if your table contains a lot of text, first try to
prune the text, but if it doesn’t help, you can use footnote size:
\begin{center}
\begin{table}[!h]
\caption{plaa-plaa}
\label{tab1:3}
\footnotesize{
\begin{tabular}
\end{tabular}
}
\end{table}
\end{center}
\begin{table}[!h]
\begin{center}
\caption{Comparison of prediction accuracy of {\em LR} and {\em NB} models.
The prediction accuracy is expressed
true positive $TP$ and true negative $TN$ rates.
All models have been evaluated by 10-fold cross-validation and the
classification rates have been averaged.}
\label{crossval}
\begin{tabular}{|l|c|c|c|c|}
\hline
Model structure&\multicolumn{2}{|l|}{$LR$ rates} &
\multicolumn{2}{|l|}{$NB$ rates}\\
& TP& TN& TP&TN\\
\hline
$A \Rightarrow FR1$ &0.83&0.47&0.96&0.31\\
\hline
$A,B \Rightarrow FR1$ &0.91&0.72&0.80&0.81\\
\hline
$A,B,C \Rightarrow FR1$ &0.93&0.81&0.83&0.81\\
\hline
$TP1 \Rightarrow FR2 $&0.70&0.68&0.96&0.53\\
1
These not recommended in computer science texts; use them sparsely!
6.11. SPECIAL LATEX NOTES 113
\hline
$TP1,D \Rightarrow FR2 $&0.78&0.84&0.76&0.61\\
\hline
$TP1,D,E \Rightarrow FR2$&0.76&0.89&0.82&0.87\\
\hline
$TP1,D,E,F \Rightarrow FR2$&0.70&0.92&0.80&0.87\\
\hline
\end{tabular}
\end{center}
\end{table}
Notice that you have to define the maximum number of columns in the
tabular definition, and multicolumn is used to combine columns on some
rows.
\begin{sidewaystable}
\begin{center}
\caption{Table caption}
\label{predmodels}
\footnotesize{
\begin{tabular}{|l|l|l|l|l|l|}
\end{tabular}
}
\end{center}
\end{sidewaystable}
In the article (and master thesis) template the default is that all paragraphs
begin by space. This is unconvienent when you just want to leave empty
lines without beginning new paragraphs. You can get rid of the beginning
space by command \noindent.
For example:
\noindent
$a \rightarrow action1$ $(0.6)$ $a \rightarrow action2$ $(0.4)$\\
$b \rightarrow action3$ $(0.6)$ $b \rightarrow action2$ $(0.4)$\\
$c \rightarrow action3$ $(0.6)$ $c \rightarrow action2$ $(0.4)$’’\\
outputs
Appendices
%The paper size, font size and document type are defined in the following
\documentclass[a4paper,12pt]{article}
%The following line is not necessary if you write in English. If you write
%in another language, uncomment the line and change the language
%\usepackage[english]{babel}
115
116 CHAPTER 7. APPENDICES
\bibliographystyle{alpha}
%If you want to remove the space before paragraphs uncomment the following.
%Remember then to leave an empty line between paragraphs!
%\setlength{\parindent}{0pt}
\author{Your name}
\begin{document}
\maketitle
\end{document}
118 CHAPTER 7. APPENDICES
%The paper size, font size and document type are defined in the following
\documentclass[a4paper,12pt]{article}
%The following line is not necessary if you write in English. If you write
%in another language, uncomment the line and change the language
%\usepackage[english]{babel}
%If you want to remove the space before paragraphs uncomment the following.
%Remember then to leave an empty line between paragraphs!
%\setlength{\parindent}{0pt}
119
\author{Your name}
\begin{document}
\maketitle
\section{References}
\section{Referred tables}
We have already practised how to make simple tables. Now we will make
tables, like Table \ref{tableexample} which have titles and are referred
from the text.
\begin{table}[!h]
\begin{center}
\caption{Useful mathematical symbols: arrows.}
\label{tableexample}
\begin{tabular}{|l|l|}
\hline
$\rightarrow$ & An arrow to the right\\
\hline
120 CHAPTER 7. APPENDICES
\section{Figures}
If you don’t refer to the figure, you can simply include it here like this:
\begin{center}
\includegraphics[width=0.6\textwidth]{cat.ps}
\end{center}
\begin{figure}[!h]
\begin{center}
\includegraphics[width=0.6\textwidth]{cat.ps}
\caption{A cat writing scientific text.}
\label{figexample}
\end{center}
\end{figure}
% Literature references:
% If you use bibtex, uncomment the following. Add the name of your
% own bibtex database instead of dbase (now file dbase.bib)
%\bibliography{dbase}
\end{document}
A check list for the master’s thesis.
Fill this table when a draft version of your master thesis is finished. You can fill the table yourself but it is better if somebody else can fill it
for you!
Property Questions Comments
CHAPTER 7. APPENDICES
Bibliography
[1] Barrass, R.: Scientists must write. A guide to better writing for scientists,
engineers and students. Chapman and Hall, London, New York, 1978.
[2] Peat, J. et al.: Scientific writing – easy when you know how. BMJ Books,
London, 2002.
[4] Strunk, W.: Elements of Style. Priv. print, Ithaca, NY, 1918. On-line
edition published July 1999 by Bartleby.com. www.bartleby.com/141/.
Loaded 1.3. 2006.
123