Algorithm project reports (working document)
When doing an algorithm project involving implementation, it is
expected that the project work is documented by a report. The report
should include a description of the problem and algorithms considered,
describe your decisions made during the implementation, document your
experimental findings, and include a discussion/explanation of the
experimental results obtained.
The below is a list of (more or less random) issues to be aware of
during the project work and report writing.
Report
- Frontpage should state group members: name, email, årskort, date.
- Use page numbers.
- Spell check report.
- For each plot in the report there should be a discussion
why you made the experiment.
- If many symbols/variables are used then add an appendix with a
table of notation (this is also a help to avoid overloading
symbols with different meanings, and to ensure consistent naming
throughout the report).
- Include a table of figures (longer reports).
- Include an index of terms used in the report and the page numbers
of the first appearance/definition of the terms (longer reports).
- The report should include the relevant references, eg books,
research papers, online resources (use eg bibtex).
- The report should include a list of missing parts in the
implementation. Are some parts of the code known to fail? Have some
functionallity not been implemented? Are there special cases known
where the implementation does not work? ...
Figures
- Please avoid overlap of data plots and description within the plot.
- Use clear distinguishable marks if having more data sets in one
plot (should be visible both in black-and-white and color printouts).
- If connecting data points in plots with lines then the data
points should still be visualized in the plot.
- Include screendumps to illustrate the running program (white
background prefered).
Implementation
- Provide a URL where the source code/executables, test-data,
generated output, and report is available. For binary executable files
state what machine is expected to be able to run the program.
- Describe how to run the program(s) and the arguments to the program(s).
Preferably provide a make
-file and README-file together with the sources.
- State the machine type, the machine characteristics (CPU,
frequency, size of caches and RAM), operating system (and version),
implementation language, compiler (and version number), and libraries
used.
Code tuning
- Use a program profiler (eg GNU gprof) to check where the time is
spend in a program.
- Check the assembler code - the compiled C++ code is perhaps not
translated into the expected low level assembler code (code removed,
loops turned around, tail recursion, inlined code...).
- Number of cache faults and other hardware counts can be accessed
using eg. the PAPI library
(requires special a linux kernel).
Experiments
- Repeat experiments. Use average over several runs of the
program. Avoid using the time used for the first run, since this
typically has an overhead, eg loading program into memory, emptying
cache, etc.
- Use a program like Gnuplot
to plot experimental data sets.
- In plots use logarithmic scales whenever appropriate, in
particular to study assymptotic behaviour
- To check if data points f(n) follow eg an
nlog n curve (hard to identify in a plot), try to plot
the data as f(n)/(nlog n) and see if it
converges to a constant for large n.
- Try to approximate the plotted data points with a curve. Does the
curve follow the theoretical expected behavior?
- Measure time, number of comparisons, tree rotations, cache
faults, branch mispredictions, ... Do they support the theory? Do hardware parameters like
cache size influence the running time? Try to explain the observed
behavior through theoretical arguments.
- Generate many plots - move similar plots to an appendix, and
include in the main report a representative for a group of similar
plots.
Testing
- Use invariants/assertions in programs to check if the program makes
the expected computations, input are as expected, ...
- Use a set of tests for different parts of code, eg to test if all
back-pointers are correctly set.
- Test if output is correct, eg by testing if different algorithms
generate the same output, use assertions in code, add code to check
various conditions, visualize output, automate check if output is
valid.
- Automate you test cases (eg using a set of scripts) so that you
can easily rerun tests and automatically update the plots generated.
- Describe in the report the different test cases used.
- If generating random input then describe precisely how you generate
the input and what random distributions you use (and the
library function you use in the implementation for generating random numbers).
- Do you handle all degenerate cases, eg three points on a line,
four points on a circle, identical points, overlapping line segments, ... ?
Common LaTeX typos
- $O(log n)$ should be $O(\log n)$
(the functions
max, min, log, exp,
sin,
cos,
etc., should be written as
\max, \min, \log, \exp, \sin, \cos).
- Don't write "Århus Universitet". The name is "Aarhus Universitet"
or "Aarhus University".
- Use "~" to avoid unfortunate linebreaks,
e.g. "Brodal~\cite{Brodal95}", "$n$~lines", "the point~$p$".
- Do not start a sentence with mathematics, eg. "$S$ is the set of
points" should be restated by something like "The set of points is denoted $S$".
- "theorem 7", "figure 2" should be "Theorem 7", "Figure 2"
(captital since name of a theorem or a figure)
- When writting sets "{ .. | ... }", the "|" is achieved using \mid.
Page maintained by Gerth S. Brodal