Data sets in large applications are often too massive to fit completely inside the computer's internal memory. The resulting input/output communication (or I/O) between fast internal memory and slower external memory (such as disks) can be a major performance bottleneck. During the last decade a major body of research has been devoted to the development of efficient external memory algorithms, where the goal is to exploit locality in order to reduce the I/O costs. This summer school will survey the state of the art in the design and analysis of external memory algorithms and data structures.
The lecturers and the main topics for the summer school are:
Lars Arge, Duke University | - External geometric data structures |
Erik D. Demaine, MIT | - Cache oblivious algorithms and data structures |
Paolo Ferragina, University of Pisa | - String algorithms and data structures |
Jeff Vitter, Duke University | - Fundamental external memory algorithms and use of parallel disks |
Norbert Zeh, Duke University | - Graph algorithms |
Lars Arge (external geometric data structures): geometric data structures (interval trees, priority search trees), range searching (range trees, kd-B trees, O-trees, cross-trees, R-trees), point location, proximity queries. B-trees, weight-balanced B-trees, persistent B-trees, buffer trees, priority queues Erik Demaine (cache oblivious algorithms and data structures): Cache-oblivious model, basic algorithms, sorting, data structures (search trees, priority queues), graph and geometry algorithms. Paolo Ferragina (string algorithms and data structures): String sorting, the indexing problem (word vs full-text indexes, time vs space trade-off), word indexes (inverted lists, text and index compression, block-addressing indexes, complex queries), full-text indexes (suffix array, suffix tree, string B-tree, opportunistic data structures, experiments). Jeff Vitter (fundamental external memory algorithms and use of parallel disks): I/O model, sorting upper and lower bounds, hashing, algorithms for parallel disks, scientific computing (matrix computations, FFT), hierarchical memory models, batched computational geometry/geographic information systems (distribution sweeping, spatial join, line segment intersection, convex hull), Norbert Zeh (graph algorithms): Data structures useful for solving graph problems I/O-efficiently (time-forward processing, list ranking, Euler tour, graph contraction), fundamental graph algorithms for general graphs (connectivity problems, breadth first search (BFS), depth first search (DFS), single source shortest path (SSSP)), algorithms for sparse graphs (general algorithms that happen to work well on sparse graphs, overview of graph classes where things can be done I/O-efficiently, planar graphs).
The European Educational Forum (EEF) is a joint initiative of European interuniversitary research schools in computer science, and involves 32 universities in Denmark, Finland, France, Germany, Italy, The Netherlands, and the United Kingdom. The summer school on Massive Data Sets is organised as a part of the EEF foundations series of summer schools, and is supported by the European Commission Research DG (Human Potential Programme, High-Level Scientific Conferences, HPCF-CT-1999-00186) and BRICS.
Questions concerning the summer school should be directed to
Summer School on Massive Data Sets
Att.: Gerth S. Brodal
BRICS, Department of Computer Science
University of Aarhus
DK - 8000 Aarhus C
Denmark
Phone: (+45) 8942 3200
Fax: (+45) 8942 3255
E-mail: MassiveData02@brics.dk