Project Description
The following description is an excerpt from the project contract.
Contents
- Project Summary
- Objectives
- Introduction
- Size and Complexity
- Algorithmic Methods
- Project Workplan
- Introduction
- Work Package 1: Massive Data Sets
- Work Package 2: Networks and Communication
- Work Package 3: Production and Transportation Planning
- Work Package 4: Generic Methods
- Work Package 5: Experimental Algorithmics
- Work Package 6: Project Management,
Dissemination, Evaluation
ALCOM-FT brings together eleven of the leading groups in algorithms
research in Europe in a project that proposes to discover new
algorithmic concepts, identify key algorithmic problems in important
applications, and contribute to the accelerated transfer of advanced
algorithmic techniques into commercial systems.
The main emphasis of the project is on a novel combination of
application oriented research in three important
areas - massive data sets, massive and complex
communication, and complex problems in production and
planning - with innovative methodological work on
experimental algorithmics and generic algorithmic
methods.
Algorithms and their analysis are at the heart of information
technology. Significant advances in algorithms are a crucial factor
in such technological developments as (1) modern search-engines which
process enormous amounts of data "instantaneously", (2) world-wide
communication, and (3) electronic commerce.
The rapid development of the Internet and other information highways
is radically changing society. Widespread use of advanced algorithmic
techniques in the information infra-structure and in engineering and
business applications will lead to more competitive systems and will
open opportunities for the design and construction of entirely new
systems.
Properly exploiting the revolutionary new technologies will require
new algorithmic methods. The consortium behind this project (11
partners in eight EU member states and one NAS country) will
co-operate in the rebuilding of the theory and practice of
algorithmics, in light of technological developments. We have
identified three key challenges and two methodologies as the content
of a long-term (or "fundamental") research project, i.e., of a
high-risk project of the type in which new insight is discovered,
published, applied, and disseminated within a five-to-ten year
horizon.
We shall consider problems of size and
complexity, specifically within the challenges of "massive data
sets," "massive and complex communication," and "complex problems in
production and planning." Our methodologies are "experimental
algorithmics" and "generic algorithmic methods." Dealing efficiently
with size and complexity has always been a primary goal for the theory
of algorithmics; however, the actual challenges are in continuous
technological revolution, so methodological breakthroughs will be
required.
Ever larger data sets arise in important applications like
Internet-search, data mining, electronic libraries, geographic
information systems, computer graphics, scientific computing etc.
Archives for satellite images, climate simulation or elementary
particle physics already work with petabytes of data, and for many
existing and new applications, no size limits are in sight.
Patterns of communication are changing radically. The Internet has
dramatically altered the way people share, acquire and exchange
information, and technological development will reinforce this change.
All-optical networks allow for fast travel of multimedia data,
wireless mobile/radio technology allows for forming temporary networks
without the aid of an established infrastructure, and satellite
networks allow for asymmetric multicast communication of high
bandwidth.
Efficient planning in business and society is as important as ever and
instances of concrete planning problems are becoming larger and larger
and more and more complex. This leads directly to a host of
optimization problems, the solutions of which are of significant
practical interest and represent a considerable challenge in algorithm
design.
Traditional algorithms, traditional network design methods and
protocols, and traditional approaches to combinatorial optimization
are insufficient for the handling of these problems.
- Algorithms for massive data problems must take into account that
the size of internal memory is limited, that access to
external memory (e.g., hard disks) is currently four to six
orders of magnitude slower than access to internal memory, and that
external memory rewards locality of reference (as data is transferred
in blocks between internal and external memory).
- The large networks of today (and the very large networks of
tomorrow) are difficult to control, and the increased contention for
key resources such as bandwidth or (optical) frequencies leads to
complexity bottlenecks in a qualitatively new way. Design and
optimization of networks, data allocation in networks, and control of
distributed computations towards common global goals are among the
most outstanding problems.
- Scarce resources must be used efficiently. This can be achieved
by good planning, but planning problems are becoming more and more
complex because of increasing size of companies, more involved
business processes, expanding governmental regulations etc. Since
human planners cannot cope with the big planning problems of tomorrow,
efficient computer algorithms are needed to assist them. For a number
of important areas, there are no such adequate algorithms available.
We shall construct new models and algorithms that take into account
multiple disks and processors, memory hierarchies, and networks. We
shall develop mathematical models of the key properties of new
communication systems, and develop algorithmic methods of general
applicability for the solutions of the important problems. We shall
design, theoretically analyze, and experimentally validate efficient,
robust, and stable algorithmic solutions to optimization issues in
communication networks and dynamic distributed environments. We have
selected important planning problems in the area of production and
transportation. We shall develop algorithmic techniques to solve them,
implement these techniques, and test them on real-world instances
supplied by our industrial contacts.
Many of the algorithms for external-memory problems will be delivered
in the form of (extensions to) algorithmic libraries. We shall also
produce software for design and resource allocation in networks as
well as for the efficient implementation of programs in dynamic
distributed environments. We shall build a library containing
real-world problem instances in planning. Our software will be made
available to a broad spectrum of users, including both industry and
academia.
Theoretical computer science has not only established an impressive
body of knowledge about the intrinsic nature of computation, but has
also produced a vast number of generic algorithmic methods leading to
decisive improvements of the solutions of large scale
problems. Important examples of such generic methods are data
structuring, on-line and dynamic algorithms, paradigms for parallel
and distributed computing, randomization, approximation, etc.
This body of knowledge constitutes the scientific platform on which
the consortium meets the challenges of size and complexity. We also
realize that the rapid transfer of relevant algorithmic techniques
into the applications domain requires an expanded view of the design
methodology relative to that of classical algorithmics alone. The
objects of interest in traditional algorithmics are algorithms and the
techniques for their analysis, whereas there is currently not a
well-defined methodology for the study of concrete programs. It
is necessary to make programs a first-class object of interest and we
expect this expanded view to lead to a methodological breakthrough.
We shall attack the challenges described, lay the foundation for
future applications, and establish a solid platform for experimental
algorithmics. The vehicle will be continued development of carefully
chosen generic methods with special emphasis on randomization,
combinatorial optimization and approximation.
- Randomization has been seen to apply to amazingly diverse areas
of computer science including communication (e.g., randomization is
vital in protocols such as Ethernet and TCP/IP), very large data sets
(e.g., the treatment of large textual data is strongly dependent on
hashing methods), and combinatorial optimization. One of our major
objectives will be to develop and organize randomized methods, most
notably (but not exclusively) in the context of these three
application areas.
- Discrete optimization techniques are of vital importance for
information systems and production plans. Relatively little is known
about solution methods to unstructured optimization problems, and our
approach will mainly be based on branch-and-bound techniques.
Much more is known about optimally solving highly-structured integer
linear programs. However, existing software for this task is
difficult to use because it requires substantial algorithmic ingenuity
and advanced implementation. Our main emphasis will be on the
development of an easy-to-use solver that understands
high-level specifications, and is based on the techniques of
branch-and-cut and constraint programming.
- Approximation is the approach of choice when a problem is known
to be hard. As opposed to heuristics with no guarantees, it can lead
to practical algorithms with a solution demonstrably close to optimal.
The development of approximation methods requires an in-depth analysis
of the combinatorial structure of optimization problems, and our major
objective will be to extend greatly the classes of approximable
problems with a view to testing the resulting algorithms on real-life
instances.
We shall produce methodologies for experimental
algorithmics as well as a testbed for running reproducible
experiments. Also, we shall provide tools for performance prediction,
that are capable of informing users about the expected behaviour of
their algorithms much more accurately than traditionally rough
analysis.
The work on randomization will address the precise quantification of
efficiency, of probabilities of failure, and other similar questions,
all at a fundamental level. Also, we will produce test sets for
experimental algorithmics that are based on randomization.
We will investigate discrete structures that play a central role in
our combinatorial optimization problems. The new structural results
that we obtain are expected to lead to more efficient algorithms that
will serve as building blocks in the overall solutions.
We will design a high-level description language for combinatorial
optimization problems, implement it and deliver a complete easy-to-use
software system for structured combinatorial optimization problems.
We will test our algorithms on recognized test instances and on
problems supplied by our industrial contacts.
The workplan is organized as five mutually related work packages:
- Massive Data Sets
- Networks and Communication
- Production and Transportation Planning
- Generic Methods
- Experimental Algorithmics.
The first three contain the work related to the challenges of size and
complexity, and the last two contain the more methodologically
oriented work. We view the relations between the work packages as
illustrated by the figure below.
Relations between the work packages
Besides these five work packages, an additional work package for
project management, dissemination, and evaluation is included.
Objectives
Future technologies allow high performance computation on the massive
data sets produced by the information society (e.g., the internet,
digital libraries, data warehouses, geographic information systems, or
virtual reality). To make this possible we want to devise innovative
algorithms which consider multiple disks and processors, memory
hierarchies, and networks.
Description of Work
Engineer algorithms for data mining, fundamental data structures,
graph problems, and string problems. General techniques for handling
parallel disks and processors using realistic machine models. Design
and implementation of an experimental platform for external
computation. Applying our expertise in parallel computing,
probabilistic algorithms, combinatorial optimization, and software
libraries will be an important part of the work.
Objectives
The current explosion of network users will lead to a rapid increase
in communications demand for new applications based on wireless
technology, high bandwidth services and multimedia. For this reason
we foresee that, despite the increase of communication resources,
their efficient use will be a key factor of future network
management. Many important issues in managing advanced communication
networks (including high-speed ATM networks, all-optical and wireless
networks, distributed media servers systems) have inherently
combinatorial and algorithmic nature. We plan to use advanced
algorithmic methods, computational and mathematical techniques and
software platforms development to design, analyze and validate
efficient algorithmic solutions to such fundamental optimization
problems in modern networks and communications.
Description of Work
In this work package, we will design, theoretically analyze and
experimentally validate efficient, robust, and stable algorithmic
solutions to selected fundamental optimization issues in modern
network communications.
We focus on two central issues, efficient communication in modern
networks, where we will investigate the efficient use of resources
(frequencies, bandwidth) and develop basic protocols (routing,
contention resolution) in advanced networks (all-optical, wireless,
ATM) and investigate their stability properties; and dynamic
distributed environments where we will provide methods for job
allocation and load-balancing, as well as efficient data management
and contention resolution.
We will also develop integrated software platforms for the
implementation and experimental validation of the proposed algorithmic
solutions.
Objectives
More and more, companies become aware that in order to run their business
processes as efficiently as possible they must improve the quality of their
planning. The planning problems are so complex that algorithmic support is
needed to produce a good solution. In co-operation with our industrial
contacts we will devise algorithms for a number of real world planning
problems in the areas of production and transport.
Description of Work
We will study a number of practical planning problems that we have
identified in cooperation with our industrial partners, including job-shop
scheduling, dynamic scheduling, facility location, routing on structure
networks, and time tabling. We shall develop algorithmic techniques to
tackle these problems and implement the solutions.
Objectives
Our main objective is to develop the new algorithmic methods which will be
necessary to properly exploit the new technologies which are currently
available and becoming available. These methods will
include probabilistic methods, optimization methods,
and approximation methods, all of which are of vital importance
to the rest of the project.
Description of Work
There are strong links between Work Packages 1, 2, and 3, and progress
along the lines of a generic methodology. Consequently, a central
theme of the project will be research in generic algorithmic methods.
The main emphasis will be to develop probabilistic techniques,
discrete optimization techniques and approximation techniques. In the
course of this work, we will need to use and improve other
computational and mathematical techniques such as approximation
methods, heuristic methods, computer algebra modeling, derandomization
and tractability analysis. Many of the problems that we consider will
have an on-line nature (i.e., not all of the data are available at the
start of the computation).
Objectives
The objective of this work package is twofold. First, we will produce
methodologies for experimental algorithmics and a testbed for running
reproducible experiments. This will be extremely beneficial to
researchers and developers interested in practical aspects of
algorithms and their implementation issues. Second, we will produce
cache-aware algorithms and tools for performance prediction, which are
able to predict the behaviour of programs more accurately than
classical asymptotic analysis. This will be beneficial to the
community of users of algorithmic techniques at large.
Description of Work
Engineering of cache-aware data structures and algorithms. Development
of techniques for more realistic performance predictions and for
checking of programs. Construction of generators of worst-case, random
and simulated real-life problem instances for use in experimental
evaluation of algorithms. Development of techniques for visualization
of program behaviour. Construction of an experimental
testbed - a software tool supporting the experimental
evaluation of algorithms, incorporating the results of the work areas
mentioned above.
Objectives
To secure the smooth operation of the project. To ensure proper
dissemination of results, and organise self-assessment and evaluation.
To represent the project in relation to external partners, including
academia, industry and the EU-commission.
Description of work
The project will be led by a Consortium Board, chaired by the
Coordinator. The board will appoint a Steering Committee and one or
more Work Package Leaders for each Work Package. The board will carry
out and/or supervise the following activities: project development,
budgeting, and all matters related to project financing; maintenance
of the infra-structure of the consortium; organisation of project
meetings, workshops, scientific exchanges etc.; management reporting
from the project; all other duties related to the general operation
and representation of the consortium activities relating to the
project.