Theoretical Computer Science TCS
Theoretical Computer Science TCS
net/publication/242260585
CITATION READS
1 4,229
4 authors, including:
Jens Lagergren
KTH Royal Institute of Technology
126 PUBLICATIONS 4,166 CITATIONS
SEE PROFILE
All content following this page was uploaded by Viggo Kann on 31 January 2014.
The objective of the research in Theoretical Computer Science is to inves- The objective of the research in
tigate methods of efficient computation in a mathematically precise sense, Theoretical Computer Science
and to find lower bounds on the computational resources required for a is to investigate methods of
computation. The types of computations studied are either chosen for their efficient computation ...
tractability or for their importance in practical applications. The different
areas of study are described below.
During the period 1993-1999 the group has been supported by Nutek, http://www.nada.kth.se/
BFR, TFR, NFR, SSF, and HSFR. theory/
Approximation algorithms
Johan Håstad, Viggo Kann, Jens Lagergren
65
Viggo Kann and Pierluigi Crescenzi have compiled a list of the best
lower and upper bounds known for more than two hundred well-studied
NP-complete optimization problems. We try to collect all new results in the
wide area of approximation in order to keep the problem list updated. The
list is included in a new text book on approximation [Ausiello et al., 1999],
and is also available for everybody on the web as http://www.nada.kth.se/
~viggo/problemlist/. It has frequently been used by researchers throughout
the world in the last five years, see [Kann and Crescenzi, 1998].
Complexity
Johan Håstad, Mikael Goldmann
66
Cryptography
Johan Håstad, Mikael Goldmann, Mats Näslund
67
Decomposability
Stefan Arnborg, Jens Lagergren
68
ments, expressed in some logic, verification seeks a proof of R while test-
ing seeks a satisfying assignment for the negation of R. Thus (by Gödel’s
Completeness Theorem) testing and verification are in a strong sense dual.
During the period in question, we have carried out research into for-
mal methods based black box software testing. This research includes:
(i) Case studies of software requirements, particularly in the domains
of computational finance (Black-Scholes) and control systems design (au-
tomotive systems). Characteristic of these areas is the need for floating point
computation, and thus the limited applicability of present day formal verifi-
cation methods.
(ii) Algorithm design for automated testing. We develop decision al-
gorithms for the satisfiability problem for first-order logic over finite large
cardinality data types. First order formulas are used to define program pre-
and postconditions which can provide an oracle during testing. Our ap-
proach to algorithm design is based upon function approximation theory.
The basic strategy is to iteratively search for a satisfying assignment, using
the outcome of searches to construct an approximate model of the system,
from which we can prune the search space.
Future research in this area will include:
(iii) Theoretical study of the coverage problem. We are attempting to
find an appropriate stochastic model for the variance between the func-
tional behaviour of programs and their approximate models in the sense of
(ii). For example, is a random walk model appropriate? Such models could
be used to estimate coverage in a mathematically precise way.
Results in this project are presented in [Abdulla et al.,1999], [Berg et
al., 1999], and in [Meinke and Nielsen, 1999].
A description of our research project for the layman can be found on
the web at http://www.nada.kth.se/\~karlm/Testing.htm .
Computational biology
Jens Lagergren, Henrik Eriksson
69
biologically motivated algorithmic problems. For instance, sequencing of
DNA constitutes a partly algorithmic problem. Moreover, it gives rise to an
enormous amount of discrete data, and the analysis of this data often gives
algorithmic problems. Many of the most interesting problems are NP-com-
plete, and thus expected to be difficult to solve, but sometimes there are
ways around this dilemma. One can consider approximation algorithms,
fixed parameter variations etc.
One fundamental problem in One fundamental problem in computational biology is to measure
computational biology is to the similarity between two genomes. The Sequence Alignment problem is
measure the similarity between to, given two DNA sequences, insert blanks into the sequences in such a
to genomes. way that they become as similar as possible.
Say that we are given two genomes as sequences of genes. One can
measure the distance between these two genomes by counting the least
number of operations that transform one into the other. One example of an
operation that is reasonable to use (since it mimics an operation on the
genome that takes place during evolution) is reversal, that is any subse-
quence of the gene-sequence may be reversed. We have studied the algo-
rithmic part of a procedure to obtain the gene sequences of a chromosome –
namely Radiation Hybrid mapping see [Håstad et al., 1998], and [Ivansson,
and Lagergren, 1999].
Another fundamental problem Another fundamental problem in computational biology is that of
in computational biology is that inferring the evolutionary history of a set of species, given some represen-
of inferring the evolutionary tation of the species. The evolutionary history is represented by a
history of a set of species, gi- phylogenetic tree. The vertices of a phylogenetic tree are labelled with spe-
ven some representation of the cies, given species or inferred species (representing ancestors of the given
species. species). Two adjacent nodes of the phylogenetic tree should be labelled
with species such that one is the ancestor of the other.
The following strategy is often used for reconstructing the evolu-
tionary relationships of a set of species (i.e. their species tree): One begins
by constructing phylogenetic trees for a set of distinct gene families (i.e.
gene trees). Typically, these gene trees are built using one of the standard
techniques for constructing phylogenetic trees from molecular sequence
data. However, for many gene families, the gene tree differs from the spe-
cies tree (using another terminology, their topologies disagree). Hence, a
single gene tree is not considered sufficient for inferring a species tree. For
this reason, a set of gene trees is often used in order to increase the reliabil-
ity of the resulting species tree. We have studied this algorithmic problem,
that is, given a set of disagreeing gene trees find the species tree [Hallet and
Lagergren, 1999].
70
Algorithms in language engineering
Viggo Kann
The goal of this project is to combine algorithms that have a solid math- The goal of this project is to
ematical base with linguistic knowledge in order to construct tools for Swed- combine algorithms ... to
ish text processing; tools that are both accurate and capable of handling construct tools for Swedish text
large amounts of data without speed degradation. We have earlier success- processing
fully constructed tools for Swedish hyphenation and Swedish spelling error
detection and correction.
Currently we are focusing on Swedish grammar checking and proof
reading. We are constructing Granska, a hybrid system using both statisti-
cal and linguistical methods for checking and correcting Swedish text. Much
work has been devoted to constructing a part-of-speech tagger that tags
each word in a sentence with its word class and inflectional features, see
[Carlberger and Kann, 1999]. By attacking the problems of disambiguation
of ambiguous words and tagging of unknown words we have managed to
obtain a tagger that tags 97% of the words correctly.
We have also constructed and implemented an efficient and power-
ful artificial language to express grammatical errors and corrections. In the
IPLab part of the project, many rules in this language have been written
http://www.nada.kth.se/
and evaluated, and a graphical user interface has been constructed. There is
theory/projects/granska/
a web interface to Granska: http://www.nada.kth.se/theory/projects/granska/
demo.html
demo.html
The project is to perform research on architecture, design and usage quality The project is to perform re-
aspects of large-scale brain image databases and related data analysis man- search on architecture, design
agement systems. It combines research objectives from the subject areas of and usage quality aspects of
database technology and brain image analysis so as to contribute to the large-scale brain image
ECHBD image database and the BINS neuroinformatics analysis center database...
projects (see below), investigate the application of advanced object-oriented
71
methodology in heterogeneous, networked scientific image analysis sys-
tems, and develop improved technology for meta-research and data mining
in large-scale brain image databases.
The European Computerized Human Brain Database (ECHBD)
project is supported by an EC Biotech grant and the Brain Image
Neuroinformatics System (BINS) by an SSF grant. Both projects were ini-
tiated and are led by professor Per Roland at the Division of Human Brain
Research of the Karolinska Institute in Stockholm.
There is no mature methodology for management of 3D spatial and
spatiotemporal image databases. Therefore, one objective of the project is
to evaluate and refine emerging such methodology, in particular the
RasDaMan system. We will also investigate the use of mediator technology
(the AMOS II system) in a large and complex scientific database environ-
ment.
The project has a neuroinformatics usage perspective involving man-
agement of very large, inhomogeneous data collections consisting mainly
of 3D raw data images, logical and statistical operations and queries on
sequences of such images, performance requirements arising from interac-
tive display 3D images, and web distribution and access of images.
From this usage perspective, the project investigates multidatabase
architecture and data modeling issues, data mining methods for functional
PET and MRI brain imagery, use of a supercomputing center as a backend
storage and computing resource, and performance issues in storage, access,
query, and display of brain imagery.
A brain image database system, based on the RasDaMan raster data-
base manager and the O2 object-oriented database management system,
was developed for the ECHBD project. The first working version of the
system was made available to the ECHBD project partners in September
1999.
A research group has been formed whose first task is to design the
general architecture of the BINS raw image database system.
Some of the groups results can be found in [Fredriksson, 1999] and
in [Fredriksson et al., 1999].
72
Current GIS’s lack efficient facilities for integrating the base system
and a domain model for spatial analysis. Such facilities may be provided by
an object-oriented database language, whose primitives represent funda-
mental operations of spatial analysis.
The use of a declarative object-oriented query language for domain
modelling offers several advantages over conventional imperative program-
ming techniques. It permits clear expression of ad-hoc analysis problems
on a high level of abstraction. Declarative models are easy to define, in-
spect and understand and lead to compact, easily reusable, and powerful
domain models. Object-oriented query languages provide object views which
are invoked independently of whether they represent stored or derived data.
This feature supports data independence and schema evolution. The Amorose project aims to
The Amorose project aims to develop a prototype spatial analysis develop a prototype spatial
and database management system based on the AMOS II main-memory analysis and database manage-
object mediator system (Risch et al., Uppsala University) with spatial ex- ment system...
tensions from the ROSE library (Gueting et al., Fernuniversitaet Hagen) ,
the prototype will be used to evaluate the prototype over a representative
set of spatial data analysis tasks.
In the first phase of the project (1996–98), the Rose library was
brought to a state where it satisfies the basic needs of the Amorose project.
As a result of this work, Rose was used as the basic computational tool in a
study of the concept of rough sets for uncertainty representation in spatial
data classification, see [Ahlqvist et al., 1999].
The second phase involved the preliminary integration of Rose into
Amos. This first integrated Amorose prototype showed how complex spa-
tial queries could be formulated in AmosQL and executed within Amorose.
The ongoing third development phase aims at close integration of
the modified ROSE library and the Windows NT-based Amos II mediator
system. The resulting Amorose system will provide a scalable memory in-
tegration mechanism extending the Amos query optimizer and making use
of query optimization techniques for efficient query evaluation.
73
References–TCS
Book
[Ausiello et al., 1999] Ausiello, G., Crescenzi, P., Gambosi, G., Kann, V.,
Marchetti Spaccamela, A., and Protasi, M. (1999), Complexity and
Approximation— combinatorial optimization problems and their
approximability properties, Springer Verlag.
Theses
[Engebretsen, 1998] Engebretsen, L. (1998), Approximating
Generalizations of Max Cut, Licentiate thesis.
[Keukelaar, 1999] Keukelaar, J. (1999), A Visual Programming Language
for the Analysis of Uncertain Spatial Data, Licentiate thesis.
[Näslund, 1998] Näslund, M. (1998), Bit Extraction, Hard-Core Predicates,
and the Bit Security of RSA, PhD thesis.
[Ulfberg, 1999] Ulfberg, S (1999), On Lower Bounds for Circuits and
Selection, PhD thesis.
Journal publications
[Ahlqvist et al., 1999] Ahlqvist, O., Keukelaar, J. H. D., and Oukbir, K.
(1999), Rough classification and accuracy assessment, accepted to
Internat. J. Geographic Information Science.
[Amaldi and Kann, 1998] Amaldi, E. and Kann, V. (1998), On the
approximability of minimizing nonzero variables or unsatisfied
relations in linear systems, Theoretical Comput. Sci. 209, 237–260.
[Andersson and Engebretsen, 1998] Andersson, G. and Engebretsen, L.
(1998), Better approximation algorithms and tighter analysis for Set
Splitting and Not-All-Equal Sat, Inform. Process. Lett., 65:6, 305–
311.
[Berg and Ulfberg, 1998] Berg, C. and Ulfberg, S. (1998), A lower bound
for perceptrons and an oracle separation of the PPPH
hierarchy, J. Comput. and Syst. Sci. 56, 263–271.
[ Cai et al., 1998] Cai, L., Chen, J., and Håstad, J. (1998), Circuit bottom
fan-in and computational power, SIAM J. Computing 27:2, 341–
355.
[Carlberger and Kann, 1999] Carlberger, J. and Kann, V. (1999),
Implementing an efficient part-of-speech tagger, Software Practice
and Experience, 29, 815–832.
[Crescenzi and Kann, 1998] Crescenzi, P. and Kann, V. (1998), How to find
the best approximation results — a follow-up to Garey and Johnson,
SIGACT News 29:4, 90—97.
74
[Fernández-Baca and Lagergren, 1998] Fernández-Baca, D. and Lagergren,
J. (1998), On the approximability of the Steiner tree problem in
phylogeny, J. Discrete and Applied Math., Special issue on comp.
molecular biology, 88:127–143.
[Goldmann and Håstad, 1998] Goldmann, M. and Håstad J. (1998).
Monotone Circuits for Connectivity Have Depth (log n)2-o(1)
SIAM J. Computing, 27:5, 1283–1294.
[Goldmann and Karpinski, 1998] Goldmann, M. and Karpinski, M. (1998),
Simulating threshold circuits by majority circuits, SIAM J.
Computing, 27:1, 230–246.
[Goldreich and Håstad, 1998] Goldreich, O. and Håstad, J. (1998), On the
complexity of interactive proof with bounded communication, Inform.
Process. Lett. 67:4, 205–214.
[Håstad, 1998] Håstad, J. (1998), The shrinkage exponent of De Morgan
formulas is 2, SIAM J. Computing, 27:1, 48–64.
[Håstad, 1999] Håstad, J. (1999), Clique is hard to approximate within
n1-ε, Acta Mathematica, 182, 105–142.
[Håstad et al., 1999] Håstad, J., Imagliazzo, R., Levin, L., and Luby, M.
(1999), A Pseudorandom Generator from any one-way function,
SIAM J. Computing, 28:4, 1364–1396.
[Kann et al., 1999] Kann, V., Domeij, R., Hollman J., and Tillenius M.
(1999), Implementation aspects and applications of a spelling
correction algorithm, in R. Koehler, L. Uhlirova, G. Wimmer: Text
as a Linguistic Paradigm: Levels, Constituents, Constructs. Festschrift
in honour of Ludek Hrebicek.
[Kann et al., 1998] Kann, V., Lagergren, J., and Panconesi, A. (1998),
Approximate MAX k-CUT with subgraph guarantee, Inform. Pro
cess. Lett. 65, 145–150.
[Lagergren, 1998] Lagergren, J. (1998), Upper bounds on the size of
obstructions and intertwines. Journal of Combinatorial Theory
Series B, 73:1, 7–40.
[Johansson, 1998] Johansson, Ö. (1998), Clique-decomposition, NLC-
decomposition, and modular decomposition — relationships and
results for random graphs, CGTC 1998, Congressus Numerantium,
132, 39–60.
[Johansson, 1999] Johansson, Ö. (1999), Simple distributed ∆+1-
coloring of graphs,Inform. Process. Lett. 70:5, 229–232.
75
Conference publications
[Abdulla et al., 1999] Abdulla, P., Ciapessoni, E., Marmo, P., Meinke, K.,
and Ratto, E. (1999), FAST: an integrated tool for verification and
validation of real time system requirements, in P.G. Larsen (ed) Proc.
Fifth FMRail Workshop, electronic publication with LNCS 1708/
1709, Proc. FM ’99, Springer Verlag.
[Andersson, 1999] Andersson, G. (1999), An approximation algorithm for
Max p-section, STACS 1999.
[Andersson and Engebretsen, 1998] Andersson, G. and Engebretsen, L.
(1998), Sampling methods applied to dense instances of non-Boolean
optimization problems, RANDOM 1998, LNCS 1518, 357–368.
[Andersson et al., 1999] Andersson, G., Engebretsen, L., and Håstad, J.
(1999), A new way to use semidefinite programming with
applications to linear equations mod p, SODA 1999, 41–50.
[Arnborg, 1999a] Arnborg, S. (1999a), Learning in Prevision Space, ISIPTA-
99.
[Aumann et al., 1999] Aumann, Y., Håstad, J., Rabin, M., and Sudan, M.
(1999), Linear consistency testing, Proceedings of Workshop on
Randomization and Approximation Techniques in Computer Science,
Berkeley CA.
[Engebretsen, 1999] Engebretsen, L. (1999), An explicit lower bound for
TSP with distances one and two, STACS 1999.
[Fredriksson, 1999] Fredriksson, J. (1999), Design of an Internet
accessible visual human brain database system, IEEE International
Confer-ence on Multimedia Computing and Systems (ICMCS ’99),
Florence, Italy.
[Fredriksson et al., 1999] Fredriksson, J., Roland, P. and Svensson, P. (1999),
Rationale and design of the European computerized human brain
database, System, Scientific and Statistical Database Management
(SSDBM ’99), Cleveland.
[Goldmann and Russell, 1999] Goldmann, M. and Russell, A. (1999), The
complexity of solving equations over finite groups, CCC 1999.
[Håstad et al., 1998] Håstad, J., Ivansson, L., and Lagergren, J. (1998),
Fitting points on the real line and its application to RH mapping,
ESA 1998, 465–476.
[Håstad and Näslund, 1998] Håstad, J. and Näslund, M. (1998), The
security of individual RSA bits, FOCS 1998, 510–519.
[Johansson, 1999] Johansson, Ö., NLC2-decomposition in polynomial
time, WG’99.
[Näslund and Russell, 1998] Näslund, M. and Russell, A. (1998)
Extraction of optimally unbiased bits from a biased source, IEEE
ITW ’98, 90–91.
76
Misc
[Arnborg, 1999b] Arnborg, S. (1999b), A survey of Bayesian Data Mining
–Part I: Discrete and semi-discrete Data Matrices, SICS
Technical Report.
[Berg et al., 1999] Berg, C., Ekström, R., and Meinke, K. (1999),
Automatic test case generation for derivative trading software,
submitted to: TEST Congress 2000.
[Goldmann et al., 1999] Goldmann, M., Näslund, M., and Russell, A. (1999),
Spectral bounds on general hard core predicates, manuscript.
[Hallet and Lagergren, 1999] Hallet, M.T. and Lagergren, J, (1999), New
Algorithms for the Duplication- Loss Model, manuscript.
[Ivansson and Lagergren, 1999] Ivansson, and Lagergren (1999), Algorithms
for RH Mapping: New Ideas and Improved Analysis, manuscript.
[Meinke and Nielsen, 1999] Meinke, K. and Nielsen, J. (1999), User
requirements capture with tabular TRIO: a case study of a cruise
controller, submitted to Formal Aspects of Computing.
77