This is my personal archive of the Algorithms and Complexity Theory seminars at the Department of Computer Science, Aarhus University, since 1996. Previously these were known as ALCOM seminars. The list is by no means complete. It mainly covers the seminar announcements ending up in my mailbox.- Gerth Stølting Brodal
June 21, 2024 | KVN Sreenivas, Indian Institute of Science Bengaluru (IISc) Bin Packing under Random Order: Breaking the Barrier of 3/2 |
June 14, 2024 | Debajyoti Kar, Indian Institute of Science, Bengaluru Parameterized Guarantees for Almost Envy-Free Allocations |
June 13, 2024 | Rolf Svenning, Aarhus University The All Nearest Smaller Values Problem Revisited in Practice, Parallel and External Memory |
May 31, 2024 | Bei Wang Phillips, University of Utah Capturing Robust Topology in Data |
May 30, 2024 | Peyman Afshani, Aarhus University Optimal Coresets for Low-Dimensional Geometric Median |
May 23, 2024 | Gerth Stølting Brodal, Aarhus University Funnelselect: Cache-Oblivious Multiple Selection |
May 16, 2024 | Sebastian Homrighausen, Aarhus University Estimating the Expected Social Welfare and Cost of Random Serial Dictatorship |
May 2, 2024 | Amik Raj Behera, Aarhus University Local Correction of Linear Functions over the Boolean Cube |
April 25, 2024 | Mads Bech Toftrup, Aarhus University Deep learning using less memory: parameter efficient training and quantization |
April 18, 2024 | Vincent Cohen-Addad, Google Research Recent Advances on Correlation Clustering |
April 11, 2024 | Mark Simkin, Aarhus University FRIDA: Data Availability Sampling from FRI |
April 9, 2024 | Jeff M. Phillips, University of Utah No Dimensional Sampling Coresets for Classification |
April 4, 2024 | Tianze Wei, City University of Hong Kong Fair Allocations of Items in Multiple Regions |
March 21, 2024 | Andrew Draganov, Aarhus University Total Unimodularity, Homology, Graph Laplacians |
March 14, 2024 | Hannah Josephine Keller, Aarhus University Differentially Private Selection from Secure Distributed Computing |
March 7, 2024 | Sudarshan Shyam, Aarhus University Boosting and Prime numbers |
February 29, 2024 | Arthur da Cunha, Aarhus University Randomized Methods in Integrated Circuit Design |
February 22, 2024 | Chris Schwiegelshohn, Aarhus University Deterministic Clustering in High Dimensional Spaces: Sketches and Approximation |
February 15, 2024 | Amik Raj Behera, Aarhus University How to compute roots of a polynomial? |
February 8, 2024 | Mikael Møller Høgsgaard, Aarhus University Majority-of-Three: The Simplest Optimal Learner? |
December 4, 2023 | Sudarshan Shyam, Aarhus University Low-distortion clustering with ordinal and limited cardinal information |
November 27, 2023 | Arthur de Cunha, Aarhus University Convolutional neural networks contain structured strong lottery tickets |
November 13, 2023 | Aniket Basu Roy, Aarhus University Packing Fréchet Balls |
November 1, 2023 | Sourav Chakraborty, Indian Statistical Institute, Kolkata, India Distinct Elements in Streams and the Klee's Measure Problem |
October 25, 2023 | Yakov Nekrich, Michigan Technological University Planar Nearest Neighbor Queries in Optimal Time: Semi-Online and Semi-Dynamic |
October 23, 2023 | Mads Bech Toftrup, Aarhus University On Generalization Bounds for Projective Clustering |
October 23, 2023 | Amik Raj Behera, Aarhus University Efficient PIT for Sparse Polynomials |
October 2, 2023 | Chris Schwiegelshohn, Aarhus University Towards Optimal Generalization Bounds for k-Means |
September 25, 2023 | Chris Schwiegelshohn, Aarhus University Optimal Coresets for Euclidean k-Means |
June 20, 2023 | Jens Kristian Refsgaard Schou, Aarhus University Space Efficient Functional Off-line Partial Persistent Trees with Applications to Planar Point Location |
May 16, 2023 | Mikael Møller Høgsgaard, Aarhus University AdaBoost is not an Optimal Weak to Strong Learner |
May 9, 2023 | Ioana Bercea, IT University of Copenhagen Locality in data structures |
April 25, 2023 | Nikolaj I. Schwartzbach, Aarhus University Hardness Self-Amplification of the Planted k-XOR Problem |
April 18, 2023 | Aniket Basu Roy, Aarhus University Covering Orthogonal Polygons with Rectangles |
March 21, 2023 | Sudarshan Shyam, Aarhus University Finding Fair Allocations under Budget Constraints |
March 14, 2023 | Ishaq Aden-Ali, University of California, Berkeley The One-Inclusion Graph Algorithm is not Always Optimal |
March 7, 2023 | Tomer Ezra, Sapienza University of Rome Contract design in combinatorial settings |
March 2, 2023 | Matteo Russo, Sapienza University of Rome Prophet Inequalities via the Expected Competitive Ratio |
February 28, 2023 | Mikael Møller Høgsgaard, Aarhus University Barriers for Faster Dimensionality Reduction |
February 14, 2023 | Rolf Svenning, Aarhus University Fully Persistent Search Trees in External Memory |
February 7, 2023 | Kasper Green Larsen, Aarhus University Fast Discrepancy Minimization with Hereditary Guarantees |
January 31, 2023 | Chris Schwiegelshohn, Aarhus University Breaching the 2 LMP Approximation Barrier for Facility Location with Applications to k-Median |
May 22, 20223 | David Saulpic, LIP6, Sorbonne Universit ́e, Paris France Clustering and Differential Privacy. |
December 6, 2022 | Zhile Jiang, Aarhus University Computing better approximate pure Nash equilibria in cut games via semidefinite programming |
November 29, 2022 | Maria Kyropoulou, University of Essex Not all Strangers are the Same: The Impact of Tolerance in Schelling Games |
November 22, 2022 | Amik Raj Behera, Aarhus University Shortest Cycle using Algebraic Techniques |
November 15, 2022 | Pingan Cheng, Aarhus University An Optimal Lower Bound for Simplex Range Reporting |
November 8, 2022 | Manaswi Paraashar, Aarhus University Quantum query-to-communication simulation and the role of symmetry |
November 1, 2022 | Mark Simkin, Aarhus University How to Compress Encrypted Data |
October 11, 2022 | Jens Kristian Refsgaard Schou, Aarhus University LP-type problems and Seidels algorithm revisited |
October 4, 2022 | Nidhi Rathi, Aarhus University ? |
September 20, 2022 | Martin Ritzert, Aarhus University Graph Machine Learning for Fuel Science |
September 1, 2022 | Dominik Peters, CNRS researcher at University Paris-Dauphine Voting Rules in Participatory Budgeting |
August 30, 2022 | Alex Munteanu, TU Dortmund Sketching Logisitic Regression |
June 9, 2022 | Christian Wulff-Nilsen, University of Copenhagen Distance Oracles for Planar Graphs |
May 25, 2022 | Pingan Cheng, Aarhus University On Semialgebraic Range Reporting |
May 19, 2022 | Giovanna Varricchio, Goethe-Universität Frankfurt am Main Maximizing Nash Social Welfare in 2-Value Instances |
May 12, 2022 | Gerth Stølting Brodal, Aarhus University Priority Queues with Decreasing Keys |
May 2, 2022 | Andrew Alexander Draganov, Aarhus University tSNE, UMAP, and Dimensionality Reduction via Gradient Descent |
April 7, 2022 | Peyman Afshani, Aarhus University On Cyclic Solutions to the Min-Max Latency Multi-Robot Patrolling Problem |
March 24, 2022 | Zeynep Gündogar, Aarhus University Tensor Concepts and Decomposing High Dimensional Data (continued) |
March 17, 2022 | Zeynep Gündogar, Aarhus University Tensor Concepts and Decomposing High Dimensional Data |
March 10, 2022 | Chris Rene Schwiegelshohn, Aarhus University Maintaining an EDCS in General Graphs: Simpler, Density-Sensitive and with Worst-Case Time Bounds |
February 17, 2022 | Karl Albert Friedrich Fehrs, Aarhus University The Complexity of Learning Approval-Based Multiwinner Voting Rules |
November 25, 2021 | Argyrios Deligkas, Royal Holloway University of London Pizza Sharing is PPA-hard |
November 22, 2021 | Prahladh Harsha, Tata Institute of Fundamental Research, India High Dimensional Expanders: An introduction and some recent applications |
November 19, 2021 | Eldon Chung, National University of Singapore 3Sum-Indexing: Adding to the Data Structure Lower Bounds |
November 18, 2021 | Martin Ritzert, Aarhus University Logic of Graph Neural Networks |
November 11, 2021 | Steffan Christ Sølvsten, Aarhus University Efficient External Memory Algorithms for Binary Decision Diagram Manipulation |
November 4, 2021 | Peyman Afshani, Aarhus University Locality-of-Reference Optimality of Cache-Oblivious Algorithms |
October 14, 2021 | Nodari Sitchinava, University of Hawaii at Manoa Atomic Power in Forks: A Super-Logarithmic Lower Bound for Implementing Butterfly Networks in the Nonatomic Binary Fork-Join Model |
October 7, 2021 | Svend Christian Svendsen, Aarhus University Algorithms for Massive Terrains and Graphs (PhD defence) |
October 7, 2021 | Constantinos Tsirogiannis, Synopsys Inc Algorithmic Problems in Microchip Design |
September 30, 2021 | Rasmus Killmann Brogaard Petersen, Aarhus University Hierarchical categorical range counting |
September 23, 2021 | Aniket Basu Roy, Aarhus Universitet Local Search for Geometric Packing and Covering Problems |
September 16, 2021 | Nidhi Rathi, Aarhus University Connecting Fair Cake Division to Combinatorial Topology |
August 21, 2021 | Christian Kroer, Columbia University Recent Advances in Iterative Methods for Large-Scale Game Solving |
April 29, 2021 | Allan Grønlund, Aarhus University Margins are Insufficient for Explaining Gradient Boosting |
March 25, 2021 | Peyman Afshani, Aarhus University Polynomial technique in computational geometry |
March 18, 2021 | Pingang Cheng, Aarhus University Lower Bounds for Semialgebraic Range Searching and Stabbing Problems |
March 11, 2021 | Svend Christian Svendsen, Aarhus University Multi-Directional Flow on a Terrain |
February 25, 2021 | Mark Simkin, Aarhus University Robust Property-Preserving Hash Functions for Hamming Distance and More |
February 18, 2021 | Ora Nova Fandina, Aarhus University Dimensionality Reduction: A Theoretical Perspective On Practical Measures |
February 11, 2021 | Srikanth Srinivasan, Aarhus University The Probabilistic degree of Boolean functions |
December 8, 2020 | Jesper Steensgaard, Aarhus Universitet Further Unifying the Landscape of Cell Probe Lower Bounds |
December 1, 2020 | Peyman Afshani, Aarhus University A lower bound for dynamic fractional cascading |
November 24, 2020 | Mark Simkin, Aarhus University Optimal Oblivious Priority Queues |
November 17, 2020 | Chris Schwiegelshohn, Aarhus University |
November 10, 2020 | Ioannis Caragiannis, Aarhus University |
November 3, 2020 | Gerth Stølting Brodal, Aarhus University Soft Sequence Heaps |
October 27, 2020 | Alexander Mathiasen, Aarhus University What if Neural Networks had SVDs? |
October 25, 2020 | Mads Bech Toftrup, Aarhus University The Power of Uniform Sampling for Coresets |
October 20, 2020 | Pingang Cheng, Aarhus University 2D Fractional Cascading on Axis-aligned Planar Subdivisions |
March 3, 2020 | Ioannis Caragiannis, Aarhus University Relaxing the independence assumption in sequential posted pricing, prophet inequality, and random bipartite matching |
July 19, 2019 | Daniel Noble, University of Pennsylvania Private Set Intersection with Linear Communication from General Assumptions |
June 25, 2019 | Ueli Maurer, ETH Zürich Constructive Cryptography (and more) |
May 9, 2019 | Mark Abspoel, CWI MPC over rings |
April 1, 2019 | Carsten Baum, Bar Ilan University Insured MPC: Efficient Secure Computation with Financial Penalties |
March 20, 2019 | Jelani Nelson, Harvard University BPTree: an l2 heavy hitters algorithm using constant memory |
March 7, 2019 | Anders Dalskov, Department of Computer Science, Aarhus University Secure Evaluation of Quantized Neural Networks |
February 14, 2019 | Ivan Damgård, Department of Computer Science, Aarhus University A Communication Lower Bound for Statistically Secure MPC with Preprocessing |
October 16, 2018 | Jeffrey Phillips, University of Utah Scalable Spatial Scan Statistics with Coresets |
October 10, 2018 | Chris Brzuska, Aalto University State-separation for game-playing proofs |
August 28, 2018 | Elena Pagnin, Chalmers University of Technology Multi-Key Homomorphic Authenticators |
June 26, 2018 | Andrej Bogdanov, Chinese University of Hong Kong Optimal extractors for generalized Santha-Vazirani sources |
June 25, 2018 | Nodari Sitchinava, University of Hawaii Sorting in the Asymmetric External Memory Model |
June 7, 2018 | Kobbi Nissim, Georgetown University Distribution and privacy: Computing heavy hitters |
October 6, 2017 | Alex Bronstein, Technion, Israel Institute of Technology Deep Neural Networks with Random Gaussian Weights: A Universal Classification Strategy? |
September 12, 2017 | Ninh Pham, University of Copenhagen Similarity Search in Hamming Space using Locality-sensitive Hashing |
August 29, 2017 | Aris Filos-Ratsikas, University of Oxford Hardness Results for Consensus-Halving and Necklace Splitting Problems |
August 24, 2017 | Luisa Siniscalchi, University of Solento Non-malleable protocols and 4-rounds multi-party coin tossing |
June 12, 2017 | Michele Ciampi, University of Salerno On delayed-input proof systems and round optimal 2PC with simultaneous message exchange channel |
May 22, 2017 | Franck Dufrenois, LISIC One class learning in high dimensions: null space based one class kernel fisher discriminant |
May 9, 2017 | Yilei Chen, Boston University Constraint-hiding constrained PRFs for NC1 from LWE |
May 2, 2017 | Tobias Nilges, Aarhus Universitet Maliciously Secure OLE with Constant Overhead |
May 2, 2017 | Lior Kamma, Weizmann Institute of Science Batch Sparse Recovery, or How to Leverage the Average Sparsity |
April 20, 2017 | Rebekah Mercer, UCL London Möbius: Smart Contracts for Blockchain Privacy |
April 4, 2017 | Claudio Orlandi, Aarhus University Will anyone ever use Secure Multiparty Computation |
March 28, 2017 | Bo Tang, Hong Kong Polytechnic University Exploring Non-Trivial Insights from Multidimensional Dataset |
February 1, 2017 | Stefan Schmid, Aalborg University Consistent Rerouting of Flows (with Applications to Software-Defined Networking) |
January 30, 2017 | Chaya Ganesh, New York University Efficient Zero-Knowledge Proof of Algebraic and Non-Algebraic Statements with Applications to Privacy Preserving Credentials |
December 13, 2016 | Antigoni Polychroniadou, Aarhus University Laconic Receiver Oblivious Transfer and its Applications |
November 22, 2016 | Antonio Faonio, Aarhus University Efficient Public-Key Cryptography with Bounded Leakage and Tamper Resilience |
November 10, 2016 | Thomas Dueholm Hansen, Aarhus University Recent results on linear programming and graph-based two-player zero-sum games |
November 8, 2016 | Daniel Esteban Escudero Ospina, Universidad Nacional de Colombia Multivariate Public Key Cryptosystems |
October 28, 2016 | Vladimir Podolskii, Steklov Mathematical Institute Computing Majority by Constant Depth Majority Circuits with Low Fan-in Gates |
October 17, 2016 | Matteo Campanelli, City University of New York Efficient verifiable computation with cryptocurrencies |
September 30, 2016 | Qin Zhang, Indiana University Bloomington Edit Distance: Sketching, Streaming and Document Exchange |
September 28, 2016 | Isaac A. Canales, Instituto Nacional Electoral (INE), Mexico Implementation of polynomial smoothness test, with an application to the DLP in small characteristic finite fields. |
September 20, 2016 | Antonis Thomas, ETH Zürich The Niceness of Unique Sink Orientations |
September 19, 2016 | Ivan Damgård, Aarhus University Unconditionally Secure Computation with Reduced Interaction |
September 6, 2016 | Rasmus Kyng, Yale University Lipschitz Learning on Graphs |
September 5, 2016 | Mark Simkin, Aarhus University Nearly Optimal Verifiable Data Streaming |
August 31, 2016 | Rafael Dowsley, Karlsruhe Institute of Technology A Look at the Commitment and OT Capacities of Noisy Channels |
August 18, 2016 | Sebastian Krinninger, Berkely Approximate Shortest Paths via Hop Sets: Distributed and Dynamic Algorithms |
August 17, 2016 | Karl Bringmann, Max-Planck-Institut für Informatik Improved Pseudopolynomial Time Algorithms for Subset Sum |
June 30, 2016 | Huacheng Yu, Stanford University An Improved Combinatorial Algorithm for Boolean Matrix Multiplication |
June 29, 2016 | Omri Weinstein, Princeton University On the Communication Complexity of Approximate Fixed Points |
June 9, 2016 | Michael Raskin, Aarhus Univesity, MADALGO Read count lower bound for nonredundant binary counters |
June 1, 2016 | Jesper With Mikkelsen, Univerity of Southern Denmark, Odense Randomization can be as helpful as a glimpse of the future in online computation |
May 24, 2016 | Mikhael Raskin, Aarhus Univeristy Lower bounds for secure multiparty computations based on balanced secret-sharing |
May 19, 2016 | Mathias Bæk Tejs Knudsen, University of Copenhagen Linear Hashing is Awesome |
May 18, 2016 | Søren Dahlgaard, University of Copenhagen Popular Conjectures as a Barrier for Dynamic Planar Graph Algorithms |
May 12, 2016 | Sune K. Jakobsen, Queen Mary, University of London Cryptogenography: Anonymity without trust |
May 3, 2016 | Maciej Obremsi, Aarhus Univeristy Afternoon of wonders and amazement sponsored by Hashing Families and Krein-Milman. |
April 26, 2016 | Guang Yang, Aarhus University Randomness Extraction from Big Sources |
April 26, 2016 | Kasper Damgård, Alexandra Instituttet A/S FRESCO: Framework for Secure Computation - why you need it! |
April 19, 2016 | Jelani Nelson, Harvard University Heavy hitters via cluster-preserving clustering |
April 6, 2016 | Constantinos Tsirogiannis, Aarhus University, MADALGO Fast Phylogenetic Biodiversity Computations Under a Non-Uniform Random Distribution |
March 31, 2016 | Rasmus Pagh, IT University of Copenhagen Locality-sensitive Hashing without False Negatives |
March 29, 2016 | Huacheng Yu, Stanford University Cell-probe Lower Bounds for Dynamic Problems via a New Communication Model |
March 21, 2016 | Mike Rosulek, Oregon State University Faster Malicious 2-party Secure Computation with Online/Offline Dual Execution |
March 17, 2016 | Irene Giacomelli, Aarhus University ZKBoo: Faster Zero-Knowledge for Boolean Circuits |
March 3, 2016 | Carsten Baum, Aarhus Univeristy Efficient Secure Multiparty Computation with Identifiable Abort |
February 25, 2016 | Tobias Nilges, Aarhus University (Practical?!) UC-secure 2PC from signature cards |
February 18, 2016 | Samuel Ranellucci, Aarhus University Oblivious Transfer from Any Non-Trivial Elastic Noisy Channels via Secret Key Agreement |
February 10, 2016 | Gerth Stølting Brodal, Aarhus Univeristy, MADALGO External Memory Three-Sided Range Reporting and Top-k Queries with Sublogarithmic Updates |
February 4, 2016 | Aris Filos-Ratsikas, University of Oxford Facility Assignment with Resource Augmentation |
February 3, 2016 | Rasmus Ibsen-Jensen Evolutionary games on graphs is PSPACE-complete |
January 27, 2016 | Christian Wulff-Nielsen, University of Copenhagen Approximate Distance Oracles for Planar Graphs with Improved Query Time-Space Tradeoff |
November 30, 2015 | Riko Jacob, IT University of Copenhagen On the Complexity of List Ranking in the Parallel External Memory Model |
November 25, 2015 | Thomas Dueholm Hansen, Aarhus University Simulating Branching Programs with Edit Distance and Friends or: A Polylog Shaved is a Lower Bound Made |
November 16, 2015 | Nodari Sitchinava, University of Hawaii Recent algorithmic advances in GPGPU computing |
October 28, 2015 | John Iacono, New York University The Power and Limitations of Static Binary Search Trees with Lazy Finger |
October 7, 2015 | Mordecai Golin, HKUST Optimal Binary Comparison Search Trees |
September 29, 2015 | Mordecai Golin, HKUST Graph Evacuation Problems |
September 24, 2015 | Johannes Fischer, Technische Universität Dortmund Computing and Approximating the Lempel-Ziv-77 Factorization in Small Space |
September 10, 2015 | Mathias Rav, Aarhus University, Madalgo I/O-Efficient Event Based Depression Flood Risk |
September 2, 2015 | Frank Staals, Aarhus University, MADALGO Trajectory Grouping Structure under Geodesic Distance |
August 27, 2015 | Siu-Wing Cheng, HKUST Approximate Shortest Paths in Weighted Regions |
August 25, 2015 | Mikhail Raskin, Independent University of Moscow Social network games & Minimal mean edge weight cycles |
August 20, 2015 | Mikkel Thorup, University of Copenhagen, Denmark Deterministic Edge Connectivity in Near-Linear Time |
August 12, 2015 | Rasmus Ibsen Jensen, IST Austria Min-cost Flows in Bipartite Graphs with Constant Targets |
June 10, 2015 | Jelani Nelson, Harvard University Toward a unified theory of sparse dimensionality reduction in Euclidean space |
June 3, 2015 | Tsvi Kopelowitz, University of Michigan Higher lower bounds from the 3SUM conjecture |
June 2, 2015 | Jared Saia, University of New Mexico Faster Agreement via a Spectral Method for Detecting Malicious Behavior |
April 30, 2015 | Moshe Babaioff, Microsoft Research A Simple and Approximately Optimal Mechanism for an Additive Buyer |
April 29, 2015 | Edith Elkind, University of Oxford Justified Representation |
April 22, 2015 | Bryan Wilkinson, Aarhus University, MADALGO Grid Heterogeneity and Square Color Range Counting |
March 27, 2015 | Viliam Lisy, Czech Technical University in Prague Convergence Guarantees of Monte-Carlo Tree Search in Concurrent and Imperfect Information Games |
March 25, 2015 | Edvin Berglin, Aarhus University, MADALGO Improving Your Exponents with Measure and Conquer |
March 11, 2015 | Thomas Dueholm, Aarhus University (CTIC) Hollow heaps |
March 5, 2015 | Raphael Clifford, University of Bristol Element Distinctness, Frequency Moments, and Sliding Windows |
February 25, 2015 | Karl Bringmann, ETH Zürich Quadratic Conditional Lower Bounds for String Problems and Dynamic Time Warping |
December 10, 2014 | Zengfeng Huang, Aarhus University, MADALGO The Deterministic Communication Complexity of Distributed ε-Approximations |
December 3, 2014 | Jesper Sindahl Nielsen, Aarhus University, MADALGO Top-k Term-Proximity in Succinct Space |
November 26, 2014 | Kasper Green Larsen, Aarhus University, MADALGO The Johnson-Lindenstrauss lemma is optimal for linear dimensionality reduction. |
November 19, 2014 | Sune Kristian Jakobsen, Queen Mary University of London Timeability of games |
November 14, 2014 | Jack Snoeyink, UNC Chapel Hill Deriving an optimal algorithm for red/blue segment intersection by degree-driven design of geometric algorithms |
October 27, 2014 | Jia Xu, Tsinghua University Ensemble learning in machine translation |
October 8, 2014 | Constantinos Tsirogiannis, Aarhus University, MADALGO Computing Distance Statistics in Trees: A Key Problem in Phylogeny Research |
October 5, 2014 | Andrea Farruggia, University of Pisa Speeding-up dynamic programming |
September 24, 2014 | Jungwoo Yang, Aarhus Univeristy, MADALGO Maintaining contour tree of Dynamic terrains |
September 3, 2014 | Bryan Wilkinson, Aarhus University, MADALGO Generalized Davenport-Schinzel Sequences |
September 1, 2014 | Daniela Maftuleac, University of Waterloo Shortest path problem in rectangular complexes of global nonpositive curvature |
August 22, 2014 | Herman Haverkort, University of Eindhoven Space-filling curves for 3D mesh traversals |
August 20, 2014 | Anne Dreimel, University of Eindhoven Data Structures for Trajectories, part 1 |
July 28, 2014 | Minming Li, City University of Hong Kong Algorithmic mechanism design on cloud computing and facility location |
July 16, 2014 | Konstantinos Tsakalidis, Chinese University of Hong Kong Deterministic Rectangle Enclosure and Offline Dominance Reporting |
July 2, 2014 | Karl Bringman, Max-Planck-Institut für Informatik Why walking the dog takes time: Frechet distance has no strongly subquadratic algorithms unless SETH fails |
June 11, 2014 | Tsvi Kopelowitz, University of Michigan Order/File-maintenance revisited again |
June 4, 2014 | Jesper Sindahl Nielsen, Aarhus University, MADALGO On Hardness of Several String Indexing |
May 28, 2014 | Ran Duan, Max-Planck-Institut für Informatik A Combinatorial Polynomial Algorithm for the Linear Arrow-Debreu Market |
May 21, 2014 | Suresh Venkatasubramanian, University of Utah A directed isoperimetric inequality with application to Bregman near neighbor lower bounds |
May 19, 2014 | Ilario Bonacin, La Sapienza University of Rome Space in Proof Complexity |
May 15, 2014 | John Iacono, New York University Cache-Oblivious Persistence |
April 22, 2014 | Gerard de Melo, IIIS, Tsinghua University Link Prediction in Big Knowledge Graphs |
April 16, 2014 | Wanbin Son, Aarhus University, MADALGO Group Nearest Neighbor Queries in the L1 Plane |
April 9, 2014 | Ingo van Duijn, Aarhus University, MADALGO Identifying hidden patterns in trajectories |
March 12, 2014 | Peyman Afshani, Aarhus University, MADALGO Optimal Deterministic Shallow Cuttings for 3D Dominance Ranges |
March 5, 2014 | Tsvi Kopelowitz, University of Michigan The Family Holiday Gathering Problem or Fair and Periodic Scheduling of Independent Sets |
February 19, 2014 | Edvin Berglin, Lund University On the performance of edge coloring algorithms for cubic graphsOn the performance of edge coloring algorithms for cubic graphs |
February 5, 2014 | Allan Grønlund, Aarhus University, MADALGO On min,+ matrix multiplication |
January 15, 2014 | Jeremy Barbay, University of Chile From Time to Space: Adaptive algorithms that yield fast compressed data structures |
January 13, 2014 | Hao Song, Tsinghua University Rectangle Overlay and Limited-Memory Communication Models |
December 4, 2013 | Hamid Rahkooy, Research Institute for Symbolic Computation, Hagenberg, Austria,
Centre de Recerce Matematica, Barcelona, Spain Solving Polynomial Equations |
December 4, 2013 | Peyman Afshani, Aarhus University, MADALGO Fast Computation of Output-Sensitive Maxima in a Word RAM |
November 29, 2013 | Darius Sidlauskas, MADALGO, Aarhus University Scalable Top-k Spatio-Temporal Term Querying |
November 20, 2013 | Tsvi Kopelowitz, University of Michigan Orienting Fully Dynamic Graphs with Worst-Case Time Bounds |
November 6, 2013 | Hsin-Hao Su, University of Michigan Distributed Algorithms for the Lovasz Local Lemma and Graph Coloring |
October 9, 2013 | Zengfeng Huang, Aarhus University, MADALGO The Communication Complexity of Distributed ε-Approximations |
October 2, 2013 | Gerth Stølting Brodal, MADALGO, Aarhus University The Encoding Complexity of Two Dimensional Range Minimum Data Structures |
September 25, 2013 | Michael Elkin, Ben-Gurion University Distributed Algorithms for Graph Coloring |
September 18, 2013 | Seth Pettie, Aarhus University/University of Michigan Sharp Bounds on Davenport-Schinzel Sequences of Every Order |
September 12, 2013 | Pankaj K. Agarwal, Duke University Range Searching and its Relatives: Theory & Practice |
August 2, 2013 | Felipe Lacerda, ETH Zürich/University of Brasilia Leakage-resilience from quantum fault-tolerance |
July 2, 2013 | Qiang-Sheng Hua, Tsinghua University Nearly Optimal Asynchronous Blind Rendezvous Algorithm for Cognitive Radio Networks |
June 20, 2013 | David Xiao, CNRS, Université Paris 7 Languages with Zero-Knowledge PCP's are in SZK |
June 19, 2013 | Casper Kejlberg-Rasmussen, Aarhus University, MADALGO I/O-Efficient Planar Range Skyline and Attrition Priority Queues |
June 19, 2013 | Bjoern Tackmann, ETH Constructive Cryptography -- Introduction and Current Trends |
June 18, 2013 | Bruno Grenet, Université de Rennes 1 Representation of polynomials, algorithms and lower bounds |
June 6, 2013 | Payman Mohassel, University of Calgary Private Function Evaluation: A General Framework and Efficient Realizations |
May 29, 2013 | Jesper Asbjørn Sindahl Nielsen, Aarhus University, MADALGO Expected Linear Time Sorting for Word Size Ω(log2 n loglog n) |
May 23, 2013 | Tore Frederiksen, Aarhus University MiniLEGO: Efficient Secure Two-Party Computation From General Assumptions |
May 22, 2013 | Grigory Yaroslavtsev, Pennsylvania State University Beating the Direct Sum in Communication Complexity |
May 16, 2013 | Mikkel Thorup, University of Copenhagen The Power of Tabulation Hashing |
May 8, 2013 | Bryan Wilkinson, Aarhus University, MADALGO Range Searching in Query-Dependent Categories |
May 8, 2013 | Ben Sach, University of Warwick Sparse Suffix Tree Construction in Small Space |
May 8, 2013 | Markus Jalsenius, University of Warwick Lower Bounds for Streaming Problems |
May 7, 2013 | Ilias Diakonikolas, University of Edinburgh A Complexity View on Unsupervised Learning |
May 2, 2013 | Oana Ciobotaru, Saarland University Rational cryptography: novel constructions, automated verification and unified definitions |
April 24, 2013 | Sarfraz Raza, Aarhus University, MADALGO Centerpoints and Tverberg.s Technique |
April 24, 2013 | Cristina Onete, CASED - Technische Universität Darmstadt Anonymity and PKE |
April 2, 2013 | Francesco Lettich, University of Venice Iterated spatial joins on GPUs |
March 25, 2013 | Axel Schroepfer, SAP Performance Optimization for Secure Computation Protocols and Prototypical Results |
March 22, 2013 | Goutam Pau, Jadavpur University, Kolkata, India, and visiting RWTH Aachen, Germany A Look into (Non)-Randomness of RC4 Stream Cipher |
March 18, 2013 | Periklis A. Papakonstantinou, ITCS/IIIS Tsinghua University How Hard are the DDH hard Groups? |
March 15, 2013 | Pavel Hubacek, Aarhus University Eliciting Truth using Proper Scoring Rules |
March 11, 2013 | Pratyay Mukherjee, Aarhus University Tamper Resilient Cryptography Without Self-Destruct |
March 6, 2013 | Ludwig Schmidt, Massachusetts Institute of Technology (MIT) The Constrained Earth Mover Distance Model with Applications to Compressive Sensing |
March 5, 2013 | Balagopal Komarath, IIT Madras Pebbling, Entropy and Branching program size lower bounds |
February 28, 2013 | Thore Husfeldt, IT University of Copenhagen The Parity of Directed Hamiltonian Cycles |
February 11, 2013 | Frédéric Dupuis, Aarhus University Min-entropy sampling, with applications to cryptography in the bounded quantum storage model |
January 30, 2013 | Yan Huang, University of Maryland Efficient Secure Two-Party Computation --- Implementation and Design Improvements |
January 29, 2013 | Yan Huang, University of Maryland Efficient Secure Two-Party Computation --- Implementation and Design Improvements |
January 28, 2013 | Tomas Toft, Aarhus University Sublinear Vickrey Auctions |
January 23, 2013 | Stijn Koopal, Technische Universiteit Eindhoven An Experimental Evaluation of Various Approximate Watershed Algorithms |
January 21, 2013 | Ivan Damgård, Aarhus University Monotone Formulae and Protocol Design |
January 16, 2013 | Fang Song, Penn State University Cryptography in a quantum world |
January 14, 2013 | Antigoni Polychroniadou, Aarhus University New Directions In Recovering Noisy RSA Keys |
January 13, 2013 | Kevin Matulef, Aarhus University Mongo DB |
December 12, 2012 | Jesper Sindahl Nielsen, Aarhus University, MADALGO Finger search in the implicit model |
December 5, 2012 | Casper Kejlberg-Rasmussen, Aarhus University, MADALGO I/O-Efficient Planar Range Skyline and Attrition Priority Queues |
November 28, 2012 | Mark de Berg, Technische Universiteit Eindhoven Kinetic Data Structures in the Black-Box Model |
November 21, 2012 | Constantinos Tsirogiannis, Aarhus University, MADALGO Fast Generation of Multiple Resolution Raster Data Sets |
November 14, 2012 | Zhewei Wei, Aarhus University, MADALGO Space Complexity of 2-Dimensional Approximate Range Counting |
November 12, 2012 | Andy Twigg, University of Oxford Persistent Streaming Indexes |
November 12, 2012 | Sebastian Faust, Aarhus University Leakage-Resilient Symmetric Cryptography |
November 9, 2012 | Joshua Brody, Aarhus University Certifying Equality with Limited Interaction |
November 7, 2012 | Kasper Green Larsen, Aarhus University, MADALGO Lower Bounds for Data Structures |
November 2, 2012 | Nadia Heninger, Microsoft Mining Your Ps and Qs: Detection of Widespread Weak Keys in Network Devices |
October 10, 2012 | Wei Yu, Aarhus University Budget Error-Correcting under Earth-Mover Distance |
October 8, 2012 | Angela Zottarel, Aarhus University Users key leakage resilient IBE scheme based on the DLIN assumption in the continual leakage model |
October 3, 2012 | Bryan Wilkinson, Aarhus University, MADALGO Adaptive and Approximate Orthogonal Range Counting |
October 1, 2012 | Claudio Orlandi, Aarhus University Calling out Cheaters: Covert Security With Public Verifiability |
September 26, 2012 | Hossein Jowhari, MADALGO, Aarhus University Fast Protocols for Edit Distance through Locally Consistent Parsing |
September 24, 2012 | Daniele Venturi, Aarhus University Witnessing Equivalence between Leakage Tolerance and Adaptive Security |
September 19, 2012 | Darius Sidlauskas, Aarhus University, MADALGO Parallel Main-Memory Indexing for Moving-Object Query and Update Workloads |
September 17, 2012 | Sunoo Park, Aarhus University Efficient Public Key Cryptography from LPN |
September 10, 2012 | Jesper Buus Nielsen, Aarhus University On the Value of Cryptographic Cheap Talk |
September 6, 2012 | Jakob Truelsen, MADALGO, Aarhus University Simplifying Massive Contour Maps |
August 2, 2012 | Margarita Vald, Boston University Universally Composable Security with Local Adversaries |
August 1, 2012 | Jeff Phillips, University of Utah Discrepancy for Kernel Range Spaces |
July 5, 2012 | Claudio Orlandi, Aarhus University Privacy-Aware Mechanism Design |
June 27, 2012 | Daniele Venturi, Aarhus University Rate-Limited Secure Function Evaluation |
June 26, 2012 | Periklis Papakonstantinou, Tsinghua University Space-Bounded Communication Complexity |
June 25, 2012 | Periklis Papakonstantinou, Tsinghua University Streaming Cryptography |
June 21, 2012 | Ranganath Kondapally, Dartmouth College Time-Space Tradeoff for Mutual Exclusion |
June 21, 2012 | Ranganath Kondapally, Dartmouth College Time-Space Tradeoff for Mutual Exclusion |
June 18, 2012 | Mehul Bhatt, University of Bremen The Shape of Empty Space - Human-Centred Computational Foundations for Understanding Built-Up Space |
June 15, 2012 | Pavel Pudlak, Institute of Mathematics, Prague How difficult is it to construct a Ramsey graph? |
June 14, 2012 | Irit Katriel Streamulus - A language for real-time event stream processing |
June 13, 2012 | Srinivasa Rao Satti, Seoul National University College of Engineering B-tree indexes for flash memory |
June 6, 2012 | Freek van Walderveen, Aarhus University, MADALGO Computing Betweenness Centrality in External Memory |
June 4, 2012 | Kostas Tzoumas, Technical University of Berlin Big Data Analytics with Stratosphere |
May 31, 2012 | Markus Bläser, Universität des Saarlandes Noncommutativity Makes Determinants Hard |
May 31, 2012 | Antigoni Polychroniadou, Royal Holloway University of London A Coding-Theoretic Approach to Recovering Noisy RSA Keys |
May 30, 2012 | Qin Zhang, MADALGO, Aarhus University Tight Bounds for Distributed Streaming |
May 24, 2012 | Carola Winzen, MPI Saarbrücken Playing Mastermind with Many Colors |
May 22, 2012 | Zhewei Wei, Aarhus University, MADALGO Range Summary Queries |
May 10, 2012 | Christian Konrad, University Paris Diderot Language and Graph Problems in the Streaming Model |
May 9, 2012 | Ulrich Meyer, Goethe University Frankfurt am Main I/O-efficient hierarchical diameter approximation |
May 2, 2012 | Pavel Hubacek, Aarhus University On Solution Concepts for Rational Cryptography |
May 2, 2012 | Hossein Jowhari, Aarhus University, MADALGO Near-optimal space bounds for Lp samplers |
April 25, 2012 | Bernardo David, University of Brasilia Universally Composable Oblivious Transfer From on Lossy Encryption and Coding Assumptions |
April 24, 2012 | Nodari Sitchinava, Karlsruhe Institute of Technology A Parallel Buffer Tree |
April 23, 2012 | Jesper Buus Nielsen, Aarhus University Better Models for Cryptography |
April 20, 2012 | Kurt Mehlhorn, Max-Planck-Institut für Informatik Physarum Computations |
April 12, 2012 | Jesper Buus Nielsen, Aarhus University Efficient Homomorphic Commitments |
April 11, 2012 | Peyman Afshani, MADALGO, Aarhus University Improved Pointer Machine and I/O Lower Bounds for Simplex Range Reporting and Related Problems |
April 2, 2012 | Eleftherios Anastasiadis, University of Liverpool Truthful approximation algorithms for combinatorial auctions |
March 30, 2012 | Rikke Bendlin, Aarhus University How to Share a Lattice Trapdoor |
March 12, 2012 | Morten Dahl, Aarhus University Computational Soundness of Observational Equivalence - A paper by Comon-Lundh and Véronique Cortier |
March 2, 2012 | Daniele Venturi, Aarhus University Entangled Cloud Storage |
February 24, 2012 | Mariana Raykova, Columbia University Secure Computation in Heterogeneous Environments: How to Bring Multiparty Computation Closer to Practice? |
February 16, 2012 | Aris Filos-Ratsikas, University of Patras An improved 2-agent kidney exchange mechanism |
February 9, 2012 | Hossein Naeemi, Maastricht University Implementability in multidimensional domains, a network approach |
February 8, 2012 | Jian Li, IIIS, Tsinghua University When LP is the Cure for Your Matching Woes: Improved Bounds for Stochastic Matchings |
February 8, 2012 | Constantinos Tsirogiannis, MADALGO, Aarhus University Flow Modelling on Triangulated Terrains: Computational Problems in Theory and Practice |
January 27, 2012 | Jonas Kölker, Aarhus University The NP-hardness of selected puzzles |
January 26, 2012 | John Steinberger, Tsinghua University IIIS Hellinger Distance and adaptivity |
January 17, 2012 | Kristian Gjøsteen, The Norwegian University of Science and Technology A reformulation of Canetti's UC framework |
December 19, 2011 | Troels Bjerre Sørensen, University of Warwick Repeated zero-sum games with budget |
December 16, 2011 | Uri Zwick, Tel Aviv University Maximum Overhang |
December 14, 2011 | Kasper Green Larsen, Aarhus University, MADALGO On Range Searching in the Group Model and Combinatorial Discrepancy |
December 7, 2011 | Lap-Kei Lee, MADALGO, Aarhus University Edit Distance to Monotonicity in Sliding Windows |
December 5, 2011 | Ioannis Emiris, University of Athens Euclidean embeddings of distance graphs |
November 30, 2011 | Michele Budinich, IMT Lucca Institute for Advanced Studies Bounded Rationality via Computational Limitations: Two Simple Examples |
November 29, 2011 | Sebastian Faust, Department of Computer Science, Aarhus University Leakage-Resilient Cryptography From the Inner-Product Extractor |
November 28, 2011 | Raphael Clifford, Bristol University Lower bounds for online integer multiplication and convolution in the cell-probe model |
November 22, 2011 | Krzysztof Pietrzak, IST, Austria Commitments and Efficient Zero-Knowledge from Hard Learning Problems |
November 17, 2011 | Jun Tarui, University of Electro-Comm (Tokyo) Finding a Duplicate in a Stream |
November 16, 2011 | Wei Yu, Aarhus University, MADALGO Data Structure Lower Bounds from Predecessor |
November 15, 2011 | Angela Zorttarel, Department of Computer Science, Aarhus University Signature Schemes Secure against Hard-to-Invert Leakage |
November 9, 2011 | Valerio Pastro, Aarhus University Multiparty Computation from Somewhat Homomorphic Encryption |
November 9, 2011 | Asano Tetsuo, Japan Advanced Institute of Science and Technology Designing Algorithms with Limited Work Space |
October 17, 2011 | Kent Andersen, Department of Mathematics, Aarhus University On the relationship between mixed integer optimization and lattice point free sets |
October 7, 2011 | Christophe Tartary, Tsinghua University Planar Graphs, Colorings and Distributed Computation in Non-Abelian Groups |
October 5, 2011 | Glynn Winskel, Cambridge University Concurrent Games |
September 29, 2011 | Simina Branzei, Aarhus University Weighted Clustering |
September 28, 2011 | Jean Lancrenon, Université Joseph Fourier(UJF) Remote object authentication protocols |
September 28, 2011 | Djamal Belazzougui, Université Paris Diderot Applications of Minimal Perfect Hashing in Compressed Full-Text Indexing |
September 26, 2011 | Jean-Charles Faugere, INRIA and UPMC, France Gröbner Bases of structured polynomial systems. Applications to Cryptology |
September 21, 2011 | Jeff Erickson, University of Illinois Tracing Curves on Triangulated Surfaces |
September 19, 2011 | Ratnik Gandhi, Tata Institute of Fundamental research Nash Equilibria computation with Group Actions |
September 12, 2011 | Moshe Lewenstein, Bar-Ilan University Fast, precise and dynamic distance queries |
August 30, 2011 | Jens Groth, Department of Computer Science, UCL Gower Street, London Short Non-interactive Zero-Knowledge Proof |
August 29, 2011 | Francesco Silvestri, University of Padova Resilient Dynamic Programming |
August 24, 2011 | Or Sheffet, CMU Stability Yields a PTAS for k-Median and k-Means Clustering |
August 19, 2011 | Swastik Kopparty, MIT The complexity of powering in finite fields |
August 4, 2011 | Nick Gravin, Nanyang Technological University, Singapore Frugal Mechanism Design via Spectral Techniques |
June 29, 2011 | Alejandro (Alex) Lopez-Ortiz, University of Waterloo Efficient scheduling of equal size tasks in multiple machines |
June 27, 2011 | Konstantinos Tsakalidis, MADALGO, Aarhus University Dynamic Planar Range Maxima Queries |
June 23, 2011 | Andrej Brodnik, University of Primorska Vehicle and Crew Scheduling in public transport |
June 22, 2011 | Qin Zhang, MADALGO, Aarhus University An Efficient Sketch for Earth-Mover Distance |
June 8, 2011 | Nodari Sitchinava, MADALGO, Aarhus University I/O-optimal parallel distribution sweeping for private-cache chip multiprocessors |
June 1, 2011 | Gerth Stølting Brodal, Aarhus University Integer Representations towards Efficient Counting in the Bit Probe Model |
May 25, 2011 | Peyman Afshani, Dalhousie University External Memory Lower Bounds for Angular Sorting and Sorted Nearest Neighbor Queries |
May 24, 2011 | Robert Weismantel, Institute for Operations Research, ETH Zürich, Switzerland About optimization of nonlinear functions over integer points in polyhedra |
May 18, 2011 | Pooya Davoodi, Aarhus University Succinct Dynamic Cardinal Trees with Constant Time Operations for Small Alphabet |
May 12, 2011 | Aaron Archer, AT&T Shannon Research Laboratory Improved Approximation Algorithms for the Prize-Collecting Steiner Tree Problem |
May 4, 2011 | Konstantinos Tsichlas, Aristotle University of Thessaloniki Some Complex Problems without Complexities |
April 13, 2011 | Thomas Mølhave, Duke University From Point Clouds to 2D and 3D Grids: A Natural Neighbor Interpolation Algorithm using the GPU |
April 11, 2011 | Mahyar Salek, University of Southern California Frugal Procurement Auction Design |
March 28, 2011 | Philipp Hupp, Technische Universität München Memory Efficient Algorithms for Sparse Grids |
March 28, 2011 | Gero Greiner, Technische Universität München Sparse Matrix Multiplications in the I/O-Model |
March 21, 2011 | Kasper Green Larsen, Aarhus University (Approximate) Uncertain Skylines |
March 21, 2011 | Kasper Green Larsen, Aarhus University Range Selection and Median: Tight Cell Probe Lower Bounds and Adaptive Data Structures |
March 18, 2011 | Jonas Kölker, Aarhus University Multiparty Computation with Storage Servers, or: Combining Secure Computation and I/O-Efficient Algorithms |
March 9, 2011 | Thomas Jakobsen, Aarhus University Key Management in the Cloud |
March 9, 2011 | Thomas Dueholm Hansen, Aarhus University Subexponential lower bounds for randomized pivoting rules for the simplex algorithm |
March 4, 2011 | Elad Verbin, Aarhus University An Information-Theoretic Approach for Black Box Separations in Cryptography |
March 2, 2011 | Dominik Scheder, ETH Zürich A Full Derandomization of Schoening's k-SAT Algorithm |
February 23, 2011 | Rasmus Ibsen-Jensen, Aarhus University The complexity of solving reachability games using value and strategy iteration |
January 12, 2011 | Xiaotie Deng, University of Liverpool Market Making and Solution Concepts |
January 12, 2011 | Andrew McGregor, University of Massachusetts, Amherst Data Streams, Dyck Languages, and Detecting Dubious Data Structures |
December 9, 2010 | Sebastian Faust, K.U. Leuven, Belgium Tamper Resilient Circuits: How to Trade Leakage for Tamper-Resilience? |
December 8, 2010 | Jakob Truelsen, Aarhus University/MADALGO A Cache-Oblivious Implicit Dictionary with the Working Set Property |
December 3, 2010 | Lasse Kosetski Deleuran, Aarhus University/MADALGO Computing Homotopic Simplification in a Plane |
November 22, 2010 | Nicole Immorlica, Northwestern University, USA Dueling Algorithms |
November 17, 2010 | Pooya Davoodi, Aarhus University Path Minima on Dynamic Weighted Trees |
November 14, 2010 | Thomas Stidsen, DTU Dantzig-Wolfe decomposition: A way to change hard problem into more manageable problems |
November 10, 2010 | Elad Verbin, Aarhus University An exposition of Barak et al's direct sum theorem |
October 6, 2010 | Qin Zhang, Aarhus University/MADALGO Optimal Sampling from Distributed Streams |
September 29, 2010 | Kord Eickmeyer, Humboldt-Universität zu Berlin Randomisation and Derandomisation in Descriptive Complexity |
September 29, 2010 | Elad Verbin, Aarhus University The Coin Problem, and Pseudorandomness for Branching Programs |
September 23, 2010 | Philip Bille, Technical University of Denmark Random Access to Grammar Compressed Strings |
September 22, 2010 | Yakov Nekrich, University of Bonn Dynamic External Memory Range Reporting in 3-D |
September 17, 2010 | Elad Verbin, Aarhus University Approximation Algorithms based on Semidefinite Programming |
September 10, 2010 | Jeff Phillips, University of Utah Comparing Distributions and Shapes with the Kernel Distance |
September 1, 2010 | Pooya Davoodi, Aarhus University/MADALGO On Space Efficient Two Dimensional Range Minimum Data Structures |
August 26, 2010 | Jason Hartline, Northwestern Approximation and Mechanism Design |
June 23, 2010 | Freek van Walderveen,, Aarhus University, MADALGO Cleaning massive sonar point clouds |
June 9, 2010 | Morten Revsbæk, MADALGO, Aarhus University I/O-Efficient Computation of Water Flow Across a Terrain |
May 20, 2010 | Faith Ellen, University of Toronto The Space Complexity of Unbounded Timestamps |
May 19, 2010 | Deepak Ajwani, MADALGO/Aarhus University I/O-efficient Topological Ordering of DAGs with Small Path Cover |
May 12, 2010 | Casper Kejlberg-Rasmussen, MADALGO, Aarhus University k-order Voronoi Diagrams in External Memory |
May 10, 2010 | Florian Horn, CNRS Complexity of the winner problem in Muller games |
May 5, 2010 | John Iacono, Polytechnic Institute of New York University Mergeable Dictionaries |
April 27, 2010 | Mihai Patrascu, AT&T Labs - Research Dynamic Lower Bounds |
April 21, 2010 | Jens-Christian Svenning, Aarhus University Ecoinformatics - the computing approach to ecology |
April 14, 2010 | Ivan Damgaard and Jonas Kölker, Aarhus University Multiparty Computation with Storage Servers, or: Combining Secure Computation and IO-Efficient Algorithms |
April 13, 2010 | Elias Tsigaridas, Aarhus University Multivariate (aggregate) separation bounds |
March 10, 2010 | Peyman Afshani, Aarhus University, MADALGO Orthogonal Range Reporting: Query lower bounds, optimal structures in 3-d, and higher-dimensional improvements |
February 24, 2010 | Shervin Daneshpajouh, Sharif University of Technology Computing minimum-link homotopic simplification of general paths |
February 23, 2010 | Srikanth Srinivasan, Institute of Mathematical Sciences, Chennai On the Hardness of the Noncommutative Determinant |
February 3, 2010 | Nodari Sitchinava, Aarhus University, MADALGO Computational Geometry in the PEM model |
January 13, 2010 | Kent Andersen, Otto-Von-Guericke University Magdeburg Integer optimization, lattice point free sets and zero-coefficient cuts |
December 10, 2009 | Kostas Tsakalidis, MADALGO, Aarhus University Dynamic 3-sided Planar Range Queries with Expected Doubly Logarithmic Time |
December 9, 2009 | Morteza Monemizadeh, University of Dortmund Coresets and Sketches for High Dimensional Subspace Approximation Problems |
December 2, 2009 | Jakob Truelsen, MADALGO, Aarhus University Approximating the Mode and Determining Labels with Fixed Frequency |
December 2, 2009 | Uffe Heide-Jørgensen, Department of Mathematics, Aarhus University Quadratic complexity of the permanent and dual varieties |
November 25, 2009 | Morten Revsbæk, MADALGO, Aarhus University I/O-efficient Contour Tree Simplification |
November 24, 2009 | Sourav Chakraborty, Technion Reconstruction of Codes |
November 19, 2009 | Sourav Chakraborty, Technion Title: Market Equillibrium with Transaction Costs |
November 18, 2009 | Elad Verbin, ITCS, Tsinghua University, China The Limits of Buffering: A Lower Bound for Membership Data Structures in the External Memory Model |
November 17, 2009 | Joshua Brody, Dartmouth College Multiround lower bounds for Gap Hamming via Round Elimination |
November 16, 2009 | Hamish Carr, University College Dublin Applications & Questions in Topological Visualization |
November 11, 2009 | Allan Grønlund Jørgensen, MADALGO, Aarhus University Data Structures for Range Median Queries |
November 10, 2009 | Vladimir V. Podolskii, Steklov Mathematical Institute Bounds on Coefficients of Integer Polynomials with a Given Boolean Sign-function |
November 4, 2009 | Kasper Dalgaard Larsen, MADALGO, Aarhus University Orthogonal Range Reporting in Three and Higher Dimensions |
November 3, 2009 | Joshua Brody, Dartmouth The NOF Communication Complexity of Multiparty Pointer Jumping |
October 30, 2009 | Kousha Etessami, Edinburgh The complexity of Nash equilibria and other fixed points |
October 30, 2009 | Uri Zwick, Tel Aviv Local improvement algorithms and Policy iteration algorithms |
October 29, 2009 | Rasmus Pagh, IT University of Copenhagen Storing a Compressed Function with Constant Time Access |
October 7, 2009 | Peyman Afshani, MADALGO, Aarhus Instance-Optimal Geometric Algorithms |
October 6, 2009 | Christian Knauer, Freie Universität Berlin The curse of dimensionality (somewhat) explained |
September 30, 2009 | Norbert Zeh, Dalhousie University Optimal Cache-Oblivious Range Reporting Requires Superlinear Space |
September 16, 2009 | Mohammad Ali Abam:, MADALGO, Aarhus University Geometric Spanners for Weighted Point Sets |
August 20, 2009 | Martin Smerek, Masaryk University I/O-efficient Symbolic Model Checking |
August 19, 2009 | Ke Yi, Hong Kong University of Science and Technology Dynamic indexability and lower bounds for dynamic one-dimensional range query indexes |
July 23, 2009 | Elias P. Tsigaridas, INRIA Algebraic Algorithms and (some) Geometric Applications |
June 4, 2009 | Peter Bro Miltersen, Aarhus University, Department of Computer Science Applications of semi-algebraic geometry in (computational) game theory |
May 15, 2009 | Martin Olsen, Aarhus Universitet Maximizing PageRank with new Backlinks |
May 15, 2009 | Kristoffer Arnsfelt Hansen, University of Aarhus, Department of Computer Science Introduction to Semi-algebraic Geometry |
May 12, 2009 | Bireswar Das, IMSc, Chennai SZK Proofs for Black Box Group Problems |
May 1, 2009 | Thomas Mølhave, MADALGO/Aarhus Universitet I/O-Efficient Algorithms for Computing Contour Lines on a Terrain |
April 24, 2009 | Jelani Nelson, Massachusetts Institute of Technology (MIT) Revisiting Norm Estimation in Data Streams |
April 23, 2009 | Eric Price, Massachusetts Institute of Technology (MIT) Lower Bounds in Compressed Sensing |
April 21, 2009 | Peyman Afshani, University of Aarhus, Department of Computer Science Maximum Cliques in Unit-disk Graphs and its Relation to Shape Covering and Graph Embedding |
April 17, 2009 | Mohammad Ali Abam, MADALGO, Aarhus University Kinetic spanner |
March 9, 2009 | Nguyen Kim Thang, Ecole Polytechnique Scheduling Games in the Dark |
March 6, 2009 | Kasper Dalgaard Larsen, MADALGO, Aarhus University Towards Optimal Three-Dimensional Range Search Indexing |
February 24, 2009 | Kristoffer Arnsfelt Hansen, University of Aarhus, Department of Computer Science The Kakeya problem in finite fields and applications to randomness extraction |
February 20, 2009 | Freek van Walderveen, Aarhus University Space-filling curves for efficient spatial index structures |
February 13, 2009 | Deepak Ajwani, MADALGO/Aarhus University Computing on Solid-State disks: Modeling and Algorithmic challenges |
February 10, 2009 | Thore Husfeldt, ITU Copenhagen and Lund University Computing the Tutte Polynomial in Vertex-Exponential Time |
February 3, 2009 | Peyman Afshani, MADALGO, Aarhus University Optimal Halfspace Range Reporting in 3-d |
February 3, 2009 | Taso Viglas, University of Sydney The good, the bad and the uninformed |
December 19, 2008 | Jérémy Barbay, Universidad de Chile Compressed Representations of Permutations, and Applications |
December 18, 2008 | Kostas Tsichlas and Spyros
Sioutas, Aristotle University of Thessaloniki and Ionian University Deterministic Structures over P2P Networks |
November 27, 2008 | Allan Grønlund Jørgensen, Aarhus University, MADALGO Selecting Sums in Arrays |
November 20, 2008 | Peter Hachenberger, MADALGO, Aarhus University Using the right BVH in each situation |
November 18, 2008 | Jesper Buus Nielsen, CAGT, Aarhus University Privacy-enhancing first-price auctions using rational cryptography |
November 13, 2008 | Mark Greve, Aarhus University, MADALGO Online Sorted Range Reporting |
November 11, 2008 | Daniel Andersson, DAIMI/CAGT Deterministic Graphical Games Revisited |
November 6, 2008 | Deepak Ajwani, MADALGO/DAIMI Incremental Topological Ordering |
November 4, 2008 | Orestis Telelis, CAGT/DAIMI On Pure and (approximate) Strong Equilibria of Facility Location Games |
October 30, 2008 | Srinivasa Rao, MADALGO, University of Aarhus On Secondary Indexing in One Dimension |
October 28, 2008 | Peter Bro Miltersen, CAGT, Department of Computer Science, Aarhus University On the computational complexity of solving stochastic mean-payoff games |
September 24, 2008 | Rasmus Pagh, IT University of Copenhagen Searching a Sorted Table with O(1) Accesses OR Bee-Trees: How to find your way with a very little Brain |
September 18, 2008 | Michael T. Goodrich, University of California, Irvine Studying Road Networks Through an Algorithmic Lens |
September 4, 2008 | Mikkel Thorup, AT&T Labs-Research Efficient Cuts via Greedy Tree Packing |
August 27, 2008 | Michal Koucky, Czech Academy of Sciences Amplifying Lower Bounds by Means of Self-Reducibility |
August 15, 2008 | John Iacono, Polytechnic Institute of New York University Blasting Transdichotomous Atomic Rambo |
July 2, 2008 | Jeff Phillips, Duke University Creating ε-Samples for Terrains |
July 1, 2008 | Jeremy T. Finemann, Massachusetts Institute of Technology Cache-Oblivious Streaming B-Trees |
July 1, 2008 | Vladimir Gurvich, University of Aarhus and RUTCOR, Rutgers University Generating Vertices of a Polyhedron is Hard |
June 11, 2008 | Maurice Jansen, Centre for Theory in Natural Sciences, University of Aarhus Lower Bounds for Syntactically Multilinear Algebraic Branching Programs |
June 11, 2008 | Maurice Jansen, Centre for Theory in Natural Sciences, University of Aarhus Lower Bounds for Syntactically Multilinear Algebraic Branching Programs |
June 6, 2008 | Seth Pettie, University of Michigan Analyzing Splay Trees Using Davenport-Schinzel Sequences |
May 22, 2008 | Kord Eickmeyer, Humboldt-Universität Berlin Approximation of natural W[P]-complete Minimisation Problems is hard |
May 20, 2008 | Thomas Dueholm Hansen, CAGT/DAIMI On Range of Skill |
May 19, 2008 | Dmitriy Morozov, Duke University Persistence-Sensitive Simplification Simplified |
May 6, 2008 | Lars Bach, University of Lund Examples of evolutionary game theory |
April 22, 2008 | Hugo Gimbert, LABRI, CNRS, Bordeaux, France Solving Simple Stochastic Games |
April 22, 2008 | Peter Hachenberger, Eindhoven University of Technology Boolean Operations on Polyhedra and 3D Minkowski Sums |
March 25, 2008 | David C. Parkes, Harvard University Coordination Mechanisms for Dynamic Multi-Agent Environments |
March 18, 2008 | Oded Lachish, University of Warwick Sound 3-query PCPPs are Long |
March 11, 2008 | Wan Huang, London School of Economics Computing Extensive Form Correlated Equilibrium |
March 11, 2008 | Bernhard von Stengel, London School of Economics Strategic characterization of the index of an equilibrium |
February 26, 2008 | Anastasios Sidiropoulos, Massachusetts Institute of Technology Algorithmic Embeddings into Low-Dimensional Spaces |
February 26, 2008 | Martin M. Andersen, School of Economics and Management How to Maximize the Likelihood Function for a Dynamic Stochastic General Equilibrium Model |
February 26, 2008 | Rune Mølgaard, School of Economics and Management Optimal Consumption, Portfolio, Leisure and House Size Choice with Stochastic House Prices, Wage and Interest Rates |
February 15, 2008 | Norbert Zeh, Dalhousie University A faster cache-oblivious shortest-path algorithm for undirected graphs with bounded edge lengths |
February 14, 2008 | Ian Munroe, University of Waterloo Integer Representation and Counting in the Bit Probe Model |
February 13, 2008 | Vangelis Markakis, CWI Algorithms for Computing Approximate Nash Equilibria in Bimatrix Games |
January 22, 2008 | Nicole Immorlica, Nicole Immorlica Secretary Problems and Extensions |
January 17, 2008 | Herman Haverkort, Eindhoven University of Technology I/O-efficient flow modeling on fat-triangulated surfaces |
December 11, 2007 | Rocio Santillan, DAIMI Phase transitions in satisfiability |
November 27, 2007 | Peter Bro Miltersen, CAGT/DAIMI The "Names in boxes" game and the problem of efficiently answering EVERY database query |
November 26, 2007 | Oren Weimann, Massachusetts Institute of Technology (MIT) Finding an Optimal Tree Searching Strategy in Linear Time |
November 22, 2007 | Kevin Chang, Max Planck Institute for Computer Science, Saarbrucken, Germany Multiple pass algorithms for model selection and other clustering problems |
November 13, 2007 | Arne Andersson, Uppsala University and Trade Extensions From Theory to Practice - Running a High-Tech E-Commerce Company |
November 13, 2007 | Jim Wilenius, Uppsala University Analyzing Combinatorial vs. Single-Bid Auctions |
November 9, 2007 | Rajeev Raman, University of Leicester On the size of succinct indices |
November 2, 2007 | Andreas Emil Feldmann, RWTH Aachen University Computing Approximate Equilibria in Network Congestion Games |
October 30, 2007 | Orestis Telelis, DAIMI/CAGT Distributed Selfish Replication |
October 24, 2007 | Inge Li Gørtz, DTU The Dial-a-Ride Problem |
October 16, 2007 | Daniel Andersson, CAGT/DAIMI Widest Path Interdiction |
October 2, 2007 | Nikolaos Triandopoulos, CAGT/DAIMI Survey of Rational Cryptography |
September 25, 2007 | Søren Asmussen and Lester Lipsky, IMF, Aarhus University and Dept. of Computer Science, Storss, CT A probabilistic study of the RESTART scheme for failure recovery. |
September 18, 2007 | Troels Bjerre Sørensen, CAGT/DAIMI Computing Proper Equilibria of Extensive Form Games |
September 6, 2007 | Jonathan Richard Shewchuk, Computer Science Division, University of California at Berkeley Tetrahedral Meshes with Good Dihedral Angles |
September 5, 2007 | Srinivasa Rao Satti, MADALGO Succinct representations of trees |
September 4, 2007 | Jonathan Richard Shewchuk, Computer Science Division, University of California at Berkeley Streaming Computation of Delaunay Triangulations |
August 30, 2007 | Michael Goodrich, University of California-Irvine Blood on the Computer: How Algorithms for Testing Blood Samples can be used in Modern Applications |
June 15, 2007 | Martin Olsen, BRICS Nash Stability in Additively Separable Hedonic Games is NP-hard |
June 1, 2007 | Daniel Andersson, BRICS Hiroimono is NP-complete |
May 22, 2007 | Bradford G. Nickerson, Faculty of Computer Science, University of New Brunswick Faculty of Computer Science, University of New Brunswick |
April 26, 2007 | Mohammad Abam, TU Eindhoven Kinetic Kd-tree |
April 20, 2007 | Vincenzo Bonifaci, Technical University Berlin The complexity of uniform Nash equilibria |
April 12, 2007 | Martin Höfer, Universität Konstanz Non-cooperative Competition in Large Networks |
February 19, 2007 | Shripad Thite, Department of Mathematics and Computer Science, TU Eindhoven IO-Efficient Map Overlay and Point Location in Low-Density Subdivisions |
November 14, 2006 | Eric Allender, Rutgers Grid Graph Reachability Problems |
October 4, 2006 | Thore Husfeldt, Lund University How to compute the chromatic number and the chromatic polynomial |
September 15, 2006 | Daniel Andersson, BRICS Improved Algorithms for Discounted Payoff Games |
September 8, 2006 | Sariel Har-Peled, University of Illinois, Urbana-Champaign (UIUC) On low dimensional coresets |
August 29, 2006 | Adam Buchsbaum, AT&T Research Restricted Strip Covering and the Sensor Cover Problem |
March 21, 2006 | Loukas Georgiadis, BRICS, Department of Computer Science, University of Aarhus Finding Dominators Efficiently: Theory and Practice |
March 10, 2006 | Jakob Krarup, Department of Computer Science, University of Copenhagen Unweighted set cover: beat the computer! |
February 22, 2006 | Norbert Zeh, Faculty of Computer Science, Dalhousie University, Canada Simplified Cache-Oblivious Planar Orthogonal Range Searching |
December 20, 2005 | Anders Yeo, Department of Computer Science, Royal Holloway, University of London Transversals in hypergraphs and total domination in graphs |
December 1, 2005 | Irit Katriel, BRICS, Dept. of Computer Science, University of Aarhus Simultaneous Matchings |
November 17, 2005 | Loukas Georgiadis, BRICS, Dept. of Computer Science, University of Aarhus Design of Data Structures for Mergeable Tress |
September 27, 2005 | Irit Katriel, BRICS Suboptimal solutions of optimisation problems |
September 9, 2005 | Gerhard J Woeginger, TU Eindhoven Three problems and one theorem |
June 30, 2005 | Martin Kutz, Max-Planck-Institut für Informatik, Saarbrücken, Germany Reachability substitutes for planar digraphs |
May 25, 2005 | Jeff Erickson, Department of Computer Science, University of Illinois at Urbana-Champaign Tight Regular Arrangements and Shortest Homotopic Paths |
May 24, 2005 | Marios Hadjieleftheriou, Computer Science Department, Boston University Efficient Indexing Techniques for Range and Nearest Neighbor Queries |
May 11, 2005 | Kostas Tsichlas, Kings College London Dynamic Interpolation Search Revisited |
January 17, 2005 | Gabriel Moruz, BRICS, Department of Computer Science, University of Aarhus On the Adaptiveness of Quicksort |
November 9, 2004 | Deepak Ajwani, Max-Planck-Institute for Computer Science, Saarbrücken, Germany Implementing external memory BFS algorithms |
November 8, 2004 | Anna Pagh, IT University of Copenhagen On Adaptive Integer Sorting |
September 27, 2004 | Troy Lee, CWI, The Netherlands The Language Compression Problem |
May 17, 2004 | Vinodchandran N. Variyam, Department of Computer Science and Engineering, University of Nebraska-Lincoln Separating NP-completeness using partial-biimmunity |
April 16, 2004 | Johan Nilsson, Lund University, Sweden Improved Approximation Algorithms for Optimization Problems in Graphs with Superlogarithmic Treewidth |
March 8, 2004 | Jesper Makholm Byskov, BRICS Exact Algorithms for Colouring using Maximal Independent Sets |
November 10, 2003 | Thore Husfeldt, University of Lund Longest Paths |
September 29, 2003 | Maciej Kurowski, Faculty of Mathematics, Informatics and Mechanics, Warsaw University Processing Short Path Queries in Planar Graphs |
September 22, 2003 | Lukasz Kowalik, Faculty of Mathematics, Informatics and Mechanics, Warsaw University Short Cycles in Planar Graphs |
June 6, 2003 | Rolf Fagerberg, Department of Computer Science, University of Aarhus On the Limits of Cache-Obliviousness |
May 9, 2003 | Anna Östlin and Rasmus Pagh, IT University of Copenhagen Uniform Hashing in Constant Time and Linear Space |
March 18, 2003 | John Iacono, Polytechnic University Input-Sensitive Data Structures |
March 11, 2003 | Lars Arge, Department of Computer Science, Duke University Cache-Oblivious Multidimensional Range Searching |
February 19, 2003 | Seth Pettie, University of Texas at Austin A new shortest path algorithm for real-weighted graphs |
January 27, 2003 | Gerth Støting Brodal, BRICS Lower Bounds for External Memory Dictionaries |
December 2, 2002 | Gerth Stølting Brodal, BRICS Funnel Heap - A Cache Oblivious Priority Queue |
November 11, 2002 | Peter Bro Miltersen, BRICS Circuits on Cylinders |
October 10, 2002 | Martin Dietzfelbinger, Technische Universitaet Ilmenau The probability of a rendezvous is minimal in complete graphs |
September 2, 2002 | Kristoffer Arnsfelt Hansen, BRICS Primes is in P |
June 19, 2002 | Jaikumar Radhakrishnan, Tata Institute of Fundamental Research, Homi Bhabha Road, Mumbai, India Lower Bounds for locally decodable codes |
May 27, 2002 | S. Srinivasa Rao, University of Leicester Space Efficient Suffix Trees |
February 11, 2002 | Lene Monrad Favrholdt, Department of Mathematics and Computer Science, University of Southern Denmark, Odense Paging with Locality of Reference |
December 17, 2001 | Rasmus Pagh, BRICS How Perfect Hashing Saves Christmas |
December 14, 2001 | Riko Jacob, BRICS Cache Oblivious Search Trees via Binary Trees of Small Height |
August 17, 2001 | V. Arvind, IMSc, Chennai, India Symmetry Breaking in Graphs |
May 15, 2001 | Lars Arge, Department of Computer Science, Duke University, NC I/O-Efficient Dynamic Planar Point Location |
January 5, 2001 | Rasmus Pagh, BRICS Optimal Time-Space Trade-Offs for Non-Comparison-Based Sorting |
December 11, 2000 | Peter Bro Miltersen, BRICS Are Bitvectors Optimal? |
November 30, 2000 | Anna Östlin, Lund University, Sweden Efficient Merging, Construction, and Maintenance of Evolutionary Trees |
November 6, 2000 | Gerth Stølting Brodal, BRICS Optimal Static Range-Reporting in One Dimension |
October 30, 2000 | Riko Jacob, BRICS Dynamic Planar Convex Hull with Optimal Query Time and O(log n·loglog n) Update Time |
June 30, 2000 | Jakob Pagter, BRICS I/O-Space Trade-Offs |
June 30, 2000 | Rasmus Pagh, BRICS A Trade-Off for Deterministic Dictionaries |
May 10, 2000 | Lance Fortnow, NEC Research Institute Diagonalization |
February 11, 2000 | Meena Mahajan, Institute of Mathematical Sciences, Chennai, India A Combinatorial Algorithm for Pfaffians |
February 7, 2000 | Meena Mahajan, Institute of Mathematical Sciences, Chennai, India When counting helps search or A new NC-algorithm for finding a perfect matching in bipartite planar and bounded genus graphs |
October 11, 1999 | Mary Cryan, BRICS Approximation Algorithms for the Fixed-Topology Phylogenetic Number Problem |
October 5, 1999 | Peter Bro Miltersen, BRICS Derandomizing Arthur-Merlin Games Using Hitting Sets |
October 4, 1999 | Anders Yeo, BRICS Packing Paths in Digraphs |
September 13, 1999 | David Pisinger, Department of Computer Science, University of Copenhagen, Denmark The p-dispersion problem |
September 10, 1999 | Jens Stoye, Deutsches Krebsforschungszentrum, Theoretische Bioinformatik, Heidelberg, Germany Linear Time Algorithms for Finding and Representing all the Tandem Repeats in a String |
August 23, 1999 | Mark Ettinger, Los Alamos National Laboratory The Game of ``Twenty Questions'' in the Quantum World |
August 10, 1999 | Anna Gal, University of Texas at Austin A theorem on sensitivity and private computation |
August 9, 1999 | Rolf Fagerberg, BRICS Dynamic Representations of Sparse Graphs |
August 6, 1999 | Uri Zwick, Tel Aviv University, Israel All pairs shortest paths using fast matrix multiplication |
June 30, 1999 | András Recski, Technical University of Budapest, Hungary Applications of matroid theory in engineering |
June 24, 1999 | Dominic Mayers, NEC Research Institute, Princeton Selfchecking Quantum Apparatus |
June 14, 1999 | Sean Hallgren, CS, Berkeley Quantum Fourier Sampling Simplified |
June 7, 1999 | Srinivasan Venkatesh, School of Technology and Computer Science, Mumbai, India The Communication Complexity of Pointer Chasing |
May 31, 1999 | Morten Nyhave Nielsen, University of Southern Denmark The Accommodating Function |
May 17, 1999 | Jakob Pagter Time-Space Trade-Offs for Decision Problems |
May 10, 1999 | Kevin Compton, BRICS and University of Michigan Poissonization and Depoissonization in the Analysis of Algorithms |
May 5, 1999 | Mary Cryan, University of Warwick Evolutionary Trees can be learned in Polynomial-Time in the Two-State General Markov Model |
April 26, 1999 | N.S. Narayanaswamy, IISc, Bangalore, India Derandomizing Space Bounded Algorithms |
April 22, 1999 | Faith Fich, University of Toronto Searching is Not Simple |
April 8, 1999 | Anders Yeo, University of Victoria Polynomial algorithms for the Travelling Salesman Problem, which finds solutions that are better than a huge number of other solutions |
April 7, 1999 | Dieter van Melkebeek, University of Chicago Graph Isomorphism and Derandomization |
March 15, 1999 | Riko Jacob A Strongly Polynomial Time Algorithm for the Constrained Maximum Flow Problem |
March 1, 1999 | Jyrki Katajainen, DIKU Meticulous Analysis of Heap Construction Programs |
February 15, 1999 | Peter Bro Miltersen Constant width planar computation - three new characterizations of AC0 |
December 21, 1998 | Everybody Real games (bring your own) and pebernødder |
December 14, 1998 | Søren Riis Complexity Gaps for Unsatisfiability Problems |
December 7, 1998 | Peter Bro Miltersen Relativizable Pseudorandom generators and extractors |
November 30, 1998 | Rolf Fagerberg, Department of Mathematics and Computer Science, Odense University The Exact Complexity of Rebalancing Binary Search Trees |
November 23, 1998 | Theis Rauhe The Marked Ancestor Problem |
November 9, 1998 | Ivan Damgård How to use noise to your advantage |
November 2, 1998 | Jakob Pagter Optimal Time-Space Trade-Offs for Sorting |
October 26, 1998 | Tibor Jordán, INPG, Laboratoire LEIBNIZ, Grenoble A Graph Theoretical Solution to a Problem in Statics |
October 19, 1998 | Rasmus Pagh Static Dictionaries and the Battle Against Redundancy |
October 5, 1998 | Gerth Stølting Brodal Functional Queues and Catenable Lists |
September 28, 1998 | Riko Jacob Towards Syntactic Characterizations of Approximation Schemes via Predicate and Graph Decompositions |
September 7, 1998 | Peter Bro Miltersen The Impagliazzo/Wigderson Gap Theorem |
May 11, 1998 | Andrzej Czygrinow, Emory University The Regularity Lemma and its algorithmic applications |
April 27, 1998 | Gudmund S. Frandsen Lower bounds for dynamic algebraic problems |
April 20, 1998 | Peter Bro Miltersen Models for dynamic algebraic computations |
April 6, 1998 | Alessandro Panconesi On computing maximal matchings in a distributed setting |
March 30, 1998 | Johan Kjeldgaard-Pedersen The complexity of arithmetic computations |
March 16, 1998 | Arto Salomaa, Turku Centre for Computer Science and Academy of Finland Computing Paradigms based on DNA complementarity |
March 13, 1998 | Faith E. Fich, University of Toronto and Fields Institute End-to-end communication |
March 12, 1998 | Mikkel Thorup, DIKU Dynamic Graph Algorithms |
March 2, 1998 | Ivan Damgård Some problems in Monotone Span Program Complexity with applications to Multiparty Computations |
February 23, 1998 | Jakob Pagter Time-space tradeoffs for sorting and related problems |
February 16, 1998 | Aleksandar Pekec A new and improved winning strategy for the Ramsey Graph Game |
December 8, 1997 | Theis Rauhe Dynamic Pattern Matching in Polylogarithmic Time |
December 1, 1997 | Christian Nørgaard Storm Pedersen Comparison of coding DNA |
November 24, 1997 | Zsolt Tuza, Hungarian Academy of Sciences, Budapest Complexity Results Related to Precoloring Extension and List Colorings of Graphs |
November 17, 1997 | Tibor Jordan, Odense University Connectivity of Graphs - Some Optimization Problems |
November 10, 1997 | Rune Bang Lyngsø The new and improved Arora Approximation Scheme for Euclidian TSP |
November 3, 1997 | Ivan Damgård Communication-Optimal Zero-Knowledge Protocols and Efficient Multiparty Computations |
October 27, 1997 | Peter Høyer, Odense Universitet Multiparty quantum communication complexity |
September 29, 1997 | Peter Bro Miltersen Error correcting codes, perfect hashing circuits, deterministic dynamic dictionaries |
May 21, 1997 | Aleksandar Pekec On optimal orientations of graphs |
May 14, 1997 | Mike Fellows, Dept. Computer Science, Univ. Victoria, Victoria, B.C., Canada Computing Obstruction Sets |
April 30, 1997 | Christian N.S. Pedersen Combining DNA and Protein Alignment |
April 30, 1997 | Ivan Damgård Efficient Non-interactive Zero-Knowledge with Preprocessing |
April 23, 1997 | Stephan Karisch, DIKU Recent Developments in Solving Quadratic Assignment Problems |
April 16, 1997 | Arne Andersson, BRICS and Lund Two results on deterministic sorting and searching |
April 9, 1997 | Thore Husfeldt P=BPP unless E has sub-exponential circuits |
April 2, 1997 | Thore Husfeldt Hardness vs. Randomness |
March 19, 1997 | Thore Husfeldt (continued from March 12) |
March 14, 1997 | Mikkel Thorup, DIKU Undirected Single Source Shortest Path in Linear Time |
March 12, 1997 | Thore Husfeldt Even easier even harder cell probe lower bounds for dynamic graph, string, and finite function problems |
March 5, 1997 | Ivan Damgård On Span programs |
February 26, 1997 | Gudmund S. Frandsen Computing the determinant |
December 20, 1996 | Everybody Real games (bring your own) and vanillekranse |
December 13, 1996 | Aleksandar Pekec Still more games |
December 6, 1996 | Aleksandar Pekec More games |
November 29, 1996 | Peter Bro Miltersen How Small is the Big Bucket? |
November 22, 1996 | Peter Bro Miltersen How Big is the Big Bucket? |
November 15, 1996 | Christian N.S. Pedersen Approximation algorithms for certain string matching problems |
November 8, 1996 | Theis Rauhe Refining the Fredman-Saks time stamp method II |
November 1, 1996 | Theis Rauhe Refining the Fredman-Saks time stamp method I |
October 25, 1996 | Rune Bang Lyngsø Arora's polynomial time approximation scheme for the Eucledian travelling salesman problem |
October 18, 1996 | Igor Shparlinski Polynomial Approximation and the parallel complexity of the discrete logarithm and breaking the Diffie-Hellman cryptosystem |
October 11, 1996 | Aleksandar Pekec Combinatorial games |
October 4, 1996 | Kousha Etessami Complexity results for temporal logic |
September 27, 1996 | Gerth Stølting Brodal Approximate dictionary queries |
September 20, 1996 | Gerth Stølting Brodal Predecessor queries in dynamic integer sets |
September 13, 1996 | Peter Bro Miltersen Last(?) instruction set dependent bounds for the static dictionary problem |
September 5, 1996 | Peter Bro Miltersen More instruction set dependent bounds for the static dictionary problem |
August 29, 1996 | Peter Bro Miltersen Instruction set dependent bounds for the static dictionary problem |
May 17, 1996 | Thore Husfeldt and Thomas Hildebrandt Semi-definite programming for graph colouring (LAMCCS-5) |
May 10, 1996 | Thore Husfeldt and Thomas Hildebrandt Semi-definite programming for graph colouring (LAMCCS-4) |
April 26, 1996 | Pawel Winter, DIKU Euclidean Steiner Tree Problem |
April 19, 1996 | Devdatt Dubhashi The Lovasz Theta function (LAMCCS-3) |
April 12, 1996 | Devdatt Dubhashi The Lovasz Theta function (LAMCCS-2) |
March 29, 1996 | Lars Arge Optimal Interval Management in External Memory |
March 22, 1996 | Mikkel Thorup, DIKU Improved Sampling with Applications to Dynamic Graph Algorithms |
March 15, 1996 | Devdatt Dubhashi Introduction to algebraic techniques in graph theory (LAMCCS-1) |
March 8, 1996 | Gerth Stølting Brodal Priority Queues on Parallel Machines |
March 1, 1996 | Roberto Grossi, University of Florence, Italy Theoretical and Experimental Results on External Dynamic Substring Search |
February 23, 1996 | Rune Bang Lyngsø and Christian N.S. Pedersen Multiple Sequence Alignment |
February 16, 1996 | Ivan Damgård On Monotone Function Closure of Perfect and Statistical Zero-Knowledge |
February 9, 1996 | Gudmund S. Frandsen Counting Bottlenecks to Show Monotone P≠NP |
Best-Fit is one of the most prominent and practically used algorithms for the bin packing problem, where a set of items with associated sizes needs to be packed in the minimum number of unit-capacity bins. Kenyon [SODA '96] studied online bin packing under random-order arrival, where the adversary chooses the list of items, but the items arrive one by one according to an arrival order drawn uniformly randomly from the set of all permutations of the items. Kenyon's seminal result established an upper bound of 1.5 and a lower bound of 1.08 on the random-order ratio of Best-Fit, and it was conjectured that the true ratio is ≈ 1.15. The conjecture, if true, will also imply that Best-Fit (on randomly permuted input) has the best performance guarantee among all the widely-used simple algorithms for (offline) bin packing. This conjecture has remained one of the major open problems in the area, as highlighted in the recent survey on random-order models by Gupta and Singla [Beyond the Worst-Case Analysis of Algorithms '20]. Recently, Albers et al. [Algorithmica '21] improved the upper bound to 1.25 for the special case when all the item sizes are greater than 1/3, and they improve the lower bound to 1.1. Ayyadevara et al. [ICALP '22] obtained an improved result for the special case when all the item sizes lie in (1/4,1/2], which corresponds to the 3-partition problem. The upper bound of 3/2 for the general case, however, has remained unimproved. In this work, we make the first progress towards the conjecture, by showing that Best-Fit achieves a random-order ratio of at most 1.5−ε, for a small constant ε >0. Furthermore, we establish an improved lower bound of 1.144 on the random-order ratio of Best-Fit, nearly reaching the conjectured ratio. This is joint work with Anish Hebbar and Arindam Khan, and appeared in SODA 2024.
We study fair allocation of indivisible goods among agents with additive valuations. We obtain novel approximation guarantees for one of the strongest fairness notions in discrete fair division, namely envy-freeness up to the removal of any positively-valued good (EFx). Our approximation guarantees are in terms of an instance-dependent parameter γ ∈ (0, 1] that upper bounds, for each indivisible good in the given instance, the multiplicative range of nonzero values for the good across the agents. Specifically, we develop a polynomial-time algorithm that computes an f(γ)-approximately EFx allocation for any given fair division instance with range parameter γ ∈ (0,1]. For instances with γ ≥ 0.511, the function f(γ) surpasses the previously best-known approximation bound of (φ−1) \apx 0.618; here φ denotes the golden ratio. Based on joint work with Siddharth Barman and Shraddha Pathak (full paper at AAMAS 2024).
We present an investigation of the All Nearest Smaller Values (ANSV) problem. Given an array A of n values, the ANSV problem involves finding for each Ai the nearest smaller values to its left and right. The problem was optimally solved in the PRAM model by Berkman, Schieber, and Vishkin (BSV) with O(n) work and O(log n) span, but their algorithm is considered too complex for practical use and lacks public implementation. The best practical solution is the simpler O(nlog n)-work algorithm by Shun and Zhao (BSZ), which includes a heuristic by Blelloch and Shun. We implement the BSV algorithm and demonstrate its practical efficiency, showing comparable performance to the BSZ algorithm. We also provide the first theoretical analysis of the BSZ heuristic, showing it achieves O(n(1 + (log n)/k)) work and O(k(1+logn/k)) span for any 1 ≤ k ≤ n, allowing a tunable trade-off between work and span. For k = Θ(log n), the BSZ algorithm becomes work-optimal with increased span compared to BSV. Our discussion includes an analysis of different input types, showing simple algorithms are efficient for random inputs. We also examine the input/output complexities of the BSV algorithm.
A Reeb graph is a graphical representation of a scalar function on a topological space that encodes the topology of the level sets. A Reeb space is a generalization of the Reeb graph to a multiparameter function. We propose novel constructions of Reeb graphs and Reeb spaces that incorporate the use of a measure. Specifically, we introduce measure-theoretic Reeb graphs and Reeb spaces when the domain or the range is modeled as a metric measure space (i.e., a metric space equipped with a measure). Our main goal is to enhance the robustness of the Reeb graph and Reeb space in representing the topological features of a scalar field while accounting for the distribution of the measure. We first introduce a Reeb graph with local smoothing and prove its stability with respect to the interleaving distance. We then prove the stability of a Reeb graph of a metric measure space with respect to the measure, defined using the distance to a measure or the kernel distance to a measure, respectively. Our measure-theoretic approach allows Reeb graphs to capture robust topology in data, in line with recent advances in building robust topological descriptors. This is a joint work with Qingsong Wang, Guanquan Ma, and Raghavendra Sridharamurthy.
We investigate coresets for approximating the cost with respect to median queries. In this problem, we are given a set of points P ⊂ Rd and median queries are p ∈ ||p − || for any point c ∈ Rd . Our goal is to compute a small weighted summary S ⊂ P such that the cost of any median query is approximated within a multiplicative (1 ± ε) factor. We provide matching upper and lower bounds.
In the multiple-selection problem one is given an unsorted array of N elements and an array of query ranks r1<···<rq, and the task is to return, in sorted order, the elements of rank r1, ..., rq, respectively. The asymptotic deterministic comparison complexity of the problem was settled by Dobkin and Munro [JACM 1981]. In the I/O model an optimal I/O complexity was achieved by Hu et al. [SPAA 2014]. In this talk we present a cache-oblivious algorithm with matching I/O complexity, named funnelselect, since it heavily borrows ideas from the cache-oblivious sorting algorithm funnelsort from the seminal paper by Frigo, Leiserson, Prokop and Ramachandran [FOCS 1999]. Joint work with Sebastian Wild ([ESA 2023, SWAT 2024]).
In the widely known assignment problem, n agents have to be matched to n items. The algorithm receives a preference ranking of each agent over all items. Our main focus will be on the Random Serial Dictatorship mechanism: After choosing a random permutation of the agent indices, we allocate the highest remaining item (according to their preferences) to agents in order of this permutation. The goal is to estimate the social welfare and social cost (in their respective settings) after having shown a hardness result that leaves us without option for an exact result.
In the problem of local correction, we are given query access to a function that is close to a codeword, such as a linear function in our case. The goal is to determine a coordinate of the codeword using a small number of queries. In this work, we consider the task of local correction of linear functions over large fields. This is a joint work with Prashanth Amireddy, Manaswi Parashaar, Srikanth Srinivasan, and Madhu Sudan.
The rapid growth in the size of large language models presents significant challenges due to the limitations in available GPU memory. This talk will explore techniques that enable the training and deployment of increasingly large models without requiring proportional increases in GPU memory. We will focus on an ongoing project that integrates three techniques: Low-Rank Adaptation (LoRA), GaLore, and quantization methods.
In Correlation Clustering, we are given a complete graph in which edges indicate affinity (+1) or aversion (-1). Our goal is to find a clustering of the nodes such that the number of affinity edges that are cut and the number of aversion edges that are retained is minimized. We will present new results on approximation algorithms for this problem.
As blockchains like Ethereum continue to grow, clients with limited resources can no longer store the entire chain. Light nodes that want to use the blockchain, without verifying that it is in a good state overall, can just download the block headers without the corresponding block contents. As those light nodes may eventually need some of the block contents, they would like to ensure that they are in principle available. Data availability sampling, introduced by Bassam et al., is a process that allows light nodes to check the availability of data without download it. In a recent effort, Hall-Andersen, Simkin, and Wagner have introduced formal definitions and analyzed several constructions. While their work thoroughly lays the formal foundations for data availability sampling, the constructions are either prohibitively expensive, use a trusted setup, or have a download complexity for light clients scales with a square root of the data size. In this work, we make a significant step forward by proposing an efficient data availability sampling scheme without a trusted setup and only polylogarithmic overhead. To this end, we find a novel connection with interactive oracle proofs of proximity (IOPPs). Specifically, we prove that any IOPP meeting an additional consistency criterion can be turned into an erasure code commitment, and then, leveraging a compiler due to Hall-Andersen, Simkin, and Wagner, into a data availability sampling scheme. This new connection enables data availability to benefit from future results on IOPPs. We then show that the widely used FRI IOPP satisfies our consistency criterion and demonstrate that the resulting data availability sampling scheme outperforms the state-of-the-art asymptotically and concretely in multiple parameters.
We refine and generalize what is known about coresets for classification problems via the sensitivity sampling framework. Such coresets seek the smallest possible subsets of input data, so one can optimize a loss function on the coreset and ensure approximation guarantees with respect to the original data. Our analysis provides the first no dimensional coresets, so the size does not depend on the dimension. Moreover, our results are general, apply for distributional input and can use iid samples, so provide sample complexity bounds, and work for a variety of loss functions. A key tool we develop is a Radamacher complexity version of the main sensitivity sampling approach, which can be of independent interest.
We study the fair allocation with the set of divisible or indivisible items distributed in multiple regions, where each agent can only obtain items from one region. In particular, we introduce two different region settings, named diverse-region and equal-region settings. In this work, we consider two widely studied fairness concepts: envy-based notions, including envy-freeness and envy-freeness up to one/any item, and share-based notions, including proportionality and proportionality up to one/any item. On the negative side, we show NP-hardness and some inapproximability results about the aforementioned fairness notions. On the positive side, we propose several algorithms to compute the partial allocations that satisfy envy-based notions and allocations that approximate the above fairness notions.
There are tons of results recently relating topology to computer science. These fields turn out to be extraordinarily similar. I try to describe when, how, why, where, etc.
Given a collection of vectors x(1),...,x(n) ∈ {0,1}d, the selection problem asks to report the index of an "approximately largest" entry in x=∑j=1n x(j). Selection abstracts a host of problems--in machine learning it can be used for hyperparameter tuning, feature selection, or to model empirical risk minimization. We study selection under differential privacy, where a released index guarantees privacy for each vectors. Though selection can be solved with an excellent utility guarantee in the central model of differential privacy, the distributed setting lacks solutions. Specifically, strong privacy guarantees with high utility are offered in high trust settings, but not in low trust settings. For example, in the popular shuffle model of distributed differential privacy, there are strong lower bounds suggesting that the utility of the central model cannot be obtained. In this paper we design a protocol for differentially private selection in a trust setting similar to the shuffle model--with the crucial difference that our protocol tolerates corrupted servers while maintaining privacy. Our protocol uses techniques from secure multi-party computation (MPC) to implement a protocol that: (i) has utility on par with the best mechanisms in the central model, (ii) scales to large, distributed collections of high-dimensional vectors, and (iii) uses k ≥ 3 servers that collaborate to compute the result, where the differential privacy holds assuming an honest majority. Since general-purpose MPC techniques are not sufficiently scalable, we propose a novel application of integer secret sharing, and evaluate the utility and efficiency of our protocol theoretically and empirically. Our protocol is the first to demonstrate that large-scale differentially private selection is possible in a distributed setting.
In this talk, we will discuss the connections between complexity theory, machine learning (boosting) and some statements about prime numbers. The thesis is that though we have these artificial separations between them, they are all the same underlying thing. This is related to work I did and stuff I read during my masters degree many years back.
We'll try to win "weirdest seminar of the year". Concretely, we'll have a quick review of analogue circuits for extremely efficient matrix multiplication. Then, we'll recap some results from the last seminar (on the expected optimum of the SSP). Finally, we'll present some solutions that came from mixing those ideas.
Most efficient algorithms for data analysis use a lot of randomization. In this talk, we will focus on deterministic alternatives, with a focus on clustering problems. Based on joint work with Vincent Cohen-Addad and David Saulpic.
If a polynomial can be computed efficiently, can its roots also be computed efficiently? It is not apriori clear if the roots of a polynomial and the polynomial have the same complexity. Besides being a clean and seemingly fundamental question in its own right, this is also useful in proving the algebraic analogue of hardness vs randomness. The talk will introduce the technique of Newton's iteration, and how to use it to prove some positive results.
Developing an optimal PAC learning algorithm in the realizable setting, where empirical risk minimization (ERM) is suboptimal, was a major open problem in learning theory for decades. The problem was finally resolved by Hanneke a few years ago. Unfortunately, Hanneke's algorithm is quite complex as it returns the majority vote of many ERM classifiers that are trained on carefully selected subsets of the data. It is thus a natural goal to determine the simplest algorithm that is optimal. In this work we study the arguably simplest algorithm that could be optimal: returning the majority vote of three ERM classifiers. We show that this algorithm achieves the optimal in-expectation bound on its error which is provably unattainable by a single ERM classifier. Furthermore, we prove a near-optimal high-probability bound on this algorithm's error. We conjecture that a better analysis will prove that this algorithm is in fact optimal in the high-probability regime.
Motivated by recent work in computational social choice, we extend the metric distortion framework to clustering problems. Given a set of n agents located in an underlying metric space, our goal is to partition them into k clusters, optimizing some social cost objective. The metric space is defined by a distance function d between the agent locations. Information about d is available only implicitly via n rankings, through which each agent ranks all other agents in terms of their distance from her. Still, even though no cardinal information (i.e., the exact distance values) is available, we would like to evaluate clustering algorithms in terms of social cost objectives that are defined using d. This is done using the notion of distortion, which measures how far from optimality a clustering can be, taking into account all underlying metrics that are consistent with the ordinal information available. Unfortunately, the most important clustering objectives (e.g., those used in the well-known k-median and k-center problems) do not admit algorithms with finite distortion. To sidestep this disappointing fact, we follow two alternative approaches: We first explore whether resource augmentation can be beneficial. We consider algorithms that use more than k clusters but compare their social cost to that of the optimal k-clusterings. We show that using exponentially (in terms of k) many clusters, we can get low (constant or logarithmic) distortion for the k-center and k-median objectives. Interestingly, such an exponential blowup is shown to be necessary. More importantly, we explore whether limited cardinal information can be used to obtain better results. Somewhat surprisingly, for k-median and k-center, we show that a number of queries that is polynomial in k and only logarithmic in n (i.e., only sublinear in the number of agents for the most relevant scenarios in practice) is enough to get constant distortion.
The Strong Lottery Ticket Hypothesis (SLTH) states that randomly-initialised neural networks contain subnetworks that can perform well without any training. Although unstructured pruning has been extensively studied in this context, its structured counterpart, which can deliver significant computational and memory efficiency gains, has been largely unexplored. One of the main reasons for this gap is the limitations of the underlying mathematical tools used in formal analyses of the SLTH. In this paper, we overcome these limitations: we leverage recent advances in the multidimensional generalisation of the Random Subset-Sum Problem and obtain a variant that admits the stochastic dependencies that arise when addressing structured pruning in the SLTH. We apply this result to prove, for a wide class of random Convolutional Neural Networks, the existence of structured subnetworks that can approximate any sufficiently smaller network. This is the first work to address the SLTH for structured pruning, opening up new avenues for further research on the hypothesis and contributing to the understanding of the role of over parameterization in deep learning.
Given a collection of polygonal chains we define a ball for every chain under the Fréchet and Hausdorff metric and study the intersection graphs of these balls. We show that computing the maximum independent set for the discrete and continuous variants is a different ball game altogether. In particular, we show that the discrete variant admits a PTAS. However, the problem becomes hard to approximate beyond a constant even when the polygonal chains are as long as 7 and in the plane. For the discrete case, we use the fact that both the Fréchet and Hausdorff distance metrics have a constant doubling dimension for constant ambient dimension and constant length of the polygonal chain. Thus, one can use the known sublinear separators to run a polynomial time local search algorithm. On the other hand, for the continuous variant, we reduce the problem of finding the maximum independent set of boxes in d-dimensions to a unit ball graph for curves of length O(d). For d=2, the former problem, known as the Maximum Independent Set of Rectangles, enjoys a constant-factor approximation algorithm [Mitchell2021, Galvez et al. 2022]. It is already APX-hard for d>2 [Chlevík and Chlevíková 2007], thus implying that finding a maximum independent set of unit balls under continuous (weak) Fréchet or Hausdorff metric is hard to approximate for even small polygonal chains in the plane.
We will present a very simple streaming algorithm on F0 estimation that also caught the eye of Donald E. Knuth. In a recent article, Donald E. Knuth started with the following two paragraphs: "Sourav Chakraborty, N. V. Vinodchandran, and Kuldeep S. Meel have recently proposed an interesting algorithm for the following problem: A stream of elements (a1, a2,...,am) is input, one at a time, and we want to know how many of them are distinct. In other words, if A = {a1, a2,...,am} is the set of elements in the stream, with multiplicities ignored, we want to know |A|, the size of that set. But we don’t have much memory; in fact, |A| is probably a lot larger than the number of elements that we can hold in memory at any one time. What is a good strategy for computing an unbiased estimate of |A|? Their algorithm is not only interesting, it is extremely simple. Furthermore, it’s wonderfully suited to teaching students who are learning the basics of computer science. (Indeed, ever since I saw it, a few days ago, I’ve been unable to resist trying to explain the ideas to just about everybody I meet.) Therefore I’m pretty sure that something like this will eventually become a standard textbook topic. This note is an initial approximation to what I might write about it if I were preparing a textbook about data streams." This simple algorithm comes out of the first ever "efficient" streaming algorithm (from PODS 21) for the Klee's Measure problem, which was a big open problem in the world of streaming for many years. This work is based on joint works with N. V. Vinodchandran, and Kuldeep S. Meel across multiple articles, notable the following: * Estimating the Size of Union of Sets in Streaming Models. PODS 2021 * Estimation of the Size of Union of Delphic Sets: Achieving Independence from Stream Size. PODS 2022 * Distinct Elements in Streams: An Algorithm for the (Text) Book. ESA 2022
In the nearest neighbor problem we store a set of points S in a data structure so that for any query point q the point p∈ S that is closest to q can be found efficiently. In this talk we study dynamic data structures for the Euclidean nearest neighbor problem in two dimensions. We show that two-dimensional nearest neighbor queries can be answered in optimal O(log n) time in some restricted dynamic scenarios. Joint work with John Iacono (Université libre de Bruxelles).
Given a set of points, clustering consists of finding a partition of a point set into k clusters such that the center to which a point is assigned is as close as possible. Most commonly, centers are points themselves, which leads to the famous k-median and k-means objectives. One may also choose centers to be j dimensional subspaces, which gives rise to subspace clustering. In this paper, we consider learning bounds for these problems. That is, given a set of n samples P drawn independently from some unknown, but fixed distribution D, how quickly does a solution computed on P converge to the optimal clustering of D? We give several near optimal results. In particular, * For center-based objectives, we show a convergence rate of \tildeO(√k/n). This matches the known optimal bounds of [Fefferman, Mitter, and Narayanan, Journal of the Mathematical Society 2016] and [Bartlett, Linder, and Lugosi, IEEE Trans. Inf. Theory 1998] for k-means and extends it to other important objectives such as k-median. * For subspace clustering with j-dimensional subspaces, we show a convergence rate of \tildeO(√kj2/n). These are the first provable bounds for most of these problems. For the specific case of projective clustering, which generalizes k-means, we show a convergence rate of Ω(√kj/n) is necessary, thereby proving that the bounds from [Fefferman, Mitter, and Narayanan, Journal of the Mathematical Society 2016] are essentially optimal.
Polynomial Identity Testing (PIT) asks the following problem: Given an oracle access to a multivariate polynomial, is the polynomial equal to the zero polynomial? This innocent looking problem turns out to be really difficult and is one of the central problems in algebraic version of P vs NP. Even after roughly three decades of combined effort, we know very little about it. In this talk, I will explain an efficient algorithm for this problem when the polynomial is sparse, i.e. the polynomial has very few monomials. This is a result by Klivans and Spielman from STOC 2001. Link to paper: https://dl.acm.org/doi/10.1145/380752.380801
This talk will present some ongoing work on improving state of the art generalization bounds for k-means. To recall: Learning asks for the sample size necessary for a solution computed on the sample to extend well to the distribution from which the data is drawn from. Kasper in particular gave many results for binary classification. For k-means, this problem is still open. In this talk, Chris will give some preliminary results and highlight futher directions. Also, Chris is trying to make a point. If the point is lost on you, you are probably the target audience.
Coresets are a type of summaries that allow us to query the cost wrt any candidate queries. If the queries happen to be any set of k-centers and the costs are the squared Euclidean distances between points and their closest centers, we are studying coresets for Euclidean k-means, arguably the most important and widely researched coreset question. Following a series of works, we now know that a coreset of size O(k/ε2 ·min(\sqrt(k),1/ε2)) exists and this has recently been shown to be optimal. In this talk we will outline how the construction works and what the main insights are towards proving this result.
In 1989 Driscoll, Sarnak, Sleator, and Tarjan presented general space-efficient techniques/transformations for making ephemeral data structures persistent. The main contribution of this paper is to adapt this transformation to the functional model. We present a general transformation of an ephemeral, linked data structure into an off-line, partially persistent, purely functional data structure with additive O(nlog n) construction time and O(n) space overhead; with n denoting the number of ephemeral updates. An application of our transformation allows the elegant slab-based algorithm for planar point location by Sarnak and Tarjan 1986 to be implemented space efficiently in the functional model using linear space.
AdaBoost is a classic boosting algorithm for combining multiple inaccurate classifiers produced by a weak learner, to produce a strong learner with arbitrarily high accuracy when given enough training data. Determining the optimal number of samples necessary to obtain a given accuracy of the strong learner, is a basic learning theoretic question. Larsen and Ritzert (NeurIPS'22) recently presented the first provably optimal weak-to-strong learner. However, their algorithm is somewhat complicated and it remains an intriguing question whether the prototypical boosting algorithm AdaBoost also makes optimal use of training samples. In this work, we answer this question in the negative. Concretely, we show that the sample complexity of AdaBoost, and other classic variations thereof, are sub-optimal by at least one logarithmic factor in the desired accuracy of the strong learner.
We will talk about ways in which locality is employed in the design of data structures. We will focus on dictionaries, which maintain sets under insertions and deletions and support membership queries of the form "is an element x in the set?". We will discuss how state-of-the-art techniques for dictionaries exploit locality in the word RAM model to obtain space and time gains and also how locality is exploited in the choice of hash functions employed for the design.
In the average-case k-SUM problem, given r integers chosen uniformly at random from {0,...,M-1}, the objective is to find a set of k numbers that sum to 0 modulo M (this set is called a ``solution''). In the related k-XOR problem, given k uniformly random Boolean vectors of length log M, the objective is to find a set of k of them whose bitwise-XOR is the all-zero vector. Both of these problems have widespread applications in the study of fine-grained complexity and cryptanalysis. The feasibility and complexity of these problems depends on the relative values of k, r, and M. The dense regime of M ≤ rk, where solutions exist with high probability, is quite well-understood and we have several non-trivial algorithms and hardness conjectures here. Much less is known about the sparse regime of M << rk, where solutions are unlikely to exist. The best answers we have for many fundamental questions here are limited to whatever carries over from the dense or worst-case settings. We study the planted k-SUM and k-XOR problems in the sparse regime. In these problems, a random solution is planted in a randomly generated instance and has to be recovered. As M increases past rk, these planted solutions tend to be the only solutions with increasing probability, potentially becoming easier to find. We show several results about the complexity and applications of these problems. In this talk, however, we focus on showing a hardness self-amplification procedure for k-XOR. We show that if there is an algorithm that runs in time T and solves planted k-XOR recovery with probability Ω(1 / polylog(r)), then there is an algorithm than runs in time \tildeO(T) and solves planted k-XOR with probability 1 - o(1). We show this by constructing a rapidly mixing random walk over k-XOR instances that preserves the planted solution. Based on joint work with Sagnik Saha and Prashant Vasudevan.
We survey the problem of Covering Orthogonal Polygons with Rectangles. For the general problem, the best-known approximation factor achieved in polynomial time is O(√log n) [Kumar and Ramesh '99], whereas, when the polygons do not have holes, there is a 2-approximation algorithm [Franzblau '89]. Furthermore, we discuss a conjecture by Paul Erdős on a related problem [Chaiken et al. '81]. The problem is also studied when we are only interested in covering the boundary of the polygons. For the general polygons, the best-known approximation factor is 4, and the problem is also known to be APX-hard [Berman and DasGupta '97]; and for hole-free polygons, it is only known to be NP-hard [Culberson and Reckhow '94]. We prove that a simple Local Search algorithm yields a PTAS for the Boundary Cover problem for simple polygons. We do this by proving the existence of planar support graphs for the hypergraphs defined on the area-maximal rectangles contained in the polygons where every critical point on its boundary induces a hyperedge.
We study the fair allocation of indivisible goods among agents with identical, additive valuations but individual budget constraints. Here, the indivisible goods—each with a specific size and value—need to be allocated such that the bundle assigned to each agent is of total size at most the agent’s budget. Since envy-free allocations do not necessarily exist in the indivisible goods context, compelling relaxations—in particular, the notion of envy-freeness up to k goods(EFk)—have received significant attention in recent years. In an EFk allocation, each agent prefers its own bundle over that of any other agent, up to the removal of k goods, and the agents have similarly bounded envy against the charity (which corresponds to the set of all unallocated goods). Recently, Wu et al. (2021) showed that an allocation that satisfies the budget constraints and maximizes the Nash social welfare is 1/4-approximately EF1. However, the computation (or even existence) of exact EFk allocations remained an intriguing open problem. We make notable progress towards this by proposing a simple, greedy, polynomial-time algorithm that computes EF2 allocations under budget constraints. Our algorithmic result implies the universal existence of EF2 allocations in this fair division context. The analysis of the algorithm exploits intricate structural properties of envy-freeness. Interestingly, the same algorithm also provides EF1 guarantees for important special cases. Specifically, we settle the existence of EF1 allocations for instances in which: (i) the value of each good is proportional to its size, (ii) all goods have the same size, or (iii) all the goods have the same value. Our EF2 result extends to the setting wherein the goods’ sizes are agent specific.
The one-inclusion graph algorithm of Haussler, Littlestone, and Warmuth achieves an optimal in-expectation risk bound in the standard PAC classification setup. In one of the first COLT open problems, Warmuth conjectured that this prediction strategy always implies an optimal high probability bound on the risk, and hence is also an optimal PAC algorithm. We refute this conjecture in the strongest sense: for any practically interesting Vapnik-Chervonenkis class, we provide an in-expectation optimal one-inclusion graph algorithm whose high probability risk bound cannot go beyond that implied by Markov's inequality. Our construction of these poorly performing one-inclusion graph algorithms uses Varshamov-Tenengolts error correcting codes. Our negative result has several implications. First, it shows that the same poor high-probability performance is inherited by several recent prediction strategies based on generalizations of the one-inclusion graph algorithm. Second, our analysis shows yet another statistical problem that enjoys an estimator that is provably optimal in expectation via a leave-one-out argument, but fails in the high-probability regime. This discrepancy occurs despite the boundedness of the binary loss for which arguments based on concentration inequalities often provide sharp high probability risk bounds. Based on joint work with Yeshwanth Cherapanamjeri, Abhishek Shetty, and Nikita Zhivotovskiy (arXiv:2212.09270).
We study two combinatorial settings of the contract design problem, in which a principal wants to delegate the execution of a costly task. In the first setting, the principal delegates the task to an agent that can take any subset of a given set of unobservable actions, each of which has an associated cost. The principal receives a reward which is a combinatorial function of the actions taken by the agent. In the second setting, we study the single-principal multi-agent contract problem, in which the principal motivates a team of agents to exert effort toward a given task. We design (approximately) optimal algorithms for both settings along with impossibility results for various classes of combinatorial functions. In particular, for the single agent setting, we show that if the reward function is gross substitutes, then an optimal contract can be computed with polynomially many value queries, whereas if it is submodular, the optimal contract is NP-hard. For the multi-agent setting, we show how using demand and value queries, it is possible to obtain a constant approximation, where for subadditive reward functions it is impossible to achieve an approximation of o(\sqrt(n)). Our analysis uncovers key properties of gross substitutes and XOS functions, and reveals many interesting connections between combinatorial contracts and combinatorial auctions. This talk is based on joint work with Paul Duetting, Michal Feldman, and Thomas Kesselheim. Bio: Tomer Ezra is a postdoc at Sapienza University of Rome, hosted by Prof. Stefano Leonardi. He completed his Ph.D. in Tel Aviv University under the supervision of Prof. Michal Feldman. His research interests lie in the border of Computer Science and Economics, focusing on the analysis and design of simple mechanisms and algorithms in limited information settings.
We consider prophet inequalities under general downward-closed constraints. In a prophet inequality problem, a decision-maker sees a series of online elements with values, and needs to decide immediately and irrevocably whether or not to select each element upon its arrival, subject to an underlying feasibility constraint. Traditionally, the decision-maker's expected performance has been compared to the expected performance of the prophet, i.e., the expected offline optimum. We refer to this measure as the Ratio of Expectations (or, in short, RoE). However, a major limitation of the RoE measure is that it only gives a guarantee against what the optimum would be on average, while, in theory, algorithms still might perform poorly compared to the realized ex-post optimal value. Hence, we study alternative performance measures. In particular, we suggest the Expected Ratio (or, in short, EoR), which is the expectation of the ratio between the value of the algorithm and the value of the prophet. This measure yields desirable guarantees, e.g., a constant EoR implies achieving a constant fraction of the ex-post offline optimum with constant probability. Moreover, in the single-choice setting, we show that the EoR is equivalent (in the worst case) to the probability of selecting the maximum, a well-studied measure in the literature. This is no longer the case for combinatorial constraints (beyond single-choice), which is the main focus of this paper. Our main goal is, thus, to understand the relation between RoE and EoR in combinatorial settings. Specifically, we establish two reductions: for every feasibility constraint, the RoE and the EoR are at most a constant factor apart. Additionally, we show that the EoR is a stronger benchmark than the RoE in that for every instance (feasibility constraint and product distribution) the RoE is at least a constant of the EoR, but not vice versa. Both these reductions imply a wealth of EoR results in multiple settings where RoE results are known.
The Johnson-Lindenstrauss transform allows one to embed a dataset of n points in Rd into Rm, while preserving the pairwise distance between any pair of points up to a factor (1 ± ε), provided that m = Ω(ε -2lg n). The transform has found an overwhelming number of algorithmic applications, allowing to speed up algorithms and reducing memory consumption at the price of a small loss in accuracy. A central line of research on such transforms, focus on developing fast embedding algorithms, with the classic example being the Fast JL transform by Ailon and Chazelle. All known such algorithms have an embedding time of Ω(d lg d), but no lower bounds rule out a clean O(d) embedding time. In this work, we establish the first non-trivial lower bounds (of magnitude Ω(mlg m)) for a large class of embedding algorithms, including in particular most known upper bounds.
We present the first fully persistent external memory search tree achieving amortized IO bounds matching those of the classic (ephemeral) B-tree by Bayer and McCreight. The insertion and deletion of a value in any version requires amortized O(logB Nv) IOs and a range reporting query in any version requires worst-case O(logB Nv + K/B) IOs, where K is the number of values reported, Nv is the number of values in the version v of the tree queried or updated, and B is the external memory block size. The data structure requires space linear in the total number of updates. Compared to the previous best bounds for fully persistent B-trees [Brodal, Sioutas, Tsakalidis, and Tsichlas, SODA 2012], this paper eliminates from the update bound an additive term of O(log2 B) IOs. This result matches the previous best bounds for the restricted case of partial persistent B-trees [Arge, Danner and Teh, JEA 2003]. Central to our approach is to consider the problem as a dynamic set of two-dimensional rectangles that can be merged and split.
Efficiently computing low discrepancy colorings of various set systems, has been studied extensively since the breakthrough work by Bansal (FOCS 2010), who gave the first polynomial time algorithms for several important settings, including for general set systems, sparse set systems and for set systems with bounded hereditary discrepancy. The hereditary discrepancy of a set system, is the maximum discrepancy over all set systems obtainable by deleting a subset of the ground elements. While being polynomial time, Bansal's algorithms were not practical, with e.g. his algorithm for the hereditary setup running in time Ω(mn4.5) for set systems with m sets over a ground set of n elements. More efficient algorithms have since then been developed for general and sparse set systems, however, for the hereditary case, Bansal's algorithm remains state-of-the-art. In this work, we give a significantly faster algorithm with hereditary guarantees, running in O(mn2 lg(2+m/n)+n3) time. Our algorithm is based on new structural insights into set systems with bounded hereditary discrepancy. We also implement our algorithm and show experimentally that it computes colorings that are significantly better than random and finishes in a reasonable amount of time, even on set systems with thousands of sets over a ground set of thousands of elements.
In this talk, we will present some ongoing work on combining fully dynamic algorithms for clustering with privacy considerations.
Cut games are among the most fundamental strategic games in algorithmic game theory. It is well-known that computing an exact pure Nash equilibrium in these games is PLS-hard, so research has focused on computing approximate equilibria. We present a polynomial-time algorithm that computes 2.7371-approximate pure Nash equilibria in cut games. This is the first improvement to the previously best-known bound of 3, due to the work of Bhalgat, Chakraborty, and Khanna from EC 2010. Our algorithm is based on a general recipe proposed by Caragiannis, Fanelli, Gravin, and Skopalik from FOCS 2011 and applied on several potential games since then. The first novelty of our work is the introduction of a phase that can identify subsets of players who can simultaneously improve their utilities considerably. This is done via semidefinite programming and randomized rounding. In particular, a negative objective value to the semidefinite program guarantees that no such considerable improvement is possible for a given set of players. Otherwise, randomized rounding of the SDP solution is used to identify a set of players who can simultaneously improve their strategies considerably and allows the algorithm to make progress. The way rounding is performed is another important novelty of our work. Here, we exploit an idea that dates back to a paper by Feige and Goemans from 1995, but we take it to an extreme that has not been analyzed before. Based on joint work with Ioannis Caragiannis
Schelling's famous model of segregation assumes agents of different types, who would like to be located in neighborhoods having at least a certain fraction of agents of the same type. We consider natural generalizations that allow for the possibility of agents being tolerant towards other agents, even if they are not of the same type. In particular, we consider an ordering of the types, and make the realistic assumption that the agents are in principle more tolerant towards agents of types that are closer to their own according to the ordering. Based on this, we study the strategic games induced when the agents aim to maximize their utility, for a variety of tolerance levels. We provide a collection of results about the existence of equilibria, and their quality in terms of social welfare. Joint work with Panagiotis Kanellopoulos and Alexandros A. Voudouris.
In this talk, we will look at an algorithm to find a shortest cycle in a weighted directed graph. Let G be a directed graph with weight on the edges, where the weight of every edge is an integer in [-W,...,W]. The weight of a cycle is defined to be the sum of the weights of edges in the cycle. Cygan, Gabow, and Sankowski gave an algorithm to find a shortest cycle in time O(W nω log W), where nω is the time to multiply two n × n matrices. The algorithm uses algebraic tools to compute the determinant of a polynomial matrix and find an edge in a shortest cycle efficiently.
We give a simplified and improved lower bound for the simplex range reporting problem. We show that given a set P of n points in \Rd, any data structure that uses S(n) space to answer such queries must have Q(n)=Ω((n2/S(n))(d-1)/d+k) query time, where k is the output size. For near-linear space data structures, i.e., S(n)=O(nlogO(1)n), this improves the previous lower bounds by Chazelle and Rosenberg [CR96] and Afshani [A12] but perhaps more importantly, it is the first ever tight lower bound for any variant of simplex range searching for d≥ 3 dimensions. We obtain our lower bound by making a simple connection to well-studied problems in incident geometry which allows us to use known constructions in the area. We observe that a small modification of a simple already existing construction can lead to our lower bound. We believe that our proof is accessible to a much wider audience, at least compared to the previous intricate probabilistic proofs based on measure arguments by Chazelle and Rosenberg [CR96] and Afshani [A12]. The lack of tight or almost-tight (up to polylogarithmic factor) lower bounds for near-linear space data structures is a major bottleneck in making progress on problems such as proving lower bounds for multilevel data structures. It is our hope that this new line of attack based on incidence geometry can lead to further progress in this area.
Buhrman, Cleve and Wigderson (STOC'98) showed that for every Boolean function f : {0,1}n to {0,1} and G ∈ {AND2, XOR2}, the bounded-error quantum communication complexity of the composed function (f \circ G) = O(Q(f) log n), where Q(f) denotes the bounded-error quantum query complexity of f. This is known as the BCW Simulation Theorem. This is in contrast with the classical setting, where it is easy to show that Rcc(f \circ G) ≤ 2R(f), where Rcc and R denote bounded-error communication and query complexity, respectively. We exhibited a total function for which the (log n) overhead in the BCW simulation is required. We also show that the (log n) overhead is not required when f is symmetric (i.e., depends only on the Hamming weight of its input), generalizing a result of Aaronson and Ambainis for the Set-Disjointness function (Theory of Computing'05).
We study the task of obliviously compressing a vector comprised of n ciphertexts of size ξ bits each, where at most t of the corresponding plaintexts are non-zero. This problem commonly features in applications involving encrypted outsourced storages, such as searchable encryption or oblivious message retrieval. We present two new algorithms with provable worst-case guarantees, solving this problem by using only homomorphic additions and multiplications by constants. Both of our new constructions improve upon the state of the art asymptotically and concretely. Our first construction, based on sparse polynomials, is perfectly correct and the first to achieve an asymptotically optimal compression rate by compressing the input vector into \bigOt ξ bits. Compression can be performed homomorphically by performing \bigOn log n homomorphic additions and multiplications by constants. The main drawback of this construction is a decoding complexity of Ω(√n). Our second construction is based on a novel variant of invertible bloom lookup tables and is correct with probability 1-2-κ. It has a slightly worse compression rate compared to our first construction as it compresses the input vector into \bigOξκ t /log t bits. In exchange, both compression and decompression of this construction are highly efficient. The compression complexity is dominated by \bigOn κ homomorphic additions and multiplications by constants. The decompression complexity is dominated by \bigOξκ t /log t decryption operations and equally many inversions of a pseudorandom permutation.
Using participatory budgeting (PB), cities give their residents an opportunity to influence how the city's budget is spent. This is often done by collecting project proposals, and then voting on those proposals. This talk gives a survey on work in computational social choice on this voting problem. First, we discuss the voting systems in use in major cities, which mostly involve a naive greedy algorithm based on approval scores. Then we formalize a model of voting under a knapsack constraint with additive valuations, and discuss how standard approval votes fit in that model. We then discuss a new voting rule, the Method of Equal Shares, that I have recently proposed with my coauthor Piotr Skowron. This rule provides proportional representation, which I'll argue is a desirable goal in PB. Finally, I will mention PB work on the core, strategyproofness, computational complexity, and several extensions which could form directions for future research: allowing negative votes, allowing for constraints, as well as analyzing input formats and the process of agenda setting, among others.
A distance oracle of a graph is a preferably compact data structure that can efficiently answer queries for the shortest path distance from one query vertex to another. A trivial distance oracle is a look-up table that stores the shortest path distance between any pair of vertices. This oracle has constant query time but requires Θ(n2) space. For general graphs, this is the best one can hope for for exact shortest path distance queries. In this talk, I will present some recent results I have been involved in which show that for planar graphs, it is possible to get an oracle that reports exact shortest path distances using space truly subquadratic in n (i.e., O(n2-ε) for some constant ε > 0) and constant or O(log n) query time. The previous best space bounds with polylogarithmic query time in planar graphs were only below Θ(n2) by log n-factors. I will also briefly mention progress I have obtained for approximate distance oracles in planar graphs.
In the problem of semialgebraic range searching, we are to preprocess a set of points in RD such that the subset of points inside a semialgebraic region described by O(1) polynomial inequalities of degree Δ can be found efficiently. Relatively recently, several major advances were made on this problem. Using algebraic techniques, "near-linear space" structures [AMS13,MP15] with almost optimal query time of Q(n) = O(n1-1/D+o(1)) were obtained. For "fast query" structures (i.e., when Q(n) = no(1)), it was conjectured that a structure with space S(n) = O(nD+o(1)) is possible. The conjecture was refuted recently by Afshani and Cheng [AC21]. In the plane, they proved that S(n)=Ω(nΔ+1-o(1)/Q(n)(Δ+3)Δ/2) which shows Ω(nΔ+1-o(1)) space is needed for Q(n) = no(1). While this refutes the conjecture, it still leaves a number of unresolved issues: the lower bound only works in 2D and for fast queries, and neither the exponent of n or Q(n) seem to be tight even for D = 2, as the current upper bound is S(n) = O(nm+o(1)/Q(n)(m-1)D/(D-1)) where m = D+Δ \choose D - 1 = Ω(ΔD) is the maximum number of parameters to define a monic degree-Δ D-variate polynomial, for any D, Δ = O(1). In this paper, we resolve two of the issues: we prove a lower bound in D-dimensions and show that when Q(n) = no(1)m + O(k), S(n) = Ω(nm-o(1)), which is almost tight as far as the exponent of n is considered in the pointer machine model. When considering the exponent of Q(n), we show that the analysis in [AC21] is tight for D = 2, by presenting matching upper bounds for uniform random point sets. This shows either the existing upper bounds can be improved or a new fundamentally different input set is needed to get a better lower bound.
We consider the problem of maximizing the Nash social welfare when allocating a set G of indivisible goods to a set N of agents. We study instances, in which all agents have 2-value additive valuations: The value of every agent i ∈ N for every good j ∈ G is either p or q, for p,q ∈ N, p<q. In this work, we design an algorithm to compute an optimal allocation in polynomial time if p divides q, i.e., when p = 1 and q ∈ N, after appropriate scaling. The problem is NP-hard whenever p and q are coprime and p ≥ 3. In terms of approximation, we present positive and negative results for general p and q. We show that our algorithm obtains an approximation ratio of at most 1.0345. Moreover, we prove that the problem is APX-hard, with a lower bound of 1.000015 achieved at p/q = 4/5.
A priority queue stores a set of items with associated keys and supports the insertion of a new item and extraction of an item with minimum key. In applications like Dijkstra's single source shortest path algorithm and Prim-Jarnik's minimum spanning tree algorithm, the key of an item can decrease over time. Usually this is handled by either using a priority queue supporting the deletion of an arbitrary item or a dedicated DecreaseKey operation, or by inserting the same item multiple times but with decreasing keys. We study what happens if the keys associated with items in a priority queue can decrease over time without informing the priority queue, and how such a priority queue can be used in Dijkstra's algorithm. We show that binary heaps with bottom-up insertions fail to report items with unchanged keys in correct order, while binary heaps with top-down insertions report items with unchanged keys in correct order. Furthermore, we show that skew heaps, leftist heaps, and priority queues based on linking roots of heap-ordered trees, like pairing heaps, binomial queues and Fibonacci heaps, work correctly with decreasing keys without any modifications. Finally, we show that the post-order heap by Harvey and Zatloukal, a variant of a binary heap with amortized constant time insertions and amortized logarithmic time deletions, works correctly with decreasing keys and is a strong contender for an implicit priority queue supporting decreasing keys in practice. To be presented at FUN22.
tSNE and UMAP are popular dimensionality reduction algorithms due to their speed and interpretable low-dimensional embeddings. Despite their popularity, however, little work has been done to study their full span of differences. We theoretically and experimentally evaluate the space of parameters in both tSNE and UMAP and observe that a single one – the normalization – is responsible for switching between them. This, in turn, implies that a majority of the algorithmic differences can be toggled without affecting the embeddings. We discuss the implications this has on several theoretic claims behind UMAP, as well as how to reconcile them with existing tSNE interpretations. Based on our analysis, we provide a method (GDR) that combines previously incompatible techniques from tSNE and UMAP and can replicate the results of either algorithm. This allows our method to incorporate further improvements, such as an acceleration that obtains either method’s outputs faster than UMAP. We release improved versions of tSNE, UMAP, and GDR that are fully plug-and-play with the traditional libraries.
We consider the following surveillance problem: Given a set P of n sites in a metric space and a set of k robots with the same maximum speed, compute a patrol schedule of minimum latency for the robots. Here a patrol schedule specifies for each robot an infinite sequence of sites to visit (in the given order) and the latency L of a schedule is the maximum latency of any site, where the latency of a site s is the supremum of the lengths of the time intervals between consecutive visits to s. When k=1 the problem is equivalent to the travelling salesman problem (TSP) and thus it is NP-hard. We have two main results. We consider cyclic solutions in which the set of sites must be partitioned into l groups, for some l ≤ k, and each group is assigned a subset of the robots that move along the travelling salesman tour of the group at equal distance from each other. Our first main result is that approximating the optimal latency of the class of cyclic solutions can be reduced to approximating the optimal travelling salesman tour on some input, with only a 1 + ε factor loss in the approximation factor and an O((k/ε)k) factor loss in the runtime, for any ε>0. Our second main result shows that an optimal cyclic solution is a 2(1-1/k)-approximation of the overall optimal solution. Note that for k=2 this implies that an optimal cyclic solution is optimal overall. The results have a number of consequences. For the Euclidean version of the problem, for instance, combining our results with known results on Euclidean TSP, yields a PTAS for approximating an optimal cyclic solution, and it yields a (2(1-1/k)+ε)-approximation of the optimal unrestricted solution. If the conjecture mentioned above is true, then our algorithm is actually a PTAS for the general problem in the Euclidean setting.
Tensors are mathematical concepts known as multidimensional arrays of numerical values in literature and these concepts are used for representing high dimensional data in many scientific problems. Therefore, tensor decomposition methods have wide application area such in psychometrics, chemometrics, signal processing, numerical linear algebra, computer vision, numerical analysis, data mining, neuroscience, graph analysis, machine learning, etc. The scope of this talk is to introduce basic tensor concepts and some of well-known tensor decomposition methods in literature and also giving information about my research studies related with tensor decomposition.
Abstract: We consider the maximum matching in dynamic graphs. Given a sequence of edge insertions and deletions, we aim to maintain a large matching using as little update time per edge as possible. In this talk, we will present a deterministic algorithm that maintains a 3/2+eps approximate matching using O(m1/4) worst case update time. The talk will focus on efficiently maintaining a subgraph known as an edge degree constrained subgraph (EDCS) with strong structural properties, which may be of independent interest. Based on joint work with Fabrizio Grandoni, Shay Solomon and Amitai Uzrad.
We study the PAC learnability of multiwinner voting, focusing on the class of approval-based committee scoring (ABCS) rules. These are voting rules applied on profiles with approval ballots, where each voter approves some of the candidates. According to ABCS rules, each committee of k candidates collects from each voter a score, that depends on the size of the voter's ballot and on the size of its intersection with the committee. Then, committees of maximum score are the winning ones. Our goal is to learn a target rule (i.e., to learn the corresponding scoring function) using information about the winning committees of a small number of sampled profiles. Despite the existence of exponentially many outcomes compared to single-winner elections, we show that the sample complexity is still low: a polynomial number of samples carries enough information for learning the target committee with high confidence and accuracy. Unfortunately, even simple tasks that need to be solved for learning from these samples are intractable. We prove that deciding whether there exists some ABCS rule that makes a given committee winning in a given profile is a computationally hard problem. Our results extend to the class of sequential Thiele rules, which have received attention due to their simplicity.
We study the computational complexity of computing solutions for the straight-cut and square-cut pizza sharing problems. We show that finding an approximate solution is PPA-hard for the straight-cut problem, and PPA-complete for the square-cut problem, while finding an exact solution for the square-cut problem is FIXP-hard and in BU. Our PPA-hardness results apply even when all mass distributions are unions of non-overlapping squares, and our FIXP-hardness result applies even when all mass distributions are unions of weighted squares and right-angled triangles. We also show that decision variants of the square-cut problem are hard: we show that the approximate problem is NP-complete, and the exact problem is ETR-complete. Joint Work with John Fearnley and Themistoklis Melissourgos Link: https://arxiv.org/abs/2012.14236
Expander graphs, over the last few decades, have played a pervasive role in almost all areas of theoretical computer science. Loosely, speaking an expander graph is an extremely wellconnected graph despite being sparse. Recently, various highdimensional analogues of these objects have been studied in mathematics and even more recently, there have been some surprising applications in computer science, especially in the area of property testing, coding theory and approximate counting. In this talk, I'll give a high-level introduction to these high-dimensional expanders (HDXs). In particular, we will view them through the perspective of random walks on graphs. Time permitting, we will then some see applications of these HDXs towards property testing, coding theory and matroid counting.
In 2017, Goldstein, Kopelowitz, Lewenstein, and Porat introduced the data structure variant of the 3SUM problem called 3SUM-Indexing in the cell probe model and furthermore drew links between its lower bounds and lower bounds for the Set Disjointness problem. Golovnev, Guo, Horel, Park, and Vaikuntanathan in 2020 then gave the first non-trivial data structure lower bound for small query times which match the best known lower bounds for static data structure problems and reductions to other problems in computational geometry. In this talk, we will present two small improvements to their lower bound result. Namely, we show that for any sufficiently small query time T, we require a data structure of size at least n(1+2/T) (which is a factor of n1/T larger than the existing lower bound). Furthermore, we show that if each memory cell only contains a single bit, the size of the data structure must be at least n3.
Graph neural networks are the de-facto standard way of applying machine learning to graphs. In this talk, we will cover the basic idea of GNNs and how they are typically used in graph-based machine learning. We then look into the expressivity of GNNs and present the connection to the 1-dimensional Weisfeiler-Leman algorithm, also known as color refinement. Article: https://arxiv.org/abs/1810.02244
We follow up on the idea of Lars Arge to rephrase the Reduce and Apply procedures of Binary Decision Diagrams (BDDs) as iterative I/O-efficient algorithms. We identify multiple avenues to simplify and improve the performance of his proposed algorithms. Furthermore, we extend the technique to other common BDD operations, many of which are not derivable using Apply operations alone, and we provide asymptotic improvements for the procedures that can be derived using Apply. These algorithms are implemented in a new BDD package, named Adiar. We see very promising results when comparing the performance of Adiar with conventional BDD package that use recursive depth-first algorithms. For instances larger than 9.5 GiB, our algorithms, in parts using the disk, are 1.47 to 3.7 times slower compared to CUDD and Sylvan, exclusively using main memory. Yet, our proposed techniques are able to obtain this performance at a fraction of the main memory needed by Sylvan to function. Furthermore, with Adiar we are able to manipulate BDDs that outgrow main memory and so surpass the limits of other BDD packages. arXiv Paper: https://arxiv.org/abs/2104.12101 GitHub: https://github.com/ssoelvsten/adiar
The program performance on modern hardware is characterized bylocality of reference, that is, it is fasterto access data that is close in address space to data that has been accessed recently than data in a randomlocation. This is due to many architectural features including caches, prefetching, virtual address translationand the physical properties of a hard disk drive; attempting to model all the components that constitute theperformance of a modern machine is impossible, especially for general algorithm design purposes. What ifone could prove an algorithm is asymptotically optimal on all systems that reward locality of reference, nomatter how it manifests itself within reasonable limits? We show that this is possible, and that excluding somepathological cases, cache-oblivious algorithms that are asymptotically optimal in the ideal-cache model areasymptotically optimal in any reasonable setting that rewards locality of reference. This is surprising as thecache-oblivious framework envisions a particular architectural model involving blocked memory transfer into amulti-level hierarchy of caches of varying sizes, and was not designed to directly model locality-of-referencecorrelated performance.
We prove an Ω(log nloglog n) lower bound for the span of implementing the n input, log n-depth FFT circuit (also known as butterfly network) in the nonatomic binary fork-join model. In this model, memory-access synchronizations occur only through fork operations, which spawn two child threads, and join operations, which resume a parent thread when its child threads terminate. Our bound is asymptotically tight for the nonatomic binary fork-join model, which has been of interest of late, due to its conceptual elegance and ability to capture asynchrony. Our bound implies super-logarithmic lower bound in the nonatomic binary fork-join model for implementing the butterfly merging networks used, e.g., in Batcher's bitonic and odd-even mergesort networks. This lower bound also implies an asymptotic separation result for the atomic and nonatomic versions of the fork-join model, since, as we point out, FFT circuits can be implemented in the atomic binary fork-join model with span equal to their circuit depth. Joint work with Michael T. Goodrich and Riko Jacob, presented at SODA 2021.
Microchip design is a process which consists of several stages, each stage posing its own flavour of algorithmic challenges. These challenges fall within different fields of algorithmic research, such as computational geometry, graph theory, parallel algorithms, and combinatorial optimization. In this talk we present an overview of the modern chip design process, and we briefly outline some of the main problems currently encountered in this field.
I will introduce a new direction in categorical range counting. Categorical range counting is the problem of given a colored point set in d-dimension store them in a datastructure, such that we can answer the following query: Given a range, output the number of distinct colors within the query range. In the classical setting the categories/colors are thought of as independent. In this talk I will present a version of the problem where the color set have dependency and show some basic lowerbounds and upper bounds in this setting.
I will give a brief overview of Fair Division, and then describe its elegant connection to a celebrated result in combinatorial topology, namely Sperner's Lemma.
Iterative methods for approximating zero-sum Nash equilibria in extensive-form games have been a core component of recent advances in superhuman poker AIs. In this talk, I will first give an optimization-oriented description of how these methods work. Then, I will discuss and contrast two recent results: First, the development of a new entropy-based regularization method for the decision spaces associated with extensive-form games, which is simultaneously simpler to analyze and has better theoretical properties than the current state of the art. Second, I will discuss new algorithms based on optimistic variants of regret matching and CFR, which lead to very strong practical performance, in spite of inferior theoretical guarantees.
Robust property-preserving hash (PPH) functions compress large inputs x and y into short digests h(x) and h(y) in a manner that allows for evaluating a predicate P on x and y while only having access to the corresponding hash values. In contrast to locality-sensitive hash functions, a robust PPH function guarantees to correctly evaluate a predicate on h(x) and h(y) even if x and y are chosen adversarially after seeing h. In this talk, I'll give an introduction to this new area and I'll will present the first construction of a robust PPH function for the exact hamming distance predicate (i.e. "is the hamming distance between inputs x and y larger than some parameter t?"). The talk will be based on recent results that will appear at Eurocrypt 2021
I will introduce the notion of a probabilistic polynomial for a Boolean function, which arose in the 1980s in the study of circuit lower bounds (Razborov 1987) and has more recently found use in algorithms for combinatorial problems such as All Pairs Shortest Paths (Williams 2014). Informally, a probabilistic polynomial is a randomized algorithm, where the algorithm is a polynomial and its efficiency is measured by its degree. I will try to describe two results. 1. A result from 2016 where we proved lower bounds on the probabilistic polynomial degree of the OR function. This is proved via interesting results about anti-concentration of low-degree polynomials. 2. A recent result where we show lower bounds on the minimum possible probabilistic degree of an n-variable function. This is joint work with Prahladh Harsha and S Venkitesh.
In a landmark paper, Patrascu demonstrated how a single lower bound for the static data structure problem of reachability in the butterfly graph, could be used to derive a wealth of new and previous lower bounds via reductions. These lower bounds are tight for numerous static data structure problems. Moreover, he also showed that reachability in the butterfly graph reduces to dynamic marked ancestor, a classic problem used to prove lower bounds for dynamic data structures. Unfortunately, Patrascu's reduction to marked ancestor loses a lglg n factor and therefore falls short of fully recovering all the previous dynamic data structure lower bounds that follow from marked ancestor. In this paper, we revisit Patrascu's work and give a new lossless reduction to dynamic marked ancestor, thereby establishing reachability in the butterfly graph as a single seed problem from which a range of tight static and dynamic data structure lower bounds follow. To be presented at SOSA 2021.
We investigate the limits of one of the fundamental ideas in data structures: fractional cascading. This is an important data structure technique to speed up repeated searches for the same key in multiple lists and it has numerous applications. Specifically, the input is a ``catalog'' graph, G, of constant degree together with a list of values assigned to every vertex of G. The goal is to preprocess the input such that given a connected subgraph G' of G and a single query value q, one can find the predecessor of q in every list that belongs to G'. The classical result by Chazelle and Guibas shows that in a pointer machine, this can be done in the optimal time of O(log n + |G'|) where n is the total number of values. However, if insertion and deletion of values are allowed, then the query time slows down to O(log n + |G'| loglog n). If only insertions (or deletions) are allowed, then once again, an optimal query time can be obtained but by using amortization at update time. We prove a lower bound of Ω( log n √loglog n) on the worst-case query time of dynamic fractional cascading, when queries are paths of length O(log n). The lower bound applies both to fully dynamic data structures with amortized polylogarithmic update time and incremental data structures with polylogarithmic worst-case update time. As a side, this also proves that amortization is crucial for obtaining an optimal incremental data structure. This is the first non-trivial pointer machine lower bound for a dynamic data structure that breaks the Ω(log n) barrier. In order to obtain this result, we develop a number of new ideas and techniques that hopefully can be useful to obtain additional dynamic lower bounds in the pointer machine model. To be presented at SODA 2021.
In this work, we present the first asymptotically optimal oblivious priority queue, which matches the lower bound of Jacob, Larsen, and Nielsen (SODA'19). Our construction is conceptually simple and statistically secure. We illustrate the power of our optimal oblivious priority queue by presenting a conceptually equally simple construction of statistically secure offline ORAMs with O(log n) bandwidth overhead. To be presented at SODA 2021.
Chazelle introduced the soft heap as a building block for efficient minimum spanning tree algorithms, and recently Kaplan et al. showed how soft heaps can be applied to achieve simpler algorithms for various selection problems. A soft heap trades-off accuracy for efficiency by allowing ε N of the items in a heap to be corrupted after a total of N insertions, where a corrupted item is an item with artificially increased key and 0<ε≤ 1/2 is a fixed error parameter. Chazelle's soft heaps are based on binomial trees and support insertions in amortized O(log (1/ε)) time and extract-min operations in amortized O(1) time. In this paper we explore the design space of soft heaps. The main contribution of this paper is an alternative soft heap implementation based on merging sorted sequences, with time bounds matching those of Chazelle's soft heaps. We also discuss a variation of the soft heap by Kaplan et al., where we avoid performing insertions lazily. It is based on ternary trees instead of binary trees and matches the time bounds of Kaplan et al., i.e. amortized O(1) insertions and amortized O(log (1/ε)) extract-min. Both our data structures only introduce corruptions after extract-min operations which return the set of items corrupted by the operation. To be presented at SOSA 2021.
Various Neural Networks employ time-consuming matrix operations like matrix inversion. Many such matrix operations are faster to compute given the Singular Value Decomposition (SVD). Previous work allows using the SVD in Neural Networks without computing it. In theory, the techniques can speed up matrix operations, however, in practice, they are not fast enough. We present an algorithm that is fast enough to speed up several matrix operations. The algorithm increases the degree of parallelism of an underlying matrix multiplication H· X where H is an orthogonal matrix represented by a product of Householder matrices. To be presented at NeurIPS 2020.
Motivated by practical generalizations of the classic k-median and k-means objectives, such as clustering with size constraints, fair clustering, and Wassterstein barycenter, we introduce a meta-theorem for designing coresets for constrained-clustering problems. The meta-theorem reduces the task of coreset construction to one on a bounded number of ring-instances with a much-relaxed additive error. This enables us to use uniform sampling, in contrast to the widely used importance sampling, when constructing our coreset, and consequently we can easily handle constrained objectives. Notably and perhaps surprisingly, we show that this simpler sampling scheme can yields bounds that are independent of n, the number of input points. Our technique yields better coreset bounds, and sometimes the first coreset, for a large number of constrained clustering problems, including capacitated clustering, fair clustering, clustering with anonymity, clustering with diversity, Euclidean Wasserstein barycenter, clustering in minor-excluded graph, and polygon clustering under Frechet and Hausdorff distance. Finally, our technique also yields improved coresets for 1-Median in low-dimensional Euclidean spaces.
Fractional cascading is one of the influential and important techniques in data structures, as it provides a general framework for solving a common important problem: the iterative search problem. In the problem, the input is a graph G with constant degree. Also as input, we are given a set of values for every vertex of G. The goal is to preprocess G such that when we are given a query value q, and a connected subgraph π of G, we can find the predecessor of q in all the sets associated with the vertices of π. The fundamental result of fractional cascading, by Chazelle and Guibas, is that there exists a data structure that uses linear space and it can answer queries in O(log n + |π|) time, at essentially constant time per predecessor. While this technique has received plenty of attention in the past decades, an almost quadratic space lower bound for ``two-dimensional fractional cascading'' by Chazelle and Liu in STOC 2001 has convinced the researchers that fractional cascading is fundamentally a one-dimensional technique. In two-dimensional fractional cascading, the input includes a planar subdivision for every vertex of G and the query is a point q and a subgraph π and the goal is to locate the cell containing q in all the subdivisions associated with the vertices of π. In this paper, we show that it is actually possible to circumvent the lower bound of Chazelle and Liu for axis-aligned planar subdivisions. We present a number of upper and lower bounds which reveal that in two-dimensions, the problem has a much richer structure. When G is a tree and π is a path, then queries can be answered in O(logn+|π|+min{|π|√logn,α(n)√|π|logn}) time using linear space where α is an inverse Ackermann function; surprisingly, we show both branches of this bound are tight, up to the inverse Ackermann factor. When G is a general graph or when π is a general subgraph, then the query bound becomes O(log n + |π|√log n) and this bound is once again tight in both cases.
We reexamine three classical settings of optimization under uncertainty, which have been extensively studied in the past, assuming that the several random events involved are mutually independent. Here, we assume that such events are only pair-wise independent; this gives rise to a much richer space of instances. Our aim has been to explore whether positive results are possible even under the more general assumptions. We show that this is indeed the case. Indicatively, we show that, when applied to pair-wise independent distributions of buyer values, sequential posted pricing mechanisms get at least 11.299 of the revenue they get from mutually independent distributions with the same marginals. We also adapt the well-known prophet inequality to pair-wise independent distributions of prize values to get a 1/3-approximation using a non-standard uniform threshold strategy. Finally, in a stochastic model of generating random bipartite graphs with pair-wise independence on the edges, we show that the expected size of the maximum matching is large but considerably smaller than in Erd Hos-Renyi random graph models where edges are selected independently. Our techniques include a technical lemma that might find applications in other interesting settings involving pair-wise independence. Joint work with Nick Gravin, Pinyan Lu, and Zihe Wang
This work presents an improved hashing-based algorithm for Private Set Intersection (PSI) in the honest-but-curious setting. The protocol is generic, modular and provides both asymptotic and concrete efficiency improvements over existing PSI protocols. If each player has m elements, our scheme requires only O(mλ) communication between the parties, where λ is a security parameter. Our protocol builds on the hashing-based PSI protocol of Pinkas et al. (USENIX 2014, USENIX 2015), but we replace one of the sub-protocols (handling the cuckoo "stash") with a special-purpose PSI protocol that is optimized for comparing sets of unbalanced size. This brings the asymptotic communication complexity of the overall protocol down from ω(mλ) to O(mλ), and provides concrete performance improvements (10-15% reduction in communication costs) over Kolesnikov et al. (CCS 2016) under real-world parameter choices. Our protocol is simple, generic and benefits from the permutation-hashing optimizations of Pinkas et al. (USENIX 2015) and the Batched, Relaxed Oblivious Pseudo Random Functions of Kolesnikov et al. (CCS 2016).
This interactive presentation consists of several parts, including: 1) A simple abstract resource theory, where resources are elements of a partially ordered set and where ajoining resources, relaxations of resources, and constructions (of resources from resources), are captured as order-preserving functions (i.e., as homomorphisms) satisfying certain axioms, for example (one-sided) commutativity of certain homomorphisms. 2) A theory of discrete probabilistic systems, where most systems discussed in cryptography (and beyond) can be understood as descriptions (in a particular language, for example a specific pseudo-code language) of such discrete systems. An instantiation of discrete systems for modeling time is also discussed, which is important for modeling guarantees in asynchronous systems such as blockchain systems. 3) Constructive cryptography as an instantiation of the resource theory, where resources are (specifications, i.e., sets of) discrete systems. Many diverse types of traditional statements in cryptography can benaturally unified and seen as being of the same type, for example information-theoretic and cryptographic security, static and adaptive corruption, and synchronous and asynchronous models as well as models capturing time. If time permits, I will also discuss an abstract theory of reductions that requires no computational model but nevertheless allows to capture typical complexity-based statements in cryptography (and beyond). The presentation is targeted at two different audiences, cryptographers and formal method specialists, and it aims at providing a bridge between them to allow cryptographers to make clean security statements that are formalizable and amenable to machine-checkable proofs. But the presented framework has a wider scope than cryptography; the ultimate goal (not yet worked out) is to capture statements at various levels, from the physical layer over digital circuits and software to the application, with cryptographic statements naturally integrated, as construction steps in a single framework. Based partially on joint work with Renato Renner and with many other researchers.
Traditionally arithmetic multiparty computation protocols have mostly worked over finite fields. Recently, more general commutative rings have been considered, mostly notably the integers modulo a prime power. The SPDZ2k protocol (CRYPTO 2018) showed that in the dishonest majority setting MPC over these rings can be obtained with comparable efficiency to over fields. Since then, several works have shown that in some cases, working over these rings can be much faster than working over fields. We discuss some of the differences of doing MPC over fields versus these rings, and show that comparable efficiency to fields can also be obtained in the information-theoretic honest-majority setting.
Fairness in Secure Multiparty Computation (MPC) is known to be impossible to achieve in the presence of a dishonest majority. Previous works have proposed combining MPC protocols with Cryptocurrencies in order to financially punish aborting adversaries, providing an incentive for parties to honestly follow the protocol. This approach also yields privacy-preserving Smart Contracts, where private inputs can be processed with MPC in order to determine the distribution of funds given to the contract. Unfortunately, the focus of existing work is on proving that this approach is possible and they present monolithic and mostly inefficient constructions. In this work, we put forth the first modular construction of ``Insured MPC'', where either the output of the private computation (which describes how to distribute funds) is fairly delivered or a proof that a set of parties has misbehaved is produced, allowing for financial punishments. Moreover, both the output and the proof of cheating are publicly verifiable, allowing third parties to independently validate an execution. We present a highly efficient compiler that uses any MPC protocol with certain properties together with a standard (non-private) Smart Contract and a publicly verifiable homomorphic commitment scheme to implement Insured MPC. As an intermediate step, we propose the first construction of a publicly verifiable homomorphic commitment scheme achieving composability guarantees and concrete efficiency. Our results are proven in the Global Universal Composability framework using a Global Random Oracle as the setup assumption. From a theoretical perspective, our general results provide the first characterization of sufficient properties that MPC protocols must achieve in order to be efficiently combined with Cryptocurrencies, as well as insights into publicly verifiable protocols. On the other hand, our constructions have highly efficient concrete instantiations, allowing for fast implementations.
In the 'frequent items' problem one sees a sequence of items in a stream (e.g. a stream of words coming into a search query engine like Google) and wants to report a small list of items containing all frequent items. We would like algorithms for this problem that use memory substantially sublinear in the length of the stream. In this talk we describe a new state-of-the-art solution to this problem, called the "BPTree". We make use of chaining methods to control the suprema of Rademacher processes to develop this new algorithm which has provably near-optimal memory consumption for the "l2" heavy hitters problem, improving upon the CountSieve and CountSketch from previous work. Based on joint work with Vladimir Braverman, Stephen Chestnut, Nikita Ivkin, Zhengyu Wang, and David P. Woodruff
Machine Learning models, and specially convolutional neural networks (CNNs), are at the heart of many day-to-day applications like image classification and speech recognition. The need for evaluating such models whilst preserving the privacy of the input provided increases as the models are used for more information-sensitive tasks like DNA analysis or facial recognition. Research on evaluating CNNs securely has been very active during the last couple of years, e.g. Mohassel & Zhang (S&P'17) and Liu et al. (CCS'17), leading to very efficient frameworks like SecureNN (ePrint:2018:442), which can perform evaluation of some CNNs with a multplicative overhead of only 17--33 with respect to evaluation in the clear. We contribute to this line of research by introducing a technique from the Machine Learning domain, namely quantization, which allows us to scale secure evaluation of CNNs to much larger networks without the accuracy loss that could happen by adapting the network to the MPC setting. Quantization is motivated by the deployment of ML models in resource-constrained devices, and we show it to be useful in the MPC setting as well. Our results show that it is possible to evaluate realistic models---specifically Google's MobileNets line of models for image recognition---within seconds. Our performance gain can be mainly attributed to two key ingredients: One is the use of the three-party MPC protocol based on replicated secret sharing by Araki et al. (S&P'17), whose multiplication only requires sending one number per party. Moreover, it allows to evaluate arbitrary long dot products at the same communication cost of a single multiplication, which facilitates matrix multiplications considerably. The second main ingredient is the use of arithmetic modulo 264 , for which we develop a set of primitives of independent interest that are necessary for the quantization like comparison and truncation by a secret shift. This talk is based on joint work with Daniel Escudero (Aarhus), Assi Barak (BIU) and Marcel Keller (data61). The paper is on eprint at https://eprint.iacr.org/2019/131
We prove a lower bound on the communication complexity of unconditionally secure multiparty computation with n=2t+1 parties of which t are corrupted. We prove that for any g ∈ N there exists a Boolean circuit C with g gates, where any secure protocol implementing C must communicate Ω(t g) bits, even if only passive and statistical security is required, and even if parties are given access to correlated randomness. The results easily extends to constructing similar circuits over any fixed finite field. This shows that for all sizes of circuits, the O(n) overhead of all known protocols when t≥ ⌊ n/2 ⌋ is inherent. It also shows that security comes at a price: the circuit we consider could namely be computed among n parties with communication only O(g) bits if no security was required. Our results extend to the case where the threshold t is suboptimal and this shows that the known optimizations via packed secret-sharing can only be obtained if one accepts that the threshold is t= (1/2 - c)n for a constant c. For the case of honest majority and no correlated randomness, we also show an upper bound that matches the lower bound up to a constant factor (existing upper bounds are a factor log n off for Boolean circuits). Joint work with Jesper Buus Nielsen and Kasper Green Larsen
Spatial Scan Statistics measure and detect anomalous spatial behavior, specifically they identify geometric regions where significantly more of a measured characteristic is found than would be expected from the background distribution. These techniques have been used widely in geographic information science, such as to pinpoint disease outbreaks. However, until recently, available algorithms and software only scaled to at most a thousand or so spatial records. In this work I will describe how using coresets, efficient constructions, and scanning algorithms, we have developed new algorithms and software that easily scales to millions or more data points. Along the way we provide new efficient algorithms and constructions for ε-samples and ε-nets for various geometric range spaces. This is a case where subtle theoretical improvements of old structures from discrete geometry actually result in substantial empirical improvements.
We start our talk by reflections on the similarity between (some) game-based and (some) simulation-based definitions. We then explain shortly how a typical game-playing proof for a real-life protocol such as TLS looks like and why/how we reach limits of human understanding quickly in such proofs. We then present a new technique to carry out game-playing proofs: We here package pieces of code that share state and allow information between packages only to flow via calls. Thereby, we obtain a call graph where the packages are the nodes. We then show how reductions can be represented via cuts in the graph and show proof examples that become more manageable when thinking on the level of packages rather than on the level of individual code lines. In the end of the seminar, we would like to discuss the complexity of MPC proofs with the audience.
Using an homomorphic authenticator (HA) scheme (e.g., homomorphic MAC or signature) a user, Alice, can authenticate a collection of data items using her secret key, and send the authenticated data to an untrusted server. At any later point in time, the server can generate a short authenticator vouching for the correctness of the output of a function computed on the outsourced data to another user Bob. However, how can we authenticate homomorphic computation of functions that involve data authenticated by different users holding independent secret keys? In this talk I will present two solutions to create multi-key homomorphic authenticators. I will describe the first multi-key homomorphic MAC construction by Fiore et al (Asiacrypt 2016) and the first compiler to enhance any sufficiently expressive single-key homomorphic signature with multi-key properties (Fiore and Pagnin, to appear at SCN 2018).
Take a finite set of biased dice that share some common faces. An adversary repeatedly tosses them, with each choice of die possibly depending on the previous outcomes. Can you extract true randomness? In 1986 Santha and Vazirani gave a negative answer when the dice are (two-sided) coins. In 2015 Beigi, Etesami, and Gohari showed how to obtain an almost-unbiased bit for other sets of dice. The sample complexity of their extractor is polynomial in the inverse of the error. We completely classify all non-trivial randomness sources of this type into: (1) non-extractable ones; (2) extractable from polynomially many samples; and (3) extractable from an logarithmically many samples (in the inverse of the error). The extraction algorithms are efficient and easy to describe. I will discuss the relevance to distributed and cryptographic computation from imperfect randomness and point out some open questions in this context.
Asymmetric External Memory (AEM) model, introduced by Blelloch et al. at SPAA 2015, models computations on new non-volatile memory (NVM), rather than tradition DRAM technology for storage purposes. The significant difference in NVMs is the fact that the cost of writing data is significantly more expensive than the cost of reading it. This asymmetry requires new algorithmic techniques that emphasize the ability to trade-off write accesses for extra read accesses during computation. In this talk I will present sorting algorithms, which achieve such trade-off between read and write accesses in the AEM model. I will also show a lower bound that proves that these algorithms are asymptotically optimal. This is joint work with Riko Jacob, and appeared in SPAA 2017.
We will present a heavy-hitters protocol in the model of local differential privacy. Our protocol achieves optimal or near optimal worst-case error, running time, and memory, improving on the prior state-of-the-art result of Bassily and Smith [STOC 2015]. The improvement is crucial for making locally-private heavy-hitters algorithms usable when the number of participants is in the millions. Joint work with Raef Bassily, Kobbi Nissim, and Abhradeep Thakurta.
Three important properties of classification machinery are: (i) the system preserves the core information of the input data; (ii) the training examples convey information about unseen data; and (iii) the system is able to treat differently points from different classes. In this talk, I will show that these fundamental properties are satisfied by the architecture of deep neural networks. I will highlight a formal proof that such networks with random Gaussian weights perform a distance-preserving embedding of the data, with a special treatment for in-class and out-of-class data. Similar points at the input of the network are likely to have a similar output. The theoretical analysis of deep networks I will present exploits tools used in the compressed sensing and dictionary learning literature, thereby making a formal connection between these important topics. The derived results allow drawing conclusions about the metric learning properties of the network and their relation to its structure, as well as providing bounds on the required size of the training set such that the training examples would represent faithfully the unseen data. I will show how these conclusions are validated by state-of-the-art trained networks. I believe that these results provide an insight why deep learning works and potentially allow designing more efficient network architectures and learning strategies.
Locality-sensitive hashing (LSH) has recently emerged as the dominant algorithmic technique for similarity search due to its sublinear performance guarantees. In the talk we will cover the background and some recent advances of LSH in Hamming space, including bit sampling LSH, Covering LSH, Fast Covering LSH. Experiments on synthetic and real-world data sets demonstrate that recent Fast Covering LSH is comparable (and often superior) to traditional hashing-based approaches for search radius up to 20 in high-dimensional Hamming space.
We study the consensus-halving problem of dividing an object into two portions, such that each of n agents has equal valuation for the two portions, given any set of (succinctly representable) valuation functions. The ε-approximate consensus-halving problem allows each agent to have an ε discrepancy on the values of the portions. We prove that computing an ε-approximate consensus-halving solution using n cuts is in PPA, and is PPAD-hard, where ε is some positive constant; the problem remains PPAD-hard when we allow a constant number of additional cuts. We also prove that it is NP-hard to decide whether a solution with n-1 cuts exists for the problem. Finally, we establish a reduction from the more general problem of approximate consensus-(1/k)-division to the known Necklace Splitting Problem, proving that the latter problem is PPAD-hard as a corollary. Joint work with Paul W. Goldberg, Søren Kristoffer Stiil Frederiksen and Jie Zhang.
This talk is a brief overview of my topics of research. In particular, the talk is focused on non-malleable protocols. Loosely speaking, in a non-malleable setting a man-in-the-middle (MIM) is defined as an adversary that can participates in many executions of the protocol and he has the power to omit, insert or modify messages at its choice. He has also full control over the scheduling of the messages. A non-malleable protocol is a protocol that remains secure even in presence of a MIM adversary. We will present the following results: - 3-round delayed-input concurrent non-malleable commitment from one-way permutations secure w.r.t. sub-exponentially time adversary; - 4-round delayed-input parallel non-malleable zero-knowledge (NMZK) from one-way functions; As an application of our 4-round delayed-input parallel NMZK we construct a 4-round multi-party protocol for coin tossing functionality based on one-way permutations.
The talk is divided in two parts. 1) Proofs of partial knowledge allow a prover to prove knowledge of witnesses for k out of n instances of NP languages. Cramer, Schoenmakers and Damgaard [CDS94] provided an efficient construction of a 3-round public-coin witness-indistinguishable (k,n)-proof of partial knowledge for any NP language, by cleverly combining n executions of sigma-protocols for that language. This transform assumes that all n instances are fully specified before the proof starts, and thus directly rules out the possibility of choosing some of the instances after the first round. In the first part of the talk we see how to have a proof of partial knowledge where knowledge of all n instances can be postponed. Indeed, this property is achieved only by inefficient constructions requiring NP reductions (Lapidot and Shamir, CRYPTO 1990). 2) Katz and Ostrovsky in CRYPTO 2004 showed a 5 (optimal) round construction assuming trapdoor permutations for the general case where both players receive the output. They also proved that their result is round optimal. This lower bound has been recently revisited by Garg et al. in Eurocrypt 2016 where a 4 (optimal) round protocol is showed assuming a simultaneous message exchange channel. Unfortunately there is no instantiation of the protocol of Garg et al. under standard polynomial time hardness assumptions. In this second part of the talk will be showed how to construct a 4-round protocol for secure two-party computation (when a simultaneous message exchange channel is available) with black-box simulation, assuming trapdoor permutations against polynomial-time adversaries. Moreover, in order to provide such a construction two additional results will be showed: a) an oblivious transfer protocol from trapdoor permutations that enjoys special properties; b) a new security proof approach that avoids "rewinding issues".
One class learning denotes the set of methods designed to isolate a target data set from unexpected data which are considered as anomalies or outliers. This problem is commonly considered as a challenging task because we are often faced with a double barrier: scarcity and dimensionality. "Scarcity" means that it is difficult to collect enough examples to build a representative class. This is often the case in the applications where anomalies are rare and therefore difficult to collect. This limited representativeness shows up the failings of standard machine learning algorithms for classification which are based on the statistical balance of the classes. Secondly, searching for anomalies in high-dimensional data is also another challenging issue which is directly related to the general problem of 'curse of dimensionality'. Indeed, in high dimensional space, the data is sparse and the notion of proximity fails to retain its meaningfulness. In this talk, in order to solve these issues, I propose an extension of a previous work which is based on a new contrast measure to separate a target data set from outliers. This generalization is formulated from the null space theory. More precisely, the Null space based strategy is a way to solve our problem when the dimensionality of the data is higher than the sample size. This problem is named the small sample size problem. This is a common case in image processing, for instance for face recognition problems. In this specific subspace, our learning data set is reduced to a single point making the classification easier (up to a given threshold). The decision boundary will be built from a target data sample beforehand collected and one counter example "artificially" located at the origin of the feature space. Next, I will compare our approach with other novelty detectors on numerous data sets. Lastly, we apply the proposed methodology to extract stationary/moving objects in video sequences which are subjected to varying weather conditions. The goal is to learn the lighting variations of the background which is considered as the target data set and then to classify the moving image structures as outliers.
Constraint-hiding constrained PRFs (CHCPRFs), initially studied by Boneh, Lewi and Wu [PKC 2017], are constrained PRFs where the constrained key hides the description of the constraint. Envisioned with powerful applications such as searchable encryption, private-detectable watermarking and symmetric deniable encryption, the only known candidates of CHCPRFs were based on indistinguishability obfuscation or multilinear maps with strong security properties. In this work we build CHCPRFs for all NC1 circuits from the Learning with Errors assumption. The construction is based on the graph-induced multilinear maps proposed by Gentry, Gorbunov and Halevi [TCC 2015]. In the talk I will demonstrate how to use GGH15 safely, and mention some open problems related to GGH15 and CHCPRF. Based on joint work with Ran Canetti. https://eprint.iacr.org/2017/143
The topic of this talk is the oblivious evaluation of a linear function f(x) = ax + b. This problem is non-trivial in the sense that a sender chooses (a, b) and the receiver x, but the receiver may only learn f(x), while the sender learns nothing about a or b. We present a highly efficient and UC-secure construction of OLE in the OT-hybrid model that requires only O(1) OTs per OLE. The construction is based on noisy encodings. Our main technical contribution solves a problem left open in previous work, namely we show in a generic way how to achieve full simulation-based security from noisy encodings. All previous constructions using noisy encodings achieve only passive security. From our OLE construction we derive improvements in MPC and OPE over existing results from the literature.
We introduce a batch version of sparse recovery, where the goal is to construct A'1,...,A'm ∈ Rn that estimate a sequence of unknown signals A1,...,Am ∈ Rn, using linear measurements, each involving exactly one signal vector, under an assumption of average sparsity. More precisely, we want to have ∑j ∈ [m]|Aj- Aj'|pp ≤ C · min { ∑j ∈ [m]|Aj - Aj*|pp } (1) for predetermined constants C ≥ 1 and p, where the minimum is over all A1*,...,Am* ∈ Rn that are k-sparse on average (strictly speaking, k is given as input). The special case m=1 is known as stable sparse recovery. The main question is what is the minimal number of measurements required to satisfy (1). We resolve the question for p ∈ {1,2} up to polylogarithmic factors, by presenting a randomized adaptive scheme that with high probability performs Õ(km) measurements and gives an output satisfying (1). Additionally, we show that adaptivity is necessary for every non-trivial scheme that solves the batch sparse recovery problem. Joint work with Alexandr Andoni, Robert Krauthgamer and Eric Price.
Blockchains offer users the ability to securely transfer money without a trusted party or intermediary, and the transparency of blockchain transactions also enables public verifiability. This openness, however, comes at a cost to user privacy, as even though the pseudonyms users go by within the blockchain are not linked to their real-world identities, all movements of money among these pseudonyms are still traceable. This talk will introduce Möbius, a system that prevents anyone from being able to link senders and recipients and is compatible with the Ethereum blockchain. Möbius achieves strong notions of unlinkability, as even senders cannot identify which pseudonyms belong to the recipients to whom they sent money, and is also able to resist denial-of-service attacks. It also achieves a much lower off-chain communication complexity than all previous schemes, with senders and recipients needing to send only one initial message in order to engage in an arbitrary number of transactions.
We study exchange of information among rational agents, that is mechanism design in the setting where agents are rewarded using information only. This is an interesting setting motivated, among other things, by the increasing interest in secure multi party computation techniques. Moreover this is a very challenging setting, since information (as opposed to money) can be easily replicated and is not fungible i.e., the same piece of information might have different values for different agents. More specifically, we consider the setting of a joint computation where different agents have inputs of different quality or value, and their utility incorporates both correctness and exclusiveness, i.e. every agent is interested in improving the quality of his own piece of information while preventing other agents from doing so. Then we ask the question of whether we can design mechanisms that motivate all agents (even those with high-quality input) to participate in the computation. We begin answering this fascinating question by proposing mechanisms for natural joint computation tasks such as intersection, union, and average. Based on Joint work with Guang Yang
Data exploration is about efficiently extracting knowledge from data. It has a wide range of applications in exploratory analysis applications. In this seminar, I will present my recent research works in this area, namely, top-k insights extraction and k-shortlist preference space, which will appear in SIGMOD'2017. The background of these two problems are given below. OLAP tools have been extensively used by enterprises to make better and faster decisions. Nevertheless, they require users to specify group-by attributes and know precisely what they are looking for. Our top-k insights extraction problem takes the first attempt towards automatically extracting top-k insights from multi-dimensional data from data warehouses. This is useful not only for non-expert users, but also reduces the manual effort of data analysts. Determining the impact regions of a given object (i.e., a specific product) is a key subroutine in many market analysis applications. Our k-shortlist preference space problem aims to identify impact regions of user preferences for which the given object is highly preferable (i.e., in top-k result list). This problem is important in potential customer identification, profile-based marketing, targeted advertising, etc. Biography: Bo Tang is currently a PhD candidate in the Department of Computing at the Hong Kong Polytechnic University. He is a member of Database Research Group. His main research interests are in the area of multidimensional data management, i.e., similarity search on high dimensional data, data exploration techniques on multidimensional dataset.
In this talk, we will consider the problem of how to consistently reroute flows in a network: this problem has recently received much attention by the networking community, and is motivated by the advent of so-called software-defined networks (a novel paradigm in computer networking). While many fundamental research questions are still open, I will provide an overview of the state-of-the-art. In particular, I will discuss three problem variants: (1) algorithms to consistently reroute flows in uncapacitated networks, such that loop-freedom is preserved, (2) algorithms to reroute flows such that basic security policies (namely waypointing) are preserved, and (3) algorithms to reroute flows in capacitated networks, ensuring congestion-freedom. If time permits (and there is interest), I will also discuss some security implications on today's shift toward more programmable and virtualized networks. The talk will be based on our HotNets 2014, PODC 2015, DSN 2016, SIGMETRICS 2016 papers, as well as on the arXiv paper: https://arxiv.org/abs/1611.09296 See also our recent survey on the topic: https://net.t-labs.tu-berlin.de/ stefan/survey-network-update-sdn.pdf Bio: Stefan Schmid is an Associate Professor at Aalborg University, Denmark. Before that, he was a senior research scientist at T-Labs, Berlin, a postdoc at TU Munich, and a PhD student at ETH Zürich. Stefan's research interests revolve around the fundamental problems of dynamic distributed systems and networks. Stefan has received the ComSoc ITC early-career award 2016 and is currently looking for PhD and Postdoc students. For more information, see: https://net.t-labs.tu-berlin.de/ stefan/
Practical anonymous credential systems are generally built around sigma-protocol ZK proofs. This requires that credentials be based on specially formed signatures. In this work, we ask whether we can instead use a standard (say, RSA, or (EC)DSA) signature that includes formatting and hashing messages, as a credential, and still provide privacy. Existing techniques do not provide efficient solutions for proving knowledge of such a signature: On the one hand, ZK proofs based on garbled circuits (Jawurek et al. 2013) give efficient proofs for checking formatting of messages and evaluating hash functions. On the other hand they are expensive for checking algebraic relations such as RSA or discrete-log, which can be done efficiently with sigma protocols. We design new constructions obtaining the best of both worlds: combining the efficiency of the garbled circuit approach for non-algebraic statements and that of sigma protocols for algebraic ones. We then show how to use these as building-blocks to construct privacy-preserving credential systems based on standard RSA and (EC)DSA signatures. Other applications of our techniques include anonymous credentials with more complex policies, the ability to efficiently switch between commitments (and signatures) in different groups, and secure two-partycomputation on committed/signed inputs. Joint work with Melissa Chase and Payman Mohassel.
In this talk, we will introduce a novel technique for secure computation over large inputs. Based on the Decisional Diffie-Hellman (DDH) assumption, we provide a new oblivious transfer (OT) protocol with a laconic receiver. In particular, the laconic OT allows a receiver to commit to a large input D (of length m) via a short message. Subsequently, a single short message by a sender allows the receiver to learn sDi , where s0 , s1 and i ∈ [m] are dynamically chosen by the sender. All prior constructions of OT required the receiver message to grow with m. Such an OT is apt for realizing secure computation over large data. More specifically, we show applications of laconic OT to non-interactive secure computation and homomorphic encryption for RAM programs.
We revisit the question of constructing public-key encryption and signature schemes with security in the presence of bounded leakage and tampering memory attacks. For signatures we obtain the first construction in the standard model; for public-key encryption we obtain the first construction free of pairing (avoiding non-interactive zero-knowledge proofs). Our constructions are based on generic building blocks, and, as we show, also admit efficient instantiations under fairly standard number-theoretic assumptions.
In this talk I give an introduction to turn-based stochastic games, a special case of Shapley's stochastic games from 1953, and show how they relate to linear programming and Markov decision processes. I present some recent algorithmic breakthroughs from the area, focusing in particular on my own work, and describe the future research directions that I find most promising. My main focus will be on simplex algorithms; the classical combinatorial algorithm for linear programming.
Many cryptosystems widely used today can be efficiently broken using a quantum computer. Experts estimate that within the next two decades large enough quantum computers will be constructed. In order to keep our communications secure, the cryptographic community is developing new cryptosystems whose security can be based on quantum-resistant problems. One of these problems is solving systems of multivariate quadratic polynomial equations over finite fields. Cryptosystems build from the hardness of this problem are known as Multivariate Public Key Cryptosystems. In this seminar we will take a look at some of the cryptosystems developed within this field and explore attacks against them and improvements.
We study the following computational problem: for which values of k, the majority of n bits MAJn can be computed with a depth two formula whose each gate computes a majority function of at most k bits? The corresponding computational model is denoted by MAJk o MAJk. We observe that the minimum value of k for which there exists a MAJk o MAJk circuit that has high correlation with the majority of n bits is equal to Θ(n1/2). We then show that for a randomized MAJk o MAJk circuit computing the majority of n input bits with high probability for every input, the minimum value of k is equal to n2/3+o(1). We show a worst case lower bound: if a MAJk o MAJk circuit computes the majority of n bits correctly on all inputs, then k ≥ n13/19+o(1). This lower bound exceeds the optimal value for randomized circuits and thus is unreachable for pure randomized techniques. For depth 3 circuits we show that a circuit with k=O(n2/3) can compute MAJn correctly on all inputs. The talk is based on joint results with Alexander Kulikov.
In this talk I will propose a way to carry out verifiable computation over cryptocurrency networks. Cryptocurrencies (e.g. Bitcoin, Ethereum) have two appealing features as a base for verifiable computation: (i) they can execute code in their blockchain (although with certain limitations); (ii) as platforms for financial transactions, they are a natural choice for rationality assumptions. Within research in cryptographic protocols, rationality assumptions have proven to achieve complexities otherwise impossible. A high-level description of my proposal is to run rational protocol for delegation of computation on the Ethereum network. The material in this talk is still a work in progress.
We show that in the document exchange problem, where Alice holds x ∈ {0,1}n and Bob holds y ∈ {0,1}n, Alice can send Bob a message of size O(K(log2 K+log n)) bits such that Bob can recover x using the message and his input y if the edit distance between x and y is no more than K, and output ``error'' otherwise. Both the encoding and decoding can be done in time Õ(n+poly(K)). This result significantly improves the previous communication bounds under polynomial encoding/decoding time. We also show that in the referee model, where Alice and Bob hold x and y respectively, they can compute sketches of x and y of sizes poly(K log n) bits (the encoding), and send to the referee, who can then compute the edit distance between x and y together with all the edit operations if the edit distance is no more than K, and output ``error'' otherwise (the decoding). To the best of our knowledge, this is the first result for sketching edit distance using poly(K log n) bits. Moreover, the encoding phase of our sketching algorithm can be performed by scanning the input string in one pass. Thus our sketching algorithm also implies the first streaming algorithm for computing edit distance and all the edits exactly using poly(K log n) bits of space.
The discrete logarithm problem (DLP) is classified as a one-way function, and therefore used within public key cryptography. Nowadays, the standard algorithm to compute the discrete logarithm is the index-calculus algorithm. This algorithm has sub-exponential time complexity and consists of four phases: (i) factor base, (ii) generation of relations, (iii) linear algebra and (iv) descent. The descent phase is performed in steps, and the most complex operation in these steps is to determine whether a polynomial is smooth with respect to a given degree. For this, one can either factorise the polynomial or perform a smoothness test on it. The latter approach is more efficient from a computational perspective. This work focuses on the smoothness test for polynomials. The main objective was to develop an efficient implementation in C of this test in order to compute the discrete logarithm in a field of cryptographic interest. We decided to work with the finite field F36x509. Specifically, our smoothness test implementation was used in the continued-fraction and classical steps of the descent phase of the index-calculus algorithm. Before this work, there were no reported results on the computation of the discrete logarithm for the chosen field. On July 2016, it was announced that our computation in F36x509 finished taking about 200 core-years, setting a new record in a finite field of characteristic 3.
In this talk we are interested in Unique Sink Orientations (USO): an abstract optimization framework that generalizes (among other things) Linear Programming. These are orientations of the edges of the hypercube graph that satisfy certain properties. USO is a concept that has been well-studied in the last 15 years; we will introduce and motivate it with examples and references. In addition, we consider Random Edge (RE). This is arguably the most natural randomized pivot rule for the simplex algorithm: at every vertex pick an edge uniformly at random from the set of outgoing edges and move to the other endpoint of this edge. Back in 2001, Welzl introduced the concept of niceness of a USO, in an effort to better understand the behavior of RE. Niceness implies natural upper bounds for RE and, moreover, it inspires a very simple derandomization. We introduce this concept in detail and give answers to the related questions asked by Welzl in 2001. The results presented are joint work with Bernd Gärtner.
We study the question of how much interaction is needed for unconditionally secure multiparty computation. We first consider the number of messages that need to be sent to compute a Boolean function with semi-honest security, where all n parties learn the result. We consider two classes of functions called t-difficult and t-very difficult functions, where t refers to the number of corrupted players. For instance, the AND of an input bit from each player is t-very difficult while the XOR is t-difficult but not t-very difficult. We show lower bounds on the message complexity of both types of functions, considering two notions of message complexity called conservative and liberal, where conservative is the more standard one. In all cases the bounds are Ω(nt). We also show (almost) matching upper bounds for t=1 and functions in a rich class PSM eff including non-deterministic log-space, as well as a stronger upper bound for the XOR function. In particular, we find that the conservative message complexity of 1-very difficult functions in PSM eff is 2n, while the conservative message complexity for XOR (and t=1) is 2n-1. Next, we consider round complexity. It is a long-standing open problem to determine whether all efficiently computable functions can also be efficiently computed in constant-round with unconditional security. Motivated by this, we consider the question of whether we can compute any function securely, while minimizing the interaction of some of the players? And if so, how many players can this apply to? Note that we still want the standard security guarantees (correctness, privacy, termination) and we consider the standard communication model with secure point-to-point channels. We answer the questions as follows: for passive security, with n=2t+1 players and t corruptions, up to t players can have minimal interaction, i.e., they send 1 message in the first round to each of the t+1 remaining players and receive one message from each of them in the last round. Using our result on message complexity, we show that this is (unconditionally) optimal. For malicious security with n=3t+1 players and t corruptions, up to t players can have minimal interaction, and we show that this is also optimal. Joint work with Jesper Buus Nielsen, Rafail Ostrovsky and Adi Rosén.
In this talk, we will discuss how to infer a function on all vertices of a graph from observations on a few of the vertices. Given real-valued observations on some vertices, we find a smooth extension to all vertices. We study both minimal Lipschitz extensions and the absolutely minimal Lipschitz extension (AMLE) which is the limit of p-Laplacian regularization. These extensions generalize naturally to directed graphs. We develop provably fast algorithms for computing these extensions. Our implementation of these algorithms runs in minutes on graphs with millions of vertices. We provide experimental evidence that AMLE performs well for spam webpage detection. To handle noisy input graphs, we develop regularization techniques for Lipschitz extensions and give algorithms for outlier removal. The latter is surprising, as the analogous least-squares problem is NP-hard.
The problem of verifiable data streaming (VDS) considers a client with limited computational and storage capacities that streams an a-priori unknown number of elements to an untrusted server. The client may retrieve and update any outsourced element. Other parties may verify each outsourced element's integrity using the client's public-key. All previous VDS constructions incur a bandwidth and computational overhead on both client and server side, which is at least logarithmic in the number of transmitted elements. We propose two novel, fundamentally different approaches to constructing VDS. The first scheme is based on a new cryptographic primitive called Chameleon Vector Commitment (CVC). A CVC is a trapdoor commitment scheme to a vector of messages where both commitments and openings have constant size. Using CVCs we construct a tree-based VDS protocol that has constant computational and bandwidth overhead on the client side. The second scheme shifts the workload to the server side by combining signature schemes with cryptographic accumulators. Here, all computations are constant, except for queries, where the computational cost of the server is linear in the total number of updates.
Noisy channels can be very useful in cryptography, allowing to design unconditionally secure protocols for primitives like commitment and OT. As it turns out that noisy channels are valuable resources for cryptography, it becomes important to understand the optimal way in which these noisy channels can be used for implementing cryptographic tasks. In this talk we will explore the concepts of commitment and OT capacities of noisy channels. More specifically, we will focus on the commitment capacity of Unfair Noisy Channels and on the OT capacity of Generalized Erasure Channel.
In this talk, I discuss a technique for computing approximate single-source shortest paths using a hop set. A hop set is a set of weighted edges that, when added to the graph, allows to approximate all distances by using only a few edges. This approach was previously used by Cohen in the PRAM model [STOC '94]. I will show how to obtain almost tight approximate SSSP algorithms in both the distributed setting (CONGEST model) and the centralized deletions-only setting using a suitable hop set. Joint work with Monika Henzinger and Danupon Nanongkai.
Given a set Z of n positive integers and a target value t, the SubsetSum problem asks whether any subset of Z sums to t. A textbook pseudopolynomial time algorithm by Bellman from 1957 solves SubsetSum in time O(nt). Here we present a simple and elegant randomized algorithm running in time soft-O(n+t). This improves upon a classic result and is likely to be near-optimal, since it matches conditional lower bounds from SetCover and k-Clique. We also present a new algorithm with pseudopolynomial time and polynomial space.
We present a new combinatorial algorithm for triangle finding and Boolean matrix multiplication that runs in \hatO(n3/log4 n) time, where the \hatO notation suppresses poly(loglog) factors. This improves the previous best combinatorial algorithm by Chan that runs in \hatO(n3/log3 n) time. Our algorithm generalizes the divide-and-conquer strategy of Chan's algorithm. Moreover, we propose a general framework for detecting triangles in graphs and computing Boolean matrix multiplication. Roughly speaking, if we can find the ``easy parts'' of a given instance efficiently, we can solve the whole problem faster than n3.
We study the two-party communication complexity of finding an approximate Brouwer fixed-point of a composition of two Lipschitz functions g*f, where Alice knows f and Bob knows g. We prove an essentially tight deterministic communication lower bound on this problem, using a "geometric" adaptation of the Raz-McKenzie simulation theorem, allowing us to "smoothly lift" the query lower bound for approximate fixed-point ([HPV'89]) from the oracle model to the two-party model. We show that a slightly "smoother" version of our lower bound would imply an NΩ(1) deterministic lower bound on the well known open problem of finding an approximate Nash equilibrium in an N*N 2-player game, where each player initially knows only his own payoff matrix. In contrast, the non-deterministic communication complexity of this problem is only polylog(N). Such improvement would also imply an exp(k) communication lower bound for finding an Ω(1)-Nash equilibrium in k-player games. Joint work with Tim Roughgarden.
I consider the problem of representing integers in close to optimal number of bits to support increment and decrement operations efficiently as described in "Integer Representations towards Efficient Counting in the Bit Probe Model" by Gerth Stølting Brodal, Mark Greve, Vineet Pandey, Srinivasa Rao Satti. [BGPS2013]. The problem is studied in the bit probe model. I analyse the number of bits read to perform the operation in the worst case. Such representations together with the corresponding increment and decrement algorithms are called counters. The article [BGPS2013] contains two basic constructions: one uses n bits to represent exactly 2n values and requires n-1 reads in the worst case, the other one is redundant (represents less than 2n values), but requires only a logarithmic number of reads for incrementing. The only known lower bound for number of reads to increment a non-redundant counter was logarithmic. Reducing the gap between the linear upper bound and logarithmic lower bound remained an open problem. I present a proof of a linear lower bound for the number of bits read in the worst case.
For competitive analysis of online algorithms, it is assumed that an online algorithm has no information at all about the input before it arrives. This leaves open the question of whether or not a small amount of information (advice) about the input could significantly improve the performance of an online algorithm. We introduce a simple but surprisingly useful technique for proving lower bounds on online algorithms with a limited amount of advice about the future. For example, using this technique, we show that: (1) A paging algorithm needs linear advice to achieve a competitive ratio better than log k, where k is the cache size. Previously, it was only known that linear advice was necessary to achieve a constant competitive ratio smaller than 5/4. (2) Every vertex coloring algorithm with a sublinear competitive ratio must use superlinear advice. Previously, it was only known that superlinear advice was necessary to be optimal. For certain online problems, including the MTS, k-server, paging, list update, and dynamic binary search tree problem, we show that randomization and sublinear advice are equally powerful (if the underlying metric space or node set is finite). This means that several long-standing open questions regarding randomized online algorithms can be equivalently stated as questions regarding online algorithms with sublinear advice. For example, we show that there exists a deterministic O(log k)-competitive k-server algorithm with advice complexity o(n) if and only if there exists a randomized O(log k)-competitive k-server algorithm without advice.
I will show how constraints on a secret-sharing scheme can be used to prove a linear communication lower bound for three-party parallel bitwise AND.
The most classic textbook hash function, e.g. taught in CLRS [MIT Press '09], is h(x) = ((ax+b) mod p) mod m where a,b ∈ {0,1,...,p-1} are chosen uniformly at random. It is known that (*) is 2-independent and almost uniform provided p >> m. This implies that when using (*) to build a hash table with chaining, the expected query time is O(1) and the expected length of the longest chain is O(√n). This result holds for any 2-independent hash function. No hash function can improve on the expected query time, but the upper bound on the expected length of the longest chain is not known to be tight for (*). Partially addressing this problem, Alon et al. [STOC '97] proved the existence of a class of linear hash functions such that the expected length of the longest chain is Ω(√n) and leave as an open problem to decide which non-trivial properties (*) has. We make the first progress on this fundamental problem showing that the expected length of the longest chain is at most n1/3+o(1), showing that the performance of (*) is similar to that of a 3-independent hash function for which we can prove an upper bound of O(n1/3). As a lemma we show that within a fixed set of integers there are few pairs such that the height of the ratio of the pairs are small. This is proved using a mixture of techniques from additive combinatorics and number theory, and we believe that the result might be of independent interest. For a natural variation of (*) where we look at the most significant instead of the least significant bits of (ax+b) mod p to create the hash value, we show that it is possible to apply second order moment bounds even when a hash value is fixed. As a consequence: - For min-wise hashing it was known that any key from a set of n keys have the smallest hash value with probability O(1/√n). We improve this to n-1+o(1). - For linear probing it was known that the worst case expected query time is O(√n). We improve this to no(1).
The dynamic shortest paths problem on planar graphs asks us to preprocess a planar graph G such that we may support insertions and deletions of edges in G as well as distance queries between any two nodes u,v subject to the constraint that the graph remains planar at all times. This problem has been extensively studied in both the theory and experimental communities over the past decades and gets solved millions of times every day by companies like Google, Microsoft, and Uber. The best known algorithm performs queries and updates in Õ(n2/3) time, based on ideas of a seminal paper by Fakcharoenphol and Rao [FOCS'01]. A (1+ε)-approximation algorithm of Abraham et al. [STOC'12] performs updates and queries in Õ(n1/2) time. An algorithm with O(polylog(n)) runtime would be a major breakthrough. However, such runtimes are only known for a (1+ε)-approximation in a model where only restricted weight updates are allowed due to Abraham et al. [SODA'16], or for easier problems like connectivity. In this paper, we follow a recent and very active line of work on showing lower bounds for polynomial time problems based on popular conjectures, obtaining the first such results for natural problems in planar graphs. Such results were previously out of reach due to the highly non-planar nature of known reductions and the impossibility of "planarizing gadgets". We introduce a new framework which is inspired by techniques from the literatures on distance labelling schemes and on parameterized complexity. Using our framework, we show that no algorithm for dynamic shortest paths or maximum weight bipartite matching in planar graphs can support both updates and queries in amortized O(n1/2-ε) time, for any ε>0, unless the classical all-pairs-shortest-paths problem can be solved in truly subcubic time, which is widely believed to be impossible. We extend these results to obtain strong lower bounds for other related problems as well as for possible trade-offs between query and update time. Interestingly, our lower bounds hold even in very restrictive models where only weight updates are allowed. See also: http://arxiv.org/abs/1605.03797 Joint work with Amir Abboud, Stanford University
Most research in anonymous communication assumes some amount of trust: for example that there is at least one server that can be trusted. In this talk we will see what happens if no active participant can be trusted and an adversary can see all communication. We consider both the model where the adversary's computational power is bounded and the model where it is unbounded. In both models, we will see that it is possible to achieve something anonymously. However, we will also show very low upper bounds on how much information you can send. This justifies the assumption of trust in anonymity research. Part of this talk is based on joint work with Claudio Orlandi.
Talk will focus on few cool* proofs of quite easy yet important (and sometimes misquoted) facts. *-I mean seriously cool. 1) We will discuss sizes of (Universal) Hashing Families: H is a universal hashing family if for all x ≠ y Ph ← H (h(x)=h(y))=2-m where m is the size of output of h. We will discuss the sizes of hashing families (this is not a rocket science but for some reason very few people actually has seen the proofs). We will relax universality condition (collision probably will not be optimal) and we will show that the family can be significantly smaller, the second argument is a probabilistic argument, very similar to proof of Johnson-Lindenstrauss lemma (used in datamining or something). 2) Then we will discuss the following lemma: if H∞(X)=k then X can be represented as a convex combination of flat distributions Yi such that H∞(Yi)=k lemma allows us to prove things for flat distributions and then immediately generalize it for any distribution. Obviously lemma above is true only if 2k is Natural number. This lemma is almost always cited incorrectly. We will present a short but cool proof of above lemma using extreme points methods (Krein-Milman theorem), ALL conditions in K-M are trivially fulfilled in finite fields thus making it a nice and EASY tool.
We pioneer techniques to extract randomness from massive unstructured data. This is the first work that achieves theoretical guarantees with a practical implementation. It includes innovated constructions, lower bounds, implementation and extensive validation of our systems-level software system. The computation over huge input size is often modeled by streaming algorithms. In this work we consider a pragmatic multi-stream model. Here every algorithm uses a small (e.g., polylog(n)) local memory and constant many (e.g., 2) streams which are space-unbounded tapes, and the algorithm reads and writes them sequentially from left to right. Unlike the traditional view in Machine Learning or Data Mining, we view Big Data as a low-quality source of entropy, from which we can extract high-quality (i.e. almost-uniform) random bits. We propose a randomness extractor that uses O(loglog n) passes over 2 streams. This is asymptotically optimal according to our lower bound result. Furthermore, we implement and experimentally validate our randomness extractor on real-world data including videos, social network feeds, DNA sequenced data, and so on. Its efficiency outperforms every known extractor (e.g. 10 hours vs. 100,000 years), and the quality of the extracted bits is certified by standardized tests. The related papers are joint work with Periklis Papakonstantinou.
How many times did you or a colleague of yours implement a prototype for a MPC protocol or application from scratch? Each time building the network stack, storage options and newly invented architecture. Well, now your days of doing tedious work is over. Introducing FRESCO, created by the Alexandra Institute as well as a few guys formerly from the CS department, this framework will grant you: 1) Easy to produce protocols. 2) Fair comparison with other protocols also implemented in FRESCO. 3) Access for your protocol to a large set of tests and applications already written. 4) The ability to produce applications independent of the underlying MPC protocol. What will this cost you? Nothing, it's free - open source. Find it at https://github.com/aicis/fresco.
Previously it was known that all l2 ε-heavy hitters could be found in the turnstile model in optimal O((1/ε2)lg n) words of space with O(lg n) update time and very slow O(nlg n) query time with 1/poly(n) failure probability, via the CountSketch of Charikar, Chen and Farach-Colton. The query time could be improved drastically using the "dyadic trick" to O((1/ε2) poly(lg n)), but the space and update time worsened to log2 n to achieve such small failure probability. We show that the best of all worlds is possible: we give a new algorithm, ExpanderSketch, achieving O((1/ε2) lg n) space, O(lg n) update time, and O((1/ε2) poly(lg n)) query with 1/poly(n) failure probability. This is accomplished via a new reduction from the heavy hitters problem to a graph clustering problem of independent interest, which we solve using a new clustering algorithm CutCloseGrab which is guaranteed to find every single cluster regardless of the number of clusters or the comparative sizes between the cluster and the entire graph.
Computing the phylogenetic diversity of a set of species is an important part of many ecological case studies. More specifically, let T be a phylogenetic tree, and let R be a subset of its leaves representing the species under study. Specialists in ecology want to evaluate a function f(T,R) (a phylogenetic measure) that quantifies the evolutionary distance between the elements in R. But, in most applications, it is also important to examine how f(T,R) behaves when R is selected at random. The standard way to do this is to compute the mean and the variance of f among all subsets of leaves in T that consist of exactly |R| = r elements. For certain measures, there exist algorithms that can compute these statistics, under the condition that all subsets of r leaves are equiprobable. Yet, so far there are no algorithms that can do this exactly when the leaves in T are weighted with unequal probabilities. As a consequence, for this general setting, specialists try to compute the statistics of phylogenetic measures using methods which are both inexact and very slow. We present for the first time exact and efficient algorithms for computing the mean and the variance of phylogenetic measures when leaf subsets of fixed size are selected from T under a non-uniform random distribution. In particular, let T be a tree that has n nodes and depth d, and let r be a non-negative integer. We show how to compute in O((d + log n)n log n) time and O(n) space the mean and the variance for any measure that belongs to a well-defined class. We show that two of the most popular phylogenetic measures belong to this class: the Phylogenetic Diversity (PD) and the Mean Pairwise Distance (MPD). The random distribution that we consider is the Poisson binomial distribution restricted to subsets of fixed size r. More than that, we provide a stronger result; specifically for the PD and the MPD we describe algorithms that compute in a batched manner the mean and variance on T for all possible leaf-subset sizes in O((d + log n)n log n) time and O(n) space. For the PD and MPD, we implemented our algorithms that perform batched computations of the mean and variance. We also developed alternative implementations that compute in O((d + log n)n2 ) time the same output. For both types of implementations, we conducted experiments and measured their performance in practice. Despite the difference in the theoretical performance, we show that the algorithms that run in O((d+log n)n2 ) time are more efficient on datasets encountered in ecology, and numerically more stable. We also compared the performance of these algorithms with standard inexact methods that can be used in case studies. We show that our algorithms are outstandingly faster, making it possible to process much larger datasets than before. Our implementations will become publicly available through the R package PhyloMeasures.
We consider a new construction of locality-sensitive hash functions for Hamming space that is covering in the sense that is it guaranteed to produce a collision for every pair of vectors within a given radius r. The construction is efficient in the sense that the expected number of hash collisions between vectors at distance cr, for a given c > 1, comes close to that of the best possible data independent LSH without the covering guarantee, namely, the seminal LSH construction of Indyk and Motwani (FOCS '98). The efficiency of the new construction essentially matches their bound if cr = log(n)/k, where n is the number of points in the data set and k∈ N, and differs from it by at most a factor ln 4 in the exponent for general values of cr. As a consequence, LSH-based similarity search in Hamming space can avoid the problem of false negatives at little or no cost in efficiency.
In this paper, we develop a new communication model to prove a data structure lower bound for the dynamic interval union problem. The problem is to maintain a multiset of intervals I over [0, n] with integer coordinates, supporting the following operations:
In this work we describe the ZKBoo protocol, a new proposal for practically efficient zero-knowledge arguments especially tailored for Boolean circuits and report on a proof-of-concept implementation. As an highlight, we can generate (resp. verify) a non-interactive proof for the SHA-1 circuit in approximately 13ms (resp. 5ms), with a proof size of 444KB. Our techniques are based on the "MPC-in-the-head" approach to zero-knowledge of Ishai et al. (IKOS), which has been successfully used to achieve significant asymptotic improvements. Our contributions include: (1) A thorough analysis of the different variants of IKOS, which highlights their pros and cons for practically relevant soundness parameters; (2) A generalization and simplification of their approach, which leads to faster σ-protocols for statements of the form "I know x such that y=f(x)" (where f is a circuit and y a public value); (3) A case study, where we provide explicit protocols, implementations and benchmarking of zero-knowledge protocols for the SHA-1 and SHA-256 circuits.
In this work, we present the first efficient MPC protocol with identifiable abort. Our protocol has an information-theoretic online phase, with roughly the same performance as the SPDZ protocol, requiring O(n) messages to be broadcast for each secure multiplication. A key component of our protocol is a linearly homomorphic information-theoretic signature scheme, for which we provide the first definitions and construction based on a previous non-homomorphic scheme.
Tamper-proof hardware in cryptographic protocols has proven to provide strong security guarantees in the context of secure two-party computation. Currently, each protocols needs protocol-specific tamper-proof hardware tokens, which cannot be reused in another protocol. We propose a functionality (digital signature) and a new model that allows very efficient UC-secure 2PC. It can even be implemented (because signature cards)!
A (γ,δ)-elastic channel is a binary symmetric channel between a sender and a receiver where the error rate of an honest receiver is δ while the error rate of a dishonest receiver lies within the interval [γ, δ]. In this paper, we show that from any non-trivial elastic channel (i.e., 0<γ<δ<1/2) we can implement oblivious transfer with information theoretic security. This was previously (Khurana et al., Eurocrypt 2016) only known for a subset of these parameters. Our technique relies on a new way to exploit protocols for information-theoretic key agreement from noisy channels. We also show that information theoretically secure commitments where the receiver commits follow from any non-trivial elastic channel.
An external memory data structure is presented for maintaining a dynamic set of N two-dimensional points under the insertion and deletion of points, and supporting unsorted 3-sided range reporting queries and top-k queries, where top-k queries report the k points with highest y-value within a given x-range. For any constant 0<ε≤ 1/2, a data structure is constructed that supports updates in amortized O(1/(ε B1-ε)logB N) IOs and queries in amortized O(1/(ε)logB N+K/B) IOs, where B is the external memory block size, and K is the size of the output to the query (for top-k queries K is the minimum of k and the number of points in the query interval). The data structure uses linear space. The update bound is a significant factor B1-ε improvement over the previous best update bounds for these two query problems, while staying within the same query and space bounds.
We study the problem of assigning n agents to m facilities of finite capacities, when the most preferred points of the agents are private, and both those points and the locations of the facilities lie on a metric space. We study the performance of truthful mechanisms for the social cost objective and we propose a resource augmentation framework, where the cost of the optimal assignment is compared with the cost of the mechanism on the same instances with augmented facility capacities. For perhaps the most well-known matching mechanism, Serial Dictatorship, we prove that while the approximation ratio without augmentation is 2m-1, the ratio becomes log n if we double the capacities and g/(g-2),i.e. constant, for augmentation g≥ 3. For Randomized Serial Dictatorship, we prove that the approximation ratio without augmentation is Θ(n). This is ongoing joint work with Ioannis Caragiannis, Kristoffer Hansen, Søren Frederiksen and Zihan Tan. The talk will be a board talk without slides.
A classic equlibrium concept in symmetric bimatrix games is evolutionary stable strategies (ESS). ESS can be viewed as considering an infinite unstructured population and considers whether a small number of mutants can take over a population. The now classic model (in theoretical evolution) of evolutionary games on graphs (EGG) is the same concept on finite structured populations. The talk is about how I together with Krishnendu Chatterjee and Martin Nowak recently showed that the most fundamental problem for EGG is PSPACE-complete - basically showing that it can not be done better than sampling in the worst case. The paper was published as "The computational complexity of ecological and evolutionary spatial dynamics" in PNAS.
We consider approximate distance oracles for edge-weighted n-vertex undirected planar graphs. Given fixed ε > 0, we present a (1 + ε)-approximate distance oracle with O(n(loglog n)2) space and O((loglog n)3) query time. This improves the previous best product of query time and space of the oracles of Thorup (FOCS 2001, J. ACM 2004) and Klein (SODA 2002) from O(n log n) to O(n(loglog n)5).
We study the problem of list ranking in the parallel external memory (PEM) model. We observe an interesting dual nature for the hardness of the problem due to limited information exchange among the processors about the structure of the list, on the one hand, and its close relationship to the problem of permuting data, which is known to be hard for the external memory models, on the other hand. By carefully defining the power of the computational model, we prove a permut- ing lower bound in the PEM model. Furthermore, we present a stronger Ω(log2 N) lower bound for a special variant of the problem and for a specific range of the model parameters, which takes us a step closer toward proving a non-trivial lower bound for the list ranking problem in the bulk-synchronous parallel (BSP) and MapReduce models. Finally, we also present an algorithm that is tight for a larger range of parameters of the model than in prior work.
A recent and very active line of work achieves tight lower bounds for fundamental problems under the Strong Exponential Time Hypothesis (SETH). A celebrated result of Backurs and Indyk (STOC'15) proves that the Edit Distance of two sequences of equal length cannot be computed in strongly subquadratic time under SETH. The result was extended by follow-up works to simpler looking problems like finding the Longest Common Subsequence (LCS). SETH is a very strong assumption: it asserts that even linear size CNF formulas cannot be analyzed for satisfiability with an exponential speedup over exhaustive search. We consider much safer assumptions, e.g. that such a speedup is impossible for SAT on much more expressive representations, like subexponential-size NC circuits. Intuitively, this seems much more plausible: NC circuits can implement complex cryptographic primitives, while CNFs cannot even approximately compute an XOR of bits. Our main result is a surprising reduction from SAT on Branching Programs to fundamental problems like Edit Distance, LCS, and many others. Truly subquadratic algorithms for these problems therefore have consequences that we consider to be far more remarkable than merely faster CNF SAT algorithms. A very interesting feature of our work is that this is true even for mildly subquadratic algorithms. For example, we show that if we can shave an arbitrarily large polylog factor from the complexity of Edit Distance then NEXP does not have non-uniform NC1 circuits. Joint work with Amir Abboud, Virginia Vassilevska Williams, and Ryan Williams.
Graphics Processing Units (GPUs) have emerged as a powerful platform for general purpose computations due to their massive hardware parallelism. However, there is very little understanding from the theoretical perspective, what makes various parallel algorithms fast on GPUs. In this talk I will review recent advances in modeling GPUs from the algorithmic perspective and will present our recent algorithmic results, identifying some non-trivial and somewhat unexpected open problems.
A static binary search tree where every search starts from where the previous one ends (lazy finger) is considered. Such a search method is more powerful than that of the classic optimal static trees, where every search starts from the root (root finger), and less powerful than when rotations are allowed---where finding the best rotation based tree is the topic of the dynamic optimality conjecture of Sleator and Tarjan. The runtime of the classic root-finger tree can be expressed in terms of the entropy of the distribution of the searches, but we show that this is not the case for the optimal lazy finger tree. A non-entropy based asymptotically-tight expression for the runtime of the optimal lazy finger trees is derived, and a dynamic programming-based method is presented to compute the optimal tree.
Constructing optimal (minimum average search time) binary search trees (BSTs) is one of the canonical early uses of dynamic programming. In 1971, Knuth described how to solve this problem in O(n2) time, with input specifying the probability of the different successful and unsuccessful searches. While the trees constructed were binary, the comparisons used were ternary. Successful searches terminated at internal nodes and unsuccessful searches at leaves. By contrast, in binary comparison trees (BCSTs), internal nodes perform binary comparisons; the search branches left or right depending upon the comparison outcome and all searches terminate at leaves. Polynomial algorithms exist for solving the optimal BCST problem in special cases with input restricted to successful searches. Hu and Tucker gave an O(nlog n) algorithm when all comparisons are the inequality "<"; Anderson et. al. developed an O(n4) algorithm when both "<" and "=" comparisons are allowed. In this talk we present the first polynomial time algorithms for solving the optimal BCST problem when unsuccessful searches are included in the input and any set of comparisons are permitted. Our running times depend upon the comparisons allowed. If equality is not allowed, our algorithm runs in O(nlog n) time; if equality is allowed, O(n4). We also demonstrate O(n) time algorithms that yield almost optimal binary comparison trees, with tree cost within constant additive factor of optimal. This is joint work with Marek Chrobak, Ian Munro and Neal Young.
In Dynamic Flow Networks, edge capacities represent the amount of flow that can enter edges in unit time, i.e., the width of an edge. Congestion occurs at vertices when more items are waiting at the vertex than can immediately leave it. The Quickest Flow Problem is to find dynamic flows that move items from sources to sinks in minimal time. If the flow is restricted to be confluent, i.e., only one flow edge leaves every vertex, Dynamic Flow Networks can be used to model evacuation protocols. Each vertex is a source, with a given initial number of people. The goal is to find an evacuation protocol that moves all of the people to sinks (evacuation exits) in a minimal amount of time. In some versions of the problem, the sinks are given as input; in others only the number of sinks is input and finding the location of sinks that minimize evacuation time is part of the problem. A min-max regret version of the problem also exists. This talk discusses some recent work on evacuation protocols on Dynamic Flow Networks. This is joint work with Guru Prakash Arumugam, John Augustine, Di Chen, Yuya Higashikawa Naoki Katoh and Prashanth Srikanthan
he Lempel-Ziv-77 algorithm greedily factorizes a text of length n into z maximal substrings that have previous occurrences, which is particularly useful for text compression. We review two recent algorithms for this task: 1. A linear-time algorithm using essentially only one integer array of length n in addition to the text. (Joint work with Tomohiro I and Dominik Köppl.) 2. An even more space-conscious algorithm using O(z) space, computing a 2-approximation of the LZ77 parse in O(n lg n) time w.h.p. (Joint work with Travis Gagie, Pawel Gawrychoswki and Thomas Kociumaka.)
An important problem in terrain analysis is modeling how water flows across a terrain and creates floods by filling up depressions. The accuracy of such modeling depends critically on the precision of the terrain data, and available high-resolution terrain models of even fairly small geographic regions often exceed the size of a computer's main memory. In such cases movement of data between main memory and external memory (such as disk) is often the bottleneck in the computation. Thus it is important to develop I/O-efficient modeling algorithms, that is, algorithms that minimize the movement of blocks of data between main memory and disk. In this paper we develop I/O-efficient algorithms for the problem of computing the areas of a terrain that are flooded in a given flash flood event due to water collecting in depressions. Previous work only considered events where rain falls at a constant uniform rate on the entire terrain. In reality, local extreme flash floods can affect downstream areas that do not receive heavy rainfall directly, so it is important to model such non-uniform events. Our main algorithm uses O(Sort(N) + Scan(H · X)) I/Os, where N is the size of the terrain, Sort(N) and Scan(N) are the number of I/Os required to sort and read N elements in the standard two-level I/O-model, respectively, X is the number of sinks in the terrain and H the height of the so-called merge-tree, which is a hierarchical representation of the depressions of the terrain. Under practically realistic assumptions about the main memory size compared to X and H, we also develop O(Sort(N)) I/O-algorithms. One of these algorithms can handle an event in optimal O(Scan(N)) I/Os after using O(Sort(N)) I/Os on preprocessing the terrain. We have implemented our algorithms and show that they work very well in practice. To appear at MASSIVE 2015. Joint work with Lars Arge, Morten Revsbæk, Sarfraz Raza.
In recent years trajectory data has become one of the main types of geographic data, and hence algorithmic tools to handle large quantities of trajectories are essential. A single trajectory is typically represented as a sequence of time-stamped points in the plane. In a collection of trajectories one wants to detect maximal groups of moving entities and their behaviour (merges and splits) over time. This information can be summarized in the *trajectory grouping structure*. Significantly extending the work of Buchin et al. [WADS 2013] into a realistic setting, we show that the trajectory grouping structure can be computed efficiently also if obstacles are present and the distance between the entities is measured by geodesic distance. We bound the number of *critical events*: times at which the distance between two subsets of moving entities is exactly ε, where ε is the threshold distance that determines whether two entities are close enough to be in one group. In case the n entities move in a simple polygon along trajectories with τ vertices each we give an O(τ n2) upper bound, which is tight in the worst case. In case of *well-spaced* obstacles we give an O(τ(n2 + mλ4(n))) upper bound, where m is the total complexity of the obstacles, and λs(n) denotes the maximum length of a Davenport-Schinzel sequence of n symbols of order s. In case of general obstacles we give an O(τ(n2 + m2 λ4(n))) upper bound. We also present lower bounds that show that the last two upper bounds are close to optimal. Furthermore, for all cases we provide efficient algorithms to compute the critical events, which in turn leads to efficient algorithms to compute the trajectory grouping structure. Appeared at SoCG 2015, joint work with: Irina Kostitsyna, Marc van Kreveld, Maarten Loffler, and Bettina Speckmann.
The weighted region in the plane models the varying difficulties of traversing different regions by associating different weights with them, and the cost of a path inside a region is equal to the product of its length and the region weight. There have been extensive research on this problem, but only one FPTAS is known so far that has a running time proportional to n8, where n is the number of vertices in the polygonal environment. In this talk, we sketch a new FPTAS that has a worst-case running time proportional to n4. In fact, when there are only O(1) small angles in the environment, our running time is linear in n. We will also discuss some future related research directions.
This is a talk in two parts. Part 1: Examples of social network games with specific properties. Social network games, as defined in an article by K. Apt and S. Simon, model a situation where multiple players choose products to use and each player wants to use the same product as the people who influence her. K. Apt, S. Simon and E. Markakis provided a few cases of paradoxical social network games, but the existence of the strongest form of paradox of choice (a so-called vulnerable network with optional product choice) was an open question. In this part of the talk there will be presented a construction designed in collaboration with N. Nikitenkov allowing to give uniform examples of all the types of paradoxical social network games, including vulnerable networks with optional product choice. . Part 2: Finding cycles with approximately minimal mean edge weight. It is well-known that some problems related to paths in graphs can be reduced to matrix multiplication. K. Chatterjee, M. Henzinger, S. Krinninger, V. Loitzenbauer studied applications of recent results about approximate matrix multiplication to approximating the minimum mean edge weight along a cycle in the graph. In this part of the talk, there will be presented a way to extract a specific cycle with approximately minimal mean edge weight using the same asymptotic time as just finding the weight itself.
We present a deterministic algorithm that computes the edge-connectivity of a graph in near-linear time. This is for a simple undirected unweighted graph G with n vertices and m edges. This is the first o(mn) time deterministic algorithm for the problem. Our algorithm is easily extended to find a concrete minimum edge-cut. In fact, we can construct the classic cactus representation of all minimum cuts in near-linear time. The previous fastest deterministic algorithm by Gabow from STOC'91 took Õ(m+k2 n), where k is the edge connectivity, but k could be as big as n-1. At STOC'96 Karger presented a randomized near linear time Monte Carlo algorithm for the minimum cut problem. As he points out, there is no better way of certifying the minimality of the returned cut than to use Gabow's slower deterministic algorithm and compare sizes. Our main technical contribution is a near-linear time algorithm that contract vertex sets of a simple input graph G with minimum degree d, producing a multigraph with Õ(m/d) edges which preserves all minimum cuts of G with at least 2 vertices on each side. In our deterministic near-linear time algorithm, we will decompose the problem via low-conductance cuts found using PageRank a la Brin and Page (1998), as analyzed by Andersson, Chung, and Lang at FOCS'06. Normally such algorithms for low-conductance cuts are randomized Monte Carlo algorithms, because they rely on guessing a good start vertex. However, in our case, we have so much structure that no guessing is needed. Joint work with Ken-ichi Kawarabayashi, National Institute of Informatics, Tokyo, Japan.
We consider the min-cost flow problem in bipartite graphs with the sources on one side and targets on the other side. In numerous applications the number of targets is small, typically constant. Hence we study the problem for a small number k of targets, and also assume that every source has unit capacity. We consider both the static problem to compute an optimal flow, as well as the dynamic problem, where the problem is to compute an optimal flow when a source is added or deleted, or the capacity of a target is incremented or decremented. Our main results are as follows. We present the first strongly linear-time algorithm for the static problem for constant number of targets. For the static problem we present two algorithms with running time O((k3 +k2 log n)n) and 2O(k2)n, respectively. For the dynamic problem we show that the dynamic operations can be handled in (a) O(k3 + k2 log n) time; (b) O(k3 + k2 log log n) time on a standard RAM. For small k (constant or poly-logarithmic k) we present the first poly-logarithmic dynamic algorithm for the problem. A key technique used is to make the problem use a special sequence of priority query operations and then solve this sequence fast.
This talk will discuss sparse Johnson-Lindenstrauss transforms, i.e. sparse linear maps into much lower dimension which preserve the Euclidean geometry of a set of vectors. We derive upper bounds on the sufficient target dimension and sparsity of the projection matrix to achieve good dimensionality reduction. Our bounds depend on the geometry of the set of vectors, moving us away from worst-case analysis and toward instance-optimality. Joint work with Jean Bourgain (IAS) and Sjoerd Dirksen (RWTH Aachen)
The 3SUM hardness conjecture has proven to be a valuable and popular tool for proving conditional lower bounds on the complexities of dynamic data structures and graph problems. This line of work was initiated by Patrascu [STOC 2010] and has received a lot of recent attention. Most of these lower bounds are based on reductions from 3SUM to a special set intersection problem introduced by Patrascu, which we call Patrascu's Problem. However, the framework introduced by Patrascu that reduces 3SUM to Patrascu's Problem suffers from some limitations, which in turn produce polynomial gaps between the achievable lower bounds via this framework and the known upper bounds. We address these issues by providing a tighter and more versatile framework for proving 3SUM lower bounds via a new reduction to Patrascu's Problem. Furthermore, our framework does not become weaker if 3SUM can be solved in truly subquadratic time, and provides some immediate higher conditional lower bounds for several problems, including for set intersection data-structures. For some problems, the new higher lower bounds meet known upper bounds, giving evidence to the optimality of such algorithms. During the talk, we will discuss this new framework, and show some new (optimal) lower bounds conditioned on the 3SUM hardness conjecture. In particular, we will demonstrate how some old and new triangle listing algorithms are optimal for any graph density, and, time permitting, prove a conditional lower bound for incremental Maximum Cardinality Matching which introduces new techniques for obtaining amortized lower bounds. This is joint work with Ely Porat and Seth Pettie
Random bits are used in computer science to break symmetry and deadlock, ensure load-balancing, find a representative sample, maximize utility, and foil an adversary. Unfortunately, true randomness is difficult to guarantee, especially in a distributed setting where some agents may not be trustworthy. What happens if a hidden cabal is generating bits that are not random? Can we detect and neutralize such behavior? In this talk, we address this question in the context of a classic problem in distributed computing: Byzantine agreement. In Byzantine agreement, n agents, each with a private input, must agree on a single common output that is equal to some agent's input. Randomization is provably necessary to solve this problem, but past random algorithms required expected exponential time. We describe a new spectral algorithm that requires expected polynomial time. Our algorithm is designed so that in order to thwart it, corrupted agents must engage in statistically deviant behavior that is detectable by examining the top eigenvector of a certain matrix. This suggests a new paradigm for reliable distributed computing: the design of algorithms that force an attacker into behavior that is statistically deviant and computationally detectable.
We consider a monopolist seller with n heterogeneous items, facing a single buyer. The buyer has a value for each item drawn independently according to (non-identical) distributions, and his value for a set of items is additive. The seller aims to maximize his revenue. It is known that an optimal mechanism in this setting may be quite complex, requiring randomization [HR12] and menus of infinite size. Hart and Nisan have initiated a study of two very simple pricing schemes for this setting: item pricing, in which each item is priced at its monopoly reserve; and bundle pricing, in which the entire set of items is priced and sold as one bundle. Hart and Nisan have shown that neither scheme can guarantee more than a vanishingly small fraction of the optimal revenue. In sharp contrast, we show that for any distributions, the better of item and bundle pricing is a constant-factor approximation to the optimal revenue. We further discuss extensions to multiple buyers and to valuations that are correlated across items. Joint work with Nicole Immorlica, Brendan Lucier, S. Matthew Weinberg, appeared at FOCS.
We consider approval-based committee voting, i.e., the setting where each voter approves a subset of candidates, and these votes are then used to select a fixed-size set of winners (committee). We propose a natural axiom for this setting, which we call justified representation (JR). This axiom requires that if a large enough group of voters exhibits agreement by supporting the same candidate, then at least one voter in this group has an approved candidate in the winning committee. We show that for every list of ballots it is possible to select a committee that provides JR. We then check if this axiom is fulfilled by well-known approval-based voting rules. We show that the answer is negative for most of these rules, with notable exceptions of PAV (Proportional Approval Voting), an extreme version of RAV (Reweighted Approval Voting), and, for a restricted preference domain, MAV (Minimax Approval Voting). We then introduce a stronger version of the JR axiom, which we call extended justified representation (EJR), and show that PAV satisfies EJR, while other rules do not. We also consider several other questions related to JR and EJR, including the relationship between JR/EJR and unanimity, and the complexity of the associated algorithmic problems. Based on joint work with Haris Aziz, Markus Brill, Vince Conitzer, Rupert Freeman, and Toby Walsh (AAAI'15)
Given a grid of colors, the heterogeneity of a cell is the number of different colors in a fixed-size window surrounding the cell. We consider the problem of computing the heterogeneity of all cells. Solutions to this problem can be applied to compute, for example, a visualization of the heterogeneity of soil throughout a raster-based terrain. Our algorithm for square windows runs in linear time (with respect to the number of cells). We also look at the related data structure problem of color range counting, showing that there is an efficient solution if queries are restricted to be squares (of any size). In contrast, color range counting with arbitrary rectangular queries is plausibly hard due to a known reduction from Boolean matrix multiplication to the offline variant. Joint work with Constantinos Tsirogiannis and Mark de Berg.
Monte Carlo Tree Search (MCTS) is a popular class of game playing algorithms based on online learning of the strategy form a large number of simulations. In this talk, I will briefly introduce the class of MCTS algorithms for solving zero-sum extensive form games. I will explain the main difficulties caused by imperfect information that prevent commonly used MCTS algorithms from converging to the optimal strategy. Afterwards, I will present two recent variants of these algorithms that guarantee eventual convergence to the Nash equilibrium. I will sketch the main ideas of the proofs of their convergence and present examples demonstrating their limitations.
Branch-and-reduce is a common strategy for finding exact solutions to NP-complete problems. For decades, such algorithms got more complex while the analyses thereof remained simple. I present the "measure and conquer" technique, enabling refined analysis through the insight that not all parts of the input contribute equally to its complexity. Applied to two simple algorithms -- for minimum dominating set and minimum independent set -- we drastically improve on, resp. become competitive with, the leading algorithm.
We introduce the hollow heap, a very simple data structure with the same amortized efficiency as the classical Fibonacci heap. All heap operations, except delete and delete-min, take O(1) time, worst case as well as amortized; delete and delete-min take O(log n) amortized time. Hollow heaps are by far the simplest structure to achieve this. Hollow heaps combine two novel ideas: the use of lazy deletion and re-insertion to do decrease-key operations, and the use of a dag (directed acyclic graph) instead of a tree or set of trees to represent a heap. Lazy deletion produces hollow nodes (nodes without items), giving the data structure its name. Joint work with Haim Kaplan, Robert E. Tarjan, and Uri Zwick.
I will present time-space tradeoff lower bounds and algorithms for exactly computing statistics of input data, including frequency moments, element distinctness, and order statistics. These problems are simple to solve for sorted data. Determining the complexity of these problems where space is limited, on the other hand, requires considerably more effort. I will first discuss a new randomised algorithm for the element distinctness problem whose time and space complexity is roughly O(n3/2/S1/2), smaller than previous lower bounds for comparison-based algorithms. I will then show how to extend this method to a sliding window version with only log factor overheads. If there is time, I will then show tight (to within log factors) time-space tradeoff bounds for computing the number of distinct elements, F0, over sliding windows. The same lower bound holds for computing the low-order bit of F0 and computing any frequency moment Fk for k ≠ 1. This shows that frequency moments Fk ≠ 1 and even the decision problem F0 mod 2 are strictly harder than element distinctness. The paper is available online at http://arxiv.org/abs/1309.3690 Joint work with Paul Beame and Widad Machmouchi
We study similarity measures on strings, such as longest common subsequence and edit distance, and on curves, such as Fréchet distance and dynamic time warping. All of these measures have simple quadratic time dynamic programs (at least the decision versions). Recently quadratic-time lower bounds based on the Strong Exponential Time Hypothesis have been shown for Fréchet distance and edit distance. We extend these lower bounds by building a framework for proving quadratic-time hardness of similarity measures. We use this framework to obtain quadratic-time lower bounds for longest common subsequence (on binary strings) and dynamic time warping (on time-series). Moreover, we improve the hardness of edit distance by proving it on binary strings and for any non-trivial costs of the edit distance operations.
Data summarization is an effective approach to dealing with the ``big data'' problem. While data summarization problems traditionally have been studied is the streaming model, the focus is starting to shift to distributed models, as distributed/parallel computation seems to be the only viable way to handle today's massive data sets. In this talk, I will talk about ε-approximations, a classical data summary that, intuitively speaking, preserves approximately the density of the underlying data set over a certain range space. We consider the problem of computing ε-approximations for a data set which is held jointly by k players, and prove an optimal deterministic communication lower bound for halfplanes.
Let D = T1 , T2 , . . . , TD be a collection of D string documents of n characters in total, that are drawn from an alphabet set Σ = [σ]. The top-k document retrieval problem is to preprocess D into a data structure that, given a query (P[1..p], k), can return the k documents of D most relevant to the pattern P. The relevance is captured using a predefined ranking function, which depends on the set of occurrences of P in Td. For example, it can be the term frequency (i.e., the number of occurrences of P in Td), or it can be the term proximity (i.e., the distance between the closest pair of occurrences of P in Td), or a pattern-independent importance score of Td such as PageRank. Linear space and optimal query time solutions already exist for this problem. Compressed and compact space solutions are also known, but only for a few ranking functions such as term frequency and importance. However, space efficient data structures for term proximity based retrieval have been evasive. In this talk we present the first sub-linear space data structure for the term-proximity relevance function, which uses only o(n) bits on top of any compressed suffix array of D and solves queries in time O((p + k) polylog n).
The Johnson-Lindenstrauss lemma, dating back to 1984, says that for any set P of n points in d-dimensional Euclidian space, there exists a map f : P → Rm, with m = O(ε-2 lg n), such that all pairwise distances between points in P are preserved to within a factor of (1+ε) when mapped to m-dimensional Euclidian space using f, i.e. for all p,q in P, it holds that (|p-q|2)/(1+ε) < |f(p)-f(q)|2 < (|p-q|2)(1+ε). Observe that m does not depend on d, and hence the transformation works even for very high dimensional points d( >> n). Furthermore, all known proofs of the JL-lemma even provides a mapping f which is linear (the mapping f can be represented by an m × d matrix A, and a point p is mapped by computing f(p) = A(p). The Johnson-Lindenstrauss lemma has found numerous applications in computer science, signal processing (e.g. compressed sensing), statistics and mathematics. The main idea in algorithmic applications is that one can transform a high-dimensional problem into a low-dimensional one such that an optimal solution to the low-dimensional problem can be lifted to a near-optimal solution to the original problem. Due to the decreased dimension, the lower dimensional problem requires fewer resources (time, memory etc.) to solve. Given the widespread use of dimensionality reduction across several domains, it is a natural and often asked question whether the JL-lemma is tight: Does there exist some set of n points P in Rd such that any map f : P→ Rm providing the JL-guarantee must have m = Ω(ε-2 lg n). In the original paper of Johnson and Lindenstrauss (1984), the authors proved that there exists a set P requiring m = Ω(lg n). Much later, Alon (2003) proved that there exists a set of n points, P, requiring m = Ω(ε-2 lg n / lg(1/ε)), matching the upper bound up to the factor of lg(1/ε). This remained the strongest lower bound prior to our work. 30 years after Johnson and Lindenstrauss' seminal paper, we prove a tight lower bound of m = Ω(ε-2lg n). The lower bound holds for any mapping f which is linear, but as mentioned, all known proofs of the JL-lemma provides a mapping f which is in fact linear. Thus one can interpret our lower bound in two ways, either as strong evidence that the JL-lemma is optimal, or as saying that any further progress must come by designing mappings that are non-linear. This seems to require a completely new approach to dimensionality reduction. Joint work with Jelani Nelson, Harvard University.
In game theory, we usually assume that the players have no sense of time. In this model, if Alice wants to have a meeting with Bob and another meeting with Charlie, she can do so without giving them any information about whom she met first. However, in practice the time of the meetings would reveal some information. If the meetings are one hour long and have to be between 2pm and 4pm on a particular day, Bob and Charlie would know if they had the first meeting, but if the meetings only had to be in 2014 they would get almost no information. In this talk we define a measure of how much information the players learn, and we show that there exists games such that we need time 22··· 21/ε if we want to reveal at most ε information, where the power tower can be arbitrarily high. We argue that "timeability" is similar to perfect recall - you cannot play untimeable games in the real world - and that timeability should be a standard assumption just as perfect recall is. Joint work with Vincent Conitzer and Troels B. Sørensen.
It has always bothered me that geometric algorithms that we prove are correct can still fail due to numerical precision issues. This is because we prove them correct for a model of computation that supports operation on real numbers, but we implement them on computers that use floating point. Liotta, Preparata and Tamassia suggested that considerations of precision could be incorporated into the design of geometric algorithms by limiting the degree of polynomials used to compute predicates. For example, for testing if two line segments cross at an intersection point, degree 2 (so roughly double precision) is enough, since you simply test if each segment's endpoints are on opposite sides of the line through the other segment. The coordinates of an intersection are rational polynomials of degree 3 over degree 2, however, so comparing the x-coordinates of intersections is a degree-5 operation (roughly 5 times input precision). This is why the classic Bentley-Ottmann sweep, which finds all k intersections of n line segments in O((n+k) log n) time, has numerical problems. For the special case of red/blue segment intersection, in which there are no red/red or blue/blue crossings (cases occur in CAD/CAM computing intersections or unions of two polygonal regions, or in GIS overlaying two maps), Andrea Mantler and I derived a degree 2 algorithm that runs in O(nlog n + k) time several years ago. Recently I've simplified and extended this algorithm as an example of how degree driven design can lead to correct geometric algorithms that handle all degenerate inputs in a consistent way; it is the basis of a series of programming assignments for an undergraduate algorithms class. I'll describe the algorithm and comment on what I and the students are learning through this assignment.
Ensemble methods are machine learning algorithms that construct a set of classifiers and combine their results in a way that outperforms each of them. In a big data world, Natural Language Processing (NLP) is essential to analyze the data related to human communication. We will study one of the most challenging NLP task, machine translation (multi-label classification) and discuss the bridge between applications and theory on the ensemble learning applied in contemporary statistical machine translation with a focus on two issues: (1) How to produce different classifiers through approaches such as corpus mining (via IR, pivot language and comparable corpus) and bagging; (2) How to combine the results of single classifiers on the level of models and systems, for instance synchronous model training and mixture of experts.
Many case studies in Ecology examine the interactions between groups of species that live in the same region. A very common problem that appears in these studies is the following: given a group of species R, we want to measure how closely related these species are. The standard way to do this is by using a phylogenetic tree; this is the tree that represents the evolution history of species. Each tree node corresponds to a different kind of species; the leaves represent species that exist today, while internal nodes represent ancestor species of the past. The tree edges indicate how species have evolved from each other, and the (positive) weights of the edges represent some kind of distance between populations e.g. amount of time since a speciation event. Therefore, the problem that we want to solve can be formulated as follows: given a tree T and a set R of its leaf nodes (that represent the species that we want to examine) we want to measure what is the "distance" in T between the nodes in R. There are several ways to calculate some distance value between these nodes; we can consider the cost of the min-weight Steiner tree between these nodes, the average path cost etc.. By calculating one of these measures we get a value for the set R, say d(R). However, this value by itself is not enough to show if the species in R are closely related or not; to find that out we need to have a point of reference. Thus, we would like to compute the mean value of d(.) among all possible sets of leaves in T that consist of the same number of elements as R. Comparing this mean with d(R) would then solve the problem. Also, other than the mean, ecologists are interested to compute higher order statistics for d(.) among all subsets of leafs that have the same size as R. In this talk, we present several efficient algorithms for computing statistics of distance measures in a tree. We present results for a variety of measures that are used in biological case studies, but we also describe a large collection of interesting open problems that are related with this topic. The presented results are part of a collaboration project between MADALGO and the Ecoinformatics group at AU.
Dynamic programming is a well-known algorithm design technique which can be applied to a broad class of problems for which the so-called "principle of optimality" holds. The solution of any dynamic program is described through a recurrence relation, so that its solution involves solving the same problem on smaller instances, which may overlap among each other. Instead of solving the recurrence directly, which would imply to solve the same subproblems many times, a "table" is defined, in which each entry is a sub-problem whose resolution depends on some sub-problems located in entries on the table with smaller coordinates. A naive algorithm for solving a dynamic program involves scanning the entire matrix in a suitable order, evaluating every dependency of each sub-problem. This leads to a time complexity which is proportional to the size of the matrix times the mean number of dependencies. This is indeed optimal when no structural properties on the input data can be shown. However, this is not the case in many applications, where the dependencies among subproblems reflects the structural properties of the problem at hand. In these cases, those properties might be exploited to "prune" redundant computation, leading to polynomial speed-ups. In this talk we expose some general techniques to achieve polynomial speed-ups when some specific properties on the dynamic programming formulations hold. We also illustrate a new general technique which has been used to speed-up the computation of a data compression problem, a result which suggest a promising research direction in this topic.
We consider maintaining the contour tree T of a piecewise-linear triangulation M that is the graph of a time varying height function h : R2 → R. In this talk, we describe the combinatorial change in T that happen as h varies over time and how these changes relate to topological changes in M. Also, we present a kinetic data structure that maintains the contour tree of M over time. Our data structure maintains certificates that fail only when h(v) = h(u) for two adjacent vertices v and u in M, or when saddle vertices lie on the same contour of M. A certificate failure is handled in O(log(n)) time. Joint work with Pankaj Agarwal, Lars Arge, Thomas Mølhave, and Morten Revsbæk.
We prove the linearity (of the lengths) of some generalized Davenport-Schinzel sequences. Standard Davenport-Schinzel sequences of order 2 (avoiding abab) are linear, while those of order 3 (avoiding ababa) and higher can be superlinear. Our goal is to determine what pattern(s), in addition to ababa, must be forbidden to regain linearity.
CAT(0) metric spaces (or metric spaces of global non-positive curvature) constitute a far-reaching common generalization of Euclidean and hyperbolic spaces and simple polygons: any two points x and y of a CAT(0) metric space are connected by a unique shortest path gamma(x, y). In this talk, we present an efficient algorithm for answering two-point distance queries in CAT(0) rectangular complexes. Namely, we show that for a CAT(0) rectangular complex K with n vertices, one can construct a data structure D of size O(n2) so that, given any two points x, y in K, the shortest path gamma(x, y) between x and y can be computed in O(d(p,q)) time, where p and q are vertices of two faces of K containing the points x and y, respectively, such that gamma(x, y) is contained in the subcomplex K(I(p, q)), d(p, q) is the distance between p and q in the underlying graph of K and I(p,q) is the interval between p and q in the underlying graph of K.
Two-dimensional finite element methods require repeated traversals of a mesh of squares or triangles. By processing the mesh elements in order along a well-chosen space-filling curve, one can avoid storing the mesh in complicated and cache-inefficient data structures: only a small number of stacks are needed. For three-dimensional meshes, however, known space-filling curves do not quite suffice. In this presentation I present our latest results on what desirable properties of such space-filling curves can and cannot be realized.
Emerging technologies such as smartphones and GPS enable the effortless collection of trajectories and other tracking data. More generally, a time-series is a recording of a signal that changes over time. This type of data is important in many different domains: consider monitoring credit card transactions, river water levels, or vital parameters of a medical patient, for example. There is a growing need for efficient algorithms to handle time-series data. An important tool for the efficient handling of large amounts of data is clustering, since it provides a .summary. of the data. It enables the discovery of hidden structures by grouping similar elements together. It is fundamental in performing tasks as diverse as data aggregation, similarity retrieval, prediction, and anomaly detection. I will first talk about the problem of clustering trajectories and time series. The need for data structures which support efficient distance queries naturally arises in this context. I will also talk about other applications, where data structures for trajectories would be useful. In the second part of the talk, I will describe a data structure which can be used for some of those applications.
Algorithmic mechanism design is now widely studied for various scenarios. In this talk, we discuss two applications: CPU time auction and facility location problem. In CPU time auction, we designed two greedy frameworks which can achieve truthfulness (approximate-truthfulness) from the bidders while at the same time a certain global objective is optimized or nearly optimized. In facility location problem, we introduce weight to the traditional study and prove that those mechanisms that ignore weight are the best we can have. Furthermore, we also propose a new threshold based model where the solution that optimizes the social welfare is incentive compatible. Bio: Minming Li is currently an associate professor in the Department of Computer Science, City University of Hong Kong. He received his Ph. D. and B.E. degree in the Department of Computer Science and Technology at Tsinghua University in 2006 and 2002 respectively. His research interests include algorithm design and analysis in wireless networks and embedded systems, combinatorial optimization and algorithmic game theory.
The rectangle enclosure problem asks to report all k enclosing pairs of n input rectangles in 2D. I will present the first deterministic algorithm that takes O(nlog n+k) worst-case time and O(n) space in the word-RAM model. It improves previous deterministic algorithms with O((nlog n+k)loglog n) running time. The result is achieved by derandomizing the algorithm of Chan, Larsen and Patrascu [SoCG.11] that attains the same time complexity but in expectation. The 2D rectangle enclosure problem is related to the offline dominance range reporting problem in 4D, and our result leads to the currently fastest deterministic algorithm for offline dominance reporting in any constant dimension d. Joint work with: Peyman Afshani and Timothy M. Chan
The Frechet distance is a well-studied and very popular measure of similarity of two curves. Many variants and extensions have been studied since Alt and Godau introduced this measure to computational geometry in 1991. Their original algorithm to compute the Frechet distance of two polygonal curves with n vertices has a runtime of O(n2log n). More than 20 years later, the state of the art algorithms for most variants still take time more than O(n2/log n), but no matching lower bounds are known, not even under reasonable complexity theoretic assumptions. To obtain a conditional lower bound, we assume the Strong Exponential Time Hypothesis. Under this assumption we show that the Frechet distance cannot be computed in strongly subquadratic time, i.e., in time O(n2-δ) for any δ > 0. This means that finding faster algorithms for the Frechet distance is as hard as finding faster SAT algorithms, and the existence of a strongly subquadratic algorithm can be considered unlikely.
The order-maintenance data structure is one of the most used black-boxes for many dynamic problems. The data structure allows to maintain a dynamic ordered list in constant time per update (assuming a pointer is given to the location) while supporting constant time order queries in which one wishes to determine the order of two elements in the list. All of these time bounds are in a worst case sense. Unfortunately , the correctness of the known solutions is considered by many to be questionable, due to a black-box usage of another unclear/complex data structure for another problem known as the file-maintenance problem. We will survey the interesting history of the order maintenance problem and describe a new simple solution that has some additional properties which are useful for applications such as online suffix tree construction and fully persistent arrays. Finally, if time permits we will discuss a new understandable solution for the file-maintenance problem.
Let S=d1,d2,...,dD be a collection of D string documents of n characters in total. The two-pattern matching problems ask to index S for answering the following queries efficiently. 1) report/count the unique documents containing P1 and P2 2) report/count the unique documents containing P1, but not P2. Here P1 and P2 represent input patterns of length p1 and p2 respectively. Linear space data structures with O(p1 + p2 + √nk logO(1) n) query cost are already known for the reporting version, where k represents the out-put size. For the counting version (i.e., report the value k), a simple linear-space index with O(p1 + p2 + √n) query cost can be constructed in O(n3/2) time. However, it is still not known if these are the best possible bounds for these problems. In this talk we show a strong con-nection between these string indexing problems and the boolean matrix multiplication problem. Based on this, we argue that these results cannot be improved significantly using purely combinatorial techniques. We also provide an improved upper bound for a related problem known as two-dimensional substring indexing. Joint work with Kasper Green Larsen , J. Ian Munro, Jesper Sindahl Nielsen, and Sharma V. Thankachan
We present the first combinatorial polynomial time algorithm for computing the equilibrium of the Arrow-Debreu market model with linear utilities. Our algorithm views the allocation of money as flows and iteratively improves the balanced flow as in [Devanur et al. 2008] for Fisher.s model. We develop new methods to carefully deal with the flows and surpluses during price adjustments. Our algorithm performs O(n6 log(nU)) maximum flow computations, where n is the number of agents and U is the maximum integer utility. The flows have to be presented as numbers of bitlength O(nlog(nU)) to guarantee an exact solution. Previously, [Jain 2007, Ye 2007] have given polynomial time algorithms for this problem, which are based on solving convex programs using the ellipsoid algorithm and the interior-point method, respectively.
The Bregman divergences Dφ are a class of distance measures parametrized by a convex function φ. They include well known distance functions like l22 and the Kullback-Leibler divergence from information theory and are used extensively in data-driven applications such as machine laerning, computer vision, text mining, and speech processing. There has been extensive research on algorithms for problems like clustering and near neighbor search with respect to Bregman divergences; in all cases, the algorithms depend not just on standard parameters like the data size n, dimensionality d, number of probes allowed t, and error ε, but also on a structure constant μ ≥ 1 that depends solely on φ and can grow without bound independent of the other parameters. This dependence has withstood attempts to remove it, and appears to be intrinsic to the study of these measures. In this paper, we provide the first evidence that this dependence might be intrinsic. In particular, we focus on the problem of approximate near neighbor search for Bregman divergences, a problem of interest in its own right. We show that under the cell probe model, any non-adaptive data structure (like locality-sensitive hashing) for c-approximate near-neighbor search that admits r probes must use space Ω(n1 + (μ)/(c r)). In contrast, for LSH under l1 the best bound is Ω(n1+1/(cr)). In proving this lower bound, we follow and extend the Fourier-analytic approach that has yielded many lower bounds for such problems. Our new tool is a directed variant of the standard boolean noise operator, for which we prove a generalization of the Bonami-Beckner hypercontractivity inequality. While such an inequality cannot exist in general, we show that it exists ``in expectation'' or upon restriction to certain subsets of the Hamming cube, and that this is sufficient to prove the desired isoperimetric inequality that we use in our data structure lower bound. This operator might be of independent interest in the analysis of boolean functions. We also present a structural result reducing the Hamming cube to a Bregman cube. This structure allows us to obtain lower bounds for problems under Bregman divergences from their l1 analog. In particular, we get a (weaker) lower bound for approximate near neighbor search of the form Ω(n1 + 1/(cr)) for an r-query non-adaptive data structure, and new cell probe lower bounds for a number of other related near neighbor questions in Bregman space. Joint work with Amirali Abdullah .
The goal of this seminar is to present some recent results regarding space in proof complexity. Hence it will consist first in a gentle introduction to proof complexity and in particular to weak proof systems, such as Resolution and Polynomial Calculus. Then we will focus more on Space Complexity Measures, how they relate with other complexity measures and on a recent result on Total Space in Resolution by me, Galesi and Thapen: an asymptotically optimal Total Space lower bound for random k-CNFs.
Partial persistence is a general transformation that takes a data structure and allows queries to be executed on any past state of the structure. The cache-oblivious model is the leading model of a modern multi-level memory hierarchy. We present the first general transformation for making cache-oblivious model data structures partially persistent. Joint work with: Pooya Davoodi, Jeremy T. Fineman and Özgür Özkan
In recent years, there has been a growing conviction, both in academia and at companies like Google, that so-called knowledge graphs will play an important role in improving natural language processing, Web search, and artificial intelligence. Edges in such graphs reflect relationships between entities, e.g. people, places, or words and their meanings. In this talk, I provide an overview of recent advances on collecting such knowledge from the Web using novel information extraction and joint link prediction methods. I will highlight some semantic applications of these methods, e.g. for taxonomy induction and adjective intensities. I will also present new resources like Lexvo.org, WebChild, and UWN/MENTA, a large multilingual knowledge graph covering over 100 languages. Bio: Gerard de Melo is an Assistant Professor at Tsinghua University, where he is heading the Web Mining and Language Technology group. Previously, he was a post-doctoral researcher at UC Berkeley working in the ICSI AI group, and a doctoral candidate at the Max Planck Institute for Informatics. He has published over 40 research papers on Web Mining and Natural Language Processing, winning Best Paper Awards at conferences like CIKM and ICGL. For more information, please visit http://gerard.demelo.org/.
Let P be a set of n points in the plane. The k-nearest neighbor (k-NN) query problem is to preprocess P into a data structure that quickly reports k closest points in P for a query point q. The group nearest neighbor problem is a generalization of the k-NN query problem to a query set Q of points. More precisely, a query is assigned with a set Q of at most m points and a positive integer k with k≤ n, and the distance between a point p and a query set Q is determined as the sum of distances from p to all q∈ Q. We can solve this problem in O(nm) time in a straightforward way. Without any preprocessing we cannot avoid Ω(n) time. In this presentation, we show algorithms that take o(n) time to compute the group k-nearest neighbors for small m and k after preprocessing the points in P.
Nowadays location aware devices are commonplace and produce large amounts of trajectories of moving objects like humans, cars, etc. To analyse this data there is need for proper visualisation techniques and methods to automatically extract knowledge from the data. We propose to enrich existing visualisations of trajectories by indicating significant occurrences of specific patterns. We give definitions of two patterns, junctions and stop areas, and study the geometric properties of these formalisations. We also give efficient algorithms to compute a given number of significant locations, and we implemented a variation of these to demonstrate the proposed technique.
We present the first efficient deterministic algorithms that given a set of n three-dimensional points, they construct optimal size (single and multiple) shallow cuttings for orthogonal dominance ranges. In particular, we show how to construct a single shallow cutting in O(n log n) worst case time, using O(n) space. We also show how to construct in the same complexity, a logarithmic number of shallow cuttings. Only polynomial guarantees were previously achieved for the deterministic construction of shallow cuttings, even in three dimensions. We will also discuss a few interesting questions left open by this work.
We introduce and examine the Holiday Gathering Problem which models the difficulty that couples have when trying to decide with which parents should they spend the holiday. Our goal is to schedule the family gatherings so that the parents will be happy, i.e., all their children will be home simultaneously for the holiday festivities, while minimizing the number of consecutive holidays in which parents are not happy. The holiday gathering problem generalizes and is related to several classical problems in computer science, such as the dining philosophers problem on a general graph and periodic scheduling. We also show interesting connections between periodic scheduling, coloring, and universal prefix free encodings. We will cover the following: (1) A combinatorial definition of the Holiday Gathering Problem. (2) A `heavyweight' non-periodic simple solution that guarantees that the parents of d children will host all of them at least once in every d+1 years. (3) A lightweight perfectly-periodic solution in which the parents of d children are guaranteed to host them all together at least once in every 2⌈log d ⌉≤ 2d years. (4) A lightweight coloring-based technique in which parents host their children every set number of years, and for parents with color c the number of holidays between being happy is at most 21+log*c·∏i=0log*c log(i)c where log(i) means iterating the log function i times. This is achieved via a connection with prefix-free encodings. We also prove that the performance of this algorithm almost matches a lower bound for coloring-based algorithms of this scheme. This is joint work with Amihood Amir, Oren Kapah, Moni Naor, and Ely Porat.
This thesis visits the forefront of algorithmic research on edge coloring of cubic graphs. We select a set of algorithms that are among the asymptotically fastest known today. Each algorithm has exponential time complexity, owing to the NP-completeness of edge coloring, but their space complexities differ greatly. They are implemented in a popular high-level programming language to compare their performance on a set of real instances. We also explore ways to parallelize each of the algorithms and discuss what benefits and detriments those implementations hold.
In this talk I will discuss a new result on (min,+) matrix multiplication by Ryan Williams that recently arrived on archive. http://arxiv.org/abs/1312.6680 - Faster all-pairs shortest paths via circuit complexity. The paper gives an algorithm for (min,+) matrix multiplication running in n3/2Ω(log n/loglog n)1/2 which is an impressive result that gives improved bounds for wide range of results (including APSP). Previous best was n3/log n. I will give some background on APSP and how it relates to (min,+) matrix multiplication and continue to explain RWs new algorithm that is not nearly as complicated as it might sound from the title (circuit complexity !!!).
Information theory is the source of many computational lower bounds, but the relationship between time and space is rarely used in the other direction. In this talk, I will describe three examples of problems where computational upper bounds defined by algorithms (e.g. sorted search, sorting, merging) yield compressed encodings and inspire efficient compressed data structures (e.g. integers, permutations, bitvectors), and some related open problems. The techniques and open problems described are understandable by any graduate student in computer science (not only theory). Short Vita: Jeremy received a BSc in Mathematics in 1997 in Rouen, a Master in 1998 and a PhD in 2002, both in Computer Science in Orsay. He was a posdoctoral fellow at the university of British Columbia until 2004 and an assistant professor at the university of Waterloo until 2008. He is now assistant professor at the university of Chile. Jeremy's main research is about the analysis of algorithms and data-structures on finer classes of instances than those merely defined by their size, which yields to adaptive algorithms, instance optimality, output sensitive and parameterized complexity, compressed data structures and indexes, with formal measures of compressibility. Jeremy experiments as part of his teaching with new techniques (e.g. organisation of two yearly android programming contests, creation of project oriented courses using http://www.alice.org, usage of concept questions in more traditional courses, course where university students teach high-school students) and designs tools to help instructors to share and evaluate collectively teaching material over time (database of solved problem still in use at the University of Waterloo), and between institutions (project in development: https://github.com/jyby/repositorium/). Jeremy was born in 1976. So far he has lived in France, United States, Canada and Chile. He speaks native French, is fluent in English and in Spanish. He plays music (e.g. clarinette, xaphoon, bandoneon, piano,... ), practices sports (e.g. swimming, roller blading and unicycling) and art (etching glasses, copper and wood), and loves cooking and taking pictures.
Since its introduction in 1979 by Prof. Andrew Yao, communication complexity has emerged as a foundational tool to proving lower bounds in many areas of computer science. In this model, two (or more) parties try to cooperatively compute a function while each only has part of the input, and goal is to minimize the amount of information exchange. Aaronson and Wigderson in 2009 revived the study of the communication complexity analogue of polynomial time hierarchy, by showing its connection to a new barrier in complexity theory: algebrization. The lower levels of this hierarchy, including its randomized version called the Arthur-Merlin communication model, and oracle query classes such as the communication analogue of PNP, are being actively studied in recent years. In particular, Impagliazzo and Willimans in their CCC 2010 paper proved new separation results concerning the communication class PNP. In this talk, I'll talk about how our research into space-bounded communication complexity accidentally provided new tools for the research of the communication polynomial hierarchy. I'll show that a new communication model we introduced called the one-way limited-memory model have close connections to the communication polynomial hierarchy. These connections provide nice combinatorial characterizations for old complexity classes, more elegant proofs for known results, and new tools to the study of well-known complexity problems. In particular, I'll talk about how to put the recent separation results presented in Impagliazzo and Williams' CCC 2010 paper in a much more elegant framework. I'll also discuss the connection and separation between this model and the bounded-depth NC circuit classes and bounded-width branching programs. This talk is based on joint work with Joshua Brody, Shiteng Cheng, Periklis A. Papakonstantinou, Dominik Scheder and Xiaoming Sun. Bio: Hao Song is a Ph.D. candidate at the Institute for Interdisciplinary Information Sciences, Tsinghua University. He got his bachelor's degree from the department of computer science and technology of Tsinghua University in July of 2008, and joined the IIIS in the same year. His research interest lies in computational complexity, and in particular communication complexity.
How do you find the solutions of a system of linear equation? We learn the answer in the high school: Gassian elimination. What if we have polynomial equations? Classically Resultant were used to give a partial answer to this question. A complete answer has been given via Groebner Bases which was introduced in 1965 by Buchberger. First part of this talk will contain a fast review to Groebner bases, its complexity and applications on solving polynomial equations and graph coloring. Second part of the talk will be on the resultant and their use to solve polynomial equations. We will quote on our ongoing work on the relations and the differences between resultant and Groebner bases.
We look at the problem of computing the maxima of a set of points in three dimensions and we present an efficient deterministic output-sensitive algorithm in the Word RAM model. We observe that improving our algorithm is most likely difficult since it requires breaking a number of important barriers, even if randomization is allowed. Throughout the talk, we will also discuss many interesting relevant open problems.
With the rapidly increasing deployment of Internet-connected, location-aware mobile devices, very large and increasing amounts of geo-tagged and timestamped user-generated content, such as microblog posts, are being generated. We present indexing, update, and query processing techniques that are capable of providing the top-k terms seen in posts in a user-specified spatio-temporal range. The techniques enable interactive response times in the millisecond range in a realistic setting where the arrival rate of posts exceeds today.s average tweet arrival rate by a factor of 4.10. The techniques adaptively maintain the most frequent items at various spatial and temporal granularities. They extend existing frequent item counting techniques to maintain exact counts rather than approximations. An extensive empirical study with a large collection of geo-tagged tweets shows that the proposed techniques enable online aggregation and query processing at scale in realistic settings. Joint work with Anders Skovsgaard and Christian S. Jensen.
In edge orientations, the goal is usually to orient (direct) the edges of an undirected n-vertex graph G such that all out-degrees are bounded. When the graph G is fully dynamic, i.e., admits edge insertions and deletions, we wish to maintain such an orientation while keeping a tab on the update time. Low out-degree orientations turned out to be a surprisingly useful tool, with several algorithmic applications involving static or dynamic graphs. Brodal and Fagerberg [1999] initiated the study of the edge orientation problem in terms of the graph's arboricity, which is very natural in this context. They provided a solution with constant out-degree and amortized logarithmic update time for all graphs with constant arboricity, which include all planar and excluded-minor graphs. However, it remained an open question (first proposed by Brodal and Fagerberg, later by others) to obtain similar bounds with worst-case update time. We resolve this 15 year old question in the affirmative, by providing a simple algorithm with worst-case bounds that nearly match the previous amortized bounds. Our algorithm is based on a new approach of a combinatorial invariant, and achieves a logarithmic out-degree with logarithmic worst-case update times. This result has applications in various dynamic graph problems such as maintaining a maximal matching, where we obtain O(log n) worst-case update time compared to the O((log n)/(loglog n)) amortized update time of Neiman and Solomon [2013]. This is joint work with Robert Krauthgamer, Ely Porat, and Shay Solomon.
The Lovasz Local Lemma (LLL), introduced by Erdos and Lovasz in 1975, is a powerful tool of the probabilistic method that allows one to prove that a set of n "bad" events do not happen with non-zero probability, provided that the events have limited dependence. However, the LLL itself does not suggest how to find a point avoiding all bad events. Since the work of Beck (1991) there has been a sustained effort in finding a constructive proof (i.e. an algorithm) for the LLL or weaker versions of it. In a major breakthrough Moser and Tardos (2010) showed that a point avoiding all bad events can be found effciently. They also proposed a distributed/parallel version of their algorithm that requires O(log2 n) rounds of communication in a distributed network. In this talk I will present new distributed algorithms for the LLL that improve on both the effciency and simplicity of the Moser-Tardos algorithm. Let p bound the probability of any bad event and d be the maximum degree in the dependency graph of the bad events. 1. When epd2 < 1 we give a truly simple LLL algorithm running in O(log1/epd2 n) rounds. 2. Under the tighter condition ep(d + 1) < 1, we give a slightly slower algorithm running in O(log2d log1/ep(d+1) n) rounds. 3. Under the stronger condition that e2 p (d+1)4 2d < 1, we give a sublogarithmic algorithm running in O(log n / loglog n) rounds. Although the conditions of the LLL are locally verifiable, we prove that any distributed LLL algorithm requires Ω(log* n) rounds. In many graph coloring problems the existence of a valid coloring is established by one or more applications of the LLL. Using our LLL algorithms, frugal coloring, defective coloring, coloring girth-4 (triangle-free) and girth-5 graphs, edge coloring, and list coloring can be obtained in logarithmic distributed rounds. Joint work with Kai-Min Chung and Seth Pettie.
Sampling is an effective approach to dealing with massive distributed data sets, and various notions of sampling have been studied in the literature. In this paper, we study ε-approximations, namely a ``uniform'' sample that approximates the fraction of the underlying population within any range over a certain range space. We consider the problem of computing ε-approximations for a data set what is held jointly by k players, and give general communication upper and lower bounds that hold for any range space whose discrepancy is known.
In the two-dimensional range minimum query problem an input matrix A of dimension m x n, m. n, has to be preprocessed into a data structure such that given a query rectangle within the matrix, the position of a minimum element within the query range can be reported. We consider the space complexity of the encoding variant of the problem where queries have access to the constructed data structure but can not access the input matrix A, i.e. all information must be encoded in the data structure. Previously it was known how to solve the problem with space O(mnminm,log n) bits (and with constant query time), but the best lower bound was .(mnlog m) bits, i.e. leaving a gap between the upper and lower bounds for non-quadratic matrices. We show that this space lower bound is optimal by presenting an encoding scheme using O(mnlog m) bits. We do not consider query time. Joint work with Andrej Brodnik and Pooya Davoodi. Results presented at ESA 2013.
Suppose that every vertex of an n-vertex graph G = (V,E) of maximum degree Δ hosts a processor. These processors wake up simultaneously and communicate in discrete rounds. In each round each vertex is allowed to send an arbitrarily large message to all its neighbors. We are interested in coloring this graph with a relatively few colors within a small number of rounds. At the end of the algorithm each vertex needs to know its own color. This problem is closely related to problems of computing a maximal independent set and a maximal matching in the distributed setting. In this talk I plan to overview the rich history of the research on these problems, to mention the most important known results, and describe some of the basic techniques which are used to achieve them. Then I will turn to my own recent work on both deterministic (joint with Barenboim, JACM'11) and randomized (joint with Barenboim, Pettie and Schneider, FOCS'12) variants of these problems. Specifically, I will show an outline of the deterministic Δ1+ε-coloring algorithm that requires polylogarithmic in n number of rounds . (Here ε > 0 is an arbitrarily small constant.) This result solves an open problem posed by Linial in 1987. If time permits I will also show an outline of a randomized (Δ+1)-coloring algorithm that requires O(log Δ) + exp(O(√loglog n)) number of rounds. There is a huge number of open problems in this area. During the talk I will present and discuss some of them.
A Davenport-Schinzel with order s is a sequence over an n letter
alphabet that avoids subsequences of the form a..b..a..b.. with length
s+2. They were originally used to bound the complexity of the lower
envelope of degree-s polynomials or any class of functions that cross
at most s times. They have numerous applications in computational
geometry. Let DSs(n) be the maximum length of such a sequence. In
this talk I'll present a new method for obtaining sharp bounds on
DSs(n) for every order s. This work reveals the unexpected fact that
sequences with odd order s behave essentially like even order s-1. The
results refute both common sense and a conjecture of Alon, Kaplan,
Nivasch, Sharir, and Smorodinsky [2008]. Prior to this work, tight
upper and lower bounds were only known for s up to 3 and all even
s>3. A manuscript is available at arXiv:1204.1086
Range searching is a classical problem that has been studied extensively both in computational geometry and databases. Over the last three decades, several sophisticated geometric techniques and data structures (e.g. kd-trees, range trees, ε-nets, cuttings, simplicial partitions) have been proposed for range searching that have had a profound impact on the field, much beyond range searching. Despite this tremendous progress on range searching, there is a big gap between theoretical results and the approaches used in practice for this problem, partly because the theoretically best known results are not easy to implement and partly because the goals have shifted. This talk begins by reviewing some of the theoretical results, and then focuses on new variants of range searching that have emerged in the last few years and discusses some of the approaches that are being used in practice.
The physical implementation of a cryptographic protocol invariably leaks information. Leakage makes the implementation vulnerable to side-channel attacks, which can be used to reveal internal states of the implementation. A growing body of research is concerned with the development of leakage-resilient cryptography, which focuses on keeping computations secret even in the present of general forms of leakage. In this work, we show how to use techniques from quantum fault-tolerance, for which the goal is to protect a quantum computer from noise, to construct classical leakage-resilient circuits.
Rendezvous is a fundamental process in Cognitive Radio Networks, through which a user establishes a link to communicate with a neighbor on a common channel. Most previous solutions use either a central controller or a Common Control Channel (CCC) to simplify the problem, which are inflexible and vulnerable to faults and attacks. Some blind rendezvous algorithms have been proposed that rely on no centralization. Channel Hopping (CH) is a representative technique used in blind rendezvous, with which each user hops among the available channels according to a pre-defined sequence. However, no existing algorithms can work efficiently for both symmetric (both parties have the same set of channels) and asymmetric users. In this paper, we introduce a new notion called Disjoint Relaxed Difference Set (DRDS) and present a linear time constant approximation algorithm for its construction. Then based on the DRDS, we propose a distributed asynchronous algorithm that can achieve and guarantee fast rendezvous for both symmetric and asymmetric users. We also derive a lower bound for any algorithm using the CH technique. This lower bound shows that our proposed DRDS based distributed rendezvous algorithm is nearly optimal. Extensive simulation results corroborate our theoretical analysis.
Probabilistically Checkable Proofs (PCPs) allow a verifier to check the validity of a proof by querying very few random positions in the proof string. Zero Knowledge (ZK) Proofs allow a prover to convince a verifier of a statement without revealing any information beyond the validity of the statement. We study for what class of languages it is possible to achieve both, namely to build ZK-PCPs, where additionally we require that the proof be generated efficiently. Such ZK-PCPs could potentially be useful for building UC-secure protocols in the tamper-proof token model. We show that all languages with efficient statistical ZK-PCPs (i.e. where the ZK property holds against unbounded verifiers) must be in SZK (the class of languages with interactive statistical ZK proofs). We do this by reducing any ZK-PCP to an instance of the Conditional Entropy Approximation problem, which is known to be in SZK. This implies in particular that such ZK-PCPs are unlikely to exist for NP-complete problems such as SAT. This is joint work with Mohammad Mahmoody.
In the planar range skyline reporting problem, we store a set P of n 2D points in a structure such that, given a query rectangle Q = [a1, a2] x [b1, b2], the maxima (a.k.a. skyline) of P ∩ Q can be reported efficiently. The query is 3-sided if an edge of Q is grounded, giving rise to two variants: top-open (b2 = ∞) and left-open (a1 = -∞) queries. All our results are in external memory under the O(n/B) space budget, for both the static and dynamic settings: * For static P, we give structures that answer top-open queries in O(logB n + k/B), O(loglogB U + k/B), and O(1 + k/B) I/Os when the universe is R2, a U x U grid, and a rank space grid [O(n)]2, respectively (where k is the number of reported points). The query complexity is optimal in all cases. * We show that the left-open case is harder, such that any linear-size structure must incur Ω((n/B)e + k/B) I/Os for a query. We show that this case is as difficult as the general 4-sided queries, for which we give a static structure with the optimal query cost O((n/B)e + k/B). * We give a dynamic structure that supports top-open queries in O(log2Be (n/B) + k/B1-e) I/Os, and updates in O(log2Be (n/B)) I/Os, for any e satisfying 0 ≤ e ≤ 1. This leads to a dynamic structure for 4-sided queries with optimal query cost O((n/B)e + k/B), and amortized update cost O(log (n/B)). As a contribution of independent interest, we propose an I/O-efficient version of the fundamental structure priority queue with attrition (PQA). Our PQA supports FindMin, DeleteMin, and InsertAndAttrite all in O(1) worst case I/Os, and O(1/B) amortized I/Os per operation. We also add the new CatenateAndAttrite operation that catenates two PQAs in O(1) worst case and O(1/B) amortized I/Os. This operation is a non-trivial extension to the classic PQA of Sundar, even in internal memory. Joint work with: Yufei Tao, Konstantinos Tsakalidis, Kostas Tsichlas, Jeonghun Yoon
Constructive cryptography is a paradigm for defining security which has been introduced by Maurer and Renner in the context of the abstract cryptography framework. The underlying idea of constructive cryptography is that one understands a protocol as a construction of a desired resource (e.g. a secure channel) from one or more assumed resources (e.g. an insecure channel and a shared key). Advantages of this approach are the clear semantics of the resulting security statements and the strong compositionality which allows for a modular protocol design and proof. The talk starts with an introduction to the constructive cryptography paradigm, shows security definitions that one obtains in the context of secure communication, and sketches the relation to previous definitions that are based on other paradigms such as game-based security or UC.
My talk is based on some of the results I obtained during my PhD, mainly focused on algebraic complexity. I will first describe an NP-completeness result for the resolution of polynomial systems over finite fields. I will then show how one can represent any arithmetic formula (or weakly-skew circuit) by a determinant of a symmetric matrix, extending a classical result of Valiant. The construction is valid for any field of characteristic different from 2 and I will explain why the case of characteristic 2 is very different. The largest part of the talk will then be focused on "sparse-like" polynomials. I will first present Koiran's real τ-conjecture related to the question "VP = VNP?" as well as the first bounds obtained on his conjecture. Then I will explain that some techniques used for the study of this conjecture can be used to compute partial factorizations of very sparse polynomials, known as lacunary polynomials. This problems leads to interesting complexity-theoretic phenomena.
Private function evaluation (PFE) is a variant of secure multiparty computation wherein the function (circuit) being computed is also considered to be a private input of one of the participants. In this talk, I will first review the existing solutions for PFE in a variety of settings and identify their shortcomings. Then, I will describe our new framework for designing PFE and show how it allows us to obtain new constructions with better asymptotic and practical efficiency. I will end by pointing out a few directions for future research.
Integer sorting in the RAM model is a fundamental problem and a long standing open problem is whether we can sort in linear time when the word size is ω(log n). In this paper we give an algorithm for sorting integers in expected linear time when the word size is Ω(log2 n loglog n). Our algorithm uses a new packed sorting algorithm with expected running time O(n/b (log n + log2 b)), where n is the number of integers and b is the number of integers packed in a word. Joint work with: Djamal Belazzougui and Gerth Stølting Brodal
One of the main tools to construct secure two-party computation protocols are Yao garbled circuits. Using the cut-and-choose technique, one can get reasonably efficient Yao-based protocols with security against malicious adversaries. At TCC 2009, Nielsen and Orlandi suggested to apply cut-and-choose at the gate level, while previously cut-and-choose was applied on the circuit as a whole. This appealing idea allows for a speed up with practical significance (in the order of the logarithm of the size of the circuit) and has become known as the ``LEGO'' construction. Unfortunately the construction by Nielsen and Orlandi is based on a specific number-theoretic assumption and requires public-key operations per gate of the circuit. The main technical contribution of this work is a new XOR-homomorphic commitment scheme based on oblivious transfer, that we use to cope with the problem of connecting the gates in the LEGO construction. Our new protocol has the following advantages: - It maintains the efficiency of the LEGO cut-and-choose. - After a number of seed oblivious transfers linear in the security parameter, the construction uses only primitives from Minicrypt (i.e., private-key cryptography) per gate in the circuit (hence the name MiniLEGO). - On the contrary of original LEGO, MiniLEGO is compatible with all known optimization for Yao garbled gates (row reduction, free-XORs, point-and-permute). Paper available at http://eprint.iacr.org/2013/155
A direct sum theorem for two parties and a function f states that the communication cost of solving k copies of f simultaneously with error probability 1/3 is at least k * R1/3(f), where R1/3(f) is the communication required to solve a single copy of f with error probability 1/3. We improve this for a natural family of functions f, showing that the 1-way communication required to solve k copies of f simultaneously with error probability 1/3 is (k R1/k(f)). Since R1/k(f) may be as large as R1/3(f) * log k, we asymptotically beat the standard direct sum bound for such functions, showing that the trivial upper bound of solving each of the k copies of f with probability 1 - O(1/k) and taking a union bound is optimal. Our results imply optimal communication/space lower bounds for several sketching problems in a setting when the algorithm should be correct on a sequence of k queries. Joint work with Marco Molinaro and David Woodruff (SODA'13)
Randomized algorithms are often enjoyed for their simplicity, but the hash functions used to yield the desired theoretical guarantees are often neither simple nor practical. Here we show that the simplest possible tabulation hashing provides unexpectedly strong guarantees. The scheme itself dates back to Zobrist [1970]. Keys are viewed as consisting of c characters. We initialize c tables T1,...,Tc mapping characters to random hash codes. A key x=(x1,...,xc) is hashed to T1[x1]⊕···⊕ Tc[xc], where ⊕ denotes xor. While this scheme is not even 4-independent, we show that it provides many of the guarantees that are normally obtained via higher independence, e.g., min-wise hashing for estimating set intersection, and cuckoo hashing. We shall also discuss a twist to simple tabulation that leads to reliable statistics with Chernoff-type concentration and extremely robust performance for linear probing with small buffers. Joint work with: Mihai Patrascu.
We consider a variant of range searching in which points are coloured. A query, whose input consists of an axis-aligned rectangle as well as a subset of colours, must report all points that lie within the query range and match one of the query colours. Such queries are useful, for example, in searching databases with both numerical and categorical attributes. While previous work has solved the 1-D case in the pointer machine model, we make steps towards solving the 2-D case. We give an optimal data structure for 2-sided (dominance) queries and a data structure for 3-sided queries that is a mere log* factor from optimal. Joint work with: Peyman Afshani
We consider the problem of constructing a sparse suffix array (or suffix array) for b suffixes of a given text T of size n, using only O(b) words of space during construction time. Breaking the naive bound of O(nb) time for this problem that can be traced back to the origins of string indexing in 1968. First results were only obtained in 1996, but only for the case where the suffixes were evenly spaced in T, here there is no constraint on the locations of the suffixes. We show that the sparse suffix tree can be constructed in O(n log(b)2) time. To achieve this we develop a technique, which may be of independent interest, that allows to efficiently answer b longest common prefix queries on suffixes of T, using only O(b) space. We expect that this technique will prove useful in many other applications in which space usage is a concern.
In this talk I give time lower bounds for two fundamental streaming problems: computing the convolution/cross-correlation (i.e. the inner product) between a fixed vector of length n and the last n numbers of a stream, and computing the Hamming distance between a fixed string of length n and the last n symbols of a stream. For each arriving value/symbol in the stream, we have to output the answer before the next value arrives. Main focus will be on our newer results for the Hamming distance problem. For both these problems, we obtain in the cell-probe model with w bits per cell, a lower bound of Omega(d*log(n)/(w+log(log(n)))) time per output, where d is the number of bits needed to represent a value/symbol in the stream. The lower bounds hold under randomisation and amortisation, and applies to any value of w. It is joint work with Raphael Clifford and Benjamin Sach. Details can be found at http://arxiv.org/abs/1101.0768 and http://arxiv.org/abs/1207.1885
Valiant's Probably Approximately Correct (PAC) learning model brought a computational complexity perspective to the study of machine learning. The PAC framework deals with *supervised* learning problems, where data points are labeled by some target function and the goal is to infer a high-accuracy hypothesis that is close to the target function. The PAC learning model and its variants provide a useful and productive setting for studying how the complexity of learning different types of Boolean functions scales with the complexity of the functions being learned. A large portion of contemporary machine learning deals with *unsupervised* learning problems. In problems of this sort data points are unlabeled, so there is no "target function" to be learned; instead the goal is to infer some structure from a sample of unlabeled data points. This talk will focus on the problem of learning an unknown probability distribution given access to independent samples drawn from it. Analogous to the PAC model for learning Boolean functions, the broad goal here is to understand how the complexity of learning different types of distributions scales with the complexity of the distributions being learned. We survey recent results in this area and identify questions for future work.
Rational cryptography has recently emerged as a very promising field of research by combining notions and techniques from cryptography and game theory, because it offers an alternative to the rather inflexible traditional cryptographic model. In contrast to the classical view of cryptography where protocol participants are considered either honest or arbitrarily malicious, rational cryptography models participants as rational players that try to maximize their benefit and thus deviate from the protocol only if they gain an advantage by doing so. The main research goals for rational cryptography are the design of more efficient protocols when players adhere to a rational model, the design and implementation of automated proofs for rational security notions and the study of the intrinsic connections between game theoretic and cryptographic notions. In this talk, we address all these issues. First we present the mathematical model and the design for a new rational file sharing protocol which we call RatFish. Next, we develop a general method for automated verification for rational cryptographic protocols and we show how to apply our technique in order to automatically derive the rational security property for RatFish. Finally, we study the intrinsic connections between game theory and cryptography by defining a new game theoretic notion, which we call game universal implementation, and by showing its equivalence with the notion of weak stand-alone security.
The centerpoint theorem is one of the fundamental theorems in discrete geometry, and it states the following: given any set P of n points in Rd, there exists a point q such that any closed half-space containing q contains at least n/(d+1) points of P. The point q need not be a point of P. This theorem has found several applications in combinatorial geometry, statistics, geometric algorithms and related areas. An even more fundamental theorem, encompassing the centerpoint theorem, was first proven by Tverberg in 1966: Given any set P of n-points in d-dimensions, one can partition P into (roughly) n/(d+1)-sets, each of d+1 points, such that the Simplices spanned by the sets have a non-empty intersection(Tverberg's Theorem). Tverberg's and Vrecica gave an ingenious algorithmic proof of Tverberg's Theorem. In the talk we will see its proof and two more fundamental results of geometry centerpoint theorem and Helley's Theorem using the same technique. We will see a generalization of the centerpoint theorem to a new theorem about "centerdisk" by extending this nice idea to such that the centerpoint theorem becomes a special case of this more general statement: If there exists a disk D such that any half-space containing D contains a larger fraction of points of P than n/(d+1)? In the talk we will see the Upper and lower bounds of this question.
A veritable Holy Grail of modern day cryptography is the apparently-simple notion of "Secure, Anonymous Communication". In the literature, the notion of security ranges between flavours of message confidentiality in encryption (semantic, CCA1, CCA2 security, etc), unforgeability for signatures, impersonation resistance for authentication, key secrecy for key agreement, etc. Security essentially guarantees attack detection and prevention, or sometimes pin-pointing and catching an attacker. By contrast, anonymity ranges from unlinkability or identity-hiding in authentication to sender/receiver anonymity in encryption, forward security in key agreement, anonymous credentials, and others. This property must guarantee that the identity associated with an entity is fully hidden. Sometimes privacy and secrecy work at odds with each other; other times, they are merely orthogonal. One challenging aspect is understanding how anonymity behaves in a network of users exchanging (possibly encrypted) communication. My presentation considers two scenarios. I first focus on a recent result (PETS 2013), analysing minimal assumptions and inherent restriction in receiver-anonymity-preserving PKE. I talk about receiver-anonymous channels, i.e. channels across which senders send messages to receivers without an adversary learning the intended recipient. By using the constructive cryptography approach of Maurer and Renner we interpret cryptographic PKE schemes (with special properties) as ways to construct an ideal resource (the elusive confidential, anonymous channel) from a real resource (e.g. a broadcast channel). We show that a very natural ideal resource can be constructed by using a PKE scheme that is IND-CCA, key-private, and weakly robust. In particular, strong robustness is NOT necessary (though it gives a tighter security bound). Yet, a desirable, stronger variant of the same resource, which also prevents tracking receiver behaviour by using trial-deliveries, is unachievable by any PKE scheme, no matter how strong. Thus, it seems that PKE schemes preserve anonymity, but cannot create it. While this seems a negative result, the use of constructive cryptography helps us understand more about the exact security and privacy guarantees PKE schemes offer, and in particular we can maximize efficiency in the construction of the ideal channel. In the second part of the talk (much shorter than the first) I will consider a resource that seems to create anonymity, rather than just preserve it. Namely, I consider TOR networks and traffic analysis attacks within such networks. In this context, anonymity is much less related to confidentiality, but rather can be considered a truly orthogonal property. The idea is to analyze the behaviour of the network, and model traffic analysis generically as leakage, in particular a leakage oracle which might, or might not, depend on the secret key. A much trickier issue is how to use such an oracle -- which is in itself an abstract, and rather rigid object -- in such a way as to capture realistically the threat of traffic analysis. This part of the presentation is open-end, since it is not accomplished research, but rather an interesting step in understanding the complicated relationship between anonymity and encryption.
An increasingly number of innovative applications relies on the processing of huge spatial data. In this regard, we are interested in contexts where deluges of these spatial data are continuously produced, such as mobile phone infrastructures, aircraft anti-collision systems, but also contexts related to virtual and simulated worlds, like those pertaining to either Massively Multiplayer Online Games (MMOG), or behavioral simulations (also known as agent-based simulations), where agents may act the behavior decisions of other agents within a given range. In many of the application scenarios deriving from the contexts mentioned above, we have multitudes of moving objects that routinely and frequently report their new positions and, at the same time, may search for other objects in their interaction range. In our work we explored the usage of modern GPUs in order to speed up the management of these scenarios through an approach we called "iterated spatial joins". Nowadays GPUs can be programmed for solving general scientific problems; however, despite their computational power the architecture of these devices poses some important limitations which must be taken into account to achieve significant advantages through their use. We will therefore show what the main problems which had to be tackled were and the solutions adopted in the work, the experimental results obtained so far and the open problems which still have to be solved for achieving optimal performances when processing skewed spatial objects distributions. Joint work with: Salvatore Orlando and Claudio Silvestri
Performance of secure computation is still often an obstacle to its practical adoption. There are different protocols for secure computation that compete for the best performance. It is current practice to evaluate the performance of SC protocols using complexity approximations of computation and communication. Due to the disparate complexity measures and constants this approach fails at reliably predicting the performance. We contribute a performance model for forecasting run-times of secure two-party computations. We show that our performance model can be used to make an optimal selection of an algorithm and cryptographic protocol combination, as well as to determine the implicit security tradeoffs. Next, we propose an automatic protocol selection which selects a protocol for each operation in the input algorithm, resulting in a mix with the best performance so far. Based on the performance model, we propose an optimization algorithm and an efficient heuristic for this more fine grained selection problem. We show that our mixed protocols achieve the best performance on a set of use cases and demonstrate that the selection problem is so complicated, that a programmer is unlikely to manually make the optimal selection. Our proposed algorithms nevertheless can be integrated into a compiler in order to yield the best (or near-optimal) performance. Rounding up this talk, we present some practical feasibility results of implemented prototypes for use-cases that build upon secure computation protocols: linear programming for supply chain optimization, browser-based garbled circuits for economic lot-size planning, cloud based benchmarking of key performance indicators. We also shortly present an intermediate language compiler for mixed secure protocols that has been used to implement some of them.
Stream cipher, a symmetric key cryptography primitive, is essentially a pseudo-random number generator (PRNG) that generates a long stream of data (called keystream) based on a short seed (the secret key). RC4 is the most popular and widely deployed byte-oriented software stream cipher. It is expected that any internal state-byte or any output-byte of RC4 would be (nearly) uniformly distributed. However, in practice, there exist many short-term and a few long-term biases in RC4 state and keystream, not only to specific integers in Z256, but also to certain key/state/output-byte combinations. In this talk, we discuss the most important ones among these biases, some of which lead to interesting practical attacks. Short bio of speaker: Dr. Goutam Paul did his Ph.D. in cryptology from Indian Statistical Institute in 2009. Currently, he is an Assistant Professor in the Department of Computer Science & Engineering of Jadavpur University, Kolkata, India, and visiting RWTH Aachen, Germany as an Alexander von Humboldt Fellow. Recently. Apart from design and cryptanalysis of stream ciphers, his other research interests include steganography, privacy preservation for location-based services and quantum cryptography.
The question whether the Decisional Diffie Hellman (DDH) assumption can be used to construct Identity Based Encryption (IBE) was an important open question regarding DDH. We show that this is impossible for constructions that make generic (black-box) use of a DDH-hard group. In 2007 Boneh, Vahlis, Rackoff, Waters, and I started proving this impossibility. The argument was completed recently. One of our early failures led to a successful attack (in 2008) separating IBE from PKE. The DDH setting is far more complicated and subtle to rule-out. It bares no "syntactic" technical similarity to the PKE. In this talk I'll assume no prior knowledge in cryptography or in black-box separations. I will introduce black-box (oracle) separations: what do they mean and give a toy-example. Then, I'll discuss some of the features of an attack algorithm that breaks the security of every IBE system, given that this system is constructed by making only generic use of a randomly labeled group. In particular, this concludes the DDH impossibility. This is joint work with Charles Rackoff and Yevgeniy Vahlis. (The 2008, PKE separation, result is joint work with Dan Boneh, Charles Rackoff, Yevgeniy Vahlis, and Brent Waters)
Strictly proper scoring rules were introduced in the middle of the last century in the context of weather forecasting, where they serve as a tool for ex post evaluation of quality of a forecast. Recently, interesting applications of scoring rules appeared both in algorithmic game theory and in cryptographic protocol theory. The former is a line of work about prediction markets, i.e., markets that serve as an aggregation tool for information held by the buyers. The latter was introduced in the work of Azar and Micali (STOC 2012) that presented so called rational proofs. This new model of interactive proofs mimics the classical setting with the difference that the prover is additionally seeking to maximize a reward payed out to him by the verifier at the end of their interaction. Such model turns out to be more powerful than the classical one; Azar and Micali showed that every language in #P admits a one round rational proof, however this is not known to be true in the classical setting. In this talk we will overview the basics of proper scoring rules and their applications in the theory of interactive proofs. Additionally, we will suggest some open problems related to use of proper scoring rules for verification of computation and construction of succinct cryptographic protocols.
We initiate a general study of schemes resilient to both tampering and leakage attacks. Tampering attacks are powerful cryptanalytic attacks where an adversary can change the secret state and observes the effect of such changes at the output. Our contributions are outlined below: 1. We propose a general construction showing that any cryptographic primitive where the secret key can be chosen as a uniformly random string can be made secure against bounded tampering and leakage. This holds in a restricted model where the tampering functions must be chosen from a set of bounded size after the public parameters have been sampled. Our result covers pseudorandom functions, and many encryption and signature schemes. 2. We show that standard ID and signature schemes constructed from a large class of Σ-protocols (including the Okamoto scheme, for instance) are secure even if the adversary can arbitrarily tamper with the prover's state a bounded number of times and/or obtain some bounded amount of leakage. Interestingly, for the Okamoto scheme we can allow also independent tampering with the public parameters. 3. We show a bounded tamper and leakage resilient CCA secure public key cryptosystem based on the DDH assumption. We first define a weaker CPA-like security notion that we can instantiate based on DDH, and then we give a general compiler that yields CCA-security with tamper and leakage resilience. This requires a public tamper-proof common reference string. 4. Finally, we explain how to boost bounded tampering and leakage resilience (as in 2. and 3. above) to continuous tampering and leakage resilience, in the so-called floppy model where each user has a personal floppy (containing leak- and tamper-free information) which can be used to refresh the secret key (note that if the key is not updated, continuous tamper resilience is known to be impossible). For the case of ID schemes, we also show that if the underlying protocol is secure in the bounded retrieval model, then our compiler remains secure, even if the adversary can tamper with the computation performed by the device. In some earlier work, the implementation of the tamper resilient primitive was assumed to be aware of the possibility of tampering, in that it would switch to a special mode and, e.g., self-destruct if tampering was detected. None of our results require this assumption. Joint work with: Ivan Damgaard, Sebastian Faust and Daniele Venturi.
Sparse signal representations have emerged as powerful tools in signal processing theory and applications, and serve as the basis of the now-popular field of compressive sensing (CS). However, several practical signal ensembles exhibit additional, richer structure beyond mere sparsity. Our particular focus in this talk is on signals and images where, owing to physical constraints, the positions of the nonzero coefficients do not change significantly as a function of spatial (or temporal) location. Such signal and image classes are often encountered in seismic exploration, astronomical sensing, and biological imaging. Our contributions are threefold: (i) We propose a simple, deterministic model based on the Earth Mover Distance that effectively captures the structure of the sparse nonzeros of signals belonging to such classes. (ii) We formulate an approach for approximating any arbitrary signal by a signal belonging to our model. The key idea in our approach is a min-cost max-flow graph optimization problem that can be solved efficiently. (iii) We develop a CS algorithm for efficiently reconstructing signals belonging to our model, and numerically demonstrate its benefits over state-of-the-art CS approaches.
The tree evaluation problem was introduced by Cook et. al. as a candidate for separating the complexity classes LogSpace and PTIME. We prove tight super-polynomial size lower bounds for non-deterministic, thrifty, bitwise-independent branching programs solving the tree evaluation problem. This model is powerful enough to achieve all known upper bounds for the problem. Furthermore, the lower bound is proved by showing a connection with fractional, black-white pebbling of complete binary trees. The lower bound is proved using Jukna and Zak's "Entropy method". We also show that existing lower bounds for the deterministic case and other related problems can be formulated using this method. Reference: Balagopal Komarath and Jayalal M.N. Sarma, Pebbling, Entropy and Branching Program Size Lower Bounds (Accepted in STACS 2013)
We present a deterministic algorithm that given any directed graph on n vertices computes the parity of its number of Hamiltonian cycles in O(1.618n) time and polynomial space. For bipartite graphs, we give a 1.5n poly(n) expected time algorithm. Our algorithms are based on a new combinatorial formula for the number of Hamiltonian cycles modulo a positive integer. Joint work with Andreas Björklund (arxiv.org/abs/1301.7250)
While it is impossible to realize oblivious transfer using a quantum protocol without further assumptions, it becomes possible if the adversary has a restricted amount of quantum memory at his disposal. Here I will present a new tool (called "min-entropy sampling") that allows us to prove very strong bounds in this model: namely that there exists a protocol that achieves OT securely if the adversary cannot store more than n - O(log2(n)) qubits, where n is the number of qubits transmitted in the protocol, while previous results could only guarantee security with a quantum memory bound of 2n/3. In other words, this is as strong as one could hope for: unless the adversary stores everything that happens in the protocol, he cannot cheat. If time permits, I will also present some other applications of min-entropy sampling. This is joint work with Omar Fawzi, Stephanie Wehner and Thomas Vidick
Many exciting practitioners have been working on experimental prototypes (or even more mature software systems) in exploiting the idea of secure computation to solve real-world needs. This talk begins with a brief summary of some effective implementation techniques, such as pipelining, circuit width reduction, library-based construction, program partitioning, layered execution, etc. Then we go through an efficient scheme (performance comparable to semi-honest protocols) for malicious adversaries but in the 1-bit leak model. We conclude with some recent discovery in developing symmetric cut-and-choose protocols that are both provably secure in the standard malicious threat model and many times more efficient (thanks to the balanced workloads and minimal idling waits).
Many exciting practitioners have been working on experimental prototypes (or even more mature software systems) in exploiting the idea of secure computation to solve real-world needs. This talk begins with a brief summary of some effective implementation techniques, such as pipelining, circuit width reduction, library-based construction, program partitioning, layered execution, etc. Then we go through an efficient scheme (performance comparable to semi-honest protocols) for malicious adversaries but in the 1-bit leak model. We conclude with some recent discovery in developing symmetric cut-and-choose protocols that are both provably secure in the standard malicious threat model and many times more efficient (thanks to the balanced workloads and minimal idling waits).
Modeling how water flows on landscapes, and therefore predicting floods and other environmental phenomena, has always been of major importance in hydrology and ecological sciences. Today, flow modeling on landscapes is performed in a computer-based environment using digital representations of real-world terrains. One of the most popular digital terrain representations are the so-called Triangulated Irregular Networks (TINs) which are piecewise linear surfaces that consist of 3D triangles. A natural way to model water flow on a surface is to consider that, at any point on the surface, water follows the direction of steepest descent (DSD). However, even this simple flow model has been proven computationally infeasible when applied on TINs. On the other hand, there exist many methods that compute flow paths on TINs approximately, that is without following strictly the DSD on the surface. Such methods do not suffer from the computational problems of the exact flow model. However, flow structures that are produced using these methods may provide a poor approximation of the original structures that are implied by the exact flow model. In this talk we present various approximate algorithms for computing drainage basins on TINs. We evaluate these algorithms both in terms of efficiency and approximation quality in comparison to the exact flow model. We consider two different categories of methods; the first category involves methods that use exact arithmetic, and can introduce new vertices on the TIN with coordinates of possibly large bit-size. The methods of the second category are restricted to performing comparisons between floating point numbers. Among other results, we show that certain methods provide a good quality of approximation for terrains of certain morphology; some methods do well on relatively mountainous terrains whereas others do better on nearly flat terrains. Target group: This talk will be presented at a technical level that is accessible to anyone with an interest in flow analysis.
Benaloh and Leichter, and later Hirt and Maurer showed that secret sharing schemes and MPC protocols can be designed from certain monotone Boolean formulae. The idea is that players in the protocol correspond to input bits to the formula, and the protocol will be secure as long as each set of players that may be corrupted corresponds to an input string that the formula rejects. This leads to a general approach to solving protocol design problems. For instance, if the formula F takes n inputs and is built from majority gates with 3 inputs, one first designs a protocol for 3 players. Then, combining this with F, one gets "automatically" an n-player protocol for the same task. In this work, we get new results for broadcast protocols and black-box group MPC using this approach and we also get new and much simpler proofs for known feasibility results based on new explicit constructions of monotone formulae.
Cryptography can change dramatically in a quantum world where people
can process quantum information. On the one hand, there are quantum
attacks that break classically secure constructions, and in general
security against classical adversaries may become invalid in presence
of quantum adversaries. On the other hand, we can also design quantum
protocols which sometimes can achieve tasks that are otherwise
impossible classically. In this talk, I will present two main results
of my research work. First, I will show a general feasibility result
that there exist classical protocols for 2-party Secure Function
Evaluation against quantum adversaries, under proper assumptions
[HSS'11]. Second, I will show a quantum protocol that realizes
Oblivious Transfer with statistical security from a trusted setup
called 2-bit Cut-and-Choose [FKSZZ'13]. This reduction is provably
impossible if using classical protocol only. [HSS'11] Classical
Cryptogrphic Protocols in a Quantum World. Sean Hallgren, Adam Smith
and Fang Song. In Crypto 2011
The fact that a cryptographic device itself leaks physical information leads to side channel cryptanalysis. Physical leakage can be captured in order to recon- struct secret keys of well-known cryptographic algorithms. Inspired by cold boot attacks (methods for extracting noisy keys from DRAMs) Heninger and Shacham (Crypto 2009) initiated the study of recovering an RSA private key from a noisy version of that key. They gave an algorithm that recovers RSA private keys when a fraction of at least 0.27 of the private key bits are known with certainty. A follow-up paper came from Henecka, May and Meurer (Crypto 2010) that works when all the key bits are subject to error with bit-flips from 0 to 1 and 0 to 1 being equally likely. Their algorithm successfully recovers the private key in polynomial time if the probability of a bit flip is less than 0.237. However, we explain why neither of these previous works solves the motivating cold boot problem. But is there any algorithm that solves the true cold boot problem? Is there any general algorithm that works in other types of side channel attack? Are the previous bounds the best possible for the cold boot setting? Is there any ultimate limit to the noise level of our new algorithm to reconstruct RSA private keys under different settings? We answer these questions by recasting the problem of noisy RSA key recovery as a problem in coding theory. Thus, we derive a key recovery algorithm that works for any (memoryless) binary channel. Moreover, we derive better upper bounds on possible error rates for the previous algorithms and we find the best upper bounds of our new algorithms.
Despite my background as a theoretician, for the past year I worked in the "real world" as a software engineer on MongoDB, the leading NoSQL database. NoSQL- short for "Not Only SQL"- describes a collection of non-relational databases that have been developed to solve many of the practical problems encountered in modern software development. In this talk, I'll describe what the NoSQL movement is all about, why it's all the rage in industry right now, and how MongoDB fits in. This will definitely *not* be a theory talk, but based on my observations, I hope to mention some areas where I think theorists have the potential to contribute.
We address the problem of creating a dictionary with the finger search property in the strict implicit model, where no information is stored between operations, except the array of elements. We show that for any implicit dictionary supporting finger searches in q(t) = Ω(log t) time, the time to move the finger to another element is Ω(q-1(log n)), where t is the rank distance between the query element and the finger. We present an optimal implicit static structure matching this lower bound. We furthermore present a near optimal implicit dynamic structure supporting search, change-finger, insert, and delete in times O(q(t)), O(q-1(log n) log n), O(log n), and O(log n), respectively, for any q(t) = Ω(log t). Finally we show that the search operation must take Ω(log n) time for the special case where the finger is always changed to the element returned by the last query. Joint work with: Gerth Stølting Brodal & Jakob Truelsen.
In the planar range skyline reporting problem, the goal is to store a set P of n 2D points in a structure such that, given a rectangle Q = [a1, a2] × [b1, b2], the maxima (a.k.a. skyline) of P ∩ Q can be reported efficiently. Q is 3-sided if one of its edges is grounded, giving rise to two variants: top-open (b2 = ∞) and left-open (a1 = -∞) queries. This paper presents comprehensive results in external memory under the O(n/B) space budget (B is the block size), covering both the static and dynamic settings: * For static P, we give structures that support a top-open query in O(logB n + k/B), O(loglogB U + k/B), and O(1 + k/B) I/Os when the universe is R2, a U x U grid, and the rank space [O(n)]2, respectively (where k is the number of points reported). The query complexity is optimal in all cases. * We show that the left-open case is harder, such that any linear-size structure must incur Ω((n/B)ε + k/B) I/Os to answer a query. In fact, this case turns out to be just as difficult as the general 4-sided queries, for which we provide a static structure with the optimal query cost O((n/B)ε + k/B). Interestingly, these lower and upper bounds coincide with those of orthogonal range reporting in R2, i.e., the skyline requirement does not alter the problem difficulty at all. * For dynamic P, we present a fully dynamic structure that supports a top-open query in O(log2Bε (n/B) + k/B1-ε) I/Os, and an insertion/deletion in O(log2Bε(n/B)) I/Os, where ε can be any parameter satisfying 0 ≤ ε ≤ 1. This result also leads to a dynamic structure for 4-sided queries with the optimal O((n/B)ε + k/B) query time, and O(log (n/B)) amortized update time. As a contribution of independent interest, we propose an I/O-efficient version of the fundamental structure priority queue with attrition (PQA). Our PQA supports FindMin, DeleteMin, and InsertAndAttrite all in O(1) worst-case I/Os, and O(1/B) amortized I/Os per operation. Furthermore, it allows the additional CatenateAndAttrite operation that merges two PQAs in O(1) worst-case and O(1/B) amortized I/Os. The last operation is a non-trivial extension to the classic PQA of Sundar, even in internal memory. The new PQA is a crucial component of our dynamic structure for range skyline reporting. Joint work: Yufei Tao, Konstantinos Tsakalidis, Kostas Tsichlas, and Jeonghun Yoon
Over the past decade, the kinetic-data-structures framework has become the standard in computational geometry for dealing with moving objects. A fundamental assumption underlying the framework is that the motions of the objects are known in advance. This assumption severely limits the applicability of KDSs. In this talk I will discuss some of the recent work on KDSs in the so-called black-box model, which is a hybrid of the KDS model and the traditional time-slicing approach. In this more practical model we receive the position of each object at regular time steps and we have an upper bound on the maximum displacement of any object in one time step. Joint work with Marcel Roeloffzen and Bettina Speckmann.
In many GIS applications it is important to study the characteristics of a raster data set at multiple resolutions. Often this is done by generating several coarser resolution rasters from a fine resolution raster. In this talk we describe efficient algorithms for different variants of this problem. Given a raster G of √N × √N cells we first consider the problem of computing for every 2 ≤ m ≤ √N a raster Gm of √N/m × √N/m cells such that each cell of Gm stores the average of the values of m × m cells of G. We describe an algorithm that solves this problem in Θ(N) time when the handled data fit in the main memory of the computer. We also describe three algorithms that solve this problem in external memory, that is when the input raster is larger than the main memory. The first external algorithm is very easy to implement and requires O(sort(N)) data block transfers from/to the external memory. The second algorithm requires only O(scan(N)) transfers, however this algorithm is cache-aware and assumes that the main memory of the computer can store a considerable number of data blocks. The third algorithm is a fresh, yet-to-be-published upgrade from the previous results; this algorithm requires O(scan(N)) transfers, is cache-oblivious, and makes no assumption on the size of the main memory. We also study a variant of the problem where instead of the full input raster we handle only a connected subregion of arbitrary shape. For this variant we describe an algorithm that runs in Θ(U log N) time in internal memory, where U is the size of the output. The results that are presented in this talk derive from joint works with Lars Arge, Gerth Brodal, Herman Haverkort and Jakob Truelsen.
We study the problem of 2-dimensional orthogonal range counting with absolute additive error. Given a set P of n points drawn from an n × n grid and an error parameter ε, the goal is to build a data structure, such that for any orthogonal range R, the data structure can return the number of points in P ∩ R with additive error ε n. A well-known solution for this problem is the ε-approximation. Informally speaking, an ε-approximation of P is a subset A of P that allows us to estimate the number of points in P ∩ R by counting the number of points in A ∩ R. It is known that an ε-approximation of size O(1/ε log2.5 1/ε) exists for any P with respect to orthogonal ranges, and the best lower bound is Ω(1/ε log 1/ ε). The ε-approximation is a rather restricted data structure, as we are not allowed to store any information other than the coordinates of a subset of points in P. In this talk, we explore what can be achieved without any restriction on the data structure. We first describe a data structure that use O(1/ε log 1/ε loglog 1/ε n) bits that answers queries with error ε n. We then prove a lower bound that any data structure that answers queries with error O(log n) must use Ω(n log n) bits. This lower bound has two consequences: 1) answering queries with error O(log n) is as hard as answering the queries exactly; and 2) our upper bound cannot be improved in general by more than an O(loglog 1/ε) factor. Joint work with Ke Yi.
Cache-oblivious streaming B-trees (indexes) were introduced by Bender et al in 2007 to support fast updates. I will describe a partially-persistent cache-oblivious streaming index. It uses linear space, and supports updates in O(log N / B) IOs, and queries in O(log N + K/B) IOs in an average-case sense.
Leakage resilient cryptography attempts to incorporate side-channel leakage into the black-box security model and designs cryptographic schemes that are provably secure within it. Informally, a scheme is leakage-resilient if it remains secure even if an adversary learns a bounded amount of arbitrary information about the schemes internal state. Unfortunately, most leakage resilient schemes are unnecessarily complicated in order to achieve strong provable security guarantees. As advocated by Yu et al. [CCS'10], this mostly is an artefact of the security proof and in practice much simpler construction may already suffice to protect against realistic side-channel attacks. In this paper, we show that indeed for simpler constructions leakage-resilience can be obtained when we aim for relaxed security notions where the leakage-functions and/or the inputs to the primitive are chosen non-adaptively. For example, we show that a three round Feistel network instantiated with a leakage resilient PRF yields a leakage resilient PRP if the inputs are chosen non-adaptively (This complements the result of Dodis and Pietrzak [CRYPTO'10] who show that if a adaptive queries are allowed, a superlogarithmic number of rounds is necessary.) We also show that a minor variation of the classical GGM construction gives a leakage resilient PRF if both, the leakage-function and the inputs, are chosen non-adaptively. This is joint work with Krzysztof Pietrzak and Joachim Schipper
The EQUALITY problem is the one of the oldest problems in communication complexity. We find several new insights into this problem, by looking at the following three resources: (a) the expected communication required to show that x!=y, (b) the amount of interaction available to Alice and Bob, and (c) the error guarantee of the protocol. We obtain tight tradeoffs between all three parameters, giving both a combinatorial and an information-complexity (icost) proof. In hindsight, the combinatorial proof is surprisingly simple; the icost proof is surprisingly involved. As an application of our icost result, we also get rounds-vs-communication lower bounds for the k-DISJOINTNESS problem. Lower bounds for k-DISJ have recently been very useful in achieving lower bounds for property testing. In the first part of the talk (Friday 9 November, 15:15-16:15) I'll go over the problem and related work, state our new results, and (time willing) present the combinatorial lower bound. I'll devote the bulk of the second talk (next Friday 16 November, 14:00-16:00) to giving the information complexity lower bound and it's application to k-DISJ. These talks assume the audience has a minimal understanding of communication complexity. The second talk also assumes a minimal understanding of information theory. If there is sufficient interest there will be a crash course in information theory next week. Joint work with Amit Chakrabarti and Ranganath Kondapally, both from Dartmouth College.
This talk will survey the area of data structure lower bounds. It will consist of two separate parts. In the first 45 minutes, I'll give a tutorial on lower bounds for dynamic data structures. This tutorial was presented at the FOCS'12 Workshop on Data Structures, in memory of Mihai Patrascu. The talk will focus in particular on Mihai's contributions to the area. In this last 25 minutes, I'll give a (shorter) overview of the techniques used for proving lower bounds for static data structures. This part of the talk will essentially be the FOCS'12 conference talk for my paper "Higher Cell Probe Lower Bounds for Evaluating Polynomials". Target group: This talk will be presented at a technical level that is accessible to anyone with an interest in data structures and/or lower bounds.
RSA and DSA can fail catastrophically when used with malfunctioning random number generators, but the extent to which these problems arise in practice has never been comprehensively studied at Internet scale. We perform the largest ever network survey of TLS and SSH servers and present evidence that vulnerable keys are surprisingly widespread. We find that 0.75 entropy during key generation, and we suspect that another 1.70 from the same faulty implementations and may be susceptible to compromise. Even more alarmingly, we are able to obtain RSA private keys for 0.50 public keys shared nontrivial common factors due to entropy problems, and DSA private keys for 1.03 signature randomness. We cluster and investigate the vulnerable hosts, finding that the vast majority appear to be headless or embedded devices. In experiments with three software components commonly used by these devices, we are able to reproduce the vulnerabilities and identify specific software behaviors that induce them, including a boot-time entropy hole in the Linux random number generator. Finally, we suggest defenses and draw lessons for developers, users, and the security community. Joint work with Zakir Durumeric, Eric Wustrow, and J. Alex Halderman.
In this talk I will talk about the following budget error-correcting problem: Alice has a point set X and Bob has a point set Y in the d-dimensional grid. Alice wants to send a short message to Bob so that Bob can use this information to adjust his point set Y towards X to minimize the Earth-Mover Distance between the two point sets. A more intuitive way to understand this problem is: Alice tries to help Bob to recall Eve's face by sending him a short message. Of course Bob will fail to recall if he does not know Eve, but if he knows something about Eve, the message could help a lot. Naturally, there is a trade-off between the message size and the quality of such an adjustment. Now given a quality constraint, we want to minimize the message size. That is, when X and Y are close, Bob wants to recover X from Y using a very small message from Alice. This problem is motivated by applications including image exchange/synchronization and video compression. In this paper we give first upper and lower bounds for this problem. These bounds are almost tight in the case when d = O(1).
In this work we present a Identity based encryption scheme that achieves full security also when an adversary can obtain leakage from the secret key of the users. Moreover, using special updating algorithms, we do not put any a priori length bound on the leakage allowed to the adversary, we just require a periodical refreshment of the keys. The scheme uses techniques from the recent work of Lewko at al (TCC 2011), but, exploiting the methodology of dual paring vector spaces, we show how to implement our leakage resilient IBE scheme with the DLIN assumption. This is a joint work with Ryo Nishimaki and Tatsuaki Okamoto.
We present new results on one of the most basic problems in geometric data structures, 2-D orthogonal range counting. All of our data structures operate under the w-bit word RAM model. It is well known that there are linear-space data structures for 2-D orthogonal range counting with worst-case optimal query time O(logw n). We give an O(n loglog n)-space adaptive data structure that improves the query time to O(loglog n + logw k), where k is the output count. When k = O(1), our bounds match the state of the art for the 2-D orthogonal range emptiness problem [Chan, Larsen, and P.tra.cu, SoCG 2011]. We give an O(n loglog n)-space data structure for approximate 2-D orthogonal range counting that can compute a (1 + δ)-factor approximation to the count in O(loglog n) time for any fixed constant δ > 0. Again, our bounds match the state of the art for the 2-D orthogonal range emptiness problem. Joint work with Timothy M. Chan, University of Waterloo
We introduce the notion of covert security with public verifiability, building on the covert security model introduced by Aumann and Lindell (TCC 2007). Protocols that satisfy covert security guarantee that the honest parties involved in the protocol will notice any cheating attempt with some constant probability ε. The idea behind the model is that the fear of being caught cheating will be enough of a deterrent to prevent any cheating attempt. However, in the basic covert security model, the honest parties are not able to persuade any third party (say, a judge) that a cheating occurred. We propose (and formally define) an extension of the model where, when an honest party detects cheating, it also receives a certificate that can be published and used to persuade other parties, without revealing any information about the honest party's input. In addition, malicious parties cannot create fake certificates in the attempt of framing innocents. Finally, we construct a secure two-party computation protocol for any functionality f that satisfies our definition, and our protocol is almost as efficient as the one of Aumann and Lindell. We believe that the fear of a public humiliation or even legal consequences vastly exceeds the deterrent given by standard covert security. Therefore, even a small value of the deterrent factor ε will suffice in discouraging any cheating attempt. Joint work with Gilad Asharov.
In this talk, I am going to present two communication protocols for computing edit distance. In the first part, I will show a one-way protocol for the following problem. Given strings x to Alice and y to Bob, Alice sends a message to Bob so that he learns x or reports that the edit distance between x and y is greater than k. Following that, I will show a simultaneous protocol for edit distance over permutations. Here Alice and Bob both send a message to a third party (the referee) who does not have access to the input strings. Given the messages, the referee decides if the edit distance between x and y is at most k or not. For both these problems I will show protocols in which the parties run in near-linear time and they transmit at most O(k polylog n) bits. These results are obtained through mapping strings to the Hamming cube. For this, I have used the Locally Consistent Parsing method in combination with the Karp-Rabin fingerprints. In addition to yielding non-trivial bounds for the edit distance problem, these results suggest a new conceptual framework and raises a new type of question regarding the embedability of edit distance into the Hamming cube which might be of independent interest.
We revisit the context of leakage-tolerant interactive protocols as defined by Bitanski, Canetti and Halevi (TCC 2012). Our contributions can be summarized as follows: 1) For the purpose of secure message transmission, any encryption protocol with message space M and secret key space SK tolerating poly-logarithmic leakage on the secret state of the receiver must satisfy |SK| ≥ (1-ε)|M|, for every 0 < ε ≤ 1, and if |SK| = |M|, then the scheme must use a fresh key pair to encrypt each message. 2) More generally, we prove that an encryption protocol for secure message transmission tolerates leakage of ≈poly(log κ) bits on the receiver side at the end of the protocol execution, if and only if the protocol has passive security against an adaptive corruption of the receiver at the end of the protocol execution. Indeed, there is nothing special about there being two parties or the communication setting: any n party protocol tolerates leakage of ≈poly(log κ) bits from party i at the end of the protocol execution, if and only if the protocol has passive security against an adaptive corruption of party i at the end of the protocol execution. This shows shows that as soon as a little leakage is tolerated, one needs full adaptive security. 3) Our result in (2) can be generalized to arbitrary corruptions in a leakage-tolerant n-party protocol. In case more than one party can be corrupted, we get that leakage tolerance is equivalent to a weaker form of adaptivity, which we call semi-adaptivity. Roughly, a protocol has semi-adaptive security if there exist a simulator which can simulate the internal state of corrupted parties, i.e., it can output some internal state consistent with what the party has sent and received. However, such a state is not required to be indistinguishable from a real state, only that it would have lead to the simulated communication. The results above complement the ones in Bitanski et al., who already showed that semi-honest adaptive security is sufficient for leakage tolerance. Our techniques rely on a novel way to exploit succinct interactive arguments of knowledge for NP, and can be instantiated based on the assumption that collision-resistant function ensembles exist. Based on joint work with Jesper Buus Nielsen and Angela Zottarel.
We are witnessing a proliferation of Internet-worked, geo-positioned mobile devices such as smartphones and personal navigation devices. Likewise, location-related services that target the users of such devices are proliferating. Consequently, server-side infrastructures are needed that are capable of supporting the location-related query and update workloads generated by very large populations of such moving objects. This paper presents a main-memory indexing technique that aims to support such workloads. The technique, called PGrid, uses a grid structure that is capable of exploiting the parallelism offered by modern processors. Unlike earlier proposals that maintain separate structures for updates and queries, PGrid allows both long-running queries and rapid updates to operate on a single data structure and thus offers up-to-date query results. Because PGrid does not rely on creating snapshots, it avoids the stop-the-world problem that occurs when workload processing is interrupted to perform such snapshotting. Its concurrency control mechanism relies instead on hardware-assisted atomic updates as well as object-level copying, and it treats updates as non-divisible operations rather than as combinations of deletions and insertions; thus, the query semantics guarantee that no objects are missed in query results. Empirical studies demonstrate that PGrid scales near-linearly with the number of hardware threads on four modern multi-core processors. Since both updates and queries are processed on the same current data-store state, PGrid outperforms snapshot-based techniques in terms of both query freshness and CPU cycle-wise efficiency.
The Learning Parity with Noise (LPN) problem has thus far been more fruitfully investigated as a basis for symmetric cryptography, but the simplicity of the problem compared to other learning problems commonly invoked for similar purposes makes the possibility of an (efficient) LPN-based PKE scheme very attractive. We have examined some variants of Alekhnovich's original proposal of a public key cryptosystem based on LPN, which bear close resemblance to Regev's cryptosystem (based on the similar Learning With Errors problem). Practically, we find that the LPN-based systems can be very efficient, due to highly parallelisable bit operations; and the parameter sizes required for security against known attacks are small enough that the LPN scheme compares favourably with widely used public key cryptosystems such as RSA.
I'll talk about which normal form equilibria of a finite two-player game can be simulated by two parties sending back and forth messages before they make a simultaneous move, aka cheap talk games. We will limit the running time of the agents such that they can use cryptography if it exists. It has previously been shown that any correlated equilibrium can be implemented by using a cheap talk protocol which is a computational NE (no party can gain non-negligibly more by deviating); This previous result uses oblivious transfer as part of the construction. It has previously also been shown that any so-called convex hull Nash equilibrium, CHNE, can be implemented by a cheap talk protocol which is an empty-threat free computational NE; This result uses only a one-way function. We extend and complement these results by showing that: 1) If all correlated equilibria can be implemented by a cheap talk protocol which is a computational NE, then oblivious transfer exists. 2) If all CHNE can be implemented by a cheap talk protocol which is a computational NE, then one-way functions exist. 3) We show that if oblivious transfer exists then a class of correlated equilibria much larger than the class of CHNE can be implemented by a cheap talk protocol which is an empty-threat free, computational NE. I lack of a better name we for now call this class Reasonable Correlated Equilibria, RCE. 4) We show that if a given utility profile can be obtained by a cheap talk protocol which is an empty-threat free, computational NE, then there exists a RCE with the same utility profil, showing that 3) is optimal.
We present a simple, efficient and practical algorithm for constructing and subsequently simplifying contour maps from massive high-resolution DEMs, under some practically realistic assumptions on the DEM and contours. Joint work with Lars Arge, Lasse Deleuran, Thomas Mølhave, Morten Revsbæk, and Jakob Truelsen
The traditional approach to formalizing ideal-model based definitions of security for multi-party protocols models adversaries (both real and ideal) as centralized entities that control all parties that deviate from the protocol. While this centralized-adversary modeling suffices for capturing basic security properties such as secrecy of local inputs and correctness of outputs against coordinated attacks, it turns out to be inadequate for capturing security properties that involve restricting the sharing of information between separate adversarial entities. Indeed, to capture collusion-freeness and game-theoretic solution concepts, Alwen et al. [Crypto, 2012] propose a new ideal-model based definitional framework that involves a de-centralized adversary. We propose an alternative framework to that of Alwen et al. We then observe that our framework allows capturing not only collusion-freeness and game-theoretic solution concepts, but also several other properties that involve the restriction of information flow among adversarial entities. These include some natural flavors of anonymity, deniability, timing separation, and information-confinement. We also demonstrate the inability of existing formalisms to capture these properties. We then prove strong composition properties for the proposed framework, and use these properties to demonstrate the security, within the new framework, of two very different protocols for securely evaluating any function of the parties. inputs.
We study the worst case error of kernel density estimates via subset approximation. A kernel density estimate of a distribution is the convolution of that distribution with a fixed kernel (e.g. Gaussian kernel). Given a subset (i.e. a point set) of the input distribution, we can compare the kernel density estimates of the input distribution with that of the subset and bound the worst case error. If the maximum error is ε, then this subset can be thought of as an ε-sample (aka an ε-approximation) of the range space defined with the input distribution as the ground set and the fixed kernel representing the family of ranges. Interestingly, in this case the ranges are not binary, but have a continuous range (for simplicity we focus on kernels with range of [0,1]); these allow for smoother notions of range spaces. It turns out, the use of this smoother family of range spaces has an added benefit of greatly decreasing the size required for ε-samples. For instance, in the plane the size is O((1/ε4/3)log2/3(1/ε)) for disks (based on VC-dimension arguments) but is only O((1/ε)√log (1/ε)) for Gaussian kernels and for kernels with bounded slope that only affect a bounded domain. These bounds are accomplished by studying the discrepancy of these "kernel" range spaces, and here the improvements in bounds are even more pronounced. In the plane, we show the discrepancy is O(√log n) for these kernels, whereas for balls there is a lower bound of Ω(n1/4).
In traditional mechanism design, agents only care about the utility they derive from the outcome of the mechanism. We look at a richer model where agents also assign non-negative dis-utility to the information about their private types leaked by the outcome of the mechanism. We present a new model for privacy-aware mechanism design, where we only assume an upper bound on the agents' loss due to leakage, as opposed to previous work where a full characterization of the loss was required. In this model, under a mild assumption on the distribution of how agents value their privacy, we show a generic construction of privacy-aware mechanisms and demonstrate its applicability to electronic polling and pricing of a digital good. This is joint work with Kobbi Nissim and Rann Smorodinsky.
We introduce the notion of rate-limited secure function evaluation (RL-SFE). Loosely speaking, in a RL-SFE protocol participants can monitor and limit the number of distinct inputs used by their counterparts in multiple executions, in a private and veriable manner. RL-SFE helps prevent oracle attacks in SFE, and allows service providers to effciently enforce a metering mechanism. We consider three variants of RL-SFE providing dierent levels of privacy, and formalize them using the real/ideal simulation-based paradigm. As a stepping stone, we also formalize the notion of commit-rst SFE (cf-SFE) wherein a protocol can be naturally divided into a committing and a function evaluation phase; at the end of the first phase parties are committed to their input and are not able to modify it in the second phase. We show that several existing SFE constructions are either commit-rst or can be transformed into one with little overhead: this includes variants of Yao's garbled circuit protocol, and special-purpose protocols for oblivious polynomial evaluation (OPE), private set intersection, and oblivious automaton evaluation. We design three compilers for transforming any cf-SFE protocol to each of the three RL-SFE variants. All our compilers are accompanied with simulation-based proofs of security in the standard model. As a case study, we take a closer look at the OPE protocol of Hazay and Lindell [ePrint' 09], show that it is commit-first and instantiate efficient rate-limited variants of it. Finally, motivated by the fact that in many natural client-server applications of SFE in practice clients do not keep state, we describe a general approach for transforming our compilers into stateless ones. We believe our results demonstrate that RL-SFE is a useful notion that arises naturally in practice and deserves further study. Based on joint work with Özgür Dagdelen and Payman Mohassel
We put forward a new program for developing tools towards understanding the limitations of space-bounded computation. This new, pure information-theoretic model restricts the classical 2-player communication complexity model by limiting (in some sense) the space each player can use to compress communication. We set the foundations of this model, and we develop two fundamentally new techniques (model-specific) for proving lower bounds. As a first step we aim to provide a concrete explanation on what's different in the use of space when computing each of the following three basic functions: EQUALITY, INNER-PRODUCT, and st-CONNECTIVITY. This is joint work in progress with Joshua Broody, Shiteng Chen, Hao Song, and Xiaoming Sun
Three years ago we introduced the concept of Streaming Cryptography; i.e. the possibility of computing cryptographic primitives using a device severely restricted in memory and in its ability to scan its external tape(s). Is it possible to do cryptography in such a setting, or the limitations of these devices rule-out such a possibility? For several, natural, black-box settings, and pretty-much all popular cryptographic assumptions used today we can show the impossibility of realizing cryptography in a streaming setting. Recently we complemented the above results by devising a non-black-box technique, showing that Streaming Cryptography is possible if one uses two external streams, and in total 3 passes over them. Roughly speaking, instead of computing a function (e.g. multiplying integers) which is impossible in the streaming setting, we encrypt enough information from its computation (when we view this computation appropriately). Thus, we are able to make somewhat counter-intuitive statements, such as: computing the product of two numbers (and any of its permuted variants) is unconditionally impossible in a streaming fashion, however we are still able to base streaming cryptography on the hardness of factoring a composite integer. To the best of our knowledge, this is the first practical, non-black-box cryptographic construction (note that the vast majority of cryptographic constructions are black-box). This is joint work with Guang Yang. Preliminary impossibility results appear in joint works with Josh Bronson, Ali Juma, and Guang Yang.
Mutual Exclusion is a fundamental problem in distributed computing, and the problem of proving upper and lower bounds on the time complexity of this problem has been extensively studied. I will present lower and (matching) upper bounds on how time complexity trades off with space. Two important implications of the bounds are that constant time complexity is impossible with subpolynomial space and subpolynomial time complexity is impossible with constant space for cache-coherent multiprocessors, regardless of how strong the hardware synchronization operations are. An interesting feature of these results is that the complexity of mutual exclusion, is captured precisely by a simple and purely combinatorial game that we design. Most part of the talk will focus on the game. Joint work with Nikhil Bansal, Vibhor Bhatt, and Prasad Jayanti.
Mutual Exclusion is a fundamental problem in distributed computing, and the problem of proving upper and lower bounds on the time complexity of this problem has been extensively studied. I will present lower and (matching) upper bounds on how time complexity trades off with space. Two important implications of the bounds are that constant time complexity is impossible with subpolynomial space and subpolynomial time complexity is impossible with constant space for cache-coherent multiprocessors, regardless of how strong the hardware synchronization operations are. An interesting feature of these results is that the complexity of mutual exclusion, is captured precisely by a simple and purely combinatorial game that we design. Most part of the talk will focus on the game. Joint work with Nikhil Bansal, Vibhor Bhatt, and Prasad Jayanti.
In our ongoing research activities in the field of spatial cognition for architecture design, project DesignSpace is concerned with developing the cognitively-driven foundational spatial informatics for user-centred architecture design systems. In this talk, I will demonstrate a human-centred model of abstraction, modelling, and computing for function-driven architecture design (analysis). The primitive entities of our design conception ontology are driven by classic notions of `structure, function, and affordance' in design, and are directly based on the fundamental human perceptual and analytical modalities of visual and locomotive exploration of space. With an emphasis on design semantics, our model for spatial design marks a fundamental shift from contemporary modelling and computational foundations underlying engineering-centred computer aided design systems. I will demonstrate the application of our model within a prototype system for user-centred computational design analysis and simulation. I will also illustrate the manner in which our design modelling and computing framework seamlessly builds on contemporary industry data modelling standards ?Building Information Model (BIM) and Industry Foundation Classes (IFC)? within the architecture and construction informatics communities. Time permitting, I will also demonstrate the usability of our general methods beyond computational design analysis, within real-time spatial services for indoor or built-up environments (e.g., navigation assistance for ordinary and handicapped people, emergency spatial assistance).
One of the first applications of the probabilistic method in finite combinatorics was Erdos's proof of the existence of graphs on n vertices that do not contain a clique or independent set of size 2log2 n. Since then mathematicians are trying to find an explicit construction of such "Ramsey" graphs. It is, however, possible that no polynomial time construction of Ramsey graphs exists. In this talk we will show a hardness result. We will prove that given any Ramsey graph G, any resolution proof showing that G is Ramsey has superpolynomial size.
Continuous computations over event streams require a data structure (typically a graph) that describes how the input streams are to be processed. A language for specifying such a computation would ideally hide the construction of the graph from the user. General purpose programming languages such as Java and C++ are considered to be ill suited for such tasks, prompting event processing systems such as StreamBase and Esper to support one of the variations of StreamSQL, i.e., a language that augments SQL by adding sliding window semantics as well as the possibility to write user-defined operators and embed them in continuous queries. These languages are very useful for many applications. Like SQL, they can be learned by non-programmers and used relatively easily. Streamulus takes a different approach to the stream programming problem. It is a C++ domain-specific embedded language (DSEL), meaning that the statements of the language are written in valid C++ syntax. The programmer needs to specify only what happens to a single event of the stream. For example, if x and y are streams, the expression (x+y)/2 is the stream of point-wise averages of x and y. The task of constructing the computation graph is delegated to the C++ compiler. This magic is achieved by use of template metaprogramming techniques via the Boost Proto library, which I will review in the talk. Streamulus is not intended as a replacement for the StreamSQL based systems. Its target users are C++ programmers who build real-time systems and need a lightweight tool that simplifies the task of writing stream processing code. For those users, it offers an expressive language for efficient stream computations. Bio: My academic background is in algorithms and data structures (PhD at MPII, post-docs at Aarhus and Brown Universities). In the last few years I have been working in industry, mostly in financial software development where I often needed to process event streams. Streamulus is a private research project which was inspired by the amount of boiler-plate code I found myself writing in my day job.
Flash memory devices are becoming ubiquitous and indispensable storage devices, partly even replacing the traditional hard-disks. To store data efficiently on these devices, it is necessary to adapt the existing file systems and indexing structures to work well on the flash memory, and a significant amount of research in this field has been devoted to designing such structures. But it is hard to compare these structures owing to the lack of any theoretical models for flash memory and since the existing external memory models fail to capture the full potential of flash-based storage devices. Starting with a brief introduction to the I/O model and B-trees, this talk presents some of the proposed index structures that have been shown to perform well on flash memory. It also presents the recently proposed flash memory model(s), and shows the comparison of the existing structures in this model.
Betweenness centrality is one of the most well-known measures of the importance of nodes in a social-network graph. In this paper we describe the first known external-memory and cache-oblivious algorithms for computing betweenness centrality, including general algorithms for networks with weighted and unweighted edges and a specialized algorithm for networks with small diameters, as is common in social networks exhibiting the "small worlds" phenomenon. Joint work with Lars Arge and Michael T. Goodrich
Stratosphere is an open-source research platform for Big Data Analytics, co-developed at TU Berlin. It features a cloud-enabled execution engine with flexible fault tolerance schemes, a novel programming model centered around second-order functions that extends MapReduce, and a cost-based query optimizer. In this talk, I will first present the system architecture and the main design decisions. Then, I will present our recent research on optimizing data flows composed of user-defined functions, and efficiently executing iterative and recursive algorithms.
We consider the complexity of computing the determinant over arbitrary finite-dimensional algebras. In particular, we prove the following dichotomy when the algebra A is fixed and not a part of the input: Computing the determinant over A is hard if A/rad A is noncommutative. ``Hard'' here means #P-hard over fields of characteristic 0 and mod-Pp-hard over fields of characteristic p > 0. On the other hand, if A/rad A is commutative and the underlying field is perfect, then we can compute the determinant over A in polynomial time by a result of Chen et al.
Inspired by cold boot attacks, Heninger and Shacham (Crypto 2009) initiated the study of the problem of how to recover an RSA private key from a noisy version of that key. They gave an algorithm for the case where some bits of the private key are known with certainty. Their ideas were extended by Henecka, May and Meurer (Crypto 2010) to produce an algorithm that works when all the key bits are subject to error. In this paper, we bring a coding-theoretic viewpoint to bear on the problem of noisy RSA key recovery. This viewpoint allows us to cast the previous work as part of a more general framework. In turn, this enables us to explain why the previous algorithms do not solve the motivating cold boot problem, and to design a new algorithm that does solve this problem (and more). In addition, we are able to use concepts and tools from coding theory - channel capacity, list decoding algorithms, and random coding techniques - to derive bounds on the performance of the previous and our new algorithm. A joint work with Kenneth G. Paterson and Dale L. Sibborn.
In this talk we will discuss the distributed streaming model. In this model, we have k sites, each receiving a stream of elements over time. There is a designated coordinator who would like to track, that is, maintain continuously at all times, some function f of all the elements received from the k sites. There is a two-way communication channel between each site and the coordinator, and the goal is to track f with minimum communication. This model is motivated by applications in distributed databases, network monitoring and sensor networks. In this talk we will first introduce the model and review the existing results in this model, and then focus on one particular problem: tracking the number of distinct elements (F0). We will discuss both upper bound and lower bound for the F0 problem, so as to give an example for designing algorithms / analyzing complexities in the distributed streaming model.
We consider the generalized version of the classic board game Mastermind with k colors and n positions. It has been known for a long time that for any constant number k of colors, there exist winning-strategies using at most Theta(n/log) queries. This bound is tight. In this talk, we study the black-peg version of Mastermind with k = O(n) colors. We present a winning strategy that uses only O(n loglog k) guesses for identifying the secret code. This improves the previously best known bounds of Chvátal, Goodrich, and others, which are all of order n log n; both for black-peg and the original Mastermind game. This is joint work with Benjamin Doerr, Reto Spöhel, and Henning Thomas.
Database queries can be broadly classified into two categories: reporting queries and aggregation queries. The former retrieves a collection of records from the database that match the query's conditions, while the latter returns an aggregate, such as count, sum, average, or max (min), of a particular attribute of these records. Aggregation queries are especially useful in business intelligence and data analysis applications where users are interested not in the actual records, but some statistics of them. They can also be executed much more efficiently than reporting queries, by embedding properly precomputed aggregates into an index. However, reporting and aggregation queries provide only two extremes for exploring the data. Data analysts often need more insight into the data distribution than what those simple aggregates provide, and yet certainly do not want the sheer volume of data returned by reporting queries. In this paper, we design indexing techniques that allow for extracting a statistical summary of all the records in the query. The summaries we support include frequent items, quantiles, various sketches, and wavelets, all of which are of central importance in massive data analysis. Our indexes require linear space and extract a summary with the optimal or near-optimal query cost. Joint work with: Ke Yi
This talk isa an overview talk about some recent results in streaming algorithms. Streaming algorithms process the input date via passes while using sublinear memory. We consider language membership problems as well as graph problems. The talk starts out with the problem of recognizing well-parenthesized expressions in the streaming model [Magniez et al. STOC 2010]. Recognizing well-parenthesized expressions allows to check well-formedness of XML documents. This link gave rise to a recent work on the problem of validating XML documents in the streaming setting [Konrad, Magniez ICDT 2012] that we discuss subsequently. In the second part of the talk, we consider graph problems in the semi-streaming model. Here, the input stream is a sequence of the edges of an input graph. We focus on the matching problem, and we briefly touch on the semi-matching problem. The difficulty of these problems relies strongly on the arrival order of the input stream. We place particular emphasis on the arrival order, and we discuss worst-case order as well as random order. Concerning bipartite graphs, we also consider vertex-based orderings of the edges wrt. to the vertices of one bipartition.
Computing diameters of huge graphs is a key challenge in complex network analysis. As long as the graphs fit into main memory, diameters can be efficiently approximated (and frequently even exactly determined) using heuristics that apply a limited number of BFS traversals. If the input graphs have to be kept and processed on external storage, however, even a single BFS run may cause an unacceptable amount of time-consuming I/O-operations. In [SWAT08] we proposed the first parameterized diameter approximation algorithm with fewer I/Os than that required for exact BFS traversal. In recent ongoing work we derive hierarchical extensions of this randomized approach and experimentally compare their trade-offs between actually achieved running times and approximation ratios. It turns out that the hierarchical approach is frequently capable of producing surprisingly good diameter approximations in shorter time than BFS. We also provide theoretical and practical insights into worst-case input classes. Joint work with Deepak Ajwani and David Veith.
Rational Cryptography is a branch of modern cryptography analyzing cryptographic protocols in a game-theoretical setting. One of the most important problems in this field is to find a solution concept that would take into account computational power of the players as well as the sequential nature of the protocols. However, there is still no solution concept available that would overcome all the issues and allow a satisfactory analysis. In this talk I will illustrate the need for new solution concepts suitable for Rational Cryptography. Also, I will give an overview of the solution concepts proposed recently such as the Renegotiation-safeness by Pass and shelat and the Threat-free Nash Equilibrium of Gradwohl et al.
In this talk, I will present near-optimal space bounds for Lp-samplers in data streams. Given a stream of additions and subtractions to the coordinates of an underlying vector x, an Lp sampler is a streaming algorithm that reads the updates once and takes a sample coordinate with probability proportional to the Lp distribution of x. More precisely, the i-th coordinate is picked with the chance corresponding to the weight |xi|p. Here I will present an ε-relative error Lp sampler requiring roughly O(ε-plog2 n) space for p in (0,2). This result improves the previous bounds by Monemizadeh and Woodruff (SODA 2010) and Andoni, Krauthgamer and Onak (FOCS 2011). As an application of these samplers, an upper bound will be demonstrated for finding duplicates in data streams using L1 samplers. In case the length of the stream is long enough, our L1 sampler leads to a O(log2 n) space algorithm for this problem, thus improving the previous bound due to Gopalan and Radhakrishnan. If time permits, I also show an Ω(log2 n) lower bound for sampling from 0,+-1 vectors. This matches the space of our sampling algorithms for constant ε>0. These bounds are obtained using reductions from the communication complexity problem of Augmented Indexing. Joint work with Mert Saglam and Gabor Tardos in PODS 2011.
In this talk I will introduce a generic construction of universally composable (UC) oblivious transfer (OT) based on lossy encryption. This generic construction is then used to obtain the first UC secure OT protocol based on coding assumptions, namely the McEliece assumptions. Both constructions are presented in the common reference string (CRS) model and are based on a technique introduced by Lindell for fully simulatable OT protocols. The general construction can be realized from a broader range of assumption than that of the dual mode encryption based protocol presented by Peikert et al. Moreover, it eliminates the need for universally composable commitments of the general compiler introduced by Choi et al, which could be used to obtain similar results if it did not require such a hybrid functionality. Later on I'll discuss possible generalizations of this result and directions for future research.
We present the parallel buffer tree, a parallel external-memory (PEM) data structure for batched search problems. Our data structure is a non-trivial extension of Arge's sequential buffer tree to a private-cache multiprocessor environment and reduces the number of I/O operations by the number of available processor cores compared to its sequential counterpart, thereby taking full advantage of multi-core parallelism. The basic parallel buffer tree is a batched search tree data structure that supports batched parallel processing of updates to the structure and membership queries. We also show how to extend the basic parallel buffer tree to support range queries. Our data structure is able to process a sequence of N updates and queries in optimal O(sort(N) + scan(K)) parallel I/O complexity, where K is the size of the output reported in the process and sort(N) and scan(N) are the parallel I/O complexities of, respectively, sorting and scanning N elements using P processors.
I will talk about why I think cryptography needs better security models: the ones we know either deem secure protocols which are clearly insecure and/or they deem insecure protocols which we actually think are secure. This is an unsatisfactory state of affairs, which I think we can better. Then I will talk about some of the work I have done on this problem in the past and finally talk about some recent progress on better modeling so-called adaptive corruptions.
Physarum is a slime mold. It was observed over the past 10 years that the mold is able to solve shortest path problems and to construct good Steiner networks (Nakagaki, Yamada, Toth, Tero, and Takagi). In a nutshell, the shortest path experiment is as follows: A maze is built and the mold is made to cover the entire maze. Food is then provided at two positions s and t and the evolution of the slime is observed. Over time, the slime retracts to the shortest s-t-path. A mathematical model of the slime's dynamic behavior was proposed in 2007 by Tero, Kobayashi, and Nakagaki. Extensive computer simulations of the mathematical model confirm the experimental findings. For the edges on the shortest path, the diameter converges to one, and for the edges off the shortest path, the diameter converges to zero. We review the wet-lab and the computer experiments and provide a proof for these experimental findings. A video showing the wet-lab experiment can be found at http://www.youtube.com/watch?v=tLO2n3YMcXw&t=4m43s
I describe how to use O(k) seed OTs and O(k) pseudorandom generated strings each of length l to generate l homomorphic commitments. This gives very efficient homomorphic commitments, allowing to optimize several practical applications of homomorphic commitments, including active-secure two-party computation. Joint work with Peter Sebastian Nordholt.
In simplex range reporting the goal is to preprocess a set of points such that given a query simplex, the points inside the query can be outputted efficiently. This is one of the fundamental and most extensively studied problems in computational geometry. In this talk, we will focus on proving lower bounds for the problem in the pointer machine model as well as the I/O model. First, we will briefly review known results and some general techniques that are used to answer simplex range reporting queries. Next, we will review the previous lower bound techniques and then we will move to present our new improvements. We will conclude with a list of open problems.
In combinatorial auctions we have a number of subsets of items of a set S that is auctioned to bidders who express their preferences on them. We want to find a maximum cost sub collection of sets which are pair-wise disjoint. Since the problem is intractable, we study algorithms which approximate the problem with a good approximation ratio and are also truthful. A good approximation and truthfulness are very important for this problem since such auctions have direct applications. The problem can be approached as a set packing problem. We also study a set of techniques to construct truthful mechanisms for combinatorial auctions where the bidder wants a specific subset of items and only the valuation is unknown to the mechanism.
In this talk I will present the work done with Sara Krehbiel and Chris Peikert while visiting Georgia Tech. We develop secure threshold protocols for two important operations in lattice-based cryptographic schemes, namely, generating a hard lattice L together with a "strong" trapdoor, and using the trapdoor to sample from a discrete Gaussian distribution over a desired coset of L. Because these are exactly the key-generation and signing (or private key extraction) operations, respectively, for several lattice-based digital signature and (hierarchical) identity-based encryption schemes, our protocols can be plugged directly in to transfer those systems to the threshold setting.
Many security properties are naturally expressed as indistinguishability between two versions of a protocol. In this paper, we show that computational proofs of indistinguishability can be considerably simplified, for a class of processes that covers most existing protocols. More precisely, we show a soundness theorem, following the line of research launched by Abadi and Rogaway in 2000: computational indistinguishability in presence of an active attacker is implied by the observational equivalence of the corresponding symbolic processes. We prove our result for symmetric encryption, but the same techniques can be applied to other security primitives such as signatures and public-key encryption. The proof requires the introduction of new concepts, which are general and can be reused in other settings.
Entangled cloud storage enables a set of clients Pi to ``entangle'' files Fi into a single clew C to be outsourced to a (potentially malicious) cloud provider S. The entanglement makes it impossible to modify or delete a single bit without inevitably affecting all files in C. The clew C keeps the files in it private but allows solely clients to recover their own data. At the same time, the cloud provider S is dissuaded from altering C or, worse, denying certain users access to their files. In this paper we provide theoretical foundations for entangled cloud storage, introducing strong security requirements capturing the properties above. We also propose a concrete construction based on privacy-preserving polynomial interpolation. This is joint work with Giuseppe Ateniese and Özgür Dagdelen.
Secure multiparty computation (MPC) studies the question of how to compute functions that depend on the private inputs of multiple parties in a secure way that reveals only the output to the designated receiver while protecting the inputs and other intermediate state information. While cryptographic protocols, which demonstrate the feasibility of secure computation for any function, have been around for more than 30 years, such protocols still have efficiency overheads that are prohibitive for many practical applications. They are also not designed to meet the requirements of many heterogeneous environments where secure computation protocols will be useful, in which different parties have different resources (computation, communication, memory), have different incentives for adversarial behavior, and often operate in highly distributed systems. In my research I have explored different approaches to bring secure computation closer to practice. These include using new computational models that overcome inherent inefficiencies in existing MPC schemes, defining new adversarial models that reflect better practical setups, constructing protocols for outsourced computation and verification as well as constructing protocols tailored for distributed computation resources. In this talk I would give an overview of my research work. Then I would focus on three examples of my projects. I would present a new paradigm for constructing verifiable computation schemes from attribute-based encryption. Second, I would talk about how we can achieve secure computation protocols with sublinear computation complexity in the size of their inputs (crucial for computation on big databases) and what is our solution to this question. I would also present a different approach to secure computation in the setting of encrypted search, where given strict efficiency requirements we optimize the privacy guarantees achieved and define relaxed, yet, meaning security notions.
We study a mechanism design version of matching computation in graphs that models the game played by hospitals participating in pairwise kidney exchange programs. We present a new randomized matching mechanism for two agents which is truthful in expectation and has an approximation ratio of 3/2 to the maximum cardinality matching. This is an improvement over a recent upper bound of 2 [Ashlagi et al., EC 2010] and, furthermore, our mechanism beats for the first time the lower bound on the approximation ratio of deterministic truthful mechanisms. We complement our positive result with new lower bounds. Among other statements, we prove that the weaker incentive compatibility property of truthfulness in expectation in our mechanism is necessary; universally truthful mechanisms that have an inclusion-maximality property have an approximation ratio of at least 2. Joint work with Ioannis Caragiannis and Ariel D. Procaccia
We study implementable (incentive compatible) allocation rules based on a network approach when types are multi-dimensional, the mechanism designer may use monetary transfers, and agents have quasi-linear preferences over outcomes and transfers. We provide a general characterization of implementability. The characterization is based on implementability on lines called line implementability and then we substantiate the characterization in case of different valuation functions. Motivated by results in Archer and Kleinberg (2008) for the case of valuations that are linear in the type, we introduce the notion of local implementability and then we show that whenever revenue equivalence holds on lines local implementability together with line implementability implies implementability. Finally we simplify the characterizations in case of finite outcome space. Thereby we derive a generalization of a well-known theorem by Saks and Yu (2005).
Consider a random graph model where each possible edge e is present independently with some probability pe. Given these probabilities, we want to build a large/heavy matching in the randomly generated graph. However, the only way we can find out whether an edge is present or not is to query it, and if the edge is indeed present in the graph, we are forced to add it to our matching. Further, each vertex i is allowed to be queried at most ti times. How should we adaptively query the edges to maximize the expected weight of the matching? We consider several matching problems in this general framework (some of which arise in kidney exchanges and online dating, and others arise in modeling online advertisements); we give LP-rounding based constant-factor approximation algorithms for these problems. Our main results are the following: We give a 4 approximation for weighted stochastic matching on general graphs, and a 3 approximation on bipartite graphs. This answers an open question from [Chen et al. ICALP 09]. We introduce a generalization of the stochastic online matching problem [Feldman et al. FOCS 09] that also models preference-uncertainty and timeouts of buyers, and give a constant factor approximation algorithm. Joint work with Nikhil Bansal, Anupam Gupta, Julian Mestre, Viswanath Nagarajan, Atri Rudra
One of the most popular ways to represent a terrain is the Triangulated Irregular Network (TIN), that is a 2D triangulation where each vertex has also an elevation value. The natural way of modelling water flow on a triangulated terrain is to make the simple assumption that water follows the direction of steepest descent. Using this assumption, it is possible to design algorithms for constructing drainage structures on a TIN, such as watersheds. Unfortunately, there exist instances of TINs for which the combinatorial complexity of drainage structures is very high. In this presentation we describe the main computational problems that arise, both in theory and in practice, when attempting to construct drainage structures on TINs. We present efficient algorithms that extract information on the drainage structures of a TIN without explicitly computing those structures themselves. We also provide experimental results that show which are the actual bottlenecks that appear when using this flow-modelling in practice. Joint work with Mark de Berg and Herman Haverkort.
Solving puzzles is a popular pastime. There's a vast literature on the complexity of finding solutions to puzzles. Most interesting puzzles are NP-complete, yet lend themselves well to human intuition. A few well-known examples of puzzles for which finding solutions is NP-complete are Sudoku, Minesweeper and Lemmings. Also, finding shortest solutions for generalized 15-puzzles as well as staying alive in Tetris (with the block sequence known in advance) is NP-complete. Along with Sudoku, several other pen-and-paper puzzles are known to be NP-complete as well. In this talk, we add to this family of results, by analyzing the hardness of three such puzzles: Magnets, Kurodoko and Slither Link (on several grid types). All three can be played at http://www.chiark.greenend.org.uk/ sgtatham/puzzles/, under the names Magnets, Range and Loopy. In Magnets, the player is to place magnetic or non-magnetic blocks in a domino-tiled rectangle, subject to the non-adjacency of magnetic poles and the presence of a given number of positive and negative poles in each row and column. In Slither Link, the player is to select a subset of edges in a plane graph that forms a simple cycle, subject to that cycle having nf edges adjacent to each face f that contains a clue nf, and any number of edges adjacent to faces not containing clues. In Kurodoko, the player is to divide a rectangular grid into black and white squares such that the black squares are non-adjacent, the white squares are connected, and the black squares have distances to numeric clues specified by those clues. We show all three puzles to be NP-complete; Slither Link we show to be NP-complete on particular classes of graphs.
We will review and introduce the notion of Hellinger distance, which is a distance measure for probability distributions that is particularly useful for working with product distributions. We will show how Hellinger distance can be used to upper bound the advantage of adaptive distinguishers in certain general settings by the advantage of non-adaptive distinguishers (loosely speaking). We will finally show how these ideas apply in a concrete crypto setting, more precisely for giving provable security results on (abstractions of) the AES blockcipher. The latter is joint work with Andrey Bogdanov, Gregor Leander, Lars Knudsen, Francois-Xavier Standaert and Elmar Tischhauser.
We describe a novel reformulation of Canetti.s Universal Composability (UC) framework for the analysis of cryptographic protocols. Our framework is different mainly in that it is (a) based on systems of interactive Turing machines with a fixed communication graph and (b) augmented with a global message queue that allows the sending of multiple messages per activation. The first feature significantly simplifies the proofs of some framework results, such as the UC theorem, while the second can lead to more natural descriptions of protocols and ideal functionalities. (Joint work with George Petrides and Asgeir Steine.)
When a zero-sum game is played once, a risk-neutral player will want to maximize his expected outcome in that single play. However, if that single play instead only determines how much one player must pay to the other, and the same game must be played again, until either player runs out of money, optimal play may differ. Optimal play may require using different strategies depending on how much money has been won or lost. Computing these strategies is rarely feasible, as the state space is often large. This can be addressed by playing the same strategy in all situations, though this will in general sacrifice optimality. Purely maximizing expectation for each round in this way can be arbitrarily bad. We therefore propose a new solution concept that has guaranteed performance bounds, and we provide an efficient algorithm for computing it. The solution concept is closely related to the Aumann-Serrano index of riskiness, that is used to evaluate different gambles against each other. The primary difference is that instead of being offered fixed gambles, the game is adversarial.
How far off the edge of the table can we reach by stacking n identical, homogeneous, frictionless blocks of length 1? A 150 year old solution to this problem of static mechanics achieves an overhang of H(n)/2, where H(n)=1/1+1/2+..+1/n is the n-th harmonic number. This solution was widely believed to be optimal. However, recent work of Mike Paterson, Yuval Peres, Mikkel Thorup, Peter Winkler and the speaker of today, Uri Zwick, shows that it is exponentially far from optimal. Professor Uri Zwick from Tel Aviv University will present his work constructing simple n-block stacks that achieve an overhang of cn1/3, for some constant c>0. It is further shown that this is best possible, up to a constant factor. This work won the David P. Robbins Prize of the MAA for an "outstanding paper in algebra, combinatorics, or discrete mathematics " and .The Lester R. Ford Award" of the MAA for "authors of articles of expository excellence published in The American Mathematical Monthly or Mathematics Magazine".
In this talk we establish an intimate connection between dynamic range searching in the group model and combinatorial discrepancy. Our result states that, for a broad class of range searching data structures (including all known upper bounds), it must hold that tu tq = Ω(disc2/lg n) where tu is the worst case update time, tq the worst case query time and disc is the combinatorial discrepancy of the range searching problem in question. This relation immediately implies a whole range of exceptionally high and near-tight lower bounds for all of the basic range searching problems. We list a few of them in the following:
Given a stream of items each associated with a numerical value, its edit distance to monotonicity is the minimum number of items to remove so that the remaining items are non-decreasing with respect to the numerical value. The space complexity of estimating the edit distance to monotonicity of a data stream is becoming well-understood over the past few years. Motivated by applications on network quality monitoring, we extend the study to estimating the edit distance to monotonicity of a sliding window covering the w most recent items in the stream for any w > 1. We give a deterministic algorithm which can return an estimate within a factor of (4+ε) using O((1/ε)2 log2(ε w)) space. We also extend the study to consider an out-of-order stream. In an out-of-order stream, each item is associated with a creation time and a numerical value, and items may be out of order with respect to their creation times. The goal is to estimate the edit distance to monotonicity with respect to the numerical value of items arranged in the order of creation times. We show that any randomized constant-approximate algorithm requires linear space.
We study the maximum number of Euclidean embeddings of minimally rigid graphs whose edges represent distance constraints. We focus on small-dimensional Euclidean spaces and obtain exact counts for graphs with up to 7 vertices. Our approach employs distance matrices and sparse polynomial systems. Application areas include robotics, structural bioinformatics, and architecture. Joint work with G. Moroz (INRIA Nancy), E. Tsigaridas (U. Aarhus), and A.Varvitsiotis (CWI Amsterdam).
Bounded rationality considers scenarios in Game Theory in which agents' capabilities are limited. We consider two problems which expose some of the key questions arising in bounded rationality: * an n-round matching pennies game in which players have limited access to randomness, and * rationalizability of a set of price-consumption observations in a single agent economic environment. For the first problem we show how the limited randomness assumption has different implications, depending on whether or not the agents' strategies can be computed in time polynomial in n. For the second problem we start from the fact that classical results in economic theory give easy to test conditions that characterize rationalizable datasets. We show how, when such conditions fail, we can interpret the observed data as the output of a consumer who is approximately maximizing a linear utility function. This result is part of ongoing research.
Leakage-Resilient Cryptography From the Inner-Product Extractor We present a generic method to secure various widely-used cryptosystems against arbitrary side-channel leakage, as long as the leakage adheres three restrictions: first, it is bounded per observation but in total can be arbitrary large. Second, memory parts leak independently, and, third, the randomness that is used for certain operations comes from a simple (non-uniform) distribution. As a fundamental building block, we construct a scheme to store a cryptographic secret such that it remains information theoretically hidden, even given arbitrary continuous leakage from the storage. To this end, we use a randomized encoding and develop a method to securely refresh these encodings even in the presence of leakage. We then show that our encoding scheme exhibits an efficient additive homomorphism which can be used to protect important cryptographic tasks such as identification, signing and encryption. More precisely, we propose efficient implementations of the Okamoto identification scheme, and of an ElGamal-based cryptosystem with security against continuous leakage, as long as the leakage adheres the above mentioned restrictions. We prove security of the Okamoto scheme under the DL assumption and CCA2 security of our encryption scheme under the DDH assumption. This is joint work with Stefan Dziembowski
We will discuss time lower bounds for both online integer multiplication and convolution in the cell-probe model. For the multiplication problem, one pair of digits, each from one of two n digit numbers that are to be multiplied, is given as input at step i. The online algorithm outputs a single new digit from the product of the numbers before step i+1. We give a lower bound of Ω((d/w)*log n) time on average per output digit for this problem where 2d is the maximum value of a digit and w is the word size. In the convolution problem, we are given a fixed vector V of length n and we consider a stream in which numbers arrive one at a time. We output the inner product of V and the vector that consists of the last n numbers of the stream. We show an Ω((d/w)*log n) lower bound for the time required per new number in the stream. All the bounds presented hold under randomization and amortization. These are the first unconditional lower bounds for online multiplication or convolution in this popular model of computation.
In this talk I'll first show a very simple (non-interactive) perfectly binding string commitments scheme whose security (i.e. hiding property) relies on the learning parity with noise (LPN) problem. Next, I'll give an efficient zero-knowledge proof of knowledge (a Σ-protocol) for any linear function of the secret used to generate LPN instances. Combining these results, we get a very simple string commitment scheme which allows to efficiently (but interactively) open any linear function (e.g. a subset) of the committed string, while revealing no other information. We borrow ideas from Stern [CRYPTO'93], and for the special case where one opens an ``empty'' commitment, our protocol can be seen as a ``dual'' version of Stern's public-key identification protocol.
We consider the computational complexity of the following very simple problem for streaming algorithms. An input stream is a sequence of m integers a1, ..., am, where each ai is between 1 and n. When m=n+1, or m>n in general, by the pigeonhole principle there is a duplicate, i.e., an integer d such that d=ai=aj for some distinct i and j. Can we find a duplicate using O(log n) space and O(1) sequential passes over the input a1, ..., am? By simple arguments we show that the answer is negative for the case m=n+1. Similar questions for deterministic algorithms for the case m=2n and for randomized algorithms for the case m=n+1 remain open.
In this talk I will talk about how to derive super-constant query lower bounds by reductions from the predecessor problem. More concretely, this talk will include a round trip reduction between 2D range minimum query problem and 2D rectangular point location problem, and a reduction from the predecessor problem to these two problems. A brief idea on the proof of the predecessor lower bound will also be given. Joint work with Mihai Patrascu and Elad Verbin
In the auxiliary input model an adversary is allowed to see a computationally hard-to-invert function of the secret key. The auxiliary input model weakens the bounded leakage assumption commonly made in leakage resilient cryptography as the hard-to-invert function may information-theoretically reveal the entire secret key. In this work, we propose the first constructions of digital signature schemes that are secure in the auxiliary input model. Our main contribution is a digital signature scheme that is secure against chosen message attacks when given an exponentially hard-to-invert function of the secret key. As a second contribution, we construct a signature scheme that achieves security for random messages assuming that the adversary is given a polynomial-time hard to compute function. Here, polynomial-hardness is required even when given the entire public-key -- so called weak auxiliary input security. We show that such signature schemes readily give us auxiliary input secure identification schemes. A joint work with Sebastian Faust, Carmit Hazay, Jesper Buus Nielsen and Peter Sebastian Nordholt.
We propose a general multiparty computation protocol secure against a dishonest majority, for computing securely arithmetic circuits over a finite field Fpk . As in several earlier works, our protocol consists of a preprocessing phase that is both independent of the function to be computed and of the inputs, and a much more efficient online phase where the actual computation takes place. Our preprocessing is based on a somewhat homomorphic cryptosystem. We extend a scheme by Brakersky et al., allowing us to perform distributed decryption and to handle many values in parallel. Our preprocessing phase improves significantly over earlier work both asymptotically and in practice. The online phase may use an existing protocol by Bendlin et al., based on unconditionally secure MACs, but we also propose a new online phase that scales better with n, the number of players. The total amount of data the players need to store from the preprocessing is linear in n rather than quadratic as in earlier work. Furthermore, the cost of a secure multiplication in our online phase is O(n) multiplications in Fpk plus O(n2) additions, rather than O(n2) multiplications as in earlier work. A joint work with Ivan Dangaard, Nigel Smart and Sarah Zakarias.
Recent progress in computer systems has provided programmers with virtually unlimited amount of work storage for their programs. This leads to space-inefficient programs that use too much storage and become too slow if sufficiently large memory is not available. Thus, I believe that space-efficient algorithms or memory-constrained algorithms deserve more attention. Constant-work-space algorithms have been extensively studied under a different name, log-space algorithms. Input data are given on a read-only array of n elements, each having O(log n) bits, and work space is limited to O(log n) bits, in other words, a constant number of pointers and counters, each of O(log n) bits. This memory constraint in the log-space algorithms may be too severe for practice applications. For problems related to an image with n pixels, for example, it is quite reasonable to use O(√n) work space, which amounts to a constant number of rows and columns. I will start my talk with a simple algorithm for detecting a cycle in a graph using only some constant amount of work space (more exactly, O(log n) bits in total) and then its applications. Then, I will introduce some paradigms for designing such memory-constrained algorithms and their applications to interesting problems including those in computational geometry and computer vision.
From a practical perspective, mixed integer optimization represents a very powerful modeling paradigm. However, the presence of both integer and continuous variables results in a significant increase in complexity over the pure integer case wrt. geometric, algebraic, combinatorial and algorithmic properties. In this talk we survey some of the recent developments that we expect to contribute to the development of this field of optimization. In particular, we present a new geometric approach based on lattice point free polyhedra, and we discuss some open questions that emerge from this approach.
Due to the expansion of communication networks, achieving secure distributed computation has become a major focus point for the research community. Even if some generic solutions are known since 1988, those protocols are computationally inefficient. In this talk, we present a new way of designing unconditionally secure and efficient multiparty computation algorithms for non-Abelian groups in the passive (also known as semi-honest) case. By a result (due to Barrington) on performing secure computation in the symmetric group S5, our protocols can be used to securely compute arbitrary functions. Our approach is based on a security reduction to the existence of a particular class of colorings for planar graphs. The computational complexity of our black-box construction is a small polynomial in the number of participants and it is independent on the size of the circuit used to compute the distributed function representing a major improvement on the generic 1988 solutions. This is a joint work with Yvo Desmedt, Josef Pieprzyk, Ron Steinfeld, Xiaoming Sun, Huaxiong Wang and Andrew Chi-Chih Yao.
I will present and motivate recent results on concurrent games. These represent a situation in which a Player (or a team of players) compete against an Opponent (a team of opponents), possibly in a highly distributed fashion. As usual the dichotomy Player vs. Opponent can be interpreted in a variety of ways, as Process vs. Environment, or Proof vs. Refutation. Both games and nondeterministic concurrent strategies are represented by event structures, a `partial-order model' of computation, with an extra function expressing the polarity (the Player/Opponent nature) of each event. Although this work has grown from the needs of semantics - I'll explain how - I believe it could have much more general interest.
We investigate clustering in the weighted setting, in which every data point is assigned a real valued weight. We analyze the influence of weighted data on standard clustering algorithms in each of the partitional and hierarchical settings, characterising the conditions under which such algorithms react to weights, and classifying clustering methods into three broad categories: weight-responsive, weight-considering, and weight-robust. The results contribute to a property-based classification of standard clustering algorithms, and in particular, distinguish between algorithms such as single-linkage and k-means. This is joint work with Margareta Ackerman, Shai Ben-David, and David Loker
In this talk, we present protocols designed to remotely authenticate objects. The underlying principal is simple: a bitstring characterizing an object affiliated to the system is extracted and stored in a database before the object is released. Subsequent authentication is then done by extracting a fresh fingerprint and comparing with that of the database. Such protocols can be applied to anti-counterfeiting. We also describe a model capturing the necessary security requirements, namely integrity of communications and certain privacy properies. These goals are reached using certain combinations of homomorphic encryption, digital signatures, and private information retrieval protocols.
Given a set S of n numbers from the range [1..U], a monotone minimal perfect hashing [SODA 2009] is a function which associates to each element x of S its rank among all elements of S. The function is allowed to return an arbitrary answer when evaluated on an element outside of S. In [SODA 2009] it is shown that a monotone minimal perfect hash function can be encoded using O(nloglog(U/n)) bits of space (which is sublinear when compared with the O(nloglog(U/n)) bits needed to encode the set S) such that the evaluation takes O(1). Given a text T, a full-text index on T is an index which supports queries which can efficiently search for patterns in T. The full-text index is said to be compressed if it occupies a space close to the space occupied by the compressed text. In this talk, I will show how monotone minimal perfect hash functions can be used to speed-up some known compressed text indexes. Joint work with Gonzalo Navarro
One of the most efficient methods to solve polynomial systems in finite fields is to compute Gröbner bases. However, little is known about the theoretical and practical complexity of computing Gröbner bases of polynomial systems with additional structures (for instance symmetries or multihomogeneous structures). Surprisingly such structured systems occur frequently in Algebraic Cryptanalysis: for instance the MinRank problem which is at the heart of the security of many multivariate public key cryptosystems such as HFE or the McEliece cryptosystem whose security is based on the hardness of decoding general linear codes are strongly related to multihomogeneous polynomial systems. Polynomial systems with symmetries occur also when solving the DLP for Edwards elliptic curves defined over a non-prime finite field Fqn. We show how to take advantage of some symmetries of twisted Edwards curves to gain an exponential factor 23 (n-1) when solving the underling polynomial systems using Gröbner bases.
Let Σ be an arbitrary triangulated surface, possibly with boundary, composed of n triangles. A simple path or cycle in Σ is normal if it avoids every vertex of Σ and it does not cross any edge of Σ twice in a row. We describe an algorithm to trace an arbitrary normal curve in O(min{X, n2log X}) time, where X is the number of times the curve crosses edges of the input triangulation. In particular, our algorithm runs in polynomial time even when the number of crossings is exponential in n. Our tracing algorithm computes a new cellular decomposition of the surface with complexity O(n); the traced curve appears as a simple path or cycle in the 1-skeleton of the new decomposition. Our tracing strategy implies fast algorithms for (multi)curves represented by normal coordinates, which record the number of times the curves cross each edge of the surface triangulation. For example, we can count the components of a normal curve, determine whether two disjoint normal curves are isotopic, or determine whether a normal curve disconnects two given points on the surface, all in O(min{X, n2log X}) time. Our results are competitive with and conceptually simpler than earlier algorithms by Schaefer, Sedgwick, and Stefankovic [COCOON 2002] based on word equations and text compression. As another application, we describe an algorithm to trace simple geodesics in piecewise-linear surfaces, represented by a set of Euclidean triangles with pairs of equal-length edges identified. Given a starting point and a direction, in the local coordinates of one of the constituent triangles, our algorithm locates the first point of self-intersection of the resulting geodesic path in O(min{X, n2log X}) time. Joint work with PhD student Amir Nayyeri (unpublished)
We consider the problem of computing Nash equilibria of a subclass of generic finite normal form games. Games in the subclass that we consider have all their payoff values rational numbers while all their equilibria solutions are irrational numbers. We will present a purely algebraic method for computing all Nash equilibria of these games.
We present an approximate distance oracle for a point set S with n points and doubling dimension λ. For every ε > 0, the oracle supports (1 + ε) - approximate distance queries in (universal) constant time, occupies space [ε-O(λ) + 2O(λ log λ)]n, and can be constructed in [2O(λ)log3 n + ε-O(λ) + 2O(λ log λ)]n expected time. This improves upon the best previously known constructions, presented by Har-Peled and Mendel. Furthermore, the oracle can be made fully dynamic with expected O(1) query time and only 2O(λ) log n + ε-O(λ) + 2O(λ log λ) update time. This is the first fully dynamic (1+ε)-distance oracle. Joint work with: Yair Bartal, Lee-Ad Gottlieb, Tsvi Kopelowitz, Liam Roditty
We show that probabilistically checkable proofs can be used to shorten non-interactive zero-knowledge proofs. We obtain publicly verifiable non-interactive zero-knowledge proofs for circuit satisfiability with adaptive and unconditional soundness where the size grows quasi-linearly in the number of gates. The zero-knowledge property relies on the existence of trapdoor permutations, or it can be based on a specific number theoretic assumption related to factoring to get better efficiency. As an example of the latter, we suggest a non-interactive zero-knowledge proof for circuit satisfiability based on the Naccache-Stern cryptosystem consisting of a quasi-linear number of bits. This yields the shortest known non-interactive zero-knowledge proof for circuit satisfiability.
Random access memories suffer from failures that can lead the logical state of some bits to be read differently from how they were last written. Silent data corruptions may be harmful to the correctness and performance of software. In this talk we will address the problem of computing in unreliable (hierarchical) memories, focusing on the design of resilient dynamic programming algorithms. We will present algorithms that are correct with high probability (i.e., obtain exactly the same result as a non-resilient implementation in the absence of memory faults) and can tolerate a polynomial number of faults while maintaining asymptotically the same space and performance as their non-resilient counterparts. Joint work with: Saverio Caminiti, Irene Finocchi, and Emanuele Fusco
We consider k-median clustering in finite metric spaces and k-means clustering in Euclidean spaces. In 2006 Ostrovsky et al. proposed an interesting condition under which one can achieve better k-means approximations in time polynomial in n and k. They consider k-means instances where the optimal k-clustering has cost noticeably smaller than the cost of any (k-1)-clustering, motivated by the idea that ``if a near-optimal k-clustering can be achieved by a partition into fewer than k clusters, then that smaller value of k should be used to cluster the data''. Under the assumption that the ratio of the cost of the optimal (k-1)-means clustering to the cost of the optimal k-means clustering is at least max{100,1/ε2}, Ostrovsky et al. show that one can obtain a (1+f(ε))-approximation for k-means in time polynomial in n and k, by using a variant of Lloyd's algorithm. In this work we improve their result by giving a PTAS for the k-means problem under the assumption that the ratio of the cost of the optimal (k-1)-means clustering to the cost of the optimal k-means clustering is at least (1+α), for some constant α > 0. This result also extends to the k-median clustering problem. In addition our technique also obtains a PTAS under the assumption of Balcan et al. that all (1+α) approximations are delta-close to a desired target clustering, in the case that all target clusters have size greater than delta*n and alpha>0 is constant. Our results are based on a new notion of clustering stability, which extends both stability assumptions mentioned above. This is joint work with Pranjal Awasthi and Avrim Blum.
We study the complexity of powering in GF(2n) by constant depth arithmetic circuits over GF(2) (also known as AC0(parity)). Our study encompasses basic arithmetic operations such as computing cube-roots and cubic-residuosity of elements of GF(2n). Our main result is that these operations require exponential size circuits. The proof revisits the classical Razborov-Smolensky method, and executes an analogue of it in the land of univariate polynomials over GF(2n). We also derive strong average-case versions of these results. For example, we show that no subexponential-size, constant-depth, arithmetic circuit over GF(2) can correctly compute the cubic residue symbol for more than 1/3 + o(1) fraction of the elements of GF(2n). As a corollary, we deduce a character sum bound showing that the cubic residue character over GF(2n) is uncorrelated with all degree-d n-variate GF(2) polynomials (viewed as functions over GF(2n) in a natural way), provided d << n0.1.
The talk will be devoted to the design of truthful mechanisms for set systems, i.e., scenarios where a customer needs to hire a team of agents to perform a complex task. In this setting, frugality provides a measure to evaluate the ``cost of truthfulness'', that is, the overpayment of a truthful mechanism relative to the ``fair'' payment. In the talk we will consider a uniform scheme for designing frugal truthful mechanisms for general set systems. The scheme is based on scaling the agents' bids using the eigenvector of a matrix that encodes the interdependencies between the agents. We then apply it to two classes of set systems, namely, vertex cover systems and k-path systems, in which a customer needs to purchase k edge-disjoint source-sink paths. For both settings, we bound the frugality of our mechanism in terms of the largest eigenvalue of the respective interdependency matrix. The mechanism turns out to be optimal for a large subclass of vertex cover systems satisfying a simple local sparsity condition and optimal for k-path systems. Our lower bound argument combines spectral techniques and Young's inequality, and is applicable to allset systems.
Consider the problem of scheduling a set of tasks of length p without preemption on m identical machines with given release and deadline times. We introduce a novel graph representation of this task as a "scheduling graph" and show that there exists a feasible schedule if and only if a certain graph property holds. We then propose a compact representation of the graph and auxiliary data structures to test for said property leading to an algorithm has time complexity O(min(1, p/m) * n2). This improves substantially over the best previously known algorithm with complexity O(m n2). Interestingly the algorithm produces a schedule which minimizes both completion time and makespan. Joint work with: Claude-Guy Quimper
In this talk we consider the dynamic two-dimensional range maxima query problem. Given n points in the plane, a point is "maximal" if it is not dominated by any other point. 4-sided (resp. 3-sided) range maxima queries report the maximal points among the points that lie within a given orthogonal rectangle (resp. with one unbounded side). Dominance range maxima queries report the maximal points that dominate a given query point. We present a linear space data structure that supports 3-sided range maxima queries in O(log n +t) worst case time, where t is the size of the output. Insertions and deletions of points in the plane take O(log n) worst case time. This improves by a logarithmic factor the worst case deletion time of [Janardan, IPL '91]. The structure of [Kapoor, SIAM J.Comp. '00] also supports deletion in O(log n) worst case time, however it only supports dominance range maxima queries in O(log n + t) amortized time. We present an adaptation of our structure to the RAM model that supports 3-sided range maxima queries in O((log n)/(log log n) + t) worst case time and updates in O((log n)/(log log n)) worst case time. This is the first dynamic data structure for the RAM model that supports all operations in sublogarithmic worst case time. Finally, using a common technique, we obtain a structure that supports 4-sided range maxima queries in O(log2 n + t ) worst case time, updates in O(log2 n) worst case time, using O(nlog n) space. This improves by a logarithmic factor the worst case deletion time of the structure of [Overmars, Wood, J.Alg. '88] for rectangular visibility queries, a special case of 4-sided range maxima queries.
We will we present a method for the vehicle and crew scheduling problem (CSP) in public transport that is based on the ACO meta-heuristic. We show that the CSP does not satisfy a fundamental assumption of the meta-heuristic and that the classic Ant System does not work. We accurately implemented Huisman's model for the MDVSP (multi-depot vehicle scheduling problem) and CSP with changeovers and solved his random instances. The quality of solution deviates from those of the IP approach for at most 7 percent and was computed in a few seconds. The acceptability of the solution qualities is shown on the real-world data from the public urban bus transit in Ljubljana, Slovenia. Joint work with David Pas.
In this talk we will discuss an efficient sketch for Earth-Mover Distance. Our sketch achieves the same space complexity and approximation ratio as Andoni et al. but (1) Our sketch is a linear mapping (embedding) to a product norm space. (2) Our sketch algorithm is much simpler. (3) Our effective use of the low-dimensionality property of a norm may be of independent interest. Joint work with Elad Verbin
The talk will describe an I/O-optimal parallel algorithm for solving the orthogonal line segment intersection reporting problem in the Parallel External Memory (PEM) model. The PEM model is a parallel model that has been introduced recently to model the cache-oriented memory design of modern multicore architectures. This is a joint work with Deepak Ajwani and Norbert Zeh.
We consider the problem of representing numbers in close to optimal space and supporting increment, decrement, addition and subtraction operations efficiently. We study the problem in the bit probe model and analyse the number of bits read and written to perform the operations, both in the worst-case and in the average-case. A counter is space-optimal if it represents any number in the range [0,...,2n-1] using exactly n bits. We provide a space-optimal counter which supports increment and decrement operations by reading at most n-1 bits and writing at most 3 bits in the worst-case. To the best of our knowledge, this is the first such representation which supports these operations by always reading strictly less than n bits. For redundant counters where we only need to represent numbers in the range [0,...,L] for some integer L < 2n-1 using n bits, we define the efficiency of the counter as the ratio between L+1 and 2n. We present various representations that achieve different trade-offs between the read and write complexities and the efficiency. We also give another representation of integers that uses n + O(log n ) bits to represent integers in the range [0,...,2n-1] that supports efficient addition and subtraction operations, improving the space complexity of an earlier representation by Munro and Rahman [Algorithmica, 2010]. Joint work with Mark Greve, Vineet Pandey, and S. Srinivasa Rao.
Sorted range reporting has recently emerged as a non-classical area of range searching with interesting and sometimes challenging problems. In traditional range reporting problems, the goal is to report a subset of the input that match a certain criteria given by the query but the order of the output elements can be arbitrary while in sorted range reporting the order of the output elements is important. In this talk, we will consider two main problems: angular sorting, which is the problem of sorting a set of input points "around" a query point and sorted nearest neighbor queries which is the problem of outputting the input points sorted by their distance to a query point. We study these (and some other related problems) in the external memory model and prove lower bounds for data structures that can answer queries using optimal number of I/Os. We will also discuss many challenges associated with obtaining lower and upper bounds for the new class of problems that we study. The open problems that emerge from this work will be discussed at the end. Joint work with Norbert Zeh.
This talk deals with the problem of optimizing nonlinear functions over the lattice points in a polyhedral set. We present families of polynomial time algorithms for special cases of the general problem. Each such algorithm makes use of combinatorial, algebraic or geometric properties of the underlying problem. A particular focus of the talk concerns the special case when we aim at minimizing a nonlinear convex objective function.
A k-ary cardinal tree is a rooted tree in which each node has at most k children, and each edge is labeled with a symbol from the alphabet {1,...,k}. We present a succinct representation for k-ary cardinal trees of n nodes where k=O(polylog(n)). Our data structure requires 2n+nlog k+o(nlog k) bits and performs the following operations in O(1) time: parent, i-th child label-child, degree, subtree-size, preorder, is-ancestor, insert-leaf, delete-leaf. The update times are amortized. The space is close to the information theoretic lower bound. The operations are performed in the course of traversing the tree. This improves the succinct dynamic k-ary cardinal tree representation of Arroyuelo (CPM'08) for small alphabet, by speeding up both the query time of O(loglog n), and the update time of O((loglog n)2/logloglog n) to O(1), solving an open problem which has been previously reported. Joint work with S. Srinivasa Rao
Given a graph (V,E) with a cost on each edge and a penalty (a.k.a. prize) on each node, the prize-collecting Steiner tree (PCST) problem asks for the tree that minimizes the sum of the edge costs in the tree and the penalties of the nodes not spanned by it. In addition to being a useful theoretical tool for helping to solve other optimization problems, PCST has been applied fruitfully by AT&T to the optimization of real-world telecommunications networks. This problem is NP-hard, so research has focused on approximation algorithms. The most recent improvement was the famous 2-approximation algorithm of Goemans and Williamson, which first appeared in 1992. The natural linear programming relaxation of PCST has an integrality gap of 2, which has been a barrier to further improvements for this problem. We present a 1.9672-approximation algorithm for PCST, using a new technique for improving prize-collecting algorithms that allows us to circumvent the integrality gap barrier. We have also applied the same technique to obtain improved approximation algorithms for the prize-collecting path and traveling salesman problems. Joint work with Mohammad Hossein Bateni (Princeton), Mohammad Taghi Hajiaghayi (AT&T), and Howard Karloff (AT&T).
There are a lot of problems that have solutions that seems to "work" but have no guarantees with respect to the efficiency metrics related to the programming model used (or model of computation). In this talk, I am going to report such problems, such as top-k dominating queries, outlier detection and core decomposition in graphs. In particular, anomaly detection is considered an important task aiming at discovering elements (known as outliers) that show significant diversion from the expected case. It is a very general problem and has been attacked by making use of statistical methods, neural networks, machine learning etc. We will look at this problem from the streaming perspective and report on recent results and open problems. Core decomposition in graphs is yet another decomposition method for graphs which due to its efficiency has been used by researchers in (social) network analysis, network visualization, protein prediction etc. The main advantage of core decomposition is that there exists simple and efficient algorithms for computing the k-cores whereas other similar in concept decompositions, such as k-cliques and k-plexes, are algorithmically difficult in the sense that they are either NP-hard or at least quadratic. We are going to discuss the problem, report on some recent advances and state some interesting open problems.
With modern LiDAR technology the amount of topographic data, in the form of massive point clouds, has increased dramatically. One of the most fundamental GIS tasks is to construct a grid digital elevation model (DEM) from these point clouds. We present a simple yet very fast natural neighbor interpolation algorithm for constructing a grid DEM from massive point clouds. We use the graphics processing unit (GPU) to significantly speed up the computation. To handle the large data sets and to deal with graphics hardware limitations clever blocking schemes are used to partition the point cloud. This algorithm is about an order of magnitude faster than the much simpler linear interpolation, which produces a much less smooth surface. We also show how to extend our algorithm to higher dimensions, which is useful for constructing 3D grids, such as from spatial-temporal topographic data. We describe different algorithms to attain speed and memory trade-offs. Joint work with Alex Beutel and Pankaj K. Agarwal
Recent developments in internet and online markets has made the world a gigantic marketplace. Nowadays, many services require a combination of multiple sub-services belonging to different agents. In addition, governments and companies face problems of great complexity which can only be addressed by a diverse combination of experts. As a result, understanding economic mechanisms for hiring a team of experts is of great importance. I will talk about designing auctions with desirable properties as a solution to the problem above. Suppose we want to hire a team of selfish agents to perform a task via auction. Each agent is assumed to have a private cost if being hired. Moreover, only certain combination of agents are feasible which are expressed in a set system. For instance, in the well-studied case of path auctions, each agent are edges of a graph, and we are trying to buy an s-t path. A natural goal for us (the auctioneer) is to seek a mechanism to pay 'the least'. I will present optimal truthful mechanisms in three classes of set systems: Vertex Covers, k-Flows (generalization of path) and Cuts. For Vertex Covers, we achieve optimal mechanisms using spectral techniques. Our mechanism first scales each bid by a multiplier derived from the dominant eigenvector of a certain matrix related to the adjacency matrix and then runs VCG. Additionally, we use Vertex Covers mechanism as a primitive and propose a methodology to obtain 'frugal' mechanisms for a given set system by pruning it down to a Vertex Cover instance. We show the power of our methodology by achieving optimal mechanisms for flows and cuts. Bio: Mahyar Salek is a fifth year PhD student in the computer science department at University of Southern California, advised by David Kempe. His research interests lie in the intersection of theoretical computer science, game theory, economics and mathematical sociology. In particular, he is interested in auction and mechanism design, algorithmic game theory and social networks analysis. He completed his undergraduate degree in computer engineering from Sharif University of Technology (Iran). Additionally, he has spent two summers working as a Research Intern, at CWI and MSR NE.
High dimensional problems easily lead to huge datasets even if they are tackled using sparse grids, a method developed to reduce the number of datapoints in high dimensions. So the question, how memory efficient algorithms for sparse grids should look like, arises naturally. This talk introduces that topic of my planned PhD work and as a first result we determine the I/O complexity of a special stencil-like traversal of a regular full 2D-grid.
The talk overviews the multiplication of a sparse matrix with both vectors and matrices in the semiring I/O-model. Multiplying a sparse N x N matrix A, having kN non-zero entries, with a dense vector was considered by Bender, Brodal, Fagerberg, Jacob and Vicari in 2007. They determined its complexity for different layouts for storing A up to a certain sparsity. In this talk, we present results extending their work towards - creating the matrix-vector product of A with multiple vectors simultaneously, - multiplying A with a dense matrix B, - transposing A, which also yields a lower bound for creating the product of two sparse matrices A and B. Furthermore, we show that in our model for most parameter settings, creating the matrix-vector product has the same complexity as evaluating the bilinear form. For all the considered tasks, upper and lower bounds that match up to constant factors are obtained for a wide range of parameters. Only the task of creating the product of two sparse matrices eludes matching bounds so far. Finally, we regard the task of permuting N elements, aiming to examine what makes a given instance difficult.
Given a set of points with uncertain locations, we consider the problem of computing the probability of each point lying on the skyline, that is, the probability that it is not dominated by any other input point. If each point's uncertainty is described as a probability distribution over a discrete set of locations, we improve the best known exact solution. We also suggest why we believe our solution might be optimal. Next, we describe simple, near-linear time approximation algorithms for computing the probability of each point lying on the skyline. Joint work with Peyman Afshani, Pankaj K. Agarwal, Lars Arge, and Jeff M. Phillips
Range selection is the problem of preprocessing an input array A of n unique integers, such that given a query (i,j,k), one can report the k'th smallest integer in the subarray A[i], A[i+1],..., A[j]. In this talk we consider static data structures in the word-RAM for range selection and several natural special cases thereof. The first special case is known as range median, which arises when k is fixed to floor((j-i+1)/2). The second case, denoted prefix selection, arises when i is fixed to 0. We prove cell probe lower bounds for range selection, prefix selection and range median, stating that any data structure that uses S words of space needs Omega(lgn/lg(Sw/n)) time to answer a query. In particular, any data structure that uses nlgO(1)n space needs Ω(lgn/lglgn) time to answer a query, and any data structure that supports queries in constant time, needs n1+Omega(1) space. For data structures that uses nlgO(1) n space this matches the best known upper bound. Joint work with Allan Grønlund Jørgensen
We consider a setting where a set of n players use a set of m servers to store a large, private data set. Later the players decide on one or more functions they want to compute on the data without the servers needing to know which computation is done, while the computation should be secure against a malicious adversary corrupting a constant fraction of the players and servers. Using packed secret sharing, the data can be stored in a compact way but will only be accessible in a block-wise fashion. We explore the possibility of using I/O-efficient algorithms to nevertheless compute on the data as efficiently as if random access was possible. We show that for sorting, priority queues and data mining, this is indeed the case. In particular, even if a constant fraction of servers and players are malicious, we can keep the complexity of our protocols within a constant factor of the passively secure solution. As a technical contribution towards this goal, we develop techniques for generating values of form r, gr for random secret-shared r ∈ Zq and gr in a group of order q. This costs a constant number of exponentiation per player per value generated, even if less than n/3 players are malicious. This can be used for efficient distributed computing of Schnorr signatures. Specifically, we further develop the technique so we can sign secret data in a distributed fashion at essentially the same cost. A joint work with Ivan Damgaard ans Tomas Toft.
The lifetime of software systems such as auctions and voting can often be naturally divided into a number of "online"-phases in which computation takes place, separated by potentially long "offline"-phases in which only storage is used and no computation takes place. This is especially true for systems running in the cloud because of easy scalability of resources, e.g. computing power, and the fact that one only pays for the resources that is actually consumed. In this presentation we discuss security issues for distributed systems in such a cloud setting and we present a protocol enabling a set of servers running in the cloud, each server having a sensitive state (say, some secret keys), to achieve a strong notion of security during the offline phases of the system. This is joint work with Ivan Damgård, Jesper Buus Nielsen, Jakob Pagter and Tim Rasmussen.
The simplex algorithm is among the most widely used algorithms for solving linear programs in practice. Most deterministic pivoting rules are known, however, to need an exponential number of steps to solve some linear programs (Klee-Minty examples). No non-polynomial lower bounds on the expected number of steps were known, prior to this work, for randomized pivoting rules. We provide the first subexponential (i.e., of the form 2(Omega(nalpha)), for some alpha>0) lower bounds for the two most natural, and most studied, randomized pivoting rules suggested to date, thereby solving a problem open since the early 1970s. Joint work with Oliver Friedmann and Uri Zwick, to appear at STOC'11.
In this lecture I'll talk about black box separations in cryptography. Here are two examples of questions in that topic: 1. Suppose you get an oracle to a one way function. You don't know what this one-way function does, but you know it's a good one-way function, i.e. it is not breakable by the adversary. Can you implement a public-key cryptosystem based on this oracle, without any assumptions (computational or otherwise) besides the assumption that the one-way function is unbreakable? 2. A similar setting to #1, except now you get an oracle to a public-key cryptosystem, and the question is whether you can implement an identity-based cryptosystem based on that? (For a definition of identity-based encryption, see e.g. here: http://en.wikipedia.org/wiki/ID-based_encryption) The answer to both questions is "NO". NO on #1 is a famous result of Impagliazzo and Rudich, while NO on #2 is a less-famous but still very interesting result of Boneh, Papakonstantinou, Vahlis, Rackoff and Waters. The papers are here: 1. http://cseweb.ucsd.edu/ russell/secret.ps 2. http://www.itcs.tsinghua.edu.cn/~papakons/pdfs/IBE_focs08.pdf In this talk I will sketch some ideas behind these two results (as time permits). Furthermore, I will say some additional hopefully-interesting things about black box separations, and present a (conjectured) approach to re-proving these (difficult) results in a natural and hopefully more intuitive way by using information theory. The notions that one needs to develop in information theory in order to approach these questions from an information-theoretic direction seem to be of independent interest. They are related to computational entropy and to black-box-computational entropy. (All notions will be defined). No background on information theory is assumed. I hope to make the talk clear to those who are not too familiar with cryptography as well, but I'm not entirely sure I'll succeed in that. This is based on joint work with Periklis Papakonstantinou.
Schoening in 1999 presented a simple randomized algorithm for 3-SAT with running time 1.334n. We give a deterministic version of this algorithm with almost the same running time. Joint work with Robin Moser. This work will also be presented at STOC'11.
Concurrent reachability games is a class of games heavily studied by the computer science community, in particular by the formal methods community. Two standard algorithms for approximately solving two-player zero-sum concurrent reachability games are value iteration and strategy iteration. We prove upper and lower bounds of 2mΘ(N) on the worst case number of iterations needed for both of these algorithms to provide non-trivial approximations to the value of a game with N non-terminal positions and m actions for each player in each position. In particular, both algorithms have at least doubly-exponential complexity. The instances establishing the lower bound may be regarded as natural and our proofs of the lower bounds proceed by arguing about these instances as games, using game-theoretic concepts and tools. The talk will focus on a proof of the lower bound for m=2. Joint work with Kristoffer Arnsfelt Hansen and Peter Bro Miltersen
There have been two major paradigms in market price determination. One is the market equilibrium approach which has been been very important in the understanding the strucutre of a global economy with various types of products, and is very useful, when combined with realistic economic data, to develop numerical evaluation of the performance of an economy under the influence of different policies in the framework of Computable General Equilibrium. Another is auction protocols under the general framework of mechanism design, first started for goods of scarcity, and now widely used in E-commerce for all kinds of goods. Dispite that they are studies on the same subject matter of pricing and allocation of goods, those two appraoches are based on different methodologies. There is a weakness of the market equilibrium that has been critised by the mechanism paradigm. There are possible untruthful reports of utility functions by individual agents in the market, that may render a possibly different outcome for market equilibrium under the true utilities. In this talk, we explore this issue to study the interface of the two paradigms. We focus on the matching market and its variants developed from the seminal work of Shapley and Shubik. Here, in many applications centralized prices are used to determine the allocation of indivisible goods under the principles of individual optimization and market clearance, based on public knowledge of individual preferences. On the other hand, auction mechanisms have been used with a different set of principles for the determination of prices, based on individuals incentives to truthfully report their preferences. Considers a single seller's market with an objective of maximizing revenue of the seller, who employs a market equilibrium pricing for allocation. We show that the maximum revenue market equilibrium mechanism converges, under a chosen best response bidding strategy of the buyers, to a solution equivalent to the minimum revenue equilibrium under true preferences of buyers, which in turn is revenue equivalent to a VCG solution. This reconfirms the revenue equivalent theory of Myerson but in the interface of the two different paradigms. In addition, we will also discuss other issues invovled with incentive analysis and market equilibrium. This work is based on recent works with several co-authors, including Ning Chen, as well as Jie Zhang and others.
In this talk, we present algorithms and lower bounds for recognizing various languages in the data stream model. In particular, we resolve an open problem of Magniez, Mathieu and Nayak [STOC, 2010] concerning the multi-pass complexity of recognizing Dyck languages. This results in a natural separation between the standard multi-pass model and the multi-pass model that permits reverse passes. We also present the first passive memory checkers that verify the interaction transcripts of priority queues, stacks, and double-ended queues. We obtain tight upper and lower bounds for these problems, thereby addressing an important sub-class of the memory checking framework of Blum et al. [Algorithmica, 1994]. A key contribution of our work is a new bound on the information complexity of AUGMENTED-INDEX. Joint work with Amit Chakrabarti, Graham Cormode, and Ranganath Kondapally.
Tampering attacks are cryptanalytic attacks on the implementation of cryptographic algorithms (e.g. smart cards), where an adversary introduces faults with the hope that the tampered device will reveal secret information. Inspired by the work of Ishai et al. [Eurocrypt'06], we propose a compiler that transforms any circuit into one with the same functionality but resilient against a well-defined and powerful tampering adversary. More concretely, our transformed circuits remain secure even if the adversary can adaptively tamper with every wire in the circuit as long as the tampering fails with some probability δ. This additional requirement is motivated by practical tampering attacks, where it is often difficult to guarantee the success of a specific attack. Formally, we show that a q-query tampering attack against the transformed circuit can be ``simulated'' with only black-box access to the original circuit and log(q) bits of additional auxiliary information. Thus if the implemented cryptographic scheme is secure against log(q) bits of leakage, then our implementation is tamper-proof in the above sense. Surprisingly, allowing such small amount of leakage significantly improves the efficiency. In contrast to Ishai et al., who allow tampering of up to t wires with δ =0, for some security parameter k we save a factor of O(k2t) in circuit size, and do not require any randomness during evaluation. We also show that allowing such leakage, similar efficiency gains can be achieved for the case of δ=0, when one is willing to make additional assumptions. Similar to earlier work our construction requires small, stateless and deterministic tamper-proof gadgets. Thus, our result can be interpreted as reducing the problem of shielding arbitrary complex computation to protecting simple components. This is joint work with Krzysztof Pietrzak and Daniele Venturi
We present an implicit dictionary with the working set property i.e. a dictionary supporting insert(e), delete(x) and predecessor(x) in O(log n) time and search(x) in O(log l) time, where n is the number of elements stored in the dictionary and l is the number of distinct elements searched for since the element with key x was last searched for. The dictionary stores the elements in an array of size n using no additional space. In the cache-oblivious model the operations insert(e), delete(x) and predecessor(x) cause O(logB n) cache-misses and search(x) causes O(logB l) cache-misses. Joint work with Gerth Stølting Brodal and Casper Kejlberg-Rasmussen.
We study a variant of the line-simplification problem where we are given a polygonal path P= p1,p2,..., pn and a set O of m point obstacles in a plane, and the goal is to find the optimal homotopic simplification, that is, a minimum subsequence Q=q1,q2, ..., qk (q1 = p1 and qk = pn) of P defining a polygonal path which approximates P within the given error e and is homotopic to P. We present a general method running in time O(m(m+n)log(nm)) for identifying every shortcut pi pj that is homotopic to the sub-path pi,...,pj of P, called a homotopic shortcut. Under any desired measure, this method can be simply combined with Imai and Iri' framework to obtain the optimal homotopic simplification. Joint work with Shervin Daneshpajouh, Mohammad Ali Abam, and Mohammad Ghodsi
We revisit classic algorithmic search and optimization problems from the perspective of competition. Rather than a single optimizer minimizing expected cost, we consider a zero-sum game in which a search problem is presented to two players, whose only goal is to outperform the opponent. Such games are typically exponentially large zero-sum games, but they often have a rich structure. We provide general techniques by which such structure can be leveraged to find minmax-optimal and approximate minmax-optimal strategies. We give examples of ranking, hiring, compression, and binary search duels, among others. We give bounds on how often one can beat the classic optimization algorithms in such duels. Joint work with Adam Tauman Kalai, Brendan Lucier, Ankur Moitra, Andrew Postlewaite, and Moshe Tennenholtz.
Consider an unrooted tree where every edge has a weight associated with it. Path Minima queries is finding the edge of minimum weight along a path between two given vertices. In the Semigroup, Comparison and RAM models, we propose data structures for the following variants of the problem: 1) the weight of each edge can be changed using an update operation, 2) new leaves can be inserted and deleted to/from the tree. The Dynamic Trees of Sleator and Tarjan (STOC'81) solves the Path Minima problem on a forest of unrooted trees under edge insertion/deletion. There are various results for different variants of the problem, where one or some of the operations are ignored. We also give several simple reductions from different problems to the variants of the Path Minima problem. Joint work with Gerth Stølting Brodal and Srinivasa Rao Satti
Operations Research deals with hard optimization problems and the main tool for this is Mixed Integer Programming (MIP) models. One of the most success full tools to handle hard MIP models is Dantzig-Wolfe decomposition. This method enables reformulation of the MIP models in order to be able to solve larger MIP models. For computer scientists it may be interesting that the solution of simpler combinatorial problems, like the knapsack problem, becomes very important. Handling these combinatorial problems then requires good Computer Science skills. Dantzig-Wolfe decomposition is briefly described in general and then it is exemplified with a problem from communication network protection. The presented Dantzig-Wolfe decomposition scheme has been published in the Networks journal: "Optimal routing with failure-independent path protection", Stidsen, T. and Petersen, B. and Spoorendonk, S. and Zachariasen, M. and Rasmussen, K.
In 2009, Barak, Braverman, Chen and Rao proved a first-of-its-kind direct sum result for randomized communication complexity. They prove that if f requires C bits of communication, then fn requires at least √n*C bits of communication. (I'm ignoring some error terms, polylog terms, etc.). More details, full definition of the setting, etc., can be found in their ECCC paper here: http://eccc.uni-trier.de/report/2009/044 Their result relies on some standard tools from the area, along with two new tools: a new method for "message compression" which is more efficient in some sense than all previous methods, and a new definition of "information content" which suits their purposes. In this talk I will give a short exposition of their result, and outline some open research directions. I assume no prior knowledge of information theory.
A fundamental problem in data management is to draw a sample of a large data set, for approximate query answering, selectivity estimation, and query planning. With large, streaming data sets, this problem becomes particularly difficult when the data is shared across multiple distributed sites. The challenge is to ensure that a sample is drawn uniformly across the union of the data while minimizing the communication needed to run the protocol on the evolving data. At the same time, it is also necessary to make the protocol lightweight, by keeping the space and time costs low for each participant. In this paper, we present communication-efficient protocols for sampling (both with and without replacement) from k distributed streams. These apply to the case when we want a sample from the full streams, and to the sliding window cases of only the W most recent elements, or arrivals within the last w time units. We show that our protocols are optimal (up to logarithmic factors), not just in terms of the communication used, but also the time and space costs for each participant. Joint work with: Graham Cormode, S. Muthukrishnan and Ke Yi.
We study probabilistic complexity classes and questions of derandomisation from a logical point of view. For each logic L we introduce a new logic BPL, (bounded error probabilistic L), which is defined from L in a similar way as the complexity class BPP is defined from PTIME. Our main focus lies on questions of derandomisation, and we prove that there is a query which is definable in BPFO, the probabilistic version of first-order logic, but not in finite variable infinitary logic with counting. This implies that many of the standard logics of finite model theory, like transitive closure logic and fixed-point logic, both with and without counting, cannot be derandomised. We prove similar results for ordered structures and structures with an addition relation, showing that certain uniform variants of AC0 (bounded-depth polynomial sized circuits) cannot be derandomised. These results are in contrast to the general belief that most standard complexity classes can be derandomised.
The "Coin Problem" is the following problem: a coin is given, which lands on head with probability either 1/2+β or 1/2-β. We are given the outcome of n independent tosses of this coin, and the goal is to guess which way the coin is biased, and to be correct with probability ≥ 2/3. When our computational model is unrestricted, the majority function is optimal, and succeeds when β ≥ c / √n for a large enough constant c. The coin problem is open and interesting in models that cannot compute the majority function. In this talk I will present results on the coin problem in the model of read-once width-w branching programs. We prove that in order to succeed in this model, beta must be at least 1/ (log n)Θ(w). For constant w this is tight by considering the recursive tribes function. I will also discuss various generalizations and variants of this. Finally, I will suggest one application for this kind of theorems: I'll show that the INW generator epsilon-fools width-w read-once permutation branching programs, using seed length O(log n*loglog n) when epsilon and w are both constant. I'll also show why we get this only for permutation branching programs, and what stops us from getting this for the non-permutation case. We are looking for applications of the coin problem in other domains (e.g. streaming lower bounds). Joint work with Joshua Brody
Grammar based compression, where one replaces a long string by a small context-free grammar that generates the string, is a simple and powerful paradigm that captures many of the popular compression schemes, including the Lempel-Ziv family, Run-Length Encoding, Byte-Pair Encoding, Sequitur, and Re-Pair. In this paper, we present a novel grammar representation that allows efficient random access to any character or substring without decompressing the string. Let S be a string of length N compressed into a context-free grammar S of size n. We present two representations of S achieving O(log N) random access time, and either O(n· αk(n)) construction time and space on the pointer machine model, or O(n) construction time and space on the RAM. Here, αk(n) is the inverse of the kth row of Ackermann's function. Our representations also efficiently support decompression of any substring in S: we can decompress any substring of length m in the same complexity as a single random access query and additional O(m) time. Combining these results with fast algorithms for uncompressed approximate string matching leads to several efficient algorithms for approximate string matching on grammar compressed strings without decompression. For instance, we can find all approximate occurrences of a pattern P with at most k errors in time O(n(min{|P|k, k4 + |P|} + log N) + øcc), where øcc is the number of occurrences of P in S. Finally, we are able to generalize our results to navigation and other operations on grammar-compressed trees. All of the above bounds significantly improve the currently best known results. To achieve these bounds, we introduce several new techniques and data structures of independent interest, including a predecessor data structure, two "biased" weighted ancestor data structures, and a compact representation of heavy-paths in grammars. Joint work with: Gad M. Landau, Rajeev Raman, Kunihiko Sadakane, Srinivasa Rao Satti, and Oren Weimann
In this talk we describe a dynamic external memory data structure that supports three-dimensional orthogonal range reporting queries in O(logB2 N + k/B) I/O operations, where k is the number of points in the answer and B is the block size. Our data structure uses O(N/Blog22 N log22 B) blocks of space and supports updates in O(log32 N) amortized I/Os. This is the first dynamic data structure that answers three-dimensional range reporting queries in logBO(1) N + O(k/B) I/Os.
In this talk I will describe some Approximation algorithms based on Semidefinite Programming. Mainly MAX-CUT (The Goemans-Williamson algorithm) and outline of some algorithms for coloring 3-colorable graphs. The lecture will more or less be around the topics of lectures 20 and 21 here: http://pages.cs.wisc.edu/ shuchi/courses/880-S07/ If there is interest I'll give some more talks in thenext few weeks, talking about other approximation algorithms (e.g. embedding-based, iterative rounding) and maybe hardness of approximation results (based on the unique games conjecture).
The kernel distance is the metric formed using a using a kernel (or similarity function). Specifically, given a positive definite K : Rd × Rd → R, then for two points sets P,Q in Rd the kernel distance is defined: DK(P,Q) = √K(P,P) + K(Q,Q) - 2 K(P,Q) where K(P,Q) = sump in P sumq in Q K(p,q) This definition generalizes naturally to shapes (curves, surfaces), distributions, clusters, graphs, and trees. In the past 5 years or so, a flurry of work in medical imaging (where DK is called the current distance) and machine learning (where DK is called maximum mean discrepancy or MMD) has shown the practicality of this measure as well as its favorable relation to more classic measures such as EMD. In this talk, I will provide the first rigorous algorithmic analysis of the kernel distance. I first will reduce the kernel distance on smooth shapes and distributions with bounded error to the kernel distance on finite point sets. Then I'll will produce near-linear algorithms on point sets that preserves the same bounded error, beating the naive quadratic runtime bound. Joint work with: Sarang Joshi, Raj Varma Kommaraju, and Suresh Venkatasubramanian
The two dimensional range minimum query problem is to preprocess a static two dimensional m by n array A of size N=mn, such that subsequent queries, asking for the position of the minimum element in a rectangular range within A, can be answered efficiently. We study the trade-off between the space and query time of the problem. We show that every algorithm enabled to access A during the query and using O(N/c) bits additional space requires Ω(c) query time, for any c where 1 ≤ c ≤ N. This lower bound holds for any dimension. In particular, for the one dimensional version of the problem, the lower bound is tight up to a constant factor. In two dimensions, we complement the lower bound with an indexing data structure of size O(N/c) bits additional space which can be preprocessed in O(N) time and achieves O(c log2 c) query time. For c=O(1), this is the first O(1) query time algorithm using optimal O(N) bits additional space. For the case where queries can not probe A, we give a data structure of size O(N min{m,log n}) bits with O(1) query time, assuming m≤ n. This leaves a gap to the lower bound of Ω(Nlog m) bits for this version of the problem. Joint work with Gerth Stølting Brodal and Srinivasa S. Rao
The first half of the talks reviews the standard economic approach to mechanism design. We assume that the preferences of the the participants are random from a known distribution and solve for the mechanism that maximizes the objective, in our case, profit. The second half of the talk surveys three challenge areas for mechanism design and describes the role approximation plays in resolving them. Challenge 1: optimal mechanisms are parameterized by knowledge of the distribution of agent's private types. Challenge 2: optimal mechanisms require precise distributional information. Challenge 3: in multi-dimensional settings economic analysis has failed to characterize optimal mechanisms. The theory of approximation is well suited to address these challenges. While the optimal mechanism may be parameterized by the distribution of agent's private types, there may be a single mechanism that approximates the optimal mechanism for any distribution. While the optimal mechanism may require precise distributional assumptions, there may be approximately optimal mechanism that depends only on natural characteristics of the distribution. While the multi-dimensional optimal mechanism may resist precise economic characterization, there may be simple description of approximately optimal mechanisms. Finally, these approximately optimal mechanisms, because of their simplicity and tractability, may be much more likely to arise in practice, thus making the theory of approximately optimal mechanism more descriptive than that of (precisely) optimal mechanisms. The talk will cover positive resolutions to these challenges challenges with emphasis on basic techniques, relevance to practice, and future research directions.
Recent advances in echo-sounder technology make that hydrographers can now obtain up to 2.2 billion soundings a day using a single multibeam echo sounder. Apart from the seabed and interesting features on it (such as pipelines), such data sets often also contain a lot of noise appearing as a result of scans of (shoals of) fish, multiple reflections, scanner self-reflections, refraction in gas bubbles, and so on. In this talk I consider the problem of automatically removing noisy points (cleaning) of massive sonar data point clouds. Existing cleaning methods are mostly based on considering data in a local neighborhood around a point to decide if it is noise or not. Therefore they often fail to recognize large clusters of noisy points. I will describe a new algorithm that avoids the problems of local-neighborhood based algorithms and can identify large clusters of noisy points. The algorithm is theoretically I/O-efficient, but also relatively simple and thus practically efficient, partly due to the development of a new simple algorithm for computing the connected components of a graph embedded in the plane. The connected component algorithm is theoretically I/O-optimal under a practically realistic assumption about the input graph, and we believe it is of independent interest. I will conclude with an extensive discussion of possible future directions and open problems related to our cleaning approach. Joint work with: Lars Arge, Kasper Dalgaard Larsen, and Thomas Mølhave
We consider rain falling at a uniform rate onto a terrain T represented as a TIN. Over time, water collects in the basins of T, forming lakes that spill into adjacent basins. Our goal is to compute for each terrain vertex, the time this vertex is covered by water. We present an I/O-efficient algorithm that solves this problem using O(sort(X)log(X/M) + sort(N)) I/Os, where N is the number of terrain vertices and X is the number of pits of the terrain. Our algorithm assumes that the volumes and watersheds of the basins of T have been precomputed using existing methods. Joint work with Lars Arge and Norbert Zeh
The timestamp problem captures a fundamental aspect of asynchronous distributed computing. It allows processes to label events with timestamps that provide information about the real-time ordering of those events. We consider the space complexity of wait-free implementations of timestamps from shared registers in a system of n processes. We prove an Ω(sqrtn) lower bound on the number of registers required. If the timestamps are elements of a nowhere dense set, for example the integers, we prove a stronger, and tight, lower bound of n. However, if timestamps can be from an arbitrary set, we show how this bound can be beaten. This work is joint with Panagiota Fatourou and Eric Ruppert.
I will present an O(sort(m))-I/O algorithm for topologically sorting a directed acyclic graph (DAG) G with width O(M/B), assuming that a chain decomposition of G is given. Here M and B are the number of elements that .t in internal memory and a block, respectively, m is the number of edges in the DAG, and sort(m) denotes the I/O complexity of sorting m data items. I will then show that this result can be generalized to all DAGs and the assumption can be relaxed to obtaining a vertex-disjoint path cover of an acyclic supergraph of G consisting of O(M/B) directed paths. For some classes of DAGs, such a path cover can be obtained in O(sort(m) polylog(m)) I/Os, thereby obtaining a topological sorting algorithm of the same complexity for these graph classes. Joint work with: Norbert Zeh
The k-order Voronoi diagram of a set of n sites S is a subdivision of Rd into a collection of convex cells, where every point in the interior of each d-face has the same k nearest sites. This talk will be devoted to defining the problem, showing the best known solution in the RAM model, which uses random incremental construction. Then I will cover known techniques to do random incremental construction in external memory by use of gradations and sampling.
Regular games provide a very useful model for the control and synthesis of reactive systems. The complexity of these games depends in a large part on the representation of the winning condition: if it is represented through a win-set, a coloured condition, a Zielonka-DAG or Emerson-Lei formulae, the problem is PSPACE-complete; if it is represented as a Zielonka tree, the problem is in NP and in co-NP. In this talk, we show that explicit Muller games can be solved in polynomial time, and provide an effective algorithm.
A data structure is presented for the Mergeable Dictionary abstract data type, which supports the operations Predecessor-Search, Split, and Merge on a collection of disjoint sets of totally ordered data. While in a typical mergeable dictionary (e.g. 2-4 Trees), the Merge operation can only be performed on sets that span disjoint intervals in keyspace, the structure here has no such limitation. A data structure which can handle arbitrary Merge operations in O(log n) amortized time in the absence of Split operations was presented by Brown and Tarjan. A data structure which can handle both Split and Merge operations in O(log2 n) amortized time was presented by Farach and Thorup. In contrast, our data structure supports all operations, including Split and Merge, in O(log n) amortized time, thus showing that interleaved Merge operations can be supported at no additional cost vis-a-vis disjoint Merge Joint work with Özgür Özkan
I will survey the state of the art in lower bounds for dynamic data structures in the cell-probe model. This will include a recent (STOC'10) paper in which I describe a plausible attack on Ω(nε) lower bounds, and a conditional proof based on 3SUM-hardness. Also, I will talk about recent progress that yields bounds of the form: "if the update time is o(lg n /lglg n), the query time must be Ω(n1-ε).
This talk introduces the novel ecoinformatics research field, outlines the ECOINF group's research in the field, and provides an outlook towards future aims and possibilities, notably the potential for biology-computer science collaborations. Ecoinformatics is a novel quantitative computing approach to ecology and environmental science, analogous to the bioinformatics approach to genetics and evolutionary biology. The ecoinformatics approach relies on recent exponential gains in computing power and data storage capacity, advances in statistics and mathematics, and dramatic increases in availability of environmental, ecological, and biological data (including genetic data), resulting from large digitization efforts, and increasingly automated data capture. It involves the application of advanced techniques from computer science and statistics to manage and analyze large amounts of biological, environmental and geographical data. It is representative of the broader change from an experimental paradigm to a computing paradigm that is happening across science (see Nature's 2006 "2020 computing" and 2008 "Big data" special features). Ecoinformatics is revolutionizing ecology and environmental science by making it possible effectively and comprehensively to investigate the complex and often large-scale problems that are at the core of this research field (such as "What determines species diversity?", which Science highlighted as one of the 25 most important topics for modern scientific research in its 125th anniversary issue in 2005) and/or that biology and ecology is increasingly expected to provide quantitative solutions for (such as climate change impacts, adaptation and mitigation possibilities).
We introduce a new connection between IO-efficient algorithms and secure computation. We use this to design protocols for a setting where a set of n players use a set of m servers to store a large data set. Later the players want to compute on the data without the servers needing to know which computation is done, while the computation should be secure against an adversary corrupting a constant fraction of the players and servers. Using packed secret sharing the data can be stored in a compact way but will only be accessible in a block-wise fashion. We explore the possibility of using IO-efficient algorithms to nevertheless compute on the data as efficiently as if random access was possible. We show for sorting, this is indeed the case, by showing how to evaluate the odd-even merge sort network I/O-efficiently, and by giving efficient protocols for reading and writing blocks of data to the servers.
We present aggregate separation bounds, named after Davenport-Mahler-Mignotte (DMM), on the isolated roots of polynomial systems, specifically on the minimum distance between any two such roots. The bounds exploit the structure of the system and the height of the sparse (or toric) resultant by means of mixed volume, as well as recent advances on aggregate root bounds for univariate polynomials.
Orthogonal range reporting is the problem of storing a set of n points in d-dimensional space, such that the k points in an axis-orthogonal query box can be reported efficiently. While the 2-d version of the problem was completely characterized in the pointer machine model more than two decades ago, this is not the case in higher dimensions. In this talk we provide a space-optimal pointer machine data structure for 3-d orthogonal range reporting that answers queries in O(log n+k) time. Thus, we settle the complexity of the problem in 3-d. We use this result to obtain improved structures in higher dimensions, namely structures with a log n / log log n factor increase in space and query time per dimension. We also prove the first non-trivial query lower bound for the problem in the pointer machine model of computation. At the end, we shall discuss some intriguing open problems that come from our results. Joint work with: Lars Arge and Kasper Dalgaard Larsen
We study the well-known problem of simplifying a polygonal path P by a coarse one Q, whose vertices are a subset of P's vertices. For a given error, our goal is to find the minimum number of vertices while preserving the homotopy with respect to the geometric objects in both sides of P. De Berg et all [deBerg95] presented an O(n(m+n)log n) time algorithm which works on x-mon paths, where n is the number of points of the polygonal path and m is the number of extra points in both sides of the input path. Here, we present a general method for homotopy-preserving simplification under any desired measure. This algorithm runs in O((n+m)log(n+m) +k) time, where k is the number of eligible edges. Using this method we improve the running time of de Berg et al. algorithm for the Hausdorff measure while working on general paths. To the best of our knowledge this is the first homotopy-preserving simplification algorithm that guarantees the minimum link on general paths. Joint work with Mohammad Ghodsi.
We study the computational complexity of computing the noncommutative determinant. We first consider the arithmetic circuit complexity of computing the noncommutative determinant polynomial. Then, more generally, we also examine the complexity of computing the determinant (as a function) over noncommutative domains. Our hardness results are summarized below: 1. We show that if the noncommutative determinant polynomial has small noncommutative arithmetic circuits then so does the noncommutative permanent, which would imply that the commutative permanent polynomial would have small commutative arithmetic circuits. 2. For any field F we show that computing the n X n permanent over F is polynomial-time reducible to computing the 2n X 2n (noncommutative) determinant whose entries are O(n2) X O(n2) matrices over the field F. 3. We also derive as a consequence that computing the n X n permanent over nonnegative rationals is polynomial-time reducible to computing the noncommutative determinant over Clifford algebras of nO(1) dimension. Our techniques are elementary and use primarily the notion of the Hadamard Product of noncommutative polynomials. Joint work with Vikraman Arvind.
We study the computational geometry problems in the parallel external memory (PEM) model. PEM is a parallel version of the external memory model of Aggarwal and Vitter. We solve the problems for 2-d dominance counting, 3-d maxima, visibility from a point and 2-d convex hull using simple techniques of the PEM model. We also introduce a parallel version of the distribution sweeping technique, which we use to solve output-sensitive problems, such as orthogonal line segment intersection reporting, batched range searching and other related problems. Joint work with Deepak Ajwani and Norbert Zeh.
Integer optimization is an important component of Operations Research, and it is used for solving many problems that arise in practice. The state-of-the-art approach for solving integer problems uses so-called cutting-planes to cut off points that have fractional coordinates. Recently a new cutting-plane theory has emerged which is based on lattice-free sets. Lattice-free sets are convex sets whose interior does not contain integer points. The interior of a lattice-free set does therefore not contain any feasible solutions to an integer optimization problem. In this talk we survey this new approach and present some of the results that have been obtained. We also consider algorithmic consequences. In particular we present a new class of cutting planes that have a number of desirable theoretical properties. We call these cutting-planes for zero-coefficient cuts. In terms of quality, zero-coefficient cuts are those cuts that are violated the most, and in terms of efficiency, zero-coefficient cuts can be identified efficiently in polynomial time. Initial computational experiments suggest that zero-coefficient cuts could be an interesting class of cutting-planes to include in practical software for mixed integer optimization.
We consider the problem of maintaining dynamically a set of points in the plane and supporting range queries of the type [a,b]×(- ∞, c]. We assume that the inserted points have their x-coordinates drawn from a class of smooth distributions, whereas the y-coordinates are arbitrarily distributed. The points to be deleted are selected uniformly at random among the inserted points. For the RAM model, we present a linear space data structure that supports queries in O(log log n + t) expected time with high probability and updates in O(log log n) expected amortized time, where n is the number of points stored and t is the size of the output of the query. For the I/O model we support queries in O(log logB n + t/B) expected I/Os with high probability and updates in O(logB log n) expected amortized I/Os using linear space, where B is the disk block size. The data structures are deterministic and the expectation is with respect to the input distribution. Joint work with: G.S. Brodal, A. Kaporis, S. Sioutas, K. Tsichlas
We consider the problem of approximating a set P of n points in Rd by a j-dimensional subspace under the lp measure, in which we wish to minimize the sum of lp distances from each point of P to this subspace. More generally, the Fq(lp)-subspace approximation problem asks for a j-subspace that minimizes the sum of qth powers of lp-distances to this subspace, up to a multiplicative factor of (1+ε). We develop techniques for subspace approximation, regression, and matrix approximation that can be used to deal with massive data sets in high dimensional spaces. In particular, we develop coresets and sketches, i.e. small space representations that approximate the input point set P with respect to the subspace approximation problem. Among the results we propose a dimensionality reduction method for various clustering measures, strong coreset, ptas and streaming algorithms in bounded and unbounded precision model for F1(l2)-subspace approximation in high-dimensional spaces. Joint work with: Dan Feldman, Christian Sohler, David P. Woodruff
In this talk we present a data structure to do a 3-approximation of the range mode with constant time queries and linear space. We present a data structure doing a (1+eps)-approximation of the range mode in O(log(1/eps)) time and O(n/eps) time. Finally we present some work on the related Fixed Range Frequency problem. Joint work with: Mark Greve, Allan Grønlund Jørgensen and Kasper Dalgaard Larsen
Everyone knows the determinant of a square matrix. The permanent is less known though it is almost the same thing, one just omits all the switching of signs. Note that in characteristic 2 the permanent and determinant are the same thing. One can compute the determinant of a given matrix in polynomial time using Gaussian elimination. However this does not hold for the permanent, assuming the characteristic is not 2, since it is not multiplicative. One might ask if there is some way to compute the permanent by "blowing up and applying the determinant". More precisely can one compute the permanent by constructing a bigger matrix with linear (affine) polynomials as entries, such that the permanent equals the determinant applied to this bigger matrix? Valiant has shown that such maps exist; in fact all polynomials can be written as the determinant of some matrix with linear entries. The topic of this talk is to establish a lower bound of how fast these matrices grow as the original matrices grow. A result due to Mignon and Ressayre gives by use of Hessian matrices that the growth is at least quadratic, when working over a field of characteristic 0. This has been generalized by Cai, Chen and Li to hold over fields of odd characteristic. For a variety X there is a link from the rank of a Hessian matrix to the dimension of the dual variety. We will outline a translation of the work of Mignon, Ressayre, Cai, Chen and Li into the language of dual varieties.
In this talk we present an I/O-efficient version of an algorithm for simplifying contour trees of two- and three-dimensional scalar fields described by Carr et al. Our algorithm uses optimal O(sort(N)) I/Os where N is the size of the contour tree. Like the algorithm of Carr et al. our algorithm can perform the simplification based on a number of local geometric measures associated with the individual contours. Joint work with: Lars Arge
Consider the following problem: a large data x in 0,1k is stored on a storage device in some encoded form. A fraction of the stored information might have got corrupted. An algorithm needs the data x as input. But most likely the algorithm will only need a small fraction (say t bits) of the data for its execution. Thus decoding the whole message is not economical. Also the algorithm is adaptive and thus does not know which bits it would need beforehand. So ideally it would like to decode any bit of the x only when the need arises and would like to decode the bit by looking at a very small fraction of the encoded data. To ensure correctness of the algorithm we would hope that the algorithm manages to correctly decode all the t bits with high probability (say > 2/3). This example motivated us to define reconstructible code. A k-query t-bit reconstructible code allows to probabilistically decode any bit of an encoded message by probing only k bits of its corrupted encoding such that any t-bits of the message is decoded correctly with high probability. An LDC can be converted into a reconstructible code by amplification of the success probability. But the question we ask is what is the best query complexity of a t-bit reconstructible code. We will study the limitations on reconstructible codes and see how it related to other areas in computer science.
Identical products being sold at different prices in different locations is a common phenomenon. Price differences might occur due to different reasons such as shipping costs, trade restrictions and price discrimination. This is modeled in traditional market models by adding a production that transfers the goods from one location to another. However, this approach is not always satisfactory since it would have the market determine the cost of production as well, as opposed to it being exogenously specified, and it would unnecessarily blow up the number of goods in the market, which affects the computational complexity of finding an equilibrium. We give a more direct way to model such scenarios by supplementing the classical Fisher model of a market by introducing transaction costs. For every buyer i and every good j, there is a transaction cost of c(i,j); if the price of good j is p(j) then the cost to the buyer i per unit of j is p(j) + c(i,j). This allows the same good to be sold at different (effective) prices to different buyers. We study questions regarding existence, uniqueness and computability of equilibrium in such a model. Joint work with Chinmay Karande and Nikhil Devanur.
In this talk I will describe a new technique for proving data structure lower bounds based on statistical reasoning. I will present it in the context of new (as in submitted-5-days-ago new) lower bounds for dynamic membership in the external memory model; the general technique, however, might (and should!) have further applications. The external memory model is just like the cell probe model, except that it has a free-to-access cache of size m, and the cell size is typically w=polylog(n). The cell-probe model for data structures counts only cell-accesses, so computation is free. One of the most interesting features of the external memory model is that it allows to achieve sub-constant update time by writing multiple items to the cache, and then writing them to memory at the same time using only one probe (just like what is done in practice when paging). This is called *buffering*. There is a data structure called the buffer tree, which achieves update time roughly O(log2(n)/w) and query time O(logn); it works for multiple problems, among them membership, predecessor search, rank select, 1-d range counting, etc. . For w=log9(n), for example, the update time here is subconstant. We prove that if one wants to keep the update time less than 0.999 (or any const<1), it is impossible to reduce the query time to less than logarithmic (namely, to o(logwlogn(n/m)) ). Thus one has a choice between two sides of a dichotomy: either buffer very well but take at least logarithmic query time, or use the old data structures from the RAM model who do not buffer at all, and have a shot at sublogarithmic query time. To restate, we prove that for membership data structures, in order to get update time 0.999, the query time has to be at least logarithmic. This is a *threshold phenomenon for data structures*, since when the update time is allowed to be 1+o(1), then a bit vector or hash table give query time O(1). The proof of our lower bound is based on statistical reasoning, and in particular on a new combinatorial lemma called the Lemma Of Surprising Intersections (LOSI) The LOSI allows us to use a proof methodology where we first analyze the intersection structure of the positive queries by using encoding arguments, and then use statistical arguments to deduce properties of the intersection structure of *all* queries, even the negative ones. We have not previously seen this way of arguing about the negative queries, and we suspect it might have further applications. Joint work with: Qin Zhang, HKUST
The past fifteen years has seen an amazing burst of research in streaming algorithms, starting with the famous paper by Alon, Matias, and Szegedy. In particular, space-efficient algorithms are possible for several functions, when you allow both randomness and approximation. However, this comes at a price: most of the algorithms pay a penalty in space that is quadratic with the approximation factor. In 2003, Woodruff and Indyk showed that this penalty was necessary in the case of one-pass data streams, using a reduction from the one-way complexity of the Gap Hamming problem. However, it has been a long standing open problem whether the same lower bound applies to multipass algorithms. We extend this lower bound so it holds even when a constant number of rounds of communication are allowed. As a result, we show that even with O(1) passes, streaming algorithms must pay a high price for their approximation. Our main technique combines a Round Elimination Lemma with several isoperimetric inequalities. Joint work with Amit Chakrabarti, Ishay Haviv, Oded Regev, Thomas Vidick, and Ronald de Wolf.
Computational topology is attractive for visualization tasks because it provides abstractions of scalar fields that can be exploited either directly or indirectly to extract information and knowledge from underlying data sets. The first part of this talk will provide a brief survey of the applications reported to date, while the second part will focus on unresolved questions relating to adaptive meshing, higher-order interpolation, and applications to multi-variate data.
In this talk we design data structures supporting range median queries, i.e. report the median element in a sub-range of an array. We consider static and dynamic data structures and batched queries. Our data structures support range selection queries, which are more general, and dominance queries (range rank). In the static case our data structure uses linear space and queries are supported in O(log n/loglog n) time. Our dynamic data structure uses O(nlog n/log log n) space and supports queries and updates in O((log n/loglog n)2) time.
A Boolean function f on n inputs is called a sign-function of an integer polynomial p on n variables if for all n-bit strings x it holds that p(x)>0 if f(x)=1 and p(x)<0 if f(x)=0. In this case p is called a perceptron for f. The degree of the perceptron is simply the degree of the polynomial and the weight of the perceptron is the sum of the absolute values of its coefficients. In this talk we will be interested in the smallest possible weight of a perceptron of a given degree for a given function. We will review upper and lower bounds on this quantity and give some proof ideas. The talk will be based on the following papers: Vladimir V. Podolskii. Perceptrons of Large Weight. Problems of Information Transmission. 45(1):51-59, 2009. Vladimir V. Podolskii. A Uniform Lower Bound on Weights of Perceptrons. Proc. of the Third International Computer Science Symposium in Russia (CSR), pp. 261-272, 2008.
In orthogonal range reporting we are to preprocess N points in d-dimensional space so that the points inside a d-dimensional axis-aligned query box can be reported efficiently. This is a fundamental problem in various fields, including spatial databases and computational geometry. In this talk we show a number of improvements for three and higher dimensional orthogonal range reporting: * In the pointer machine model, we improve all the best previous results, some of which have not seen any improvements in almost two decades. * In the I/O-model, we improve the previously known three-dimensional structures and provide the first (nontrivial) structures for four and higher dimensions. Joint work with: Peyman Afshani Lars Arge
e give new results on the number-on-the-forhead (NOF) communication complexity of the multiparty pointer jumping problem. The origional motivation for this problem comes from circuit complexity. Specifically, there is no explicit function known to lie outside the complexity class ACC0. However, a long line of research in the early 90's showed that a sufficiently strong NOF communication lower bound for a function would place it outside ACC0. Pointer juming is widely considered to be a strong candidate for such a lower bound. We give a surprising general upper bound to this problem, as well as a tight upper and lower bound for a restricted class of protocols. Part of this talk was joint work with Amit Chakrabarti.
We reexamine what it means to compute Nash equilibria and, more generally, what it means to compute a fixed point of a given Brouwer function, and we investigate the complexity of the associated problems. Specifically, we study the complexity of the following problem: given a finite game with 3 or more players, and given epsilon > 0, approximate some actual Nash Equilibrium to within distance epsilon. We show that for games with 3 (or more) players approximating an actual NE within any non-trivial distance is at least as hard as long standing open problems in numerical computation (the square-root-sum problem and arithmetic circuit decision problems), which are not known to be in NP nor in the polynomial time hierarchy. Thus, approximating an actual NE for 3-player games is much harder, in our present state of knowledge, than the PPAD-complete problems of computing an epsilon-NE for 3-player games, and of computing an actual NE for 2-player game. We define a new complexity class, FIXP, which captures search problems that can be cast as fixed point computation problems for functions represented by algebraic circuits (straight line programs) over basis +,*,max with rational coefficients. We show that the computation (approximate or otherwise) of actual Nash equilibria for 3 or more players is FIXP-complete. We show that PPAD is precisely equal to the piecewise-linear fragment of FIXP where the basis is restricted to the operators +,max. Many other important problems in game theory, economics, and probability theory, can be formulated as fixed point problems for such algebraic functions. We discuss several such problems: computing economic market equilibria, computing the value of Shapley's stochastic games and Condon's simpler games, and computing extincition probabilities of branching processes. We show that for approximation, or even exact computation, some of these problems can be placed in PPAD, while others are at least as hard as the square-root sum and arithmetic circuit decision problems, and some are FIXP-complete.
The policy iteration algorithms is a simple family of algorithms that can be applied in many different settings, ranging from the relatively simple problem of finding a minimum mean weight cycle in a graph, the more challenging solution of Markov Decision Processes (MDPs), to the solution of 2-player full information stochastic games, also known as Simple Stochastic Games (SSGs). It was recently shown by Fridmann that the worst case running time of a natural deterministic version of the policy iteration algorithm, when applied to Parity Games (PGs), is exponential. It is still open, however, whether deterministic policy iteration algorithm can solve Markov Decision Processes in polynomial time, and whether randomized policy iteration algorithms can solve Simple Stochastic Games in polynomial time. The talk will survey what is known regarding policy iteration algorithms and mention many intriguing open problems.
We consider the problem of representing, in a space-efficient way, a function f: S -> Sigma such that any function value can be computed in constant time on a RAM. Specifically, our aim is to achieve space usage close to the 0th order entropy of the sequence of function values. Our technique works for any set S of machine words, without storing S, which is crucial for applications. Our contribution consists of two new techniques, of independent interest, that we use in combination with an existing result of Dietzfelbinger and Pagh (ICALP 2008). First of all, we introduce a way to support more space efficient approximate membership queries (Bloom filter functionality) with arbitrary false positive rate. Second, we present a variation of Huffman coding using approximate membership, providing an alternative that improves the classical bounds of Gallager (IEEE Trans. Information Theory, 1978) in some cases. The end result is an entropy-compressed function supporting constant time random access to values associated with a given set S. This improves both space and time compared to a recent result by Talbot and Talbot (ANALCO 2008). Joint work with Johannes Hreinsson and Morten Krøyer
Standard worst-case analysis of algorithms has often been criticized as overly pessimistic. As a remedy, some researchers have turned towards adaptive analysis where the cost of algorithms is measured as a function of not just the input size but other parameters, such as the output size. The ultimate in adaptive algorithms is an instance-optimal algorithm, i.e., an algorithm whose cost is at most a constant factor from the cost of any other algorithm running on the same input, for every input instance. In other words, an instance-optimal algorithm cannot be beaten by any other algorithm on any input. For many problems, this requirement is too stringent but we show that if we ignore the order of the input elements (i.e., we assume the input is given in the worst case order), then adaptive algorithms exist for many fundamental geometric problems such as convex hull. Thus, these convex hull algorithms are optimal with respect to all the measures of difficulty that are independent of the order, such as output-size, spread of the input point set or more complicated quantities like the expected size of the convex hull of a random sample. Joint work with: Jeremy Barbay Timothy M. Chan
Parameterized complexity aims to design exact algorithms whose running times depend on certain parameters of the input data that are naturally related to the problem at hand and in a way capture its complexity. A problem is called fixed-parameter tractable (FPT) with respect to a parameter k if there is an efficient algorithm to solve the problem for the cases where the parameter k is small. Another objective of this theory is to show that such algorithms are unlikely to exist for certain problems (and parameters). Not many geometric problems have been studied from the parameterized complexity point of view. Most research has focused on special (combinatorial) parameters for geometric problems, like, e.g., the number of inner points (i.e., points in the interior of the convex hull) for the TSP problem or for the problem of computing minimum convex decompositions. Also, on the negative side, only few connections between geometric problems and known hard parameterized problems are known to date. We provide a brief tour of results from parameterized complexity theory for various geometric problems (e.g. hyperplane depth, clustering) with a focus on the dimension as the parameter. Our results indicate that all these problems are inherently difficult in higher dimensions. Joint work with: Sergio Cabello Panos Giannopoulos Günter Rote
In recent years, a number of cache-oblvious range search structures for 3-sided range reporting in the plane, 3-d dominance reporting, and 3-d halfspace range reporting with the optimal query bound of O(logB N + K/B) I/O's have been developed. These structures use O(N log N) space, while the best cache-aware structures for these problems use (almost) linear space. This raises the question whether linear-space cache-oblivious structures with the optimal query bound exist for these problems. We give a negative answer to this question by proving that Ω(N (log log N)ε) space is necessary to achieve the optimal query bound cache-obliviously. This is the first result that provides a gap between the resource consumption of a cache-oblivious algorithm or data structure and its cache-aware counterpart that grows with the input size.
Let (S, d) be a finite metric space, where each element p . S has a non-negative weight wt(p). We study t-spanners for the set S with respect to the following weighted distance function d.: d.(p, q) = 0 if p = q, dw(p,q) = wt(p) + d(p, q) + wt(q) if p <> q. We present a general method for turning spanners with respect to the d-metric into spanners with respect to the d.-metric. For any given . > 0, we can apply our method to obtain (5+ .)-spanners with a linear number of edges for three cases: points in Euclidean space Rd, points in spaces of bounded doubling dimension, and points on the boundary of a convex body in Rd where d is the geodesic distance function. We also describe an alternative method that leads to (2+.)-spanners for points in Rd and for points on the boundary of a convex body in Rd. The number of edges in these spanners is O(n log n). This bound on the stretch factor is nearly optimal: in any finite metric space and for any . > 0, it is possible to assign weights to the elements such that any non-complete graph has stretch factor larger than 2 . ..
Model checking is a popular approach for formal verification of reactive systems. However, usage of this method is limited by so-called state space explosion. One way to cope with this problem is to represent the model and the state space symbolically by using Binary Decision Diagrams (BDDs). Unfortunately, during the computation the BDD can become too large to fit into the available main memory and it becomes essential to minimize the number of I/O operations. In this presentation, we will first talk about model checking and related problems in general. After that we will give a short introduction to BDDs and we will discuss their usage in symbolic model checking. Finally, we will briefly mention previous work on I/O-efficient BDD manipulation and we will talk about our research in this area. We will present general idea of our new vector existential quantification algorithm which is the first I/O-efficient algorithm to solve this or similarly complex problem connected to BDDs.
The B-tree is a fundamental external index structure that is widely used for answering one-dimensional range reporting queries. Given a set of N keys, a range query can be answered in O(logB N/M + K/B) I/Os, where B is the disk block size, K the output size, and M the size of the main memory buffer. When keys are inserted or deleted, the B-tree is updated in O(logB N) I/Os, if we require the resulting changes to be committed to disk right away. Otherwise, the memory buffer can be used to buffer the recent updates, and changes can be written to disk in batches, which significantly lowers the amortized update cost. A systematic way of batching up updates is to use the logarithmic method, combined with fractional cascading, resulting in a dynamic B-tree that supports insertions in O(1/B log N/M) I/Os and queries in O(log N/M + K/B) I/Os. Such bounds have also been matched by several known dynamic B-tree variants in the database literature. Note that, however, the query cost of these dynamic B-trees is substantially worse than the O(logB N/M + K/B) bound of the static B-tree by a factor of O(log B). In this paper, we prove that for any dynamic one dimensional range query index structure with query cost O(q + K/B) and amortized insertion cost O(u/B), the tradeoff q log(u/q) = Ω(log B) must hold if q = O(log B). For most reasonable values of the parameters, we have N/M = BO(1), in which case our query-insertion tradeoff implies that the bounds mentioned above are already optimal. We also prove a lower bound of u log q = Ω(log B), which is relevant for larger values of q. Our lower bounds hold in a dynamic version of the indexability model, which is of independent interests. Dynamic indexability is a clean yet powerful model for studying dynamic indexing problems, and can potentially lead to more interesting complexity results.
Real algebraic numbers are the numbers that are real solutions of a univariate polynomial with integer coefficients, and they are usually represented using the isolating interval representation, that is a square-free polynomial and an isolating interval. To construct such numbers we need to isolate the real roots of a univariate polynomial. We will present some recent complexity and algorithmic results for the problem of real root isolation, and we will sketch on going work on random polynomials. Algebraic algorithms, and in particular computations with real algebraic numbers and (real) solving of univariate polynomials and polynomial systems, are powerful tools that could be used in many applications. We will present some recent advances in the problem of symmetric tensor decomposition, and in non-linear computational geometry. The latter includes the arrangement of conic arcs, the Voronoi diagram of convex smooth pseudo-circles. and the computation of the topology of a real plane algebraic curve. If time permits, we will also present algorithms that given a convex lattice polygon, test whether it can be decomposed into a Minkowski sum of two other such polygons and if so, find one or all such decompositions. The problem is closely related to the problem of bivariate polynomial factorization.
Abstract: Through the example of the BIG MATCH, we give a gentle invitation to the theory of two-player zero-sum stochastic games with limiting average payoffs. Such games were shown to have values in a celebrated 1981-paper by Mertens and Neyman, building on earlier work of Bewley and Kohlberg. A key ingredient of this earlier work is an elegant use of the Tarski transfer theorem. In 2009, Solan and Vielle published an algorithmic version of the theorem of Mertens and Neyman, but without any non-trivial upper bound on the computational resources used by the algorithm. Such a bound would seem to require new theorems of semi-algebraic geometry. We shall discuss as much of the above as the patience of the audience and the ability of the lecturer permits, and save the rest of the story for a later meeting.
The founders of Google introduced the PageRank algorithm that computes an estimate of the popularity of each web page based solely on the link structure of the web graph - these estimates are the so called PageRank values. A page will achieve one of the top spots of search results if it has a high PageRank value and it matches the search criteria for the actual Google search. For a given node t in a directed graph G(V,E) and a positive integer k we study the problem of computing a set of k new links pointing to t - so called backlinks to t - producing the maximum increase in the PageRank value of t. This problem is known as "Link Building" in the www context. We present a theorem showing how the topology of the graph comes in to play when evaluating potential new backlinks. Based on the theorem we show that no FPTAS exists for Link Building under the assumption NP<>P and we also show that Link Building is W[1]-hard - strongly suggesting that Link Building is not fixed parameter tractable. We prove these results by reduction from the independent set problem on regular graphs. Finally, we use the theorem to characterize sets of backlinks producing a significant increase in the PageRank value of t.
Semi-algebraic geometry is concerned with the set of real solutions of systems of polynomial equations and inequalities, called semi-algebraic sets. In this lecture we will introduce semi-algebraic sets from basics. Next we will prove the Tarski-Seidenberg theorem, that semi-algebraic sets in (n+1)-dimensional space projects to semi- algebraic sets in n-dimensional space. Coupled with parametric versions of real root counting techniques such as Sturm sequences (that we will also cover from basics) this gives us a decision procedure for determining if the given system has a real solution. This lecture marks the beginning of a series of lectures on semi- algebraic geometry. The intent is that it will over time transform into a reading/lecture series working towards research problems.
In this talk we shall classify several group-theoretic computational problems into the classes PZK and SZK (problems with perfect/statistical zero-knowledge proofs respectively). Previously, these problems were known to be in AM and coAM. As PZK and SZK are subsets of AM as well as coAM, we have a tighter upper bound for these problems. Specifically: 1) We will show that the permutation group problems Coset Intersection, Double Coset Membership, Group Conjugacy are in PZK. We will also show that permutation group isomorphism for solvable groups is in PZK. As an ingredient of this protocol, we design a randomized algorithm for sampling short presentations of solvable permutation groups. 2) We shall show that the above problems for black-box groups are in SZK. This talk is based on a joint work with V. Arvind.
A terrain S is the graph of a bivariate function. We assume that S is represented as a triangulated surface with n vertices. A "contour" of S is a connected component of a level set of S. Generically, each contour is a closed polygonal curve; at "critical" levels these curves may touch each other. We present I/O-efficient algorithms for the following two problems related to computing contours of S: Given two real parameters h and d > 0, we present an I/O-optimal algorithm that report all contours of S at heights h + kd, for every positive integer k, using O(Sort(N)+T/B) I/Os, where T is the total number edges in the output contours, B is the "block size," and Sort(N) is the number of I/Os needed to sort N elements. The algorithm uses O(N/B) disk blocks. Each contour is generated individually with its composing segments sorted in clockwise order. We can preprocess S, using O(Sort(N)) I/Os, into a linear-size data structure so that all contours at a given height can be reported using O(logB N + T/B) I/Os, where T is the output size. Each contour is generated individually with its composing segments sorted in clockwise order. Joint work with: Lars Arge, Pankaj K. Agarwal, and Bardia Sadri
The problem of estimating the pth moment Fp (p nonnegative and real) in data streams is as follows. There is a vector x which starts at 0, and many updates of the form xi <-- xi + v come sequentially in a stream. The algorithm also receives an error parameter 0 < eps < 1. The goal is then to output an approximation with relative error at most eps to Fp = ||x||pp. Previously, it was known that polylogarithmic space (in the vector length n) was achievable if and only if p <= 2. We make several new contributions in this regime, including: (*) An optimal space algorithm for 0 < p < 2, which, unlike previous algorithms which had optimal dependence on 1/eps but sub-optimal dependence on n, does not rely on Nisan's PRG. (*) A near-optimal space algorithm for p = 0 with optimal update and query time. (*) A near-optimal space algorithm for the "distinct elements" problem (p = 0 and all updates have v = 1) with optimal update and query time. (*) Improved L2 --> L2 dimensionality reduction in a stream. (*) New 1-pass lower bounds to show optimality and near-optimality of our algorithms, as well as of some previous algorithms (the "AMS sketch" for p = 2, and the L1-difference algorithm of Feigenbaum et al.). As corollaries of our work, we also obtain a few separations in the complexity of moment estimation problems: F0 in 1 pass vs. 2 passes, p = 0 vs. p > 0, and F0 with strictly positive updates vs. arbitrary updates. Joint work with Daniel Kane (Harvard) and David Woodruff (IBM Almaden).
Compressed sensing is a method for reconstructing a sparse signal from a low-rank linear sketch. This is useful because many real-world signals have sparse representations in some basis; for example, images are sparse in the wavelet basis and music is sparse in the Fourier basis. In the past five years, a variety of sketch matrices and reconstruction techniques have emerged that can efficiently and stably recover an N-dimensional vector with K non-zero components from O(K log N/K) linear measurements. We prove that any stable recovery scheme requires Ω(K log N/K) linear measurements, matching the upper bound. This contrasts with the Theta(K) measurements required for unstable recovery.
Given a set P of n points in d-dimensional space, a unit- disk graph is a graph built with P as the vertex set, by connecting two vertices (i.e., points) if and only if their Euclidean distance is at most one. We are interested in problems related to the maximum cliques in a unit-disk graph. These graphs have numerous applications (for instance, they can be used to model wireless networks) and are well studied in computational geometry literature and other fields. In this talk, after a brief introduction, we will survey a few previous results on this problem. First, we will review how the algorithmic problem of computing the maximum clique corresponds to the following non-algorithmic covering problem: for Rd, find the minimum value of k and a set S of k objects of diameter one, such that *any* shape of diameter one can be covered using k elements of S. For d=2 the answer is two; for d=3 the best result so far is five and no non-trivial result is known for higher dimensions. We will also examine a related problem of embedding graphs as unit- disk graphs including some embedding results. References * Finding k points with minimum diameter and related problems Alok Aggarwal and Hiroshi Imai and Naoki Katoh and Subhash Suri, Journal of Algorithms, volume 12, pages: 38 - 56, 1991 * Approximation algorithms for maximum cliques in 3D unit-disk graphs, Peyman Afshani and Timothy Chan, CCCG, pages 6--9, 2005 * Approximation and inapproximability results for maximum clique of disc graphs in high dimensions, Peyman Afshani and Hamed Hatami, Information Processing Letters, volume 105, pages 83--87, January 2008 * On finding a large number of 3D points with a small diameter, Minghui Jiang, Discrete Applied Mathematics, 155:2355-2361, 2007
We present a new (1+ε)-spanner for sets of n points in Rd. Our spanner has size O(n/εd-1) and maximum degree O(logd n). The main advantage of our spanner is that it can be maintained efficiently as the points move: Assuming the trajectories of the points can be described by bounded-degree polynomials, the number of topological changes to the spanner is O(n2/εd-1), and using a supporting data structure of size O(nlogdn) we can handle events in time O(logd+1 n). Moreover, the spanner can be updated in time O(log n) if the flight plan of a point changes. This is the first kinetic spanner for points in Rd whose performance does not depend on the spread of the point set.
The goal of coordination mechanism is to design policies in such a way that the policies reflect the social cost in the individual costs so that selfish agents' behaviors result in a socially desirable solution. In the talk, we are interested in coordination mechanism in scheduling games where every player has to choose a machine on which to execute its job. The cost of a player is the completion time of its job and the social cost is the makespan - the largest load over all machines. Most prior studied policies depend on the processing times and need the jobs to announce their processing times. The jobs could try to influence the schedule to their advantage by misreporting their processing times. In our work, we study the so-called non-clairvoyant policies policies do not require knowledge on the processing times of jobs - in scheduling games. We introduce a non-clairvoyant policy under which the scheduling game always admits an equilibrium and the price of anarchy matches that of the best (strongly local) clairvoyant policy. Joint work with Christoph Durr.
In this talk I will show how we improve both the upper and lower bounds for three-dimensional range search indexing, the problem of storing a set of points in three dimensions such that the points in a three-dimensional axis-parallel query hyper-rectangle can be found efficiently. I first describe a disk based index structure for three-dimensional range searching that answers queries in optimal O(logB N+T/B) I/Os using O(N (log N/(loglogB N))3) space, where B is the disk block size, N the number of points, and T the query output size. The previously best known structure uses O(N (log N)3) space. I will also describe improved structures for several infinite range variants of the problem. Next I will show how we apply the theory of indexability to show that any d-dimensional range search index answering queries in O(PolyLog N+T/B) I/Os has to use Ω(N (log N/(loglogB N))d-1) space. The previously best known lower bound was Ω(N (log B /(loglogB N))d-1) space. Our results narrows the space gab between the lower and upper bound to a factor of log N/loglogB N, thus moving us closer to optimal three-dimensional range search indexing.
A Kakeya set is any set of points in n-Euclidean space which contains a unit line segment in every direction. The famous Kakeya conjecture is that such sets must have Hausdorff dimension and Minkowski dimension n. In this talk we consider the finite field version of the Kakeya question, and cover a recent optimal lower bound on the size of Kakeya sets due to Zeev Dvir. We will next cover applications to the problem of randomness extraction studied in theoretical computer science. This problem is the following: We are given access to a source of "dirty" randomness with a guarantee that it has min-entropy k. Using a short seed of pure random bits we wish to transform the source of "dirty" randomness into a source of "pure" randomness, i.e close in statiscal distance to pure random bits. The talk will be based on the following papers (available at http://www.math.ias.edu/ dvir/papers/lop.html). * Z. Dvir. On the size of Kakeya sets in finite fields. J. Amer. Math. Soc. (to appear), 2008. * Z. Dvir and A. Wigderson. Kakeya sets, new mergers and old extractors. In FOCS '08, 2008. * Z. Dvir, S. Kopparty, S. Saraf, and M. Sudan. Extensions to the method of multiplicities, with applications to kakeya sets and mergers.
R-trees are a class of spatial index structures in which objects are arranged to enable fast window queries: report all objects that intersect a given query window. One of the most successful methods of arranging the objects in the index structure is based on sorting them along a space-filling curve. In this talk, I will discuss how the choice of space-filling curve influences the query performance of such index structures. We take a look at two types of input objects. For sets of points in the plane we can qualify the efficiency of two-dimensional space-filling curves using quality measures. We develop new measures and prove general lower bounds for a number of cases. I will also discuss the results of our approximation algorithm for such measures for a number of space-filling curves. For sets of rectangles in the plane we propose new four-dimensional space-filling curves and test their performance on several real-world and synthetic data sets. The new curves combine the strengths of earlier approaches based on two- and four-dimensional curves, while avoiding their apparent weaknesses. Joint work with Herman Haverkort.
Flash memory based solid-state disks are fast becoming the dominant form of end-user storage technology for mobile devices, partly even replacing the traditional hard disks. The I/O characteristics of the currently available solid-state disks fundamentally differ from those of the hard disks. In this talk, I will present new computation models that can capture the intricacies of the flash devices better. I will then discuss the relationship between the proposed models and the existing memory hierarchy models and show how a large body of merging-based external memory algorithms can be adapted to the flash models. Joint work with Andreas Beckmann, Ulrich Meyer, Gabriel Moruz and Riko Jacob.
The Tutte polynomial sits in the intersection of enumerative combinatorics, graph theory, and statistical physics. Depending on who looks at it, it counts the colourings of a graph, its number of spanning forests, the reliability of a network, the topology of a knot, and many other things; perhaps most importantly, it computes the partition function of the Ising, Potts, and random cluster model. All in all, tens of thousands of research papers have been written about it in some guise or another. In the first half of my talk I will spend some time introducing the Tutte polynomial in several ways. For the algorithmic audience this will go via the deletion-contraction algorithm. For the more algebraically inclined I will approach it via the chromatic polynomial from algebraic graph theory. I will also show the connection to statistical mechanics via the Potts model, demonstrating all the theoretical physics I know, which isn't a lot. Then I will show how the Tutte polynomial for a given graph can be computed in time exponential in the number of vertices. (It's straightforward to do this in time exponential in the number of edges.) Joint work with A. Björklund (Lund), P. Kaski (Helsinki), and M. Koivisto (Helsinki), FOCS 2008
Halfspace range reporting is a data structure problem in computational geometry where we are to preprocess an input set of point in a data structure that is capable of answering the following queries: report all the points inside a given halfspace. This fundamental problem has been studied for more than 25 years, and during this time many efficient solutions were given for the problem in three dimensions, but unfortunately none of them was optimal. In this talk, after a brief introduction to the problem, we will describe a recent simple idea which allows us to obtain the first optimal data structure for this problem. We will also investigate the implications of our techniques in higher dimensions and in the I/O model. Joint work Timothy Chan.
This work discusses extensions and results on the price of anarchy for network congestion games that in general quantify the loss of efficiency due to lack of central control. In this work, the users in the system have different aims (utility functions). In particular we discuss the impact of malicious and oblivious behavior in routing games; a malicious user attempts to increase congestion for a system, while an oblivious user acts without any knowledge of the congestion of the network. Speaker's short bio: taso viglas is a senior lecturer at the university of sydney. taso received his PhD from Princeton University in 2002, and was a postoctoral fellow at the University of Toronto for two years, before joining the University of Sydney in 2004. taso's main research interests include computational complexity theory and algorithmic game theory.
We explore various techniques to compress a permutation π over n integers, taking advantage of ordered subsequences in π, while supporting its application π(i) and the application of its inverse π-1(i) in small time. Our compression schemes yield several interesting byproducts, in many cases matching, improving or extending the best existing results on applications such as the encoding of a permutation in order to support iterated applications πk(i) of it, of integer functions, and of inverted lists and suffix arrays. Joint work with Gonzalo Navarro
We design a new P2P data structure, called the Deterministic Distributed tree (DDtree). The DDtree compares favourably to other designs for the following reasons: a) it divides the overlay structure of the P2P environment from the actual elements stored in it and b) it provides better complexities (which are deterministic) compared to all previous solutions. Additionally, the division between elements and nodes results in a load balancing problem in which we have provided an innovative and very efficient solution.This load-balancing scheme can also be applied to any other tree structure in a P2P environment. Finally, a small discussion on models of P2P Networks is initiated.
In an array of n numbers each of the (n \choose 2)+n contiguous subarrays define a sum. In this paper we focus on algorithms for selecting and reporting maximal sums from an array of numbers. First, we consider the problem of reporting k subarrays inducing the k largest sums among all subarrays of length at least l and at most u. For this problem we design an optimal O(n+k) time algorithm. Secondly, we consider the problem of selecting a subarray storing the k'th largest sum. For this problem we prove a time bound of Θ(n · max{1,log (k/n)}) by describing an algorithm with this running time and by proving a matching lower bound. Finally, we combine the ideas and obtain an O(n· max{1,log (k/n)}) time algorithm that selects a subarray storing the k'th largest sum among all subarrays of length at least l and at most u.
Bounded volume hierarchies (BVHs) are efficient and versatile search data structures for geometric data, which are similar to binary space partitions (BSPs). Like BSPs, they store the geometric data in the leafs of a search tree. While each non-leaf node of a BSP stores a half-plane that decides what data is stored in which sub-tree, the non-leaf nodes of a BVH stores a bounding volume of the data of each of its sub-trees. Different BVHs differ mostly in two ways, in the shape of the bounding volumes, and in the way they are constructed. The first criterion usually determines the name of a BVH. The simplest BVH is the box tree, which stores axis-aligned boxes as bounding volumes. Its strength is its simplicity and the efficiency of intersection tests with a box. Its weakness is input data aligned to a diagonal, which may cause linear instead of the usual logarithmic query running time. A c-DOP tree stores convex polygonal bounding volumes whose sides are parallel to a set of c pre-determined directions. c-DOP trees have better worst-case query running times than box trees, but lack the simplicity of box trees and are therefore often slower. We introduce a new BVH, the c-rotating-box tree (c-rb tree), which is a compromise of box tree and c-DOP tree. Every bounding volume stored in a c-rb tree is a box, but they are not necessarily axis-aligned. The boxes are aligned to c pre-determined directions. The c-rb tree allows more efficient intersection operations than c-DOP trees and has the same worst-case behavior. We prove that c-rb trees indeed have the same worst case complexity as c-DOP trees, and we compare several variants of the three data structures by experiments. Joint work with Mark de Berg.
We consider enhancing a sealed-bid single-item auction with privacy concerns, our assumption being that bidders primarily care about monetary payoff and secondarily worry about exposing information about their type to other players and learning information about other players? types. To treat privacy explicitly within the game theoretic context, we put forward a novel hybrid utility model that considers both fiscal and privacy components in the players? payoffs. We show how to use rational cryptography to approximately implement a given ex interim individually strictly rational equilibrium of such an auction (or any game with a winner) without a trusted mediator through a cryptographic protocol that uses only point-to-point authenticated channels between the players. By ex interim individually strictly rational we mean that, given its type and before making its move, each player has a strictly positive expected utility, i.e., it becomes the winner of the auction with positive probability. By approximately implement we mean that, under cryptographic assump- tions, running the protocol is a computational Nash equilibrium with a payoff profile negligibly close to the original equilibrium. In addition the protocol has the stronger property that no collusion, of any size, can obtain more by deviating in the implementation than by deviating in the ideal mediated setting which the mechanism was designed in. Also, despite the non-symmetric payoffs profile, the protocol always correctly terminates. Joint work with Nikolaos Triandopoulos and Peter Bro Miltersen.
We study the following extension of the static one-dimensional range reporting problem. For an array A of n elements, build a data structure that supports the query: Given two indices i.j and an integer k, report the k smallest elements in the sub array A[i..j] in sorted order. We present a static data structure that uses O(n) words of space, supports queries in O(k) time, and can be constructed in O(n log n) time on the RAM model. We also extend the data structure to solve the online version of the problem where the elements in A[i..j] are reported in sorted order one-by-one, each element being reported in O(1) worst-case time. The data structure has applications to e.g. top-k queries in databases, prioritized suffix tree reporting, and three-sided planar sorted range reporting.
We revisit Washburn's deterministic graphical games, a natural generalization of the perfect information win/lose games commonly solved by retrograde analysis. We study the complexity of solving deterministic graphical games and obtain an almost-linear time comparison-based algorithm for computing an equilibrium of such a game. The existence of a linear time comparison-based algorithm remains an open problem. Joint work with Kristoffer Arnsfelt Hansen, Peter Bro Miltersen, and Troels Bjerre Sørensen.
I will present a simple algorithm which maintains the topological order of a directed acyclic graph with n nodes under an online edge insertion sequence in O(n2.75) time, independent of the number of edges inserted. I will then show an average case analysis of incremental topological ordering algorithms. After discussing some recent advances in this area, I will talk about dynamic topological ordering in the external memory model. Joint work with Tobias Friedrich, MPI and Ulrich Meyer, J.W. Goethe Universität
We study social cost losses in Facility Location games, where n selfish agents install facilities over a network and connect to them, so as to forward their local demand (expressed by a non-negative weight per agent). Agents using the same facility share fairly its installation cost, but every agent pays individually a (weighted) connection cost to the chosen location. We study the Price of Stability (PoS) of pure Nash equilibria and the Price of Anarchy of strong equilibria (SPoA), that generalize pure equilibria by being resilient to coalitional deviations. A special case of recently studied network design games, Facility Location merits separate study as a classic model with numerous applications and individual characteristics: our analysis for unweighted agents on metric networks reveals constant upper and lower bounds for the PoS, while an O(ln n) upper bound implied by previous work is tight for non-metric networks. Strong equilibria do not always exist, even for the unweighted metric case. For this case we prove existence of 2.36-approximate strong equilibria, whereas in general we show that e-approximate strong equilibria exist (e=2.718...). The SPoA is generally upper bounded by O(ln W) (W is the sum of agents' weights), which becomes tight for unweighted agents. For the unweighted metric case we prove a constant upper bound. We point out challenging open questions that arise. Joint work with Thomas Dueholm Hansen.
Let X = x1 x2 ... xn be a string over a finite, ordered alphabet S. A secondary index for X answers alphabet range queries of the form: Given a range [1,r] over S, return the set I[1,r] = {i | xi ∈ [l,r]}. Secondary indexes are heavily used in relational databases and scientific data analysis. It is well-known that the obvious solution of storing a dictionary for the position set associated with each character, does not always give optimal query time. In this paper we give the first theoretically optimal data structure for the secondary indexing problem. In the I/O model, the amount of data read when answering a query is within a constant factor of the minimum space needed to represent I[l,r], assuming that the size of internal memory is SΩ(1) blocks. The space usage is Ø(n log |S|) bits in the worst case, and we also show how to bound the size of the data structure in terms of the 0th order entropy of X. Updates can be done in essentially the same time bound as for buffered B-trees. We also consider an approximate version of the basic secondary indexing problem, where a query reports a superset of I[l,r] containing each element not in I[l,r] with probability at most ε, where ε > 0 is the false positive probability. For this problem the amount of data that needs to be read by the query algorithm is reduced to |I[l,r]| log (1/ε) bits.
We consider some well known families of two-player, zero-sum, turn-based, perfect information games that can be viewed as specical cases of Shapley's stochastic games. Strengthening a theorem of Zwick and Paterson, we show that the following tasks are polynomial time equivalent: 1. Solving simple stochastic games, 2. Solving stochastic mean-payoff games with rewards and probabilities given in unary, and 3. Solving stochastic mean-payoff games with rewards and probabilities given in binary. Joint work with Vladimir Gurvich.
Tagline: "I am a Bear of Very Little Brain, and long words bother me." -- Winnie the Pooh Abstract: A minimal perfect hash function maps a set S of n keys into the set {0, 1, ..., n - 1 } bijectively. Classical results state that minimal perfect hashing is possible in constant time using a structure occupying space close to the lower bound of log e bits per element. Here we consider the problem of monotone minimal perfect hashing, in which the bijection is required to preserve the lexicographical ordering of the keys. A monotone minimal perfect hash function can be seen as a very weak form of index that provides just ranking on the set S (and answers randomly outside of S). Our goal is to minimise the description size of the hash function: we show that, for a set S of n elements out of a universe of 2w elements, O (nlog log w) bits are sufficient to hash monotonically in time O (log w). Alternatively, we can get space O (n log w) bits with O (1) query time. Both of these data structures improve a straightforward construction with O (n log w) space and O (log w) query time. As a consequence, it is possible to search a sorted table with O (1) accesses to the table (using additional O (n log log w) bits). Our results are based on a structure (of independent interest) that represents very compactly a trie, but allows for errors. As a further application of the same structure, we show how to compute the predecessor (in the sorted order of S) of an arbitrary element, using O (1) accesses in expectation and an index of O (n log w) bits, improving the trivial result of O (nw) bits. This implies an efficient index for searching a blocked memory. Joint work with Djamal Belazzougui, Paolo Boldi, and Sebastiano Vigna. To appear at SODA '09.
This paper studies real-world road networks from an algorithmic perspective, motivated from empirical studies that yield useful properties of road networks that can be exploited in the design of fast algorithms that deal with geographic data. Unlike previous approaches, our study is not based on the assumption that road networks are planar graphs. Indeed, based on the number of experiments we have performed on the road networks of the 50 United States and District of Columbia, we provide strong empirical evidence that road networks are quite non-planar. Our approach therefore instead is directed at finding algorithmically-motivated properties of road networks as non-planar geometric graphs, focusing on alternative properties of road networks that can still lead to efficient algorithms for such problems as shortest paths and Voronoi diagrams. Our empirical analysis uses the U.S. TIGER/Line road network database, as provided by the Ninth DIMACS Implementation Challenge, which is comprised of over 24 million vertices and 29 million edges.
We study a simple greedy tree packing of a graph and use it to derive better algorithms for fully-dynamic min-cut and for the static k-way cut problem. A greedy tree packing is a sequence of spanning tree where each new tree is a minimum spanning tree with respect to the edge loads from the previous trees, that is, the load of an edge is the number of times it has been used by the previous trees. A minimum k-way cut is a minimum set of edges whose removal splits the graph in k components. A min-cut is a minimum 2-way cut. If the (unknown) edge connectivity of the graph is c, we show that if we pack c7*log3 m trees, then some min-cut is crossed exactly once by some tree. This leads to the best fully-dynamic min-cut algorithm (presented at STOC'01) If we pack k3*log n trees, then every minimum k-way cut is crossed 2k-2 times by some tree. This leads to the best determinstic algorithm for k-way cut (presented at STOC'08).
We observe that many important computational problems in NC1 share a simple self-reducibility property. We then show that, for any problem A having this self-reducibility property, A has polynomial size TC0 circuits if and only if it has TC0 circuits of size n1+ε for every ε > 0 (counting the number of wires in a circuit as the size of the circuit). As an example of what this observation yields, consider the Boolean Formula Evaluation problem (BFE), which is complete for NC1 and has the self-reducibility property. It follows from a lower bound of Impagliazzo, Paturi, and Saks, that BFE requires depth d TC0 circuits of size n1+εd. If one were able to improve this lower bound to show that there is some constant ε>0 such that every TC0 circuit family recognizing BFE has size n1+ε, then it would follow that TC0 ≠ NC1. We show that proving lower bounds of the form n1+ε is not ruled out by the Natural Proof framework of Razborov and Rudich and hence there is currently no known barrier for separating classes such as ACC0, TC0 and NC1 via existing ``natural'' approaches to proving circuit lower bounds. We also show that problems with small uniform constant-depth circuits have algorithms that simultaneously have small space and time bounds. We then make use of known time-space tradeoff lower bounds to show that SAT requires uniform depth d TC0 and AC0[6] circuits of size n1+c for some constant c depending on d. Joint work with Eric Allender
Suppose you have a fixed set of n numbers and you want to create a data structure that allows a fast search in this set. How fast can such a search be? Well it depends on what computer you have. But theoreticians don't have computers, instead we have models of computation. But what is a model of computation other that saying what things you can do in what amount of time? Being able to set the rules by which our algorithms are evaluated should make it easier to design fast algorithms. Traditionally, models of computation have had some relation to actual computers, but insisting on this relationship can result in slower algorithms than if one is unconstrained by reality. In particular, we discuss how non-standard, but plausible, CPU and memory designs can allow the design of asymptotically faster fundamental algorithms.
Consider a point set D with a measure function μ : D → R. Let A be the set of subsets of D induced by containment in a shape from some geometric family (e.g. axis-aligned rectangles, half planes, balls, k- oriented polygons). We say a range space (D, A) has an ε-sample (a.k.a. ε-approximation) P if
The B-tree is the classic external-memory-dictionary data structure. The B-tree is typically analyzed in a two-level memory model (called the DAM or I/O model) in which internal memory of size M is organized into size-B blocks, and there is an arbitrarily large external memory. The cost in the model is the number of block transfers between internal and external memory. An N-element B-tree supports supports searches, insertions, and deletions in O(logB N) block transfers. In fact, there is a tradeoff between the cost of searching and inserting in external-memory dictionaries [Brodal,Fagerberg 03], and the B-tree achieves only one point on this tradeoff. A more general tradeoff is exhibited by their buffered B-tree. This talk presents two points on the insert/search tradeoff for cache-oblivious (CO) data structures, the "cache-oblivious lookahead array (COLA)," and the "shuttle tree." The CO model is similar to the DAM model, except that the block size B and memory size M are unknown to the algorithm. The buffered B-tree is not cache oblivious---buffer sizes are chosen according to B. The COLA implements searches in O(log2 N) I/Os and inserts in amortized O( (log2 N) / B) I/Os. Notice that the searches are worse than in the B-tree by a log2(B) factor, but the inserts are better by a B/log2(B) factor. These bounds represent one optimal point on insert/search tradeoff space. In fact, when made into a cache-aware data structure, the lookahead array achieves the same tradeoff as the buffered B-tree. The shuttle tree implements searches in the optimal O(logB N) block transfers. Inserts cost o(logB N) block transfers, improving on the B-tree by a superconstant factor. Joint work with: Michael A. Bender, Martin Farach-Colton, Yonatan R. Fogel, Bradley C. Kuszmaul, and Jelani Nelson.
We show that generating all negative cycles of a weighted graph is hard in both directed and undirected cases. More precisely, all negative cycles cannot be generated in time polynomial in the number of such cycles, unless P=NP. As a corollary we solve in the negative two well-known generating problems from linear programming: * Given an infeasble system of linear inequalities, generating all minimal infeasible subsystems (so-called Helly subsystems) is hard. Yet, for generating maximal feasible subsystems the complexity remains open. * Given a feasible system of linear inequalities, generating all vertices of the corresponding polyhedron is hard. Yet, the problem reamins open in the case of bounded polyhedra. Joint work with L. Khachiyan, E. Boros, K. Borys, K. Elbassioni and H.R. Tiwary
It is shown that any weakly-skew circuit can be converted into a skew circuit with constant factor overhead, while preserving either syntactic or semantic multilinearity. This leads to considering syntactically multilinear algebraic branching programs (ABPs), which are defined by a natural read-once property. A 2n/4 size lower bound is proven for ordered syntactically multilinear ABPs computing an explicitly constructed multilinear polynomial in 2n variables. Without the ordering restriction a lower bound of level Ω(n3/2/log n) is observed, by considering the generalization of a hypercube covering problem by Galvin.
It is shown that any weakly-skew circuit can be converted into a skew circuit with constant factor overhead, while preserving either syntactic or semantic multilinearity. This leads to considering syntactically multilinear algebraic branching programs (ABPs), which are defined by a natural read-once property. A 2n/4 size lower bound is proven for ordered syntactically multilinear ABPs computing an explicitly constructed multilinear polynomial in 2n variables. Without the ordering restriction a lower bound of level Ω(n3/2/log n) is observed, by considering the generalization of a hypercube covering problem by Galvin.
The splay tree is a dynamic binary search tree that is conjectured to be universally efficient in the following way. On any sequence of accesses the splay tree is conjectured to take time within a constant factor of the optimal (offline) dynamic binary search tree. Splay trees are traditionally analyzed using potential functions or some other counting argument. In this talk I will present a new way to analyze splay trees (and dynamic data structures in general). The three-part strategy is to (1) transcribe the operations of the data structure as some combinatorial object, (2) show the object has some forbidden substructure, and (3) to prove upper bounds on the size of such a combinatorial object. As an example of this strategy, we show that splay trees execute a sequence of N deque operations (push, pop, inject, and eject) in O(Na* (N)) time, where a* is the iterated-inverse-Ackermann function. This bound is within a tiny a*(N) factor of that conjectured by Tarjan in 1985.
We prove that the weighted monotone circuit satisfiability problem has no fixed-parameter tractable approximation algorithm with constant or polylogarithmic approximation ratio unless FPT = W[P]. The parameterized complexity class FPT consists of all fixed-parameter tractable problems, and the class W[P] may be viewed as an analogue of the class NP in the world of parameterized complexity theory. Our result answers a question of Alekhnovich and Razborov, who proved that the weighted monotone circuit satisfiability problem has no fixed-parameter tractable 2-approximation algorithm unless every problem in the parameterized complexity class W[P] can be solved by a randomized fpt algorithm and asked whether their result can be derandomized. The decision version of the monotone circuit satisfiability problem is known to be complete for the class W[P]. By reducing them to the monotone circuit satisfiability problem with suitable approximation preserving reductions, we prove similar inapproximability results for all other natural minimisation problems known to be W[P]-complete. Joint work with Martin Grohe and Magdalena Grüber
At AAAI'07, Zinkevich, Bowling and Burch introduced the Range of Skill measure of a two-player game and used it as a parameter in the analysis of the running time of an algorithm for finding approximate solutions to such games. They suggested that the Range of Skill of a typical natural game is a small number, but only gave heuristic arguments for this. In this work, we provide the first methods for rigorously estimating the Range of Skill of a given game. We provide some general, asymptotic bounds that imply that the Range of Skill of a perfectly balanced game tree is almost exponential in its size (and doubly exponential in its depth). We also provide techniques that yield concrete bounds for unbalanced game trees and apply these to estimate the Range of Skill of Tic-Tac-Toe and Heads-Up Limit Texas Hold'em Poker. In particular, we show that the Range of Skill of Tic-Tac-Toe is more than 100,000. Joint work with Peter Bro Miltersen and Troels Bjerre Sørensen, to appear at AAI'08.
We revisit the question of persistence-sensitive simplification on 2-manifolds. We call function g an epsilon-simplification of function f if the L∞ distance between f and g is no more than epsilon, and the persistence diagrams of g are the same as those of f except all points within L1-distance at most epsilon from the diagonal are removed. We give an algorithm for constructing epsilon-simplifications that is considerably simpler than its predecessor, allows for hierarchical simplification, and results in a bounded subdivision of the domain. Joint work with Dominique Attali, Marc Glisse, Samuel Hornus, and Francis Lazarus.
When introducing the notion of an Evolutionarily Stable Strategy (ESS) to Game Theory, J. Maynard Smith and others simultaneously began a tradition for applying Game Theory to Evolutionary thinking. (The ESS describes a more restricted case of a Nash Equilibrium.) Game theory has since been an integral part of evolutionary biology. Expression such as evolution of cooperation and evolution of altruism can often be observed in the titles of evolutionary game theory. Obviously the Prisoner's Dilemma (PD) game is ubiquitous in evolutionary game theory, but also Maynard Smith's Hawk-Dove (HD) game has for example played a principal role. Unlike the PD game, it can result in a stable polymorphic state of the population containing both strategies. Here, I introduce a few diverse examples of game theoretical applications from biology ranging from competition among proliferating cells to competing mating strategies and species interactions. The former, constructed as the simplest possible n-player Prisoner's Dilemma game, showed interesting patterns such as bifurcation dynamics and hysteresis. Mating competition among male lizards has been shown to exhibit Rock-Scissors-Paper dynamics also allowing the population to simultaneously sustain all three strategies.
A Simple Stochastic Game is played by two players called Min and Max, moving turn by turn a pebble along edges of a graph. Some vertices are random and the next vertex is chosen randomly with fixed transition probabilities. Player Max wants the pebble to reach a special vertex called the target vertex. Solving a simple stochastic game consists in computing the maximal probability with which player Max can enforce the pebble to reach the target vertex. In this talk, we will review known algorithm for solving simple stochastic games and we will present a new algorithm especially efficient for games with few random vertices.
Boolean Operations on polyhedra are a basic building block of many geometric applications such as the visual hull, robot motion planning, computer-aided design and packing problems. Since exact geometry is considered to be slow, industrial applications usually rely on data structures and algorithms that are neither complete nor robust. We implemented Boolean operations on Nef polyhedra as a part of the Computational Geometry Algorithm Library (CGAL). The implementation is complete, exact and still fast enough to compete with industry products. In comparison to the widely used ACIS CAD kernel our CGAL implementation is about 4-7 times slower, while ACIS does not always supply correct results. In scenarios that are (close to) degenerate our Nef polyhedra can be faster than ACIS. Based on Nef polyhedra we present the first robust implementation of the Minkowski sum of two non-convex polyhedra.
Computational mechanism design (CMD) seeks to understand how to design games that induce desirable outcomes in multi-agent systems despite private information, self-interest and limited computational resources. CMD finds application in many settings, from allocating wireless spectrum and airport landing slots, to Internet advertising, to expressive sourcing in the supply chain, to allocating shared computational resources. In meeting the demands for CMD in these rich domains, we often need to bridge from the classic theory of economic mechanism design to the practice of deployable, scalable mechanisms. Of particular interest is the problem of designing coordination mechanisms for dynamic environments, with agent arrivals and departures and agents that experience changes to local information (e.g., about preferences or capabilities), or more generally changes to their local state. We can conceptualize these systems as loosely-coupled Markov Decision Processes (MDPs), with each agent's local problem modeled as a privately known and privately observable MDP and constraints on joint action profiles. This formulation also allows for environments in which agents are learning about their value for resources with "information-state MDPs" modeling Bayesian-optimal learning. I sketch a sequence of models for dynamic coordination problems, and an overview of how the family of Groves mechanisms can be generalized to these environments. In offering some remarks about computational challenges and opportunities in these domains, I will also outline a complementary direction in "computational ironing", which dynamically modifies online stochastic combinatorial optimization algorithms to provide them with appropriate monotonicity properties, and therefore make them non-manipulable. Time permitting, I will touch on some directions for future work. Joint work with Ruggiero Cavallo, Quang Duong, and Satinder Singh.
We present a tradeoff between the length of a 3-query probabilistically checkable proof of proximity (PCPP) and the best possible soundness obtained by querying it. Consider the task of distinguishing between ``good'' inputs w in 0,1n that are codewords of a linear error correcting core C over the binary alphabet and ``bad'' inputs that differ from every word of C on 1/2 of their bits. To perform this task, we allow oracle access to w and an auxiliary proof pi, however, we place the following limitations on our verifier: (i) it can read at most 3 bits from w and pi, (ii) it must accept every ``good'' input with probability one (when accompanied by a suitable proof), and (iii) its decision must be *linear* --- i.e., based on the parity of the sum of the queried bits. We notice that all known techniques for PCPP constructions yield verifiers for linear codes that satisfy these conditions, so our tradeoff applies to all of them. Our main result implies that for certain codes, every verifier accessing a proof of polynomial length (in the length of the input), will accept some ``bad'' words with probability at least 2/3. In contrast, if no limitation is placed on the proof length, we can construct a verifier that rejects any ``bad'' word with the largest possible probability of 1/2. In other words, the closer the rejection probability is to the best possible, the longer the proof our verifier will require. This tradeoff between proof length and soundness holds for any code that is not locally testable, including codes with large dual distance and most random Low Density Parity Check (LDPC) codes.
We develop a polynomial-time algorithm for finding the extensive form correlated equilibrium(EFCE) for multiplayer extensive games with perfect recall. The EFCE concept is defined by von Stengel and Forges (2007). We describe the set of EFCE with polynomial number of incentive constraints for multiplayer perfect-recall extensive games without chance moves. We explain that linear programming duality, the ellipsoid algorithm, and Markov chain steady state computations employed by Papadimitriou and Roughgarden (2007) for computing correlated equilibrium can also be used for computing EFCE. We explain how to apply this algorithm to multiplayer perfect-recall extensive games with chance moves. We also describe a possible interpretation of the variables in the dual system.
The index of a Nash equilibrium is an integer that is related to notions of ``stability'' of the equilibrium. The index is a relatively complicated topological notion, essentially a geometric orientation of the equilibrium. We prove the following theorem, first conjectured by Hofbauer (2003), which characterizes the index in much simpler strategic terms: Theorem. A generic bimatrix game has index +1 if and only if it can be made the unique equilibrium of an extended game with additional strategies of one player. In an mxn game, it suffices to add 3m strategies of the column player. The main tool to prove this theorem is a novel geometric-combinatorial method that we call the ``dual construction'', which we think is of interest of its own. It allows to visualize all equilibria of an mxn game in a diagram of dimension m-1. For example, all equilibria of a 3xn game are visualized with a diagram (essentially, of suitably connected n+3 points) in the plane. This should provide new insights into the geometry of Nash equilibria. Joint work with Arndt von Schemde.
We consider the problem of computing a minimum-distortion embedding of a finite metric space into a low-dimensional Euclidean space. It has been shown by Matousek [Mat90] that for any d≥ 1, any n-point metric can be embedded into Rd with distortion O(n2/d) via a random projection, and that in the worst case this bound is essentially optimal. This clearly also implies an O(n2/d)-approximation algorithm for minimizing the distortion. We show that for any fixed d≥ 2, there is no polynomial-time algorithm for embedding into Rd, with approximation ratio better than Omega(n1/(17d)), unless P=NP. Our result establishes that random projection is not too far, concerning the dependence on d, from the best possible approximation algorithm for this problem. Our proof uses a result from Combinatorial Topology due to Sarkaria, that characterizes the embeddability of a simplicial complex in terms of the chromatic number of a certain Kneser graph. We complement the above result by showing that for the special case where the input space is an ultrametric, there exists a polynomial-time algorithm for embedding into Rd with poly-logarithmic approximation ratio. Joink work with Jiri Matousek, and Krzysztof Onak.
This paper adapts two optimization routines to deal with objective functions for DSGE models. The optimization routines are i) a version of Simulated Annealing developed by Corana, Marchesi & Ridella (1987) and ii) the evolutionary algorithm CMA-ES developed by Hansen, Muller & Koumoutsakos (2003). Following these modifications we examine the ability of the two routines to maximize the likelihood function for a DSGE model. Our results show that the CMA-ES routine clearly outperforms Simulated Annealing in its ability to find the global optimum and in efficiency. With 10 unknown structural parameters in the likelihood function the CMA-ES routine is able to find the global optimum in 95
The optimal consumption, portfolio, house size and labor choice of an economic agent is analyzed in a continuous-time, continuous-state model that allows for stochastic house prices, wage and interest rates. The wage rates and the house prices can be instantaneously correlated with both interest rates, bond prices and stock prices. The model allows for the expected wage rate growth to be an affine function of the real short-term interest rate in order to encompass business cycle variations. In this setup I show how the optimal stock/bond/cash allocation of a long-term investor is affected in the presence of housing. In the absence of portfolio restrictions, liquidity constraints and when both the wage rate and the house prices is spanned by the financial market, i.e. when the economy is complete, I derive the solution in close-form and state the optimal strategies. The model is analyzed by numerical methods when the economy is incomplete.
We present a cache-oblivious algorithm for computing single-source shortest paths in undirected graphs with non-negative edge lengths. The algorithm incurs O(.(nm log w)/B+(m/B) log n +MST (n, m)) memory transfers on a graph with n vertices, m edges, and real edge lengths between 1 and W; B denotes the cache block size, and MST(n, m) denotes the number of memory transfers required to compute a minimum spanning tree of a graph with n vertices and m edges. Our algorithm is the first cache-oblivious shortest-path algorithm incurring less than one memory transfer per vertex if the graph is sparse (m = O(n)) and W = 2B).
We examine the problem of integer representation in near minimal number of bits so that the increment and the decrement (and indeed the addition and the subtraction) operations can be performed using few bit inspections and fewer bit changes. In particular, we prove a new lower bound of Ω(sqrt(n)) for the increment and the decrement operation, where n is the minimum number of bits required to represent the number. The model of computation we considered is the bit probe model, where the complexity measure counts only the bitwise accesses to the data structure. We present several efficient data structures to represent integer that use a logarithmic number of bit inspections and a constant number of bit changes per operation. Finally, we present an extension to our data structure to support a special form of addition and subtraction while retaining the same asymptotic bounds for the increment and the decrement operations.
A Nash equilibrium of a noncooperative game is a pair of strategies such that no player has an incentive to unilaterally deviate from his current strategy. Recent results (e.g. Daskalakis, Goldberg, Papadimitriou '06 and Chen, Deng, Teng '06) indicate that polynomial time algorithms for finding an exact Nash equilibrium are unlikely to exist. In this talk we will consider the question of computing approximate Nash equilibrium strategies in bimatrix games. In particular, given a bimatrix game where the entries of both payoff matrices are in the interval [0,1], we will say that a pair of strategies is an epsilon-Nash equilibrium if no player can gain more than epsilon by deviating. I will first summarize the existing results on the complexity of this problem. Then I will present a new polynomial time approximation algorithm that achieves an improved approximation guarantee of epsilon = 0.36.
The classical secretary problem studies the problem of selecting online an element (a "secretary") with maximum value in a randomly ordered sequence. The difficulty lies in the fact that an element must be either selected or discarded upon its arrival, and this decision is irrevocable. Constant-competitive algorithms are known for the classical secretary problems and several variants. In this talk, I present extensions of this problem to combinatorial settings and settings with weights and discounts. These generalizations can be used to build mechanisms for a class of online combinatorial auctions. Parts of this talk are based on joint work with Moshe Babaioff, Michael Dinitz, Anupam Gupta, David Kempe, Robert Kleinberg, and Kunal Talwar.
We study the flow of water on fat terrains, that is, triangulated surfaces where the minimum angle of any triangle is bounded from below by a positive constant. We show that the worst-case complexity of any path of steepest descent on a fat terrain of n triangles is Θ(n), and that the worst-case complexity of the river network on such terrains is Θ(n2). This improves the corresponding bounds for arbitrary terrains by a linear factor. We prove that in general similar bounds cannot be proven for Delaunay triangulations: these can have river networks of complexity Θ(n3). Moreover, we present an acyclic graph, the descent graph that enables us to trace flow paths on triangulated surfaces I/O-efficiently. We use the descent graph to obtain I/O-efficient algorithms for computing river networks and watershed-area maps of fat terrains in O(sort(r + n)) I/O's, where r is the complexity of the river network. We also describe a data structure for reporting the boundary of the watershed of a query point q (or the flow path from q) in O(l + k) I/O's, where 1 is the number of I/O's used for planar point location and k is the size of the reported output.
Phase transitions not only occur in physics but also in many problems in computer science. An especially important problem is the satisfiability problem, SAT, for Boolean formulas. K-SAT is the problem, where the formulas are restricted to clauses consisting of exactly K literals. When the density of clauses (i.e., the ratio between the number of clauses and the number of variables) is increased in random K-SAT problems, there is an abrupt change in the probability from being satisfiable to being unsatisfiable. Numerous experimental results are available, but the exact location of the phase transition is not known for the random K-SAT problem with K > 2. There are only lower and upper bounds which are rigorously proven. We consider formulas with more structure, the model of fixed balanced shapes introduced by Navarro and Voronkov in 2005. They provided first upper bounds for the location of the critical value, i.e., where the phase transition occurs. These upper bounds were obtained by using the first moment method (FMM). In this talk, we present how to improve the upper bounds by a method which is based on locally maximal solutions. This method has been proposed by Creignou and Daud in 2007. Since this method requires a sensitivity polynomial as an input, we show how such polynomials can be computed for shapes in general.
The "Names in boxes" game is presented by Peter Winkler (College Math. J, vol 37:4, page 260) as follows: "The names of one hundred prisoners are placed in one hundred wooden boxes, one name to each box, and the boxes are lined up on a table in a room. One by one, the prisoners are led into the room; they may look into up to fifty of the boxes to try to find their own name but must leave the room exactly as it was. They are permitted no further communication after leaving the room. The prisoners have a chance to plot a strategy in advance and they are going to need it, because unless they all find their own names they will all be executed. There is a strategy that has a probability of success exceeding thirty percent - find it" The game first appeared (in a somewhat less elegant version) in Anna Gal and Peter Bro Miltersen, "The cell probe complexity of succinct data structures", at ICALP 2003. In this talk we present the history of the game and explain its original motivation from the study of cell probe complexity, a combinatorial theory of compact representation and retrieval of data. In particular, we explain how the "Names in boxes" game relates to the following fundamental question of data retrieval: Can data be stored, so that EVERY data base query with a polynomial time algorithm can be answered in logarithmic time by a sequential algorithm?
We address the extension of the binary search technique from sorted arrays and totally ordered sets to trees and tree-like partially ordered sets. As in the sorted array case, the goal is to minimize the number of queries required to find a target element in the worst case. However, while the optimal strategy for searching an array is straightforward (always query the middle element), the optimal strategy for searching a tree is dependent on the tree's structure and is harder to compute. We present an O(n)-time algorithm that finds the optimal strategy for binary searching a tree, improving the previous best O(n3)-time algorithm. The significant improvement is due to a novel approach for computing subproblems, as well as a method for reusing parts of already computed subproblems, and a linear-time transformation from a solution in the form of an edge-weighed tree into a solution in the form of a decision tree.
In this talk, I will present a framework for algorithms for learning statistical models of clustering in the streaming model of computation. The streaming model is a paradigm for the design of algorithms for massive data sets, in which the algorithm may make only a small number of sequential passes over the input while using a very small amount of working memory in order to accomplish the computational task at hand. Our algorithms will consider the following statistical model for clustering, known as a mixture of distributions. We are given k different probability distributions F1, ..., Fk, each of which is given a weight wi >0. A point is drawn according to the mixture by choosing a distribution Fi with probability proportional to wi and then choosing a point according to Fi. The data are then ordered arbitrarily and placed into a data stream and the algorithm's task is to learn the density function of the mixture. I will present a multiple pass streaming algorithm that uses P passes over the data, where P is an input parameter to the algorithm. The memory requirement of the algorithm falls significantly as a function of P, showing a strong trade-off between the number of passes and memory required. Using communication complexity, this trade-off can be proved to be nearly tight, for a slightly stronger learning problem. I will show how this framework can be adapted to solving clustering problems in combinatorial optimization.
A story of academic researchers taking their findings to the market will be told. Trade Extensions was founded in 2000 as an offspring of theoretical research on how to combine electronic negotiations with optimization techniques and algorithm design. The company now has offices in Sweden and UK, presence in more countries, and a global customer base. Some reflections will be given on the challenges in taking academic research to the "real world".
We study multi-commodity auction where some bidders have synergies between commodities. That is, a bidder gets an extra value if he manages to win all in a specific group of commodities. In such a situation, it is a common belief that the auctioneer gets a better revenue by letting the bidder bid on explicit combinations, a so-called combinatorial auction. However, so far there has been no proof that a combinatorial auction actually gives higher revenue. Indeed, the only existing analysis is for a small auction with two commodities, and where the result is the opposite; it is better for the auctioneer to only allow single bids. In this work, we provide the first analytical comparison of the revenue with and without combinatorial bids for a multi-commodity auction with more than two commodities. By giving bounds on the optimal bidding strategies for combinatorial and non-combinatorial auctions, we manage to prove that a combinatorial indeed gives higher revenue to the auctioneer. Joint work with Arne Andersson.
A succinct data structure occupies an amount of space that is close to the information-theoretic minimum plus an additional term. The latter is not necessarily a lower-order term and, in several cases, completely dominates the space occupancy both in theory and in practice. We present several solutions to partially overcome this problem, introducing new techniques of independent interest that allow us to improve over previously known upper and lower bounds. Joint work with: Alex Golynski, Ankur Gupta, Roberto Grossi, and Srinivasa Rao
In recent years, there has been an increased interest in understanding selfish routing in large networks like the Internet. Since the Internet is operated by different economic entities with varying interests, it is natural to model these entities as selfish agents who are only interested in maximizing their own benefit. Congestion games are a classical model for resource allocation among selfish agents. We consider the special case of network congestion games, in which the resources are the edges of a graph and every player wants to allocate a path between her source and target node. The delay of an edge increases with the number of players allocating it, and every player is interested in allocating a routing path with minimum delay. It is well known that there are stable states of the game in which no player has an incentive to change her strategy, since all alternative paths have greater delays than the one chosen. These states are known as Nash equilibria and are not necessarily optimal with respect to some overall measure of the state the game is in. In the development process of a network architecture one might be interested in analysing its performance when a Nash equilibrium is reached, for which such a state has to be computed. However, it has been shown that the problem of computing Nash equilibria in network congestion games is PLS-complete, which basically means that there is no known efficient algorithm to solve the problem. A standard approach in computer science to circumvent such impediments is to approximate a solution to the problem. But there is also a different reason to consider approximations in the first place. In many applications, players incur some costs when they change their strategy. Hence, it is reasonable to assume that a player is only interested in changing her strategy if this decreases her delay significantly. These considerations lead to the notion of an .-approximate Nash equilibrium, which is a state in which no player can decrease her delay by more than a factor of (1 + .) by unilaterally changing her strategy. Unfortunately, it was lately shown that the problem of computing an .-approximate Nash equilibrium in congestion games is PLS-complete too and it is strongly believed that this result also holds for network congestion games. However, the delay functions used in the reductions are quite artificial. Therefore we formulate some restrictions on the delay functions, and by that obtain an efficient algorithm using the method of randomised rounding to compute .-approximate Nash equilibria for network congestion games We then study the delay function classes that result from the found restrictions, including polynomi- als, exponential functions, a mixture of the first two, and functions from queuing theory. The fully worked out thesis treating the described issues can be reviewed at http://www-users.rwth-aachen.de/andreas.feldmann/thesis.pdf.
Study of Network Formation Games constitutes an exciting research branch of Algorithmic Game Theory, aiming at understanding performance of networks formed by selfish autonomous entities. In this talk we give a brief overview of results on network formation games, and focus on distributed caching/replication networks in particular. We study implications of selfish behavior in total bandwidth consumption over a distributed replication network formed by autonomous users/nodes. A user may choose to replicate (up to constrained local storage capacity) a subset of frequently accessed information objects, and retrieve the rest from neighboring nodes, so as to minimize his individual access cost. For the related strategic game we prove existence of pure strategy Nash equilibria on certain network topologies, and study their quality in terms of the prices of anarchy and stability. We also discuss related open problems.
In the finite capacity dial-a-ride problem the input is a metric space, a set of objects, where each object di specifies a source si and a destination ti, and an integer k - the capacity of the vehicle used for making the deliveries. The goal is to compute a shortest tour for the vehicle in which all objects can be delivered to their destinations (from their sources) while ensuring that the vehicle carries at most k objects at any point in time. There are two variants of the problem: the non-preemptive case, in which an object once loaded on the vehicle stays on it until delivered to its destination, and the preemptive case in which an object may be dropped at intermediate locations and then picked up later by the vehicle and delivered. The dial-a-ride problem generalizes the Traveling Salesman problem (TSP) even for k=1 and is thus NP-hard. Let N be the number of nodes in the input graph, i.e., the number of points that are either sources or destinations. Charikar and Raghavachari gave a minO(log N), O(k)-approximation algorithm for the preemptive version of the problem. We show that the preemptive dial-a-ride problem has no minO(log(1/4-epsilon)N),k(1-epsilon)-approximation algorithm for any epsilon>0 unless NP is a subset of ZP(n(polylog n)). ZP(n(polylog n)) is the class of problems solvable by a randomized algorithm that always returns the right answer and has expected running time O(n(polylog n)), where n is the size of the input. We will also discuss why this lower bound not extends to the non-preemptive case and discuss ideas for showing hardness of approximation of this variant of the problem.
We define and study new variants of the network interdiction problem, where the objective is to destroy high-capacity paths from the source to the destination. We consider both global and local, node-wise limited, budgets of arc removals. We suggest a polynomial time algorithm for global budgets and an almost-linear time comparison based algorithm for local budgets. The extistence of a linear time algorithm remains an open problem. This talk is based on joint work with Peter Bro Miltersen, Troels Bjerre Sorensen, and Kristoffer Arnsfelt Hansen.
During the last years there has been an interest in research that explores the connections between game theory and cryptography, especially in the context of protocol design. In this talk, we first give a brief overview of some existing work in the area of rational cryptography, which studies game-theoretic analysis of cryptographic protocols and cryptography-based implementations of game-theoretic concepts. We then describe a multi-party protocol for securely computing a class of functions in a model where none of the participating parties are honest: they are either rational, acting in their selfish interest to maximize their utility, or adversarial, acting arbitrarily.
Consider a task like the execution of a computer program or a file transfer that may fail before being completed. Standard schemes for failure recovery are RESUME (continue after repair), REPLACE (start a new task of the same type) and RESTART (start the same task from the beginning), but also ideas like checkpointing have been discussed. Failure times are considered random and often also ideal task times. In both cases, the total actual task time is random. RESUME and REPLACE have been analyzed in work by Trivedi, Kulkarni and others, whereas RESTART has resisted detailed analysis until recently. We present a discussion of the various schemes and more detailed results on the structure of the total task time for RESTART. Also some applications to parallel computing are outlined.
It is well known that Nash equilibria of two-player zero-sum extensive form games can be found using linear programming. Despite the widespread use of this technique, the strategies computed suffer from certain deficiencies as they sometimes fail to prescribe sensible play in all parts of the game. This issue is addressed by the concept of equilibrium refinements. A well established refinement is the normal form proper equilibrium, which has many desirable properties. In this talk, we show how to find a normal form proper equilibrium in behavior strategies of a given two-player zero-sum extensive form game with imperfect information but perfect recall. Our algorithm solves a finite sequence of linear programs and runs in polynomial time. For the case of a perfect information game, we show how to find a normal form proper equilibrium in linear time by a simple backwards induction procedure. This is joint work with Peter Bro Miltersen.
We develop two new methods for creating high-quality tetrahedral meshes: one with guaranteed good dihedral angles, and one that in practice produces far better dihedral angles than any prior method. The isosurface stuffing algorithm fills an isosurface with a uniformly sized tetrahedral mesh whose dihedral angles are bounded between 10.7 degrees and 165 degrees. The algorithm is whip fast, numerically robust, and easy to implement because, like Marching Cubes, it generates tetrahedra from a small set of precomputed stencils. Our angle bounds are guaranteed by a computer-assisted proof. Our second contribution is a mesh improvement method that uses optimization-based smoothing, topological transformations, and vertex insertions and deletions to achieve extremely high quality tetrahedra. Joint work with Francois Labelle and Bryan Klingner.
Trees are one of the most fundamental structures in computing. They are used in almost every aspect of modeling and representation for explicit computations. Standard representations of trees using pointers are quite wasteful of space, and could account for the dominant space cost in applications such as storing a suffix tree. For example, a standard representation of a binary tree on n nodes uses 2n pointers or 2n log n bits. This is a factor of log n more than the minimum number of bits necessary, as there are less than 4n distinct binary trees on n nodes. Also, this only supports finding the left/right child of a node efficiently. In this talk, starting with a brief introduction to succinct or highly space efficient data structures, I will present some tree representations that take only 2n + o(n) bits and support various useful queries efficiently. I will also briefly mention some applications where these can be used.
We show how to compute Delaunay triangulations of utterly huge, well-distributed point sets in 2D and 3D on an ordinary computer by exploiting the natural spatial coherence in a stream of points. We achieve large performance gains by introducing spatial finalization into point streams: we partition space into regions, and augment a stream of input points with finalization tags that indicate when a point is the last in its region. By extending an incremental algorithm for Delaunay triangulation to use finalization tags and produce streaming mesh output, we compute a billion-triangle terrain representation for the Neuse River system from 11.2 GB of LIDAR data in 48 minutes using only 70 MB of memory on a laptop with two hard drives. This is a factor of twelve faster than the previous fastest out-of-core Delaunay triangulation software.
This talk discusses combinatorial group testing, which began from work on detecting diseases in blood samples taken from GIs in WWII. Given a parameter d, which provides an upper bound on the number of defective (e.g., diseased) samples, the main objective of such problems is to design algorithms that identify all the defective samples without explicitly testing all n samples. This classic problem has a number of interesting modern applications, and we provide several new efficient algorithms that can be applied in these new contexts. In particular, modern applications we will discuss include problems in DNA sequencing, wireless broadcasting, and network security. Biography: Prof. Goodrich received his B.A. in Mathematics and Computer Science from Calvin College in 1983 and his PhD in Computer Sciences from Purdue University in 1987. He is a Chancellor's Professor at the University of California, Irvine, where he has been a faculty member in the Department of Computer Science since 2001. Dr. Goodrich's research is directed at the design of high performance algorithms and data structures for solving large-scale problems motivated from information assurance and security, the Internet, information visualization, and geometric computing.
In a hedonic game a set of players N splits up in coalitions and the payoff for a player is only dependent on the members of the coalition of the player. A hedonic game is additively separable if a utility function vp exists for each player p so that the payoff for p for belonging to coalition C is ∑i ∈ C vp(i). A partition of N is Nash stable if no player would be strictly better of by moving to another coalition in the partition. Ballester has shown that the problem of deciding whether a Nash stable partition exists in a hedonic game with arbitrary preferences is NP-complete. We show that the problem remains NP-complete even when restricting to additively separable hedonic games. Bogomolnaia and Jackson have shown that a Nash stable partition exists in every additively separable hedonic game with symmetric preferences. We show that the problem of deciding whether a non-trivial Nash stable partition exists in an additively separable hedonic game with non-negative and symmetric preferences is NP-complete. This presentation will also be given at the conference CiE 2007 - a conference on computation and logic in the real world.
Tetris, Sudoku and Minesweeper are some of the well known puzzles that have been studied from a complexity theoretic point of view. While these are inventions of the late 20th century, there is one puzzle that has been around for more than 600 years(!) and has never had its complexity investigated. Until now, that is... This presentation will also be given at the Fourth International Conference on Fun with Algorithms (FUN 2007).
Some recent results on high dimensional search are presented. We define d- dimensional search is high when d approaches log n, for n = number of points or objects to be searched. For example, the l-level k-range is reported to have orthogonal range time Q(n, d) = O(log n + A) for A = number of points or ob- jects in range. Our results show that the l-level k-range requires Q(n, d, l) = O((2l)(d-1)(log N +A)) time for orthogonal range search, making it impractical for range search, even for relatively low d. For d-dimensional point data, the venerable k-d tree is found to be competitive with the Patricia trie adapted for d-dimensional search. For large d, we present a technique based on the pyramid technique that we call the PKD-tree. The PKD-tree shows good performance in testing with uni- formly distributed random data points ( n ≤ 1.000.000 and d ≤ 100) and with 68.040 32-d data points from a colour histogram dataset. We adapted the pyramid technique to implement a k-nearest neighbour algo- rithm called the decreasing radius or DR pyramid technique. Results indicate that for uniformly distributed random data, the DR pyramid and BBD-tree algorithms are comparable. For d≥ 16, we discovered that a naive (brute force) search was faster than six other algorithms for k-nearest neighbour search. The talk presents some observations about why efficient search in high dimensions is challenging.
We propose a simple variant of kd-tree, called rank-based kd-tree, for sets of points in Rd. We show that a rank-based kd-tree, like an ordinary kd-tree, supports range search queries in O(n1-1/d+k) time, where k is the output size. The main advantage of rank-based kd-tree is that they can be efficiently kinetized: the KDS processes O(n2) events in the worst case and each event can be handled in O(log n) time, and each point is involved in O(1) certificates.
We investigate the complexity of finding Nash equilibria in which the strategy of each player is uniform on its support set. We show that, even for a restricted class of win-lose bimatrix games, deciding the existence of such uniform equilibria is an NP-complete problem. Our proof exploits a connection between uniform equilibria and some appropriately defined graph structures associated to the game.
This talk will present some of my recent and current work on non-cooperative games in large networks. The first part will treat a class of games for cost sharing between selfish users. The games encompass a variety of optimization settings, e.g. for covering, facility location, and Steiner tree creation problems. Pure Nash equilibria in these games do not necessarily exist. If they exist, they can be very inefficient and hard to compute. However, all games considered admit cheap and stable approximate Nash equilibria. They can be computed in polynomial time, and the talk will present algorithmic procedures based on primal-dual approximation algorithms and problem-specific combinatorial approaches. In addition, special cases of the games are presented, which admit pure Nash equilibria that are efficient and/or computable in polynomial time. The second part of the talk will give some ideas of my current work in progress on Stackelberg network pricing and average-case aspects of selfish routing.
We present improved and simplified IO-efficient algorithms for map overlay and point location in planar subdivisions (also called planar maps). These are two important problems in computational geometry with application in Geographic Information Systems (GIS). We analyze the IO-complexity of our algorithms on the popular external-memory IO-model. We show that the IO-complexity scales well with both the parameters of the memory hierarchy and the geometric complexity of the input. Our algorithms and data structures improve on the previous best known bounds for general subdivisions both in the number of IOs and storage space; they are significantly simpler and easier to implement. Specifically, we show how to preprocess a so-called low-density subdivision with n edges in O(sort(n)) IOs into a compressed linear quadtree such that one can: (i) compute the overlay of two such preprocessed subdivisions in O(scan(n)) IOs, where n is the total number of edges in the two subdivisions, and (ii) answer a single point location query in O(logB n) IOs and k batched point location queries in O(scan(n) + sort(k)) IOs. For the special case where the subdivision is a fat triangulation, we show how to obtain the same bounds with an ordinary (uncompressed) quadtree, and we show how to make the structure fully dynamic using O(logB n) IOs per update.
We study the complexity of restricted versions of st-connectivity, which is the standard complete problem for NL. Grid graphs are a useful tool in this regard, since * reachability on grid graphs is logspace-equivalent to reachability in general planar digraphs, and * reachability on certain classes of grid graphs gives natural examples of problems that are hard for NC1 under AC0 reductions but are not known to be hard for DLOG; they thus give insight into the structure of DLOG. In addition to explicating the structure of DLOG, another of our goals is to expand the class of digraphs for which connectivity can be solved in logspace, by building on the work of Jakoby et al. who showed that reachability in series-parallel digraphs is solvable in DLOG. Our main results are: * Reachability on planar graphs (and grid graphs) is logspace-equivalent to reachability on graphs of genus 1. Nothing is known about genus 2 and higher, except for the trivial NL upper bound. * Reachability on "layered" grid graphs can be done in UL (a subclass of NL). * Many of the natural restrictions on grid-graph reachability (GGR) are equivalent under AC0 reductions (for instance, undirected GGR, out-degree-one GGR, and indegree-one-outdegree-one GGR are all equivalent). These problems are all equivalent to the problem of determining if a completed game position in HEX is a winning position, as well as to the problem of reachability in mazes studied by Blum and Kozen. This gives rise to a hierarchy of complexity classes between NC1 and L. * Series-Parallel digraphs are a special case of single-source-single-sink planar dags; reachability for such graphs logspace reduces to single-source-single-sink acyclic grid graphs. We show that reachability on such grid graphs AC0 reduces to undirected GGR. * We build on this to show that reachability for single-source multiple-sink planar dags is solvable in DLOG. This is joint work with David A. Mix Barrington, Tanmoy Chakraborty, Samir Datta, and Sambuddha Roy.
We present exact algorithms to compute the chromatic number and the chromatic polynomial in time and space within a polynomial factor of 2n. The result is based on a novel inclusion-exclusion characterisation of set covering, simple and completely self- contained. A brief historical overview is included, including Birkhoff's 1912 definition of the chromatic polynomial. Apart from graph colouring, the algorithm works for a family of related problems such as Domatic Number, Bin Packing, and other graph partitioning problems. We also present polynomial space variants. Joint work with Andreas Björklund.
This talk is devoted to the design and analysis of combinatorial algorithms for solving one-player versions of several prominent infinite duration games pertinent to automated verification of computerized systems. We present the first two strongly polynomial algorithms for solving one-player discounted payoff games, running in time O(mn2) and O(mn2 log m), where the latter algorithm allows edges to have different discounting factors. As applications, we are able to improve the best previously known strongly subexponential algorithms for solving two-player discounted payoff games and the ergodic partitioning problem for mean payoff games.
In this talk we will review some low-dimension geometric approximation algorithms that work by extracting a small subset of the input, and performing the computation on this small subset. Such subsets, referred to as coresets, had emerged as a powerful tool, and we will survey some of the resulting algorithms and future challenges associated with coresets.
Suppose we are given a set of objects that cover a region and a duration associated with each object. Viewing the objects as jobs, can we schedule their beginning times to maximize the length of time that the original region remains covered? This is the Sensor Cover Problem. For example, suppose you wish to monitor activity along a fence (interval) by sensors placed at various fixed locations. Each sensor has a range (also an interval) and limited battery life. The problem is then to schedule when to turn on the sensors so that the fence is fully monitored for as long as possible. This one-dimensional problem involves intervals on the real line. Associating a duration to each yields a set of rectangles in space and time, each specified by a pair of fixed horizontal endpoints and a height. The objective is to assign a bottom position to each rectangle (by moving them up or down) so as to maximize the height at which the spanning interval is fully covered. We call this one-dimensional problem Restricted Strip Covering. If we replace the covering constraint by a packing constraint (rectangles may not overlap, and the goal is to minimize the highest point covered), then the problem becomes identical to Dynamic Storage Allocation, a well-studied scheduling problem, which is in turn a restricted case of the well known problem Strip Packing. We present a collection of algorithms for Restricted Strip Covering. We show that the problem is NP-hard and present an O(logloglog n)- approximation algorithm. We also present better approximation or exact algorithms for some special cases, including when all intervals have equal width. For the general Sensor Cover Problem, we distinguish between cases in which elements have uniform or variable durations. The results depend on the structure of the region to be covered: We give a polynomial-time, exact algorithm for the uniform-duration case of Restricted Strip Covering but prove that the uniform-duration case for higher-dimensional regions is NP-hard. We give some more specific results for two-dimensional regions. This is joint work with Alon Efrat, Shaili Jain, Suresh Venkatasubramanian, and Ke Yi.
The problem of finding dominators in a directed graph calls for computing, for each vertex v, the set of vertices that are visited in all paths from r to v, where r is a distinguished root vertex of the graph. This is a fundamental problem in graph algorithms with several applications, including program optimization, circuit testing, constraint programming and theoretical biology. In the first part of this talk we present an experimental study of practical algorithms for finding dominators. These include a simple, iterative algorithm proposed by Cooper et al., the well-known algorithm of Lengauer and Tarjan, and a new, hybrid algorithm that we introduce. In the second part we present the first linear-time algorithm for computing dominators on a pointer-machine. Previous linear-time algorithms, given by Alstrup et al. and Buchsbaum et al., were implementable only on the random-access model of computation. Although our linear-time algorithm settles the complexity of finding dominators, some variants of this problem remain open. We conclude with a discussion of such problems and their motivating applications. This is joint work with R. E. Tarjan. The experimental part was performed in collaboration with R. F. Werneck, S. Triantafyllis and D. I. August.
For a 0-1 matrix C = cij}, column j is said to cover row i if cij =1. A cover is a subset of columns covering all rows. Unweighted SET COVER (USC) is the problem of finding a cover containing as few columns as possible. An seemingly innocent brain teaser was published in a Danish journal in 2003. Not much reflection is needed to realize that the solution of a highly structured instance C of USC will answer the question posed. The instance C or rather, the family of instances C(n) is a square matrix of size 3n, n = 1,2, ... . Provided that C(1) is known, the fractal-like structure of the matrices allows for C(n) to be constructed recursively for any value of n. Let r(n) be the number of columns in an optimal solution to C(n). Some preliminary investigations have enabled us to determine r(n), n=1,2,3,4, and to show that r(5) must be either 12 or 13. 12 or 13? Unfortunately LP-based bounds take us nowhere in this case since the optimum is flat as a pancake. It offers some consolation though that experiments with CPLEX were not too encouraging either. For n=5, C(5) is a matrix of size 243x243. Nothing was returned after 24 hours CPU time. Eventually, upon an investment of 72 hours CPU time, CPLEX managed to come up with r(5)=12. The original problem asks for r(12), that is, an optimal solution to a square matrix of size 531,441. After a week the originator of the problem, using an invalid argument, announced his own answer, r(12)=512, and cashed the award. We have so far shown that r(12) is bounded to belong to the in- terval [210, 377]. We also need to conclude that no brute force approach such as further experiments with CPLEX is likely to work whereas paper and pencil may suffice in providing the final result. The text above is a slightly extended version of the abstract used when a few preliminary results for n < 6 were presented at meetings in 2004 and 2005. State-of-affairs as per today: the transition from patches to tiles in the investigation of feasible solutions has reduced the dimension by a factor of 34. In terms of tiles a full-scale verification of the opti- mal solution found for n=6 (729x729) occupies only a single A4 sheet. Moreover, n=7 has lead to new insight into an important notion called compatibility, to the corresponding STU forms, to the concept of even/odd keys, and to the discovery of matching cliques. February 2006: paper and pencil have been supplemented by a puzzle with pieces of transparent strips showing three or six letters each. The puzzle is readily solvable within a few minutes, for example, on an over- head projector. It is believed that the patterns thereby generated repre- sent a significant step ahead towards the general "n -> n+1" (recursive) solution procedure aimed for. Being born optimists we also conjecture that the lower bounds are tight for all values of n. If true, r(12) = 210. Disregarding the distant and, for our purpose, largely irrelevant fam- iliarity with the "Coupon Collector Problem", nothing useful whatsoever has been found in the literature that we are familiar with. Thus, if the nut can be fully cracked, the visible result will presumably be an indeed self-contained paper as reflected by the total absence of references.
The talk will discuss cache-oblivious structures for 2- and 3-sided planar orthogonal range searching. The structure for 2-sided range searching is the first cache-oblivious structure for this problem that achieves the optimal query bound and linear space. Using this structure, a new structure for 3-sided range searching can be obtaind that matches the query and space bounds of the best previous structure, but is simpler. At the expense of a slightly worse query bound, the 2-sided structure can be made semi-dynamic, supporting insertions in the same amortized bound as cache-oblivious B-trees.
A total dominating set S in a graph G=(V(G),E(G)) is a set of vertices such that every vertex in G is adjacent to a vertex in S. In other words ∀ x ∈ V(G) ∃ s ∈ S: xs ∈ E(G). The minimum size of a total dominating set, γt(G), in a graph, G, is well studied. We will talk about the following new bounds, where δ(G) is the minimum degree of G and Δ(G) is the maximum degree in G:
Given a bipartite graph G=(X ⊍ D, E ⊆ X × D), an X-perfect matching is a matching in G that covers every node in X. In this talk we study the following generalisation of the X-perfect matching problem, which has applications in constraint programming: Given a bipartite graph as above and a collection F ⊆ 2X of k subsets of X, find a subset M ⊆ E of the edges such that for each C∈ F, the edge set M ∩ (C × D) is a C-perfect matching in G (or report that no such set exists). We show that the decision problem is NP-complete and that the corresponding optimisation problem is in APX when k=O(1) and even APX-complete already for k=2. On the positive side, we show that a 2/(k+1)-approximation can be found in 2k poly(k,|X∪ D|) time. This is joint work with Khaled Elbassioni, Martin Kutz and Meena Mahajan, to be presented at ISAAC 2005. The paper can be downloaded from http://www.brics.dk/ irit/pub/SM.ps
Motivated by an application in computational topology, we consider a novel variant of the problem of efficiently maintaining dynamic rooted trees. This variant allows an operation that merges two tree paths. In contrast to the standard problem, in which only one tree arc at a time changes, a single merge operation can change many arcs. In spite of this, we develop a data structure that supports merges and all other standard tree operations in O(log2 n) amortized time on an n-node forest. For the special case that occurs in the motivating application, in which arbitrary arc deletions are not allowed, we give a data structure with an O(log n) amortized time bound per operation, which is asymptotically optimal. The analysis of both algorithms is not straightforward and requires ideas not previously used in the study of dynamic trees. We explore the design space of algorithms for the problem and also consider lower bounds for it.
A solution to an optimisation problem can be viewed as a collection of decisions (e.g., "node v1 is in the independent set, node v2 is not in the independent set, etc..."). An optimal solution O is then a collection of choices that maximizes the profit or minimizes the cost of the solution. Let O[C] be the optimal solution that can be obtained if we are forced to make the choice C. In several applications in constraint programming, it is useful to compute (or approximate) the values O[C1],...,O[Ck] where C1,...,Ck is a set of mutually exclusive choices. The main question in sub-optimality is: Will it help us to compute O first? The focus of the talk will not be on applications. I will discuss sub-optimality is a computational- theoretic concept and try to interest you in further research into it. I will partially answer the "main question" by showing that the answer is "sometimes". The talk is based on joint work with Russell Bent and Pascal van Hentenryck, which is to be presented at CP 2005 (http://www.iiia.csic.es/cp2005/).
The talk discusses three algorithmic problems (a weight balancing problem; a data management problem; a sequencing problem vaguely related to dartboards), some of their combinatorial properties, and one unifying theorem.
Given a digraph G=(V,E) with a set U of vertices marked ``interesting,'' we want to find a smaller digraph H = (V',E') with V' ⊇ U in such a way that the reachabilities amongst those interesting vertices in G and H are the same. So with respect to the reachability relations within U, the digraph H is a substitute for G. This problem has applications, for example, in network design and also as an important building block in the context of dynamic reachability. The complexity of finding such a reachability substitute of minimum size has so far been open. We prove that the problem is NP-hard. Our main contribution is about planar digraphs: While almost all digraphs do not allow reachability substitutes smaller than Ω(|U|2/log |U|), every planar digraph has a reachability substitute of size O(|U| log2 |U|). Our proof is constructive and offers an efficient algorithm to find a reachability substitute of this size. The talk describes joint work with Irit Katriel and Martin Skutella.
I will describe an efficient algorithm to compute the shortest path homotopic to a given path on a given combinatorial surface. The algorithm has two phases. In the preprocessing phase, we construct a set of simple cycles, each as short as possible in its homotopy class, that decompose the surface into topological disks each with eight edges. This decomposition allows us to approximate the given irregular metric with a standard hyperbolic metric, defined by a regular tiling of the hyperbolic plane by right-angled octagons. In the query phase, we exploit this hyperbolic structure and techniques in combinatorial group theory to reduce the shortest-path search to a small subset of the universal cover. Dijkstra's algorithm then finds the true shortest path quickly. This is joint work in progress with Eric Colin de Verdiere.
In this talk I will present efficient indexing techniques for answering range and nearest neighbor queries, which are essential operations for most practical applications dealing with very large spatio-temporal datasets. Generating and storing huge amounts of data is a typical characteristic of spatio-temporal applications, given the highly dynamic nature of such environments. Therefore, indexing spatio-temporal data for answering queries efficiently has become a very important issue. In this direction, I will present specialized algorithms for approximating and indexing spatio-temporal objects cost-effectively. These algorithms introduce a space-utilization vs. index-quality tradeoff that enables index performance tuning according to application requirements. The presented solutions are based on dynamic programming and greedy techniques for producing high quality object approximations. The main characteristic of our solutions is that they decrease the amount of empty volume introduced in the index, while keeping the complexity of the approximations low.
Consider a dynamic file where new keys are inserted with a probability that obeys an unknown smooth distribution μ and existing keys are deleted at random. We show that the probabilistic analyses of the known dynamic interpolation search data structures, for storing this file, retain their correctness only when the produced elements are pairwise distinct, otherwise they fail. Moreover, we present a new dynamic interpolation search data structure whose probabilistic analysis is always valid and which exhibits similar expected asymptotic performance as the aforementioned structures.
Quicksort was first introduced in 1961 by Hoare. Many variants have been developed, the best of which are among the fastest generic sorting algorithms available, as testified by the choice of Quicksort as the default sorting algorithm in most programming libraries. Some sorting algorithms are adaptive, i.e. they have a complexity analysis which is better for inputs which are nearly sorted, according to some specified measure of presortedness. Quicksort is not among these, as it uses Ω(n log n) comparisons even when the input is already sorted. However, in this paper we demonstrate empirically that the actual running time of Quicksort is adaptive with respect to the presortedness measure Inv. Differences close to a factor of two are observed between instances with low and high Inv value. We then show that for the randomized version of Quicksort, the number of element swaps performed is provably adaptive with respect to the measure Inv. More precisely, we prove that randomized Quicksort performs expected O(n(1+log (1+Inv/n))) element swaps, where Inv denotes the number of inversions in the input sequence. This result provides a theoretical explanation for the observed behavior, and gives new insights on the behavior of the Quicksort algorithm. We also give some empirical results on the adaptive behavior of Heapsort and Mergesort. Joint work with Gerth S. Brodal and Rolf Fagerberg.
Even though a large number of external memory graph algorithms have been developed in recent years, very little effort went into investigating the practical merits of theoretically developed I/O-efficient algorithms. In this talk, I will discuss the implementation and comparative study undertaken by us, between Munagala-Ranade's BFS algorithm and Mehlhorn-Meyer's randomized BFS algorithm on various graph classes. I will also describe the STXXL design framework for implementing external memory algorithms, with a particular reference to the concept of pipelining.
This talk considers integer sorting on a RAM. We show that adaptive sorting of a sequence with qn inversions is asymptotically equivalent to multisorting groups of at most q keys, and a total of n keys. Using the recent O(n √loglog n) expected time sorting of Han and Thorup on each set, we immediately get an adaptive expected sorting time of O(n √loglog q). Interestingly, for any positive constant e, we show that multisorting and adaptive inversion sorting can be performed in linear time if q≤ 2(log n)(1-e). We also show how to asymptotically improve the running time of any traditional sorting algorithm on a class of inputs much broader than those with few inversions. Joint work with Rasmus Pagh and Mikkel Thorup
Given a set A of n bit strings, we would like to encode the elements of A with about log |A|, rather than n bits. Information theoretically we can achieve this bound. We examine the question of if this bound can still be achieved when we further demand that every element x ∈ A can be printed from its description in polynomial time, assuming the decompression function is given oracle access to the set A. It is not too hard to come up with sets A for which this is not possible for polynomial time decompression functions, even randomized ones. We show however, that when the decompression function is allowed to be nondeterministic and randomized we can achieve the information theoretic bound up to a factor of O(log3 n). The proof of this result uses hardness vs. randomness tradeoffs based on the Nisan-Wigderson pseudorandom generator. Time permitting we will show an application of this result to prove a weak form of polynomial time symmetry of information. This reveals an interesting connection between polynomial time symmetry of information and a long-standing open problem in complexity theory.
Are many-one reductions different from Turing reductions? The talk concerns this question. Informally, with Turing reductions an instance of a problem can be solved by asking polynomially many queries about the instances of another problem. Many-one reductions require an instance of one problem to be mapped into an instance of the other in a decision preserving manner. Since a many-one reduction is also a Turing reduction, a problem that is NP-complete under many-one reductions is also NP-complete under Turing reductions. On the other hand, it appears that many-one reductions are much more restrictive than Turing reductions and a fundamental question is whether these completeness notions are indeed different. If P=NP then the answer is negative. Thus we need complexity-theoretic assumptions to answer this question. A major open problem on this topic is to show that these completeness notions differ using any reasonable average-case hardness assumption about problems in NP. In this talk we separate Turing NP-completeness from many-one NP-completeness using an (admittedly weak) average-case assumption which we call partial-biimmunity assumption. A problem L is partially biimmune if any deterministic machine that decides L correctly, takes at least 2nε time on all but 2logl n instances of length n for some l. Under the assumption we construct a problem that is Turing NP-complete but not many-one NP-complete. This is a joint work with John Hitchcock and A. Pavan, and will be presented in the 19th IEEE Conference on Computational Complexity, 2004 (CCC 04).
In this talk we present two novel generic schemes for approximation algorithms for optimization NP-hard graph problems constrained to partial k-trees. Our first scheme yields deterministic polynomial-time algorithms achieving typically an approximation factor of k/log(1-epsilon) n for k = polylog (n). The second scheme yields randomized polynomial-time algorithms achieving an approximation factor of k/log n, where k is superlogarithmic. Both our approximation methods lead to the best known approximation guarantees for some basic optimization problems. In particular, we obtain best known polynomial-time approximation guarantees for the classical maximum independent set problem in partial trees. Joint work with Artur Czumaj (New Jersey Institute of Technology) and Andrzej Lingas (Lund University).
In this talk we are concerned with algorithms for k-colouring and for finding the chromatic number of a graph. For that use, we will look at algorithms for enumerating all maximal independent sets of a graph. There exists a tight upper bound on the number of maximal independent sets in a graphs and algorithms for enumerating them all in time proportional to their number. Such algorithms have been used for both 3- and 4-colouring and for finding the chromatic number of a graph. To improve this, we look at maximal independent sets of size k. We give tight upper bounds for all k and algorithms for enumerating them all in time within a polynomial factor of these bounds. Using these algorithms, we can improve on the running time for 4-, 5- and 6-colouring, as well as the time for finding the chromatic number. To do the last, we also give algorithms for finding all maximal k-colourable subgraphs.
We present recent results on the hardness of approximating the longest path and the longest cycle in graphs and directed graphs. This includes an algorithm for Longest Path in graphs with performance guarantee n loglog n/log2 n, and a hardness result for digraphs: that there is no polynomial time algorithm that always finds a path of length Ω(log2+ε n) (under reasonable complexity assumptions). References: * Finding a path of superlogarithmic length. Andreas Björklund and Thore Husfeldt. SIAM J. Computing. Vol. 32 (2003), No. 6, pp. 1395-1402. * Approximating Longest Directed Path. Andreas Björklund, Thore Husfeldt, Sanjeev Khanna. ECCC TR03-032
We present a new algorithm for answering short path queries in planar graphs. For any fixed constant k and a given unweighted planar graph G=(V,E) one can build in O(|V|) time a data structure, which allows to check in O(1) time whether two given vertices are distant by at most k in G and if so a shortest path between them is returned. This significantly improves the previous result of D. Eppstein where after a linear preprocessing the queries are answered in O(log |V|) time. Our approach can be applied to compute the girth of a planar graph and a corresponding shortest cycle in O(|V|) time provided that the constant bound on the girth is known. Our results can be easily generalized to other wide classes of graphs -- for instance we can take graphs embeddable in a surface of bounded genus or graphs of bounded tree-width. Joint work with Lukasz Kowalik. Presented at the Proc. 35th Symp. Theory of Computing (STOC), 2003.
We present new algorithms for finding short cycles (of length at most 6) in planar graphs. Although there is an O(n) algorithm for finding any fixed subgraph H in a given n-vertex planar graph, the multiplication constant hidden in ``O'' notation (which depends on the size of H) is so high, that it rather cannot be used in practice even when |V(H)| = 4. Our approach gives faster ``practical algorithms'' which are additionally much easier to implement. As a side-effect of our approach we show that the maximum number of k-cycles in n-vertex planar graph is Θ(n⌊ k/2 ⌋). Work presented at the Proc. 28th Workshop Graph-Theoretic Concepts in Comp. Sci. (WG 2003).
In this talk, we present lower bounds for permuting and sorting in the cache-oblivious model. We prove that (1) I/O optimal cache-oblivious comparison based sorting is not possible without a tall cache assumption (2) there does not exist an I/O optimal cache-oblivious algorithm for permuting, not even in the presence of a tall cache assumption. These results demonstrate a separation between the I/O-model and the cache-oblivious model. In more detail, the result for sorting shows the existence of an inherent trade-off in the cache-oblivious model between the strength of the tall cache assumption and the overhead for the case M >> B, and shows that Funnelsort and recursive binary mergesort are both optimal algorithms in the sense that they attain this trade-off. This is joint work with Gerth Stølting Brodal, to be presented at STOC'03.
Many algorithms and data structures employing hashing have been analyzed under the uniform hashing assumption, i.e., the assumption that hash functions behave like truly random functions. Starting with the discovery of universal hash functions, many researchers have studied to what extent this theoretical ideal can be realized by hash functions that do not take up too much space and can be evaluated quickly. In this paper we present an almost ideal solution to this problem: A hash function that, on any set of n inputs, behaves like a truly random function with high probability, can be evaluated in constant time on a RAM, and can be stored in O(n) words, which is optimal. For many hashing schemes this is the first hash function that makes their uniform hashing analysis come true, with high probability, without incurring overhead in time or space.
One would like to take advantage of the fact that real-world access sequences often have patterns that intelligently designed systems can make use of, even though the exact distribution of operations is seldom not known in advance. Still, there is often reason to believe that the stream of queries and updates arriving at such systems is far from random. We will present theory of what we call input-sensitive behavior of query-based data structures. The objective of input-sensitive data structures and input-sensitive analysis is to create data structures and algorithms whose runtime is expressed as a function of the patterns that occur in the input, with the goal of speeding up the performance, relative to input sequences that exhibit no favorable patters, or are completely random. We will present an overview of input sensitive structures and techniques. We begin by reviewing the splay tree structure of Sleator and Tarjan and paring heaps of Fredman, Sedgewick, Sleator and Tarjan. These classic self adjusting structures structures are able to run several (but not all) desirable classes of operation patterns quickly. We will then present the results of our recent work to develop new input sensitive structures for dictionaries, heaps and geometric problems. This is joint work with Erik Demaine and Stefan Langerman.
As the memory systems of modern computers become more complex, it is increasingly important to design algorithms that are sensitive to the structure of memory. One of the essential features of modern memory systems is that they consist of a hierarchy of several levels of cache, main memory, and disk. While traditional theoretical computational models assumed a ``flat'' memory with uniform access time, the access times of different levels of memory can vary by several orders of magnitude in current machines. To amortize the large access time of memory levels far away from the processor, memory systems often transfer data between memory levels in large blocks. Thus it is becoming increasingly important to obtain high data locality in memory access patterns. While a lot of research has been done in two-level (external) memory models, relatively little work has been done in models for multilevel hierarchies, mainly because of the many parameters in such models. Recently however, a promising new line of research has aimed at developing memory-hierarchy-sensitive algorithms that avoid any memory-specific parameterization whatsoever. These "cache-oblivious" algorithms work optimally on a multilevel memory hierarchy if they work optimally on a two-level hierarchy. In this talk, we describe cache-oblivious dynamic data structures for multidimensional range searching. Joint work with Pankaj Agarwal, Andrew Danner, and Bryan Holland-Minkley to be presented at 19th ACM Symposium on Computational Geometry.
We present a new algorithm for the all-pairs shortest path problem on real-weighted graphs. Our algorithm improves on Dijkstra's classical shortest path algorithm -- the previous best -- and runs in time O(mn + n2 loglog n), where m and n are the number of edges and vertices, respectively. The new ingredients in our algorithm are (1) a method for representing the dependencies between shortest paths emanating from nearby vertices and (2) a method for computing exact distances from approximate distances.
In this talk we study trade-offs between the update time and the query time for comparison based external memory dictionaries. We will present two lower bound trade-offs between the I/O complexity of member queries and insertions: If N>M insertions perform at most δ*N/B I/Os, then (1) there exists a query requiring N/(M*(M/B)O(δ)) I/Os, and (2) there exists a query requiring Ω(log(N/M)/log(δ log2 N) I/Os when delta is O(B/log3 N) and N is at least M2. For both lower bounds data structures exist which give matching upper bounds for a wide range of parameters, thereby showing the lower bounds to be tight within these ranges. This is joint work with Rolf Fagerberg. Talk given at SODA'03.
The cache oblivious model of computation is a two-level memory model with the assumption that the parameters of the model are unknown to the algorithms. A consequence of this assumption is that an algorithm efficient in the cache oblivious model is automatically efficient in a multi-level memory model. Arge et al. recently presented the first optimal cache oblivious priority queue, and demonstrated the importance of this result by providing the first cache oblivious algorithms for graph problems. Their structure uses cache oblivious sorting and selection as subroutines. In this paper, we devise an alternative optimal cache oblivious priority queue based only on binary merging. We also show that our structure can be made adaptive to different usage profiles.
In this talk we consider the computational power of cylindrical circuits. We show that every polynomial size constant depth circuit with at most one layer of MOD gates and at most two layers of AND/OR-gates above the MOD gates can be simulated by a polynomial size constant width cylindrical circuits. On the other hand, we show that every polynomial size constant width cylindrical circuit can be simulated in ACC0. This is joint work with Kristoffer Arnsfelt Hansen and V Vinay.
In a connected simple graph G the following random experiment is carried out: each node chooses one of its neighbors uniformly at random. We say a rendezvous occurs if there are adjacent nodes u and v such that u chooses v and v chooses u. Metivier et al. (2000) formulated the conjecture that the probability for a rendezvous to occur in G is at least as large as the probability of a rendezvous if the same experiment is carried out in the complete graph on the same number of nodes. In this talk we show that the conjecture is true.
In this seminar we present the recent paper "PRIMES is in P", by Manindra Agarwal, Nitin Saxena and Neeraj Kayal (available at www.cse.iitk.ac.in/news/primality.html) Abstract: We present a deterministic polynomial-time algorithm that determines whether an input number n is prime or composite.
An error-correcting code is said to be locally decodable if a randomized algorithm can recover any single bit of the message by reading only a small number of symbols of a possibly corrupted encoding of the message. Katz and Trevisan (STOC 2000, pp. 80--86) showed that any such code C: {0,1}n → Σm with a decoding algorithm that makes at most q probes must satisfy m = Ω((n/log |Σ|)q/(q-1)). They assumed that the decoding algorithm was non-adaptive, and left open the question of proving similar bounds when the decoder is adaptive. We show lower bounds without assuming that the decoder is non-adaptive. Our argument is based on an anlysis a sampling algorithm by estimating the variance. (Joint work with Amit Deshpande, Rahul Jain, T Kavitha and Satyanaryana V Lokam, this work was presented at Complexity 2002)
We consider space efficient data structures for the off-line string matching problem: given a text string and a pattern string, to find an occurrence (or all occurrences) of the pattern in the text, after preprocessing the text string. We look at space efficient implementations of the well known data structure, suffix tree. We also look at some compressed suffix array representations and their use in space efficient suffix trees. Joint work with Venkatesh Raman and J. Ian Munro.
Motivated by the fact that competitive analysis yields too pessimistic results when applied to the paging problem, there has been considerable research interest in refining competitive analysis and in developing alternative models for studying online paging. The goal is to devise models in which theoretical results capture phenomena observed in practice. In the talk a new, simple model for studying paging with locality of reference is presented. The model is closely related to Denning's working set concept and directly reflects the amount of locality that request sequences exhibit. We use the page fault rate to evaluate the quality of paging algorithms, which is the performance measure used in practice. In our model LRU is optimal whereas FIFO and deterministic marking strategies are not optimal in general. This is joint work with Susanne Albers, University of Freiburg, and Oliver Giel, University of Dortmund.
It is well known that Santa Claus distributes gifts to billions of children around December 24. What may be less well known is the way in which he organizes the presents in order to be able to quickly get to the right present for a given child. (Sorting in advance is just too much work... And besides, people move around all the time.) To formalize Santa's problem, let n be the number of children, and u the number of possible children's identifiers. In the good old days when Santa was able to memorize O(log n + loglog u) bits, he used a scheme where he just had to read a single bit to find the right present. Now that his memory is worse, he has been in need of something else. Luckily, a recent scheme using expander graphs now allows him to find the right present by just looking at O(loglog u) bits, expected, without having to memorize anything. Thus, once again, christmas appears to be saved! The talk will describe Santa's old and new gift-finding schems.
We propose a version of cache oblivious search trees which is simpler than the previous proposal of Bender, Demaine and Farach-Colton and has the same complexity bounds. In particular, our data structure avoids the use of weight balanced B-trees, and can be implemented as just a single array of data elements, without the use of pointers. The structure also improves space utilization. For storing n elements, our proposal uses (1+ε)n times the element size of memory, and performs searches in worst case O(logB n) memory transfers, updates in amortized O((log2 n)/(ε B)) memory transfers, and range queries in worst case O(logB n + k/B) memory transfers, where k is the size of the output. The basic idea of our data structure is to maintain a dynamic binary tree of height log n+O(1) using existing methods, embed this tree in a static binary tree, which in turn is embedded in an array in a cache oblivious fashion, using the van Emde Boas layout of Prokop. We also investigate the practicality of cache obliviousness in the area of search trees, by providing an empirical comparison of different methods for laying out a search tree in memory. This is joint work with Gerth Stølting Brodal and Rolf Fagerberg. The talk is to be given at SODA'02.
An undirected graph G is said to be d-distinguishable if there is a d-coloring of its vertex set V(G) such that no nontrivial automorphism of G preserves the coloring. In other words, the graph has a d-coloring that breaks all its symmetries, making the colored graph rigid. The distinguishing number of a graph G is the minimum d for which it is d-distinguishable. In this talk we will discuss the complexity of computing the distingui- shing number of a graph. In particular, we will describe efficient algorithms for computing the distinguishing numbers of trees and planar graphs, and discuss an approach for bounded degree graphs.
In this talk we present the first provably I/O-efficient dynamic data structure for point location in a general planar subdivision. Our structure uses O(N/B) disk blocks to store a subdivision of size N, where B is the disk block size. Queries can be answered in O(logB2 N) I/Os in the worst-case, and insertions and deletions can be performed in O(log2B N) and O(logB N) I/Os amortized, respectively. Previously, an I/O-efficient dynamic point location structure was only known for monotone subdivisions. Joint work with Jan Vahrenhold presented at SoCG 2000.
We study the fundamental problem of sorting n integers of w bits on a unit-cost RAM with word size w, and in particular consider the time-space trade-off (product of time and space in bits) for this problem. For comparison-based algorithms, the time-space complexity is known to be Θ(n2). A result of Beame shows that the lower bound also holds for non-comparison-based algorithms, but no algorithm has met this for time below the comparison-based Ω(nlg n) lower bound. We show that if sorting within some time bound T' is possible, then time T ∈ O(T'+nlg* n) can be achieved with high probability using space S ∈ O(n2/T+w), which is optimal. Given a deterministic priority queue using amortized time t(n) per operation and space nO(1), we provide a deterministic algorithm sorting in time T ∈ O(n(t(n) + lg* n)) with S ∈ O(n2/T+w). Both results require that w≤ n1-Ω(1). Using existing priority queues and sorting algorithms, this implies that we can deterministically sort time-space optimally in time Θ(T) for T≥ n(lglg n)2, and with high probability for T≥ nlglg n. Our results imply that recent lower bounds for deciding element distinctness in o(nlg n) time are nearly tight. Joint work with Jakob Pagter.
Consider the problem of compactly representing a subset of size n of a universe of size m, so that membership queries can be answered efficiently. A bitvector is one solution to this problem. It uses m bits for representing the set (which is a quite a lot if m is big, compared to n). On the other hand, one can answer a membership query by inspecting only a single bit of the data structure. A hash table is another solution. It uses only O(nlog m) bits for representing the set (which is optimal if n is not too close to m). On the other hand, to answer a membership query one must inspect O(log m) bits of the data structure. In this talk we present a solution combining the desirable aspects of both the above solutions: We devise a representation using only O(nlog m) bits to represent any set, so that any membership query can be answered by inspecting only a single bit of the data structure. This is joint work with Harry Buhrman, Jaikumar Radhadkrishnan and Srinivasan Venkatesh, previously presented at STOC'00.
In this paper we study the algorithmic problem of constructing rooted evolutionary trees in the so called experiment model. This model was first presented by Kannan, Lawler, and Warnow. We present new techni- ques of efficiently merging and updating partial evolutionary trees in this model. We show that two partial evolutionary trees for disjoint sets of species can be merged using experiments in time O(dn), where n is the total number of species in the resulting evolutionary tree and d is its maximum degree. We prove our upper time bound on merging evolutionary trees to be asymptotically optimal. We show also that after O(nlog n)-time preprocessing, a partial evolutionary tree can be maintained under a sequence of m species insertions or deletions in time O(dm log (n+m)). By applying our algorithm for merging evolutionary trees, or alternatively, our algo- rithm for updating evolutionary trees, we obtain an O(dnlog n)- time bound on the problem of constructing an evolutionary tree for n species and maximum degree d from experiments. The classic O(nlog n)-time bound on sorting in the comparison-based model can be seen as a very special case of this upper bound.
We present optimal data structure for static range reporting in one dimension. For a set of n integers each of w bits, we construct a data structure that requires space O(n) words on a unit cost RAM with word size w. Given a query interval the data structure supports the reporting of all elements from the set contained within the interval in optimal time O(k), where k is the number of elements reported. An extension of the data structure supports approximate range counting queries in constant time. The data structures can be constructed in expected time O(n√w). Joint work with Stephen Alstrup and Theis Rauhe.
The dynamic maintenance of the convex hull of a set of points in the plane is one of the most important problems in computational geometry. We present a data structure supporting point insertions in amortized O(log n·logloglog n) time, point deletions in amortized O(log n·loglog n) time, and various queries about the convex hull in optimal O(log n) worst-case time. The data structure requires O(n) space. Applications of the new dynamic convex hull data structure are improved deterministic algorithms for the k-level problem and the red--blue segment intersection problem where all red and all blue segments are connected. Joint work with Gerth Stølting Brodal.
In this talk we will define external memory (or I/O) models which also capture space complexity and develop a general technique for deriving I/O-space trade-offs in these models from internal memory model time-space trade-offs. Using this technique we show strong I/O-space product lower bounds for Sorting. This talk is based on joint work with Lars Arge (Duke University) to be presented at SWAT'20000.
We consider dictionaries over the universe U={0,1}w on a unit-cost RAM with word size w and a standard instruction set, and present a linear space deterministic dictionary accommodating membership queries in time (log log n)O(1) and updates in amortized time (log n)O(1), where n is the size of the set stored. Previous deterministic dictionaries either had query time (log n)Ω(1) or update time 2ω(√log n) in the worst case. To be presented at SWAT 2000.
This talk will give a history and philosophy of diagonalization as a tool to prove lower bounds in computational complexity. We will give several examples and discuss four possible approaches to using diagonalization to separate log-space from nondeterministic polynomial-time.
The Pfaffian of an oriented graph is closely linked to Perfect Matching. It is also naturally related to the determinant of an appropriately defined matrix. This relation between Pfaffian and determinant is usually exploited to give a fast algorithm for computing Pfaffians. We present the first completely combinatorial algorithm for computing the Pfaffian in polynomial time. Our algorithm works over arbitrary commutative rings. Over integers, we show that it can be computed in the complexity class GapLog; this result was not known before. Our proof techniques generalize the recent combinatorial characterization of determinant [Mahajan,Vinay, CJTCS 1997] in novel ways. As a corollary, we show that under reasonable encodings of a planar graph, Kasteleyn's algorithm for counting the number of perfect matchings in a planar graph is also in GapLog. The combinatorial characterization of Pfaffian also makes it possible to directly establish several algorithmic and complexity theoretic results on Perfect Matching which otherwise use determinants in a roundabout way. We also present hardness results for computing the Pfaffian of an integer skew-symmetric matrix. We show that this is hard for #Log and GapLog under logspace many-one reductions. (An extended abstract describing most of these results appeared in the Proceedings of the Fifth Annual International Computing and Combinatorics Conference COCOON 1999, in the Springer-Verlag Lecture Notes in Computer Science series Volume 1627, pp. 134--143. The full version is a DIMACS and ECCC Technical Report.) Joint work with P. R. Subramanya, and V. Vinay
It has been known for a long time now that the problem of counting the number of perfect matchings in a planar graph is in NC. This result is based on the notion of a pfaffian orientation of a graph. Recently, Galluccio and Loebl generalised this result to the case of graphs of bounded genus. However, it is not known if the corresponding search problem, that of finding one perfect matching in a planar graph, is in NC. This situation is intriguing as it seems to contradict our intuition that search should be easier than counting. For the case of planar bipartite graphs, Miller and Naor showed that a perfect matching can indeed be found using an NC algorithm. Meena Mahajan and Kasturi R. Varadarajan present a very different NC-algorithm for this problem. Unlike the Miller-Naor algorithm, their approach directly uses the fact that counting is in NC, and it also generalizes to the problem of finding a perfect matching in a bipartite bounded genus graph. It also rekindles the hope for an NC-algorithm to find a perfect matching in a non-bipartite planar graph. (To appear in STOC 2000)
In the L-phylogeny problem, one wishes to construct an evolutionary tree for a set of species represented by characters, in which each state of each character induces no more than L connected components. We consider the fixed-topology version of this problem for fixed-topologies of arbitrary degree. This version of the problem is known to be NP-complete when L is at least 3, even for degree-3 trees in which no state labels more than L+1 leaves (and therefore there is a trivial L + 1 phylogeny). We give a 2-approximation algorithm for all L for arbitrary input topologies. Dynamic programming techniques, which are typically used in fixed-topology problems, cannot be applied to L-phylogeny problems. Our 2-approximation algorithm is the first application of linear programming to approximation algorithms for phylogeny problems. We extend our results to a related problem in which characters are polymorphic. This was joint work with Leslie Ann Goldberg and Cynthia A. Phillips.
We prove that AM (and hence Graph Nonisomorphism) is in NP if for some ε>0, some language in NE intersect coNE requires nondeterministic circuits of size 2ε n. This improves recent results of Arvind and Kobler and of Klivans and Van Melkebeek who proved the same conclusion, but under stronger hardness assumptions. The previous results on derandomizing AM were based on pseudorandom generators. In contrast, our approach is based on a strengthening of Andreev, Clementi and Rolim's hitting set approach to derandomization. As a spin-off, we show that this approach is strong enough to give an easy (if the existence of explicit dispersers can be assumed known) proof of the following implication: For some ε>0, if there is a language in E which requires nondeterministic circuits of size 2ε n, then P=BPP. This differs from Impagliazzo and Wigderson's theorem ``only'' by replacing deterministic circuits with nondeterministic ones. Technically, our strengthening of the Andreev-Clementi-Rolim approach is in the form of the combination of this approach with the use of a certain error correcting code, based on the low degree extension. It is well known that the low degree extension plays an important role in the construction of pseudorandom generators, and so, it is not surprising that it would play an role here also. Interestingly however, the coding theoretic property of the low degree extension that is used in the construction of pseudorandom generators is local decodability. In our application, local decodability plays no role. We use a simpler property which will be explained in the talk. This is joint work with N.V. Vinodchandran, to appear on FOCS'99.
Let G be a fixed collection of digraphs. Given a digraph H, a G-packing of H is a collection of vertex disjoint subgraphs of H, each isomorphic to a member of G. A G-packing, P, is maximum if the number of vertices belonging to some member of P is maximum, over all G-packings. The analogous problem for undirected graphs has been extensively studied in the literature. We will concentrate on the cases when G is a family of paths. We will show that G-packing is NP-complete when (essentially) G is not one of the families { P1 }, or { P1, P2 }. When G = { P1 }, the G-packing problem is simply the matching problem. When G={P1, P2 }, the directed paths of length one and two, we will present a collection of augmenting configurations such that a packing is maximum if and only if it contains no augmenting configuration. We will also present a min-max condition which yields a concise certificate for maximality of packings. These results imply a polynomial time algorithm for finding a maximum { P1, P2 }-packings in arbitrary digraphs.
The p-dispersion-sum problem is the problem of locating p facilities at some of n predefined locations, such that the overall distance sum is maximized. The problem has applications in telecommunication (where it is desirable to locate radio transmitters as far away from each other in order to minimize interference problems), and in location of shops/service-stations (where the mutual competition should be minimized). Simple upper bounds for the problem are presented, and it is shown how these bounds can be tightened through Lagrangian relaxation. A branch-and-bound algorithm is finally described which in each iteration finds upper bounds in O(n) time. Finally some computational results with randomly generated and geometrical problems are presented. The related p-dispersion problem is the problem of locating p facilities such that the minimum distance between two facilities is as large as possible. Formulations and simple upper bounds will be presented, and it will be discussed whether a similar framework as for the p-dispersion sum problem can be used to tighten bounds.
A tandem repeat (or square) is a string aa, where a is a non-empty string. We present an O(|S|)-time algorithm that operates on the suffix tree T(S) for a string S, finding and marking the endpoint in T(S) of every tandem repeat that occurs in S. This decorated suffix tree implicitly represents all occurrences of tandem repeats in S, and can be used to efficiently solve many questions concerning tandem repeats and tandem arrays in S. This improves and generalizes several prior efforts to efficiently capture large subsets of tandem repeats. Joint work with Dan Gusfield, UC Davis.
We define a game called ``Quantum Twenty Questions'' where Player 1 chooses a secret state s from a large set of possible states S and supplies copies of s, on request, to Player 2. It is the task of Player 2, with full knowledge of S, to design a sequence of ``quantum questions'', i.e. measurements, to determine the secret s with high probability in as few requests to Player 1 as possible. Sometimes the geometry of S, coupled with the idea of eliminating candidate states, suggests a natural measurement. If S is the set of states associated with the discrete logarithm problem or factoring, then it turns out that the natural measurement is the Fourier transform. This talk will present these ideas and show how viewing current quantum algorithms from the point of view of a search game may assist in the quest for new algorithms.
A combinatorial characteristic of a Boolean function is its sensitivity. Informally, the sensitivity of a Boolean function is the maximum number of input variables, such that changing the value of just one variable at a time, changes the value of the function as well. I will talk about an (almost tight) upper bound on the sensitivity of a multiple-output Boolean function in terms of the sensitivity of its coordinates and the size of the range of the function. We apply this theorem to establish a tight tradeoff between randomness and the number of rounds in private computation. A private protocol to compute a Boolean function allows a number of players, each possessing a single input bit, to compute the value of the function on their combined input, in a way that no single player learns any ``unnecessary'' information (in particular, the inputs of the other players). We give lower bounds on the number of rounds of such protocols in terms of the sensitivity of the function being computed, and the amount of randomness used in the computation.
A fundamental operation on representations of graphs is the adjacency query: given two nodes u and v, tell whether or not the edge (u,v) is present in the graph. Two standard textbook representations of graphs are adjacency matrices and adjacency lists. The first answers adjacency queries in O(1) time, but uses n2 bits space, which is superlinear for sparse graphs. The latter uses linear space, but adjacency queries now involve searching adjacency lists, which may have length n. The challenge therefore is to find representations which are both fast and compact - i.e. support adjacency queries in O(1) time and use linear size. Quite a number of such representations have been developed, but so far only for static graphs. In this talk, we study the dynamic version of the problem and give a representation which also allows insertions and deletions of edges in O(1) and O(log n) amortized time per update, respectively. The representation is a slight variation of the adjacency list representation, equipped with a very simple, almost canonical update algorithm. Our proof of complexity proceeds by first showing that this simple algorithm inherits the time complexity of any algorithm for maintaining such adjacency lists, and then giving a non-constructive proof of existence of an algorithm with the stated complexity. Our representation works for graphs of bounded arboricity, which is a large class of sparse graphs containing e.g. the planar graphs and graphs of bounded treewidth. This is joint work with Gerth Stølting Brodal.
Several recent algorithms for solving the all pairs shortest paths (APSP) problem in graphs will be discussed. All these algorithms use fast algorithms for algebraic matrix multiplication. Among the algorithms is an O(n2.575) time algorithm for solving the APSP problem in unweighted directed graphs, and an O(nω) time algorithm for solving the problem in unweighted undirected graphs, where ω<2.376 is the exponent of algebraic matrix multiplication.
Graphs have been used for electric network theory since Kirchhoff, and for statics since Cremona and Maxwell (middle of the 19th century). However, quite a few problems of electrical engineering and in statics seem to be intractable using graphs only. Such problems include
The problem of imperfect apparatus is an important open question in quantum cryptography. The main difficulty is that it is not enough to have an apparatus which is in principle secure in accordance with known experiments. For example, it may be well accepted that in laboratory a given kind of source should emit more than 1 photon less than one percent of the time (ignoring pulses with no photon detected). For cryptography, we also need to be able to prove this kind of assertions while making as little assumptions as possible. We propose an approach based on violations of Bell inequalities which allow us to test an apparatus with very little assumptions, even on the testing apparatus itself. The connection with the Self-Testing gates of Wim van Dam, Michele Mosca, Frederic Magniez and Miklos Santha will be briefly discussed.
We isolate and generalize a technique implicit in many quantum algorithms, including Shor's algorithms for factoring and discrete log. In particular, we show that the distribution sampled after a Fourier transform over Zp can be efficiently approximated by transforming over Zq for any q in a large range. Our result places no restrictions on the superposition to be transformed, generalizing previous applications. In addition, our proof easily generalizes to multi-dimensional transforms for any constant number of dimensions. Joint work with Lisa Hales.
We study the k-round two-party communication complexity of pointer chasing problem for fixed k. Damm, Jukna and Sgall showed an upper bound of O(nlg(k-1)n) for this problem. We prove a matching lower bound for this problem improving the lower bound of Ω(n) by Nisan and Wigderson. This yields a corresponding improvement in the hierarchy results for bounded-depth monotone circuits. We also consider the bit version of the problem and show upper and lower bounds. This implies that there is an abrupt jump in complexity from linear to superlinear when the number of rounds is reduced below k/2.
The competitive ratio as a measure for the quality of on-line algorithms, has been criticized for giving bounds that are unrealistically pessimistic and for not being able to distinguish between algorithms with very different behavior in practical applications. A new measure, the accommodating function, for the quality of on-line algorithms is presented. The accommodating function is a generalization of both the competitive ratio and the accommodating ratio. As an example, we investigate the measure for a variant of bin-packing in which the goal is to maximize the number of objects put in n bins. In this talk, focus will be on a natural algorithm, First-Fit, which is almost worst possible in the competitive sense, but when analyzed in this setting, it turns out to be strictly better than some other algorithms. Joint work with Kim S. Larsen and Joan Boyar.
Recently Miklos Ajtai has shown that any algorithm (in a reasonable model of computation) that solves the Element Distinctness problem in linear time, also uses linear space. The result can be found in Miklos Ajtai: ``Determinism versus Non-Determinism for Linear Time RAMs with Memory Restrictions'', ECCC TR98-077. The paper shows a similar lower bound for the following problem. Given input x=x1,...xn, where xi is an m-bit string, does there exists i≠ j such that the Hamming Distance between xi and xj is less than m/4. For this problem the result is that any algorithm using time O(n) must use space Ω(nlg n). In this talk I will prove the lower bound for the Hamming Distance problem which contains all the main ingredients used in the Element Distinctness lower bound, but is less technical.
Poissonization is a general technique for simplifying average case analyses of algorithms. It has been used for hashing, digital tree, leader election, and probabilistic counting algorithms. It is applied in situations where the algorithm performs some action (for example, a hash table insertion) at discrete intervals. The poissonization analysis assumes instead that the actions are Poisson distributed. Special properties of the Poisson distribution then permit a simpler analysis. For example, in hashing algorithms, insertion at distinct locations become independent. From the analysis of the poissonized algorithm one obtains a solution to the original problem by a variety of analytic techiques. This is called depoissonization. We will give a general survey of poissonization and depoissonization and a number of examples where it is applied.
The j-State General Markov Model of evolution (due to Steel) is a stochastic model concerned with the evolution of strings over an alphabet of size j. In particular, the Two-State General Markov Model of evolution generalises a well-known model of evolution called the Cavender-Farris-Neyman model. I'll start my talk by describing these models. Previously Farach and Kannan showed how to PAC-learn Markov Evolutionary Trees in the Cavender-Farris-Neyman model provided that the target tree satisfies the additional restriction that all pairs of leaves have a sufficiently high probability of being the same. I will talk about how to remove this restriction (and the restriction to the Cavender-Farris-Neyman model) and thereby obtain the first polynomial-time PAC-learning algorithm (in the sense of Kearns et al.) for the general class of Two-State Markov Evolu- tionary Trees. This was joint research with Leslie Ann Goldberg and Paul Goldberg.
Finding space efficient simulations of space bounded randomized algorithms is addressed. The best known simulation to date is based on the ingenious construction of a pseudorandom generator by Noam Nisan. In this talk, we will concentrate upon Nisan's pseudorandom generator and prove its usefulness towards derandomizing space bounded algorithms. Also, some approaches towards getting improved simulations will be discussed.
Binary search of a sorted list is the standard method for determining the membership of an input value in a set of numbers or for finding its predecessor: the largest element in the set smaller than the input value. The information theory lower bound says that binary search is the fastest comparison based algorithm for these problems. Binary search is also optimal on the standard RAM model. I will present a survey of various deterministic searching algorithms, some recent and some less recent, that are better than binary search. Corresponding lower bounds will also be mentioned.
Let V(D) be the vertex set of a digraph D and let A(D) be the arc set of D. A complete digraph of order n is a digraph where the arcs xy and yx both lie in A(D), for all x,y ∈ V(D) (x ≠ y) and |V(D)|=n. A Hamilton cycle in a digraph D is a directed cycle containing all vertices exactly once. One of the best-known problems in combinatorial optimization and algorithmics is the Travelling Salesman Problem (TSP). The most general form of the TSP is the directed version, where we want to find a minimum cost Hamiltonian cycle in a complete digraph, where costs are assigned to the arcs. It is well-known that the TSP is NP-complete, which implies that we are unlikely to find a polynomial algorithm, which can solve this problem exactly. Let Dn be a complete digraph of order n, with costs assigned to its arcs, and let H be any Hamilton cycle in Dn. The domination number of H is defined to be the number of distinct Hamilton cycles in Dn, which have cost greater than or equal to the cost of H. We state several conjectures and results culminating in the fact that we in polynomial time can find a Hamilton cycle of any Dn, such that the domination number of our solution is at least (n-1)!/2. As there are only (n-1)! distinct Hamilton cycles in D, clearly (n-1)!/2 is half of all Hamilton cycles. Our result proves a conjecture by Glover and Punnen (1997). The above algorithm is not practical, but we will also state a practical algorithm, with domination number (n-2)!, which was the algorithm with the highest domination number up to a couple of weeks ago! The TSP is a special case of the more difficult Quadratic Assignment Problem (QAP). Our polynomial method also works for the QAP, but we can only prove the (n-2)! bound when n is a prime. For general n we can prove the bound Ω((n!)/(kn)) for any k>1. k>1. This is however the first method for the QAP, which gives exponential domination numbers (for polynomial algorithms). Contrary to the TSP, there didn't exist any algorithms for the QAP with high domination number, prior to the above algorithm.
The graph isomorphism problem asks whether two given graphs are isomorphic, i.e., whether there exists a one-to-one and onto mapping between their vertices that preserves adjacency. Graph isomorphism has both practical and theoretical relevance. Certain questions about chemical compounds translate into the isomorphism problem of two graphs. From a complexity theoretic point of view graph isomorphism forms one of the few candidate NP-problems believed to be neither in P nor NP-complete. One of the reasons why complexity theorists believe graph isomorphism not to be NP-complete is the structure of its complement: The graph nonisomorphism problem seems easier than the complement of any known NP-complete language. A crucial open question in this context is whether, for any pair of nonisomorphic graphs, one can give a short proof that there exists no isomorphism between them. The shortest known proofs for graph nonisomorphism are of exponential size. We will provide the first strong evidence that one can do better: Under a widely accepted complexity theoretic conjecture (namely that the polynomial-time hierarchy does not collapse) we will exhibit subexponential size proofs. Under a stronger assumption we will be able to reduce the proof size to polynomial. We obtain our results by derandomizing a randomized proof system for graph nonisomorphism known as an Arthur-Merlin game. Our derandomi- zation technique works for any Arthur-Merlin game, as well as for several other randomized processes. In particular, it applies to constructing universal traversal sequences, nonadaptively finding witnesses for NP-problems, learning Boolean circuits, and building rigid matrices. This is joint work with Adam Klivans (MIT).
The constrained maximum flow problem is to send the maximum possible flow from a source node to a sink node in a directed network subject to the constraint that the total cost of the flow does not exceed a given budget. We present the first strongly polynomial time algorithm for the problem. Our algorithm is based on Megiddo's elegant parametric search technique and uses O(mlog n(loglog nlog n + log m)) minimum cost flow computations where n and m denote the number of nodes and arcs, respectively, in the network. Joint work with Sven O. Krumke, Konrad-Zuse-Zentrum für Informationstechnik Berlin, Department Optimization.
The behaviour of three methods for constructing a binary heap is studied. The methods considered are the original one proposed by Williams [1964] based on repeated bottom-up heapifications, the improvement by Floyd [1964] based on repeatead top-down heapifications, and a level-wise construction method proposed, e.g., by Fadel et al. [1998]. Both the worst-case number of instructions and that of cache misses is analysed. It is well-known that Floyd's method has the best instruction count. However, our analysis shows that, under reasonable assumptions, the repeated bottom-up heapification method and the level-wise construction method both have the optimal O(n/B) cache behaviour, whereas the repeated top-down construction method, as programmed by Floyd, is not optimal but it causes Ω((n log2 B)/B) cache misses, where n denotes the size of the heap to be constructed and B the length of the cache lines. On the other hand, a re-engineered version of Floyd's program reaches also the optimal O(n/B) cache miss bound, and makes it the fastest of the programs considred in practice. Joint work with Jesper Bojesen and Maz Spork.
A classical and surprising result of Barrington states that the computational power of polynomial size, constant width branching programs is exactly NC1. The same result holds for constant width contact schemes and constant width circuits. In this talk, we shall consider polynomial size, constant width planar branching programs, contact schemes and (AND,OR)-circuits and show that their computational power is exactly AC0, i.e., their power is identical to polynomial size, constant depth, unbounded fan-in circuits. In contrast to Barrington's classical result, the proof for circuits is very different from the proofs for branching programs and contact schemes. As a somewhat surprising application of these pure complexity results, we shall derive a dynamic connectivity algorithm for constant width grid graphs with a time complexity of O(loglog n) per operation. This talk is based on joint work with Dave Barrington, Chi-Jen Lu, and Sven Skyum, appearing on STACS'98 and Computational Complexity '99.
In the talk I will present a very recent result: Any sequence ψn of tautologies which expresses the validity of a fixed combinatorial principle either is ``easy'' i.e. have polynomial size tree-like resolution refutations or is ``difficult'' i.e require exponential size tree-like resolution refutations. I will as show that the same complexity gap apply to the Davis-Putnam procedure. The first part of the talk will give some background in propositional logic and in the resolution method. The talk will be kept on an elementary level.
In a recent breakthrough paper (Near-Optimal Extractors Using Pseudo-Random Generators, ECCC TR98-055), Trevisan points out, that rather surprisingly(!), the well-known Impagliazzo-Wigderson pseudorandom generator yields an extractor with better parameters than what was previously known for any explicit construction! Thus, the well known construction actually solves a well known open problem! However, quoting Trevisan: ``the previously stated property of the Impagliazzo-Wigderson generator was never observed, let alone proved, before, and even though such a property is ``implicitly proved'' in [IW97], an explicit proof (whether done by this author, or left to the reader) would be long and complicated''. He then proceeds to give an alternative construction, solving the open problem. In this talk we point out that the transformation of the proof in [IW97] needed is, in fact, a very well known transformation of proofs in complexity theory, very often left to the reader: One merely has to check that their proof relativizes. As is well known, in complexity theory, it is much more noteworthy when a proof does not relativize than when it does, and indeed, by inspection, we see that their proof does relativize. In fact, this was explicitly noted in recent work by Dieter van Melkebeek. The talk will proceed as follows: We sketch the construction of the IW-generator, discuss relativized computation, and why the IW-generator relativizes. Then we define and discuss extractors and state the problem about them that used be open. This will take at least an hour. Then we spend five minutes showing why the IW-construction solves the problem. Then we spend quite a bit of time wondering why this extremely simple proof was not noted before.
The binary search tree is a fundamental data structure in computer science, due to its ability to maintain a set in sorted order during insertions and deletions. Almost any operation on a binary search tree takes time proportional to its height. The trivial lower bound on the height is ⌈log (n+1)⌉, and as is wellknown, numerous rebalancing schemes exist for keeping the height within a multiplicative constant of this bound, while using O(log n) time per update. A natural question to ask is: exactly how small a height can we maintain? Presumably, this depends on how much rebalancing we are willing to do for each update - for instance, with linear time per update, the trivial lower bound is clearly attainable by rebuilding the tree after each update. The question thus becomes: what is the best possible height maintainable with a given amount of rebalancing per update? Or, phrased another way, what is the intrinsic rebalancing complexity of binary search trees, as a function of the height maintained? In this talk, we answer the question with respect to amortized complexity by proving new upper and lower bounds on the height of binary search trees. Specifically, we show that the height ⌈log (n+1) + 1/f(n)⌉ is not maintainable with o(f(n)) amortized rebalancing work per update, and we give a new rebalancing scheme achieving that height with O(f(n)) amortized rebalancing work per update. Here, f is any function (monotonically increasing and not too pathological) between Θ(1) and Θ(n). We briefly mention some corresponding (but not completely matching) results for worst case complexity, and also consider the semi-dynamic case (i.e. insertions only).
In this talk I introduce a dynamic problem called the Marked Ancestor Problem. Consider a rooted tree whose nodes can be in two states: marked or unmarked. The marked ancestor problem is to maintain a data structure with the following operations: mark(v) marks node v; unmark(v) removes mark from node v; exists(v) returns true iff there is a marked node on the path from v to the root. It turns out that this problem underlies several important dynamic problems and data structures such as priority search trees, static tree union-find, and problems from computational geometry. The talk will focus on a proof of an Ω(log n/loglog n) lower bound for the worst-case time per operation on the unit cost RAM, and provide examples on how this result implies similar bounds of the above dynamic problems. Joint work with Thore Husfeldt and Stephen Alstrup.
Consider a scenario with two mutually distrusting parties, who want to reach some common goal, such as flip a common coin, or jointly compute some function on inputs that must be kept as private as possible. A classical example is Yao's ``millionaire's problem'': two millionaires want to find out who is richer without revealing to each other how many milllions they each own. It is well-known that if only error free, digital communication is available, then no non-trivial task of this type can be implemented without unproven computational assumptions. But if a noisy channel is available, where each bit sent is flipped with a certain probability, it has been known for long that any two-party computation can be done provided, however, that the error probability is known exactly. Unfortunately this is not a very realistic assumption: in practice one can usually only assume that the error level is in some interval [a..b], where 0≤ a, b<1/2. We show that if b≥ 2a(1-a), then no non-trivial two-party computation is possible. On the other hand, as soon as b<2a(1-a), so-called bit commitments can be implemented with unconditional security, implying that zero-knowledge proofs for all IP problems can be given in this model. If b<F(a), where F is a complicated function, a<F(a)<2a(1-a), then a stronger primitive called Oblivious Transfer can be built, implying that any joint two-party computation can be done with unconditional security. Finally, if quantum communication AND a noisy channel is available, Oblivious Transfer is possible alrady if b<2a(1-a). Joint work with Joe Kilian and Louis Salvail.
We study the fundamental problem of sorting in a sequential model of computation and in particular consider the time-space trade-off (product of time and space) for this problem. Beame has shown a lower bound of Ω(n2) for this product leaving a gap of a logarithmic factor up to the previously best known upper bound of O(n2log n) due to Frederickson. Since then, no progress has been made towards tightening this gap. Our main contribution is a comparison based sorting algorithm which closes the gap by meeting the lower bound of Beame. The time-space product O(n2) upper bound holds for the full range of space bounds between log n and n/log n. Hence in this range our algorithm is optimal for comparison based models as well as for the very powerful general models considered by Beame. The algorithm to be presented improves on the algorithm presented at the ALCOM-seminar in February. However, the techniques used are somewhat different. Joint work with Theis Rauhe.
In the well-solved edge-connectivity augmentation problem we must find a minimum cardinality set F of edges to add to a given undirected graph to make it k-edge-connected. We solve the generalization where every edge of F must go between two different sets of a given partition of the vertex set. Our solution to the general partition-constrained problem gives a min-max formula for |F| and a strongly polynomial algorithm that solves our problem in time O(n(m+nlog n)log n). Here n and m denote the number of vertices and distinct edges of the given graph respectively. This bound is identical to the best-known time bound for the problem without partition constraints due to Nagamochi and Ibaraki. Our algorithm is based on the splitting off technique of Lovász, like several known efficient algorithms for the unconstrained problem. However unlike previous splitting algorithms, when k is odd our algorithm must handle ``obstacles'' that prevent all edges from being split off. A special case of this partition-constrained problem, previously unsolved, is increasing the edge-connectivity of a bipartite graph to k while preserving bipartiteness. Based on this special case we present an application of our results in statics. In this application a square grid framework is given and the goal is to find a smallest set of diagonal rods whose addition results in a framework which remains rigid even if (at most) k-1 diagonal rods fail. This talk is based on the following two papers:
The problem of storing a subset S of a finite universe U, is fundamental within data structures. A bit vector is an extremely simple solution to the problem, and offers constant lookup time, but in many cases the space usage of |U| bits is far too much. During the 70s researchers worked on space-efficient representations with O(1) lookup time, and achieved O(|S|) words* for some ratios |U|/|S|. The breakthrough came in the early 80s, when Fredman, Komlos and Szemeredi achieved O(|S|) words for any |U| and |S|, using simple hashing techniques. More recently, it has been studied how close one can get to the information theoretical lower bound (where the coding has just enough bits for each subset of a given size to be representable). It turns out that it is possible to get surprisingly close to this bound, while maintaining O(1) lookup time. In this talk a review of the history of the problem will be given, and the new results will be discussed. * Elements of U are assumed to fit into one machine word
In this talk I will talk about two functional data structures, Queues and Catenable Lists. The two data structures illustrate how one can benefit from a lazy evaluated functional language (e.g. Haskell) when developing efficient (functional) data structures. Both data structures are much simpler than the known data structures for strict functional languages which obtain comparable complexities. The functional queues support the operations Inject and Pop in amortized constant time, and the functional catenable lists support the operations Head, Tail and Catenate in amortized constant time. The talk is based on material contained in Chris Okasaki's book ``Purely Functional Data Structures'' (Cambridge University Press, 1998).
We present a simple extensible theoretical framework for devising polynomial time approximation schemes for problems represented using natural syntactic (algebraic) specifications endowed with natural graph theoretic restrictions on input instances. Direct application of our technique yields polynomial time approximation schemes for all the problems studied in [LT80,NC88,KM96,Ba83,DTS93,HM+94a,HM+94] as well as the first known approximation schemes for a number of additional combinatorial problems. One notable aspect our work is that it provides insights into the structure of the syntactic specifications and the corresponding algorithms considered in [KM96,HM+94]. The understanding allows us to extend the class of syntactic specifications for which generic approximation schemes can be developed. The results can be shown to be tight in many cases, i.e. natural extensions of the specifications can be shown to yield non-approximable problems. As specific examples of applicability of our techniques we get that
Recently, Impagliazzo and Wigderson proved the following ``Gap'' theorem: Unless BPP is all of EXP, it can be simulated in deterministic subexponential time ``on the average''. This is the first general derandomization result using a uniform, non-cryptographic assumption. We present their proof which only uses 1990-1991 technology.
Andrzej Czygrinow will give a talk about the Regularity Lemma, a famous deep result by Szemeredi, and its algorithmic applications. The talk is divided into two parts. The first will be a tutorial-like introduction to the Lemma and its significance, and will end up with an overview of results from Andrzej's thesis. This part will be accessible to the widest audience. The second part, after the break, will be a more technical discussion of the thesis presenting algorithmic applications and extensions of the Lemma.
We present two techniques for proving lower bounds on dynamic algebraic problems. The first technique is essentially a counting argument that suffices to give a tight bound on matrix vector multiplication (and an almost tight bound bound on convolution) in a wide range of models. The second technique is both more involved and has a shorter range of application. It uses a recent lower bound result on the size of depth-2 super-concentrators by Radhakrishnan and Tashma, and it leads to nontight yet nontrivial bounds for the Discrete Fourier Transform. (Joint work with Peter Bro Miltersen and Johan P. Hansen)
Dynamic algebraic problems have been considered in a rather large body of literature by Fredman and collaborators and in a more general setting by Reif and Tate. We give an overview of the area. A substantial part of the discussion will concern the models of computation used for showing upper and lower bounds, as this turns out to be a somewhat subtle and potentially confusing point.
We show that maximal matchings can be computed deterministically in polylogarithmically many rounds in the synchronous, message-passing model of computation. This is one of the very few cases known of a non-trivial graph structure (and the only ``classical'' one) which can be computed distributively in polylogarithmic time without recourse to randomization.
We consider Random Access Machines with additional instructions such as bitwise boolean operations and integer multiplication and division. We study reducibility among different instruction sets. As it turns out, all interesting models of register computation are captured by RAMs with purely arithmetical instruction sets. In particular, we study the complexity class PTIME(+,*,<) which seems to be intimately related to numerical computation and of which absolutely nothing is known. An example of a problem in this class is MANDELBROT: Given a complex number x + iy, does (x,y) belong to the Mandelbrot set?
Watson-Crick complementarity is one of the very central components of DNA computing, the other central component being the massive parallelism of DNA strands. While the latter component drastically reduces time complexity, the former component is the essential cause behind the (Turing) universality of many models of DNA computing presented so far. The lecture makes this cause explicit, after giving a brief introduction to DNA computing. Also some case studies, discussions about universality in terms of specific models, are presented. Time permitting, another aspect of complementarity, the operational one, will be discussed in terms with Lindenmayer systems. Most of the lecture assumes no previous knowledge about the subject matter.
End-to-end communication is the problem of sending a sequence of data items from a sender to a receiver, even if the network through which they communicate is unreliable. New and existing algorithms will be surveyed and some lower bounds will be presented.
In recent work, Cramer, Damgaard and Maurer have shown that monotone span programs can be used to implement fault tolerant multiparty computations. The possible faulty subsets of the n players can be described in a natural way by a monotone Boolean function on n inputs bits. The protocols by CDM are efficient in scenarios where the function describing the possible faulty subsets has a poly-size monotone span program. The performance of earlier protocols was connected in a similar way to the monotone formula complexity of these functions. We show that the well known separation between monotone span programs and monotone formulae carries over to the types of functions occurring here, which implies that monotone span programs give rise to strictly more efficient verifiable secret sharing protocols that what was known earlier. If the separation we show could be proved to hold, also if the span program is required to have a special ``multiplication property'', this would imply that span programs can make general multiparty computations more efficient. This is open at the time of writing.
In this talk I will consider time-space tradeoff for sorting and some related problems. I will present an Ω(n2) lower bound for sorting, by way of a general lower bound technique for time-space tradeoffs (This is a result of Beame). After this I will present an algorithm improving the best known upper bound from O(n2lg n) to O((nlg* n)2) (This part is joint work with Theis Rauhe).
After brief review of the current state of affairs in the quest for the fast winning strategies for Ramsey games (and ultimately better lower bounds on Ramsey numbers), a new winning strategy for the Ramsey Graph Game will be presented. I'll also state several open problems in the area.
The problem of text indexing is a classic area in data structures. Given a text T preprocess it such that occurrences of different patterns can be reported efficiently. This talk is going to address the problem in a dynamic setting: A family F of text and/or patterns S1,...,Sk are modified dynamically by various update operations such as insertions/deletions of characters, or concatenation/splitting of strings in the family. In the talk I will present a data structure which supports such updates in polylogarithmic time together with a fast search operation; for any string Si in family F, all occurrences of Si in F can be reported in polylogartimic time per occurrence. The data structure is an extension of a data structure by Mehlhorn, Sundar and Uhrig which supports the above update operations, but only supports a query which tests for equality of a pair of strings in the family. In the first part of the talk I will sketch the data structure by Mehlhorn, Sundar and Uhrig, and in the second part I describe the extension enabling the fast search.
In this talk we introduce a general model for the evolutionary distance between two coding DNA sequences in which a nucleotide event is penalized by the change it induces on the DNA as well as the change it induces on the encoded protein. We will furthermore describe a quadratic time algorithm for computing the optimal alignment in a biological reasonable restriction of the general model.
The Precoloring Extension problem is a decision problem whose input is an undirected graph G, an unchangeable coloring (``precoloring'') on a subset W of the vertex set, and a natural number k called color bound. The question is whether the coloring of W can be extended to a proper k-coloring of the entire G, i.e., to an assignment with at most k colors such that no monochromatic vertex pair is adjacent. In a more general setting, for list colorings one has to select colors from sets of admissible colors specified for each vertex. These problems are practically motivated and lead to many interesting theoretical questions, too. In the talk we mostly deal with problems and results related to algorithmic complexity.
Starting from the short summary of the basic definitions and facts, the goal of the talk is to present the solutions of some graph optimization problems which deal with certain connectivity properties of a graph (or digraph) and to illustrate the application of some useful techniques of this area. The plan is to discuss basic splitting off theorems (due to Lovasz, Mader and others) and their extensions, constructions of 2k-edge-connected graphs and k-edge-connected digraphs, Edmonds' branching theorem, optimal connectivity augmentation of graphs, etc.
In the original approximation scheme for euclidian TSP by Arora an (1 + 1/c) approximation could be determined in polynomial time (nO(c)) in 2D and quasipolynomial time (nO(cd-1logd-2n)) for dimension d. This has now been improved to nearly linear time (npolylog(n)) for any fixed dimension (where nearly hides a term of log40c n just for the 2D case). The algorithm is randomised giving a (1 + 1/c) approximation with probability at least 1/2 but can trivially be derandomised increasing the running time by a factor of O(nd). As I presented the original approximation scheme at an AlCom seminar last year I will focus on the new elements, especially the structure theorem. Lemmata and theorems surviving the improvement will merely be stated. (Two papers describing the new results can be obtained from Arora's homepage)
We present zero-knowledge proofs and arguments for arithmetic circuits over finite prime fields, namely given a circuit, show in zero-knowledge that inputs can be selected leading to a given output. For a field GF(q), where q is an n-bit prime, a circuit of size O(n), and error probability 2-n, our protocols require communication of O(n2) bits. This is the same worst-cast complexity as the trivial (non zero-knowledge) interactive proof where the prover just reveals the input values. If the circuit involves n multiplications, the best previously known methods would in general require communication of Ω(n3log n) bits. Variations of the technique behind these protocols lead to other interesting applications. For the Boolean Circuit Satisfiability problem we give zero-knowledge proofs and arguments for a circuit of size n and error probability 2-n in which there is an interactive preprocessing phase requiring communication of O(n2) bits. In this phase, the statement to be proved later need not be known. Later the prover can non-interactively prove any circuit he wants, i.e. by sending only one message, of size O(n) bits. We also briefly show how related techniques plus secret sharing based on monotone span programs can be used to do efficient multiparty computations, both in the secure channel and the broadcast model. Paper by Ronald Kramer and Ivan Damgård is available in paper copies.
Not only is quantum computers seemingly strictly more powerful than classical computers, but their algorithms and analysis are also more beautiful. Classical computation are usually done on binary strings. In contrast, quantum mechanics allows quantum computation to use any vector in the space formed by all finite C-linear combinations of such strings. As our main example in this talk, we use a communication complexity problem where k parties want to compute some group-theoretical function. We give a surprisingly simple quantum protocol that improves on any classical protocol by at least a log(k) factor. We also briefly mention other problems considered for quantum computation. The richness and ``skønhed'' of quantum computation are elaborated by using group algebras. All necessary background are given in the talk. (paper by Wim van Dam, Peter Høyer, Alain Tapp available as quant-ph/9710054 from http://xxx.soton.ac.uk/archive/quant-ph)
We consider dictionaries of size n over the finite universe U={0,1}w and introduce a new technique for their implementation: error correcting codes. The use of such codes makes it possible to replace the use of strong forms of hashing, such as universal hashing, with much weaker forms, such as clustering. We use our approach to construct, for any ε>0, a deterministic solution to the dynamic dictionary problem using linear space, with worst case time O(nε) for insertions and deletions, and worst case time O(1) for lookups. This is the first deterministic solution to the dynamic dictionary problem with linear space, constant query time, and non-trivial update time. In particular, we get a solution to the static dictionary problem with O(n) space, worst case query time O(1), and deterministic initialization time O(n1+ε). The best previous deterministic initialization time for such dictionaries, due to Andersson, is O(n2+ε). The model of computation for these bounds is a unit cost RAM with word size w (i.e. matching the universe), and a standard instruction set. The constants in the big-O's are independent upon w. The solutions are weakly non-uniform in w, i.e. the code of the algorithm contains word sized constants, depending on w, which must be computed at compile-time, rather than at run-time, for the stated run-time bounds to hold. An ingredient of our proofs, which may be interesting in its own right, is the following observation: A good error correcting code for a bit vector fitting into a word can be computed in O(1) time on a RAM with unit cost multiplication. As another application of our technique in a different model of computation, we give a new construction of perfect hashing circuits, improving a construction by Goldreich and Wigderson. In particular, we show that for any set S subseteq {0,1}w of size n, there is a Boolean circuit C of size O(wlog w) with w inputs and 2log n outputs so that the function defined by C is 1-1 on S. The best previous bound on the size of such a circuit was O(wlog w loglog w). (tech report available on web)
It is well known that a graph G admits a strongly connected orientation if and only if G is 2-edge connected. In fact, such an orientation can be found in linear time. The things get more complicated when the goal is to find a strongly connected orientation that is optimal with respect to some natural measure of optimality (such as minimizing the maximal distance in the orientation, minimizing (weighted) sum of distances taken over all pairs of vertices,...). I will discuss the complexity of these problems, present optimal orientations for some classes of graphs that are important from practical point of view, and propose guidelines for design of efficient heuristic algorithms.
The Graph Minor Theorem (GMT) promises that finite forbidden substructure characterizations, in the manner of Kuratowski's characterization of planarity, exist for many natural graph properties. The proof of the GMT is nonconstructive at least to the extent that simply knowing how to recognize a graph ideal is not enough information to allow the set of obstructions to be effectively computed. The talk is about the big question: ``What kinds of information about an ideal allow the obstruction set to be computed?'' This is still far from being well-understood. The talk will describe a recent extension of earlier results of Fellows and Langston on the computation of obstruction sets. The new methods allow obstruction sets to be computed, e.g., from monadic second order logical descriptions of an ideal, or a Myhill-Nerode-type congruence for the ideal (these may sound exotic, but are usually available), without the necessity of a prior bound on maximum obstruction width. As an example, the new methods allow the intertwines of an arbitrary graph and a tree to be effectively computed, and can probably be extended to the intertwines of an arbitrary graph and a planar graph, and maybe to the intertwines of arbitrary graphs. Perhaps surprisingly, these methods are practical and have been at least partly implemented, and nontrivial previously unknown obstruction sets mechanically computed. The computation of the obstruction set for torus embeddings, for example, now appears to be within reach. Despite these new positive results, there is the distinct possibility of much stronger noncomputability theorems as well. These will also be discussed. Altogether, this is an area in which there are many elegant open problems that will be highlighted. Joint work with: Kevin Cattell (Univ. Victoria), Michael Dinneen (Univ. Auckland), Rod Downey (Victoria Univ., Wellington) and Mike Langston (Univ. Tennessee).
A protein evolves slower than its coding DNA making it more reliable to align (i.e. compare) the protein rather than the underlying DNA. However only aligning sequences at the protein level and disregard the fact that the protein corresponds to a coding sequence of DNA is not a biological sound solution. I will present a model for pairwise alignment in which the underlying coding DNA sequence is considered when aligning proteins. A simple O(n4) time algorithm computing the optimal alignment in the model was presented by Hein in 1994. I will present an improved algorithm computing the optimal alignment in time O(n2). This improvement was conjectured by Hein in 1994 and is achieved by application of dynamic programming.
We consider the ``standard'' problem in the theory of zero-knowledge protocols: showing in zero-knowledge that a Boolean formula of size n is satisfiable, with error probability 2-n. We show that this can be done in two phases: an interactive pre-processing phase (which can be done in idle time since it is independent of the formula to handle later) with communication complexity O(n2log n) bits; and a proof phase, where the prover sends just one message of length O(nlog n) bits. The best known interactive protocols have complexity O(n2) bits, so our results shows that at a marginal cost in total complexity, the actual proof can be made non-interactive and much smaller than what was previously known. The constants involved are small enough to provide for realistic size formulas: a proof for a formula with 10.000 binary operators would be about 14 Kbyte. The key to our result is the use of Karchmer and Wigderson's span programs to represent the formula in question.
The presentation is divided in two parts: I: Faster deterministic sorting and searching in linear space We present a significant improvement on linear space deterministic sorting and searching. On a unit-cost RAM with word size w, an ordered set of n w-bit keys (viewed as binary strings or integers) can be maintained in O(min (√log n, log n/log w + loglog n, log w loglog n)) time per operation, including insert, delete, member search, and neighbour search. The cost for searching is worst-case while the cost for updates is amortized. As an application, n keys can be sorted in linear space at O(n √log n) worst-case cost. The best previous method for deterministic sorting and searching in linear space has been the fusion trees which supports updates and queries in O(log n/loglog n) amortized time and sorting in O(n log n/loglog n) worst-case time. We also make two minor observations on adapting our data structure to the input distribution and on the complexity of perfect hashing. II: Tight bounds for searching a sorted array of strings Given n strings, each of k characters, arranged in alphabetical order (i.e., a string precedes another string if it has the smallest character in the first position in which the two strings differ), how many characters must we probe to determine whether a k-character query string is present? We assume that the strings are given in an array and that no extra information is available. If k is a constant, we can solve the problem using O(log n) probes by means of binary search, and this is optimal, but what happens for larger values of k? In the presence of preprocessing and extra storage, there are efficient methods, but what if we are just given the sorted strings? The question is a fundamental one; we are simply asking for the complexity of searching a dictionary for a string, where the common assumption that entire strings can be compared in constant time is replaced by the assumption that only single characters can be compared in constant time. For sufficiently long strings, the latter assumption seems more realistic. At first glance the problem may appear easy---some kind of generalized binary search should do the trick. However, closer acquaintance with the problem reveals a surprising intricacy. Here, we will present tight upper and lower bounds.
This seminar will be based on the paper ``P=BPP unless E has sub-exponential circuits: Derandomizing the XOR-lemma'' by Russell Impagliazzo and Avi Wigderson (IW). (To be presented at STOC'97) Abstract of IW-paper: Yao showed that the XOR of independent random instances of a somewhat hard Boolean function becomes almost completely unpredictable. In this paper we show that, in non-uniform settings, total independence is not necessary for this result to hold. We give a pseudo-random generator which produces n instances of the function for which the analog of the XOR lemma holds. This is the first derandomization of a ``direct product'' result. Our generator is a combination of two known ones - the random walks on expander graphs (Ajtai et al., 1987; Cohen and Wigderson, 1989; Impagliazzo and Zuckerman, 1989) and the nearly disjoint subsets generator (Nisan, 1991). The quality of the generator is proved via a new proof of the XOR lemma, which may be useful for other direct product results. Combining our generator with the approach of Nisan and Wigderson (1994), Babai et al. (1993) and the generator of Impagliazzo (1995) gives substantially improved results for hardness vs. randomness trade-offs. In particular, we show that if any problem in E=DTIME(2O(n)) has circuit complexity 2Ω(n), then P=BPP.
I will present parts of the paper [1] Noam Nisan and Avi Wigderson: Hardness vs. randomness, JCSS 49 (1994), no. 2, 149--167 This lies the foundation for the recent paper [2] Russell Impagliazzo and Avi Wigderson: P=BPP unless E has sub-exponential circuits, STOC '97, which I will tell about the following week. Building on ideas developed by Blum and Micali, and Yao, [1] explains how the existence of a `really hard' function (somewhere around exponential time) implies P=BPP. From there, a handful of papers, which I won't cover (and don't understand), implied the statement P=BPP unless *bla*, where *bla* are increasingly unlikely statements, culminating in [2]. However, the intuition behind all these results: trading hardness for randomness, is the same as in [1].
Joint work with Theis Rauhe We prove new lower bounds for dynamic algorithms from various fields. The results strengthen, generalise, and further simplify previous work with Søren Skyum that Theis Rauhe presented to the Alcom seminar last year. (In the talk I assume no familiarity with this previous work.) The key result (again) is an extension of the time stamp method of Fredman and Saks, which generalises their result; the proof has been streamlined and is considerably more transparent than our previous work. However, the talk will focus on applications of this result, which is much more fun. These include very easy reductions that yield lower bound proofs for dynamic planar point location and graph connectivity. After this warm-up, we will consider the complexity of the dynamic prefix problem for arbitrary symmetric functions. We exhibit connections to the Boolean circuit complexity of these functions.
This seminar will based mostly on the paper ``On Span Programs'' by Karchmer and Wigderson (KW). (presented at Structure in Complexity Theory '93) We briefly review the concept of a span-program (SP) and some of the complexity results for SP's proved by KW. We then look in some detail at monotone span programs and KW's construction of secret sharing schemes from SP's. We show how this solves a previously open problem in secret sharing: monotone functions do exist that have efficient secret sharing schemes, but require superpolynomial monotone formula size. Thus monotone formula size and efficiency of secret sharing are not equivalent, as previously conjectured by some researchers.
A general overview of the Euclidean Steiner Tree Problem will be given during the first part of the talk. The problem of finding a suboptimal Steiner tree inside a simple polygon will be addressed in the second part of the talk. An approach based on the concatenation of Steiner trees with few terminals provides solutions of good quality. Efficient methods of determining optimal Steiner trees with three and four terminals inside a simple polygon will be outlined. References General on Steiner trees (Euclidean, rectilinear and in graphs):
The fundamental problem of range searching has been the subject of much research, and many elegant (main memory) data structures have been developed for the problem and its special cases. Unfortunately most of these structures are not efficient when mapped to external memory. However, the practical need for I/O support has led to the development of a large number of external data structures, which have good average-case behavior for common problems but fail to be efficient in the worst case sense. Very recently some progress has been made on the construction of external range searching structures with good worst-case I/O performance. In this talk we give a short survey of these results and present our optimal solution to the special case of external two-dimensional range searching called dynamic interval management. This problem has important applications especially in object-oriented databases and constraint logic programming.
We state a new sampling lemma and use it to improve the running time of dynamic graph algorithms. For the dynamic connectivity problem the previously best randomized algorithm takes expected time O(log3 n) per update, amortized over Ω(m) updates. Using the new sampling lemma, we improve its running time to O(log2 n). Similarly improved running times are achieved for 2-edge connectivity, k-weight minimum spanning tree, and bipartiteness. Joint work with Monika Rauch Henzinger.
The aim of the course is to investigate how linear algebraic methods may be employed in combinatorics and algorithms, in particular to graph problems. Starting from the basics of algebraic graph theory, the focus will be on the Lovasz function of a graph (a P-time computable function sandwiched between two NP-complete parameters of a graph) and on methods stemming from it such as semi-definite programming and its applications in developing approximation algorithms for graph problems such as colouring. Topics:
The construction of priority queues is a classical topic in data structures. Recently, it has been considered how to implement priority queues on parallel machines. In this talk we focus on how to achieve optimal speedup for the individual priority queue operations known from the sequential setting. We present time and work optimal priority queues for the CREW PRAM. Our priority queues support FindMin in constant time with one processor, and MakeQueue, Insert, Meld, FindMin, ExtractMin, Delete and DecreaseKey in constant time with O(log n) processors. A priority queue can be build in time O(log n) with O(n/log n) processors. In parallel k elements can be inserted into a priority queue in time O(log k) with O(log n+k/log k) processors. With a slowdown of O(loglog n) in time the priority queues adopt to the EREW PRAM without increasing the work. A pipelined version of our priority queues can be implemented on a processor array of size O(log n), supporting the operations MakeQueue, Insert, Meld, FindMin, ExtractMin, Delete and DecreaseKey in constant time.
We address the issue of efficiently searching on external dynamic data structures for strings, introducing the following External Dynamic Substring Search problem. Consider a set of (external) text strings kept in secondary storage. The set can be dynamically changed by inserting or deleting strings, and on-line searched to find all the occurrences of an arbitrary pattern string as a substring of the text strings in the set. In the first part, we describe a text indexing data structure for secondary storage, called the SB-Tree, which allows us to solve the External Dynamic Substring Search problem with provably good worst-case I/O bounds. It requires optimal space and makes the on-line search alphabet independent also in main memory. In the second part, we show that SB-trees combine the best of B-trees and suffix arrays, overcoming the limitations of inverted files, suffix arrays, suffix trees, and prefix B-trees. The performance of SB-trees is evaluated in a practical setting, under a number of searching and updating experiments. Improved performance is obtained by a new space-efficient and alphabet- independent organization of the internal nodes of SB-trees, and a new batch insertion procedure that avoids thrashing.
Let S be a finite alphabet and a1,a2,...,ak be strings over S with length n1,n2,...,nk. Let S' be the alphabet S extended with a special `space'-character `-'. An alignment of the strings a1,a2,...,ak is specified by a (k × m)-matrix, where m≥max{n1,n2,...,nk}. Each element of the matrix is a member of S' and each row i contains the characters of ai in order, interspersed by m - ni spaces. Let d : S' × S' → R be a pseudo-metric. The sum-of-pair score (SP-score) for an alignment A of the strings a1,a2,...,ak is defined as
Consider a protocol, in which a prover can convince a verifier that a given word x belongs to a language L. This is called a proof system for L. If the verifier gets only negligible information except for the fact that x is in L, even regardless of his computing power, the protocol is said to be statistical zero-knowledge. What characterizes a language that has a statistical zero-knowledge proof system? This seems to be a very difficult question. The class of such languages, known as SZKIP, contains both languages in NP and languages beleived not to be in NP. On the other hand, SZKIP does not contain NP unless the poly-time hierachy collapses. In this work, we try to approach the nature of SZKIP by showing that many languages in SZKIP are in a certain sense stable under monotone operations. More precisely, we consider the following scenario: Suppose we are given a language L in SZKIP. From this and a monotone Boolean function f on n inputs, we can construct in a natural way a new language f(L): Suppose we are given a division of some word x into n sub-words x1,..,xn; then we create an n-bit string B by saying that the i'th bit of B is 1 iff xi is in L. We define that x is in f(L) iff f(B)=1. If for example f is a threshold function with threshold k, we are simply asking whether at least k of the n sub-words are in L. We show that under certain conditions on L and f, both f(L) and f(L') (where L' is the complement of L) are in SZKIP. More specifically, it is sufficient that the proof system of L is a 3-move Arthur-Merlin game and that f has a polynomial size monotone formula. This in fact means that any language previously known to be in SZKIP immediately gives rise to an infinite family of languages in SZKIP. Joint work with Ronald Cramer, CWI.
I will present Haken's proof of an exponential lower bound for the size of monotone solutions to NP-complete problems (Armin Haken: Counting Bottlenecks to Show Monotone P≠NP. FOCS '95) The method of proving lower bounds by bottleneck counting is illustrated for monotone Boolean circuits. This paper gives another proof of the result of Razborov and Andreev, that monotone Boolean circuits must have exponential size when solving a problem in NP. More specifically, the paper defines a graph recognition problem called BMS. Any monotone circuit that solves BMS, must contain a quantity of gates that is exponential in the eighth root of the input size. The actual instances of the BMS problem used to prove the lower bound are easy to separate for non-monotone circuits. The proof is self-contained and uses only elementary combinatorics.