Worst-case Optimal Join Algorithms

  • Hung Q. Ngo
  • , Ely Porat
  • , Christopher Ré
  • , Atri Rudra

Research output: Contribution to journalArticlepeer-review

117 Scopus citations

Abstract

Efficient join processing is one of the most fundamental and well-studied tasks in database research. In this work, we examine algorithms for natural join queries over many relations and describe a new algorithm to process these queries optimally in terms of worst-case data complexity. Our result builds on recent work by Atserias, Grohe, and Marx, who gave bounds on the size of a natural join query in terms of the sizes of the individual relations in the body of the query. These bounds, however, are not constructive: they rely on Shearer's entropy inequality, which is information-theoretic. Thus, the previous results leave open the question of whether there exist algorithms whose runtimes achieve these optimal bounds. An answer to this question may be interesting to database practice, as we show in this article that any project-join style plans, such as ones typically employed in a relational database management system, are asymptotically slower than the optimal for some queries. We present an algorithm whose runtime is worst-case optimal for all natural join queries. Our result may be of independent interest, as our algorithm also yields a constructive proof of the general fractional cover bound by Atserias, Grohe, and Marx without using Shearer's inequality. This bound implies two famous inequalities in geometry: the Loomis-Whitney inequality and its generalization, the Bollobás-Thomason inequality. Hence, our results algorithmically prove these inequalities as well. Finally, we discuss how our algorithm can be used to evaluate full conjunctive queries optimally, to compute a relaxed notion of joins and to optimally (in the worst-case) enumerate all induced copies of a fixed subgraph inside of a given large graph.

Original languageEnglish
Article number16
JournalJournal of the ACM
Volume65
Issue number3
DOIs
StatePublished - 13 Mar 2018

Bibliographical note

Publisher Copyright:
© 2018 ACM.

Funding

A preliminary version of this article was presented at PODS’12 as Reference [62]. We thank Georg Gottlob for sending us a full version of his work [30]. We thank XuanLong Nguyen for introducing us to the Loomis-Whitney inequality. We thank Dung Nguyen for catching some errors in the earlier statement of our algorithm. We thank the anonymous PODS’12 and JACM referees for many helpful comments that have greatly improved the presentation of the article. H.N.’s work is partly supported by NSF Grants No. CCF-1161196 and No. CCF-1319402. C.R. acknowledges the National Science Foundation (NSF) under CAREER Awards No. IIS-1353606 and No. CCF-1356918, the Office of Naval Research (ONR) under Awards No. N000141210041 and No. N000141310129, the Sloan Research Fellowship, the Moore Foundation Data Driven Investigator award, and gifts from American Family Insurance, Google, Lightspeed Ventures, and Toshiba. A.R.’s work on this project is supported by NSF Grants No. CCF-0844796 and No. CCF-1319402. Authors’ addresses: H. Q. Ngo and A. Rudra, 338 Davis Hall, University at Buffalo, Buffalo, NY, 14214. USA; emails: {hungngo, atri}@buffalo.edu; E. Porat, Bar-Ilan University, Ramat-Gan, 5290002 Israel; email: [email protected]; C. Ré, Gates Computer Science Building, 353 Serra Mall, Stanford, CA 94305. USA; email: [email protected]. Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]. © 2018 ACM 0004-5411/2018/03-ART16 $15.00 https://doi.org/10.1145/3180143 H.N.'s work is partly supported by NSF Grants No. CCF-1161196 and No. CCF-1319402. C.R. acknowledges the National Science Foundation (NSF) under CAREER Awards No. IIS-1353606 and No. CCF-1356918, the Ofce of Naval Research (ONR) under Awards No. N000141210041 and No. N000141310129, the Sloan Research Fellowship, the Moore Foundation Data Driven Investigator award, and gifts from American Family Insurance, Google, Lightspeed Ventures, and Toshiba. A.R.'s work on this project is supported by NSF Grants No. CCF-0844796 and No. CCF-1319402.

FundersFunder number
American Family Insurance
Lightspeed VenturesCCF-0844796
Ofce of Naval Research
National Science FoundationIIS-1353606, CCF-1356918
Office of Naval ResearchN000141310129, N000141210041
Directorate for Computer and Information Science and Engineering1054009
Gordon and Betty Moore Foundation
Google

    Keywords

    • Bollobás-Thomason inequality
    • Join Algorithms
    • Loomis-Whitney inequality
    • fractional cover bound

    Fingerprint

    Dive into the research topics of 'Worst-case Optimal Join Algorithms'. Together they form a unique fingerprint.

    Cite this