Abstract
Exploratory Data Analysis (EDA) is an essential yet highly demanding task. To get a head start before exploring a new dataset, data scientists often prefer to view existing EDA notebooks - illustrative, curated exploratory sessions, on the same dataset, that were created by fellow data scientists who shared them online. Unfortunately, such notebooks are not always available (e.g., if the dataset is new or confidential). To address this, we present ATENA, a system that takes an input dataset and auto-generates a compelling exploratory session, presented in an EDA notebook. We shape EDA into a control problem, and devise a novel Deep Reinforcement Learning (DRL) architecture to effectively optimize the notebook generation. Though ATENA uses a limited set of EDA operations, our experiments show that it generates useful EDA notebooks, allowing users to gain actual insights.
| Original language | English |
|---|---|
| Title of host publication | SIGMOD 2020 - Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data |
| Publisher | Association for Computing Machinery |
| Pages | 1527-1537 |
| Number of pages | 11 |
| ISBN (Electronic) | 9781450367356 |
| DOIs | |
| State | Published - 14 Jun 2020 |
| Externally published | Yes |
| Event | 2020 ACM SIGMOD International Conference on Management of Data, SIGMOD 2020 - Portland, United States Duration: 14 Jun 2020 → 19 Jun 2020 |
Publication series
| Name | Proceedings of the ACM SIGMOD International Conference on Management of Data |
|---|---|
| ISSN (Print) | 0730-8078 |
Conference
| Conference | 2020 ACM SIGMOD International Conference on Management of Data, SIGMOD 2020 |
|---|---|
| Country/Territory | United States |
| City | Portland |
| Period | 14/06/20 → 19/06/20 |
Bibliographical note
Publisher Copyright:© 2020 Association for Computing Machinery.
Funding
Acknowledgments. This work has been partially funded by the Israel Innovation Authority (MDM), the Israel Science Foundation, and the Binational US-Israel Science Foundation.
| Funders |
|---|
| Israel Innovation Authority |
| MDM |
| US-Israel Science Foundation |
| Israel Science Foundation |
Keywords
- EDA
- EDA notebooks
- auto EDA
- auto generated
- autogenerated
- data exploration
- interactive data analysis
- notebooks