Data Discovery in Data Lakes

Name: Data Discovery in Data Lakes
Brand: Springer
Price: 128.39 EUR
Availability: PreOrder

Ziawasch Abedjan Mahdi Esmailoghli Sainyam Galhotra(Author)

Springer (Publisher)

Will be published approx. on 21. August 2026

Book

Hardback

978-3-032-30821-4 (ISBN)

€128.39incl. 7% vat

Not yet published

Description

More details

Persons

Ziawasch Abedjan is Chair of the Data Integration and Data Preparation group at TU Berlin and research group lead at Berlin Institute for Foundations of Learning and Data (BIFOLD). His research is focused on developing generalizable and scalable techniques for various data integration problems, such as data cleaning, data science pipeline generation and understanding, and data discovery. Previously, he chaired the Database and Information systems group at the Leibniz University Hannover. He held positions as Assistant Professor at TU Berlin, Postdoctoral Associate at MIT, Research Associate at QCRI, Senior Researcher at the German Center for Artificial Intelligence (DFKI), and Visiting Academic at Amazon Search. He received his PhD from the Hasso Plattner Institute in Potsdam and was awarded the University of Potsdam's best Dissertation Prize in 2014. He has co-authored more than 80 peer-reviewed papers in leading database and software engineering venues and received recognition with several academic awards.

His research is supported by the German Research Council (DFG) and the German Federal Ministry of Science and Education.

Mahdi Esmailoghli is a postdoctoral research associate at the David R. Cheriton School of Computer Science at the University of Waterloo. His research is centered on the challenges of data discovery within large-scale data lakes. He investigates how data systems can assist researchers and users in navigating vast table repositories to identify relevant datasets and suggest optimized pipelines, including index structures and algorithms.

His work has been published in leading data management venues, including SIGMOD, VLDB, ICDE, EDBT, and CIDR. Mahdi earned his PhD in Computer Science with the highest honors from the Technical University of Berlin.

Sainyam Galhotra is an assistant professor in the Department of Computer Science at Cornell University. His research focuses on trustworthy artificial intelligence, data-centric AI, causal reasoning, and reliable agentic systems. He studies how AI systems can reason over complex multimodal data while remaining transparent, robust, and adaptive in dynamic environments. His work has been published in leading venues across machine learning, data management, and AI systems, including ICLR, NeurIPS, SIGMOD, VLDB, and ICSE.

He is the recipient of the Best Paper Award at FSE 2017, the Most Reproducible Paper Award at SIGMOD 2017 and 2018, and the Best Artifact Paper Honorable Mention Award at SIGMOD 2023. He has also been recognized as a Data Science Rising Star, a DAAD AInet Fellow, and the first recipient of the Krithi Ramamritham Award at the University of Massachusetts Amherst for contributions to database research.

Content

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Data Discovery in Data Lakes

Description

More details

Persons

Content