Hands-On Data Science with SQL Server 2017

Name: Hands-On Data Science with SQL Server 2017 | Perform end-to-end data analysis to gain efficient data insight
Brand: Packt Publishing
Price: 34.99 EUR
Availability: OnlineOnly

Perform end-to-end data analysis to gain efficient data insight

Chmel Marek Chmel(Author)

Packt Publishing

1st Edition

Published on 29. November 2018

506 pages

E-Book

ePUB with Adobe-DRM

System requirements

978-1-78899-643-3 (ISBN)

€34.99incl. 7% vat

System requirements

for ePUB with Adobe-DRM

E-Book Single Licence

Available for download

Description

Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.

Alles über E-Books, Kopierschutz & Dateiformate finden Sie in unserem Info- & Hilfebereich.

Find, explore, and extract big data to transform into actionable insights

Key Features

Perform end-to-end data analysis-from exploration to visualization

Real-world examples, tasks, and interview queries to be a proficient data scientist

Understand how SQL is used for big data processing using HiveQL and SparkSQL

Book Description

SQL Server is a relational database management system that enables you to cover end-to-end data science processes using various inbuilt services and features.

Hands-On Data Science with SQL Server 2017 starts with an overview of data science with SQL to understand the core tasks in data science. You will learn intermediate-to-advanced level concepts to perform analytical tasks on data using SQL Server. The book has a unique approach, covering best practices, tasks, and challenges to test your abilities at the end of each chapter. You will explore the ins and outs of performing various key tasks such as data collection, cleaning, manipulation, aggregations, and filtering techniques. As you make your way through the chapters, you will turn raw data into actionable insights by wrangling and extracting data from databases using T-SQL. You will get to grips with preparing and presenting data in a meaningful way, using Power BI to reveal hidden patterns. In the concluding chapters, you will work with SQL Server integration services to transform data into a useful format and delve into advanced examples covering machine learning concepts such as predictive analytics using real-world examples.

By the end of this book, you will be in a position to handle the growing amounts of data and perform everyday activities that a data science professional performs.

What you will learn

Understand what data science is and how SQL Server is used for big data processing

Analyze incoming data with SQL queries and visualizations

Create, train, and evaluate predictive models

Make predictions using trained models and establish regular retraining courses

Incorporate data source querying into SQL Server

Enhance built-in T-SQL capabilities using SQLCLR

Visualize data with Reporting Services, Power View, and Power BI

Transform data with R, Python, and Azure

Who this book is for

Hands-On Data Science with SQL Server 2017 is intended for data scientists, data analysts, and big data professionals who want to master their skills learning SQL and its applications. This book will be helpful even for beginners who want to build their career as data science professionals using the power of SQL Server 2017. Basic familiarity with SQL language will aid with understanding the concepts covered in this book.

More details

Other editions

Content

Cover
Title Page
Copyright and Credits
About Packt
Contributors
Table of Contents
Preface
Chapter 1: Data Science Overview
Introducing data science
Data science project life cycle
Business understanding
Getting data
Modelling and analysis
Deployment and visualization
Final acceptance
Data science domains
Math and statistics
Visualizing the types of data
Statistics 101
Central tendency
Skewness
Variability
Machine learning
SQL Server and machine learning
Choosing the right algorithm
Big data
SQL Server and big data
Summary
Chapter 2: SQL Server 2017 as a Data Science Platform
Technical requirements
SQL Server evolution
What's available in the pack?
History of SQL Server
SQL Server in the cloud
Azure SQL Database
Azure SQL Data Warehouse
SQL Server Services and their use with data science
SQL Server Integration Services
SQL Server Analysis Services
Tabular Mode
Multidimensional mode
PowerPivot Mode
Querying languages
Reporting Services
Development tools for Reporting Services
Power BI Report Server
Machine Learning Services
Summary
Chapter 3: Data Sources for Analytics
Technical requirements
Getting data from databases
Importing data from SQL Server
Importing data from other database systems
Importing flat files
Working with XML data
Working with JSON
Retrieve data as JSON
Processing stored JSON data
External data with PolyBase
Installing and configuring
Summary
Chapter 4: Data Transforming and Cleaning with T-SQL
Technical requirements
The need for data transformation
Database architectures for data transformations
Direct source for data analysis
Staging-target scenario
Landing-staging-target scenario
Tools eligible for data movement
Distributed queries
SQL Server Integration Services
Why should we use SSIS?
What is needed to develop an SSIS solution?
Where should SSIS be used?
Is there an alternative to SSIS?
Transforming data
Full data load
Incremental data load
The MERGE statement
CHECKSUM
Temporal tables
Denormalizing data
Relational normalization
First normal form
Second normal form
Third normal form
Need for denormalization
Ways of denormalization
Computed columns
Denormalization using joins
Using views and stored procedures
Database applications
Using views
Using stored procedures
Performance considerations
Writing correct code
Using indexes
B-tree indexes
COLUMNSTORE INDEX
Summary
Questions
Chapter 5: Data Exploration and Statistics with T-SQL
Technical requirements
T-SQL aggregate queries
Common properties of aggregate functions
Aggregate functions
COUNT, COUNT(*), and COUNT_BIG
MIN and MAX
SUM
AVG
VAR and VARP
STDEV and STDEVP
Using groups
Using the HAVING clause
Ranking, framing, and windowing
Ranking functions
ROW_NUMBER
RANK
DENSE_RANK
NTILE
Running aggregates
Using aggregate functions in running aggregates
Using aggregate functions
Using the LEAD and LAG functions
Calculating with percentiles
The PERCENT_RANK and CUME_DIST functions
The PERCENTILE_CONT and PERCENTILE_DISC functions
Summary
Questions
Chapter 6: Custom Aggregations on SQL Server
Technical requirements
Overview of SQLCLR
Use cases of using SQLCLR
How to work with SQLCLR
Instance and database configurations to use with SQLCLR
Creating CLR aggregations
Example goal and assignment
Skeleton of CLR aggregation
Implementing methods
Implementing custom serialization
Implementing the Init method
Implementing the Accumulate method
Implementing the Merge method
Implementing the terminate method
Deployment and testing
Limitations and performance considerations
Development issues and risks
Maintenance issues and risks
Performance issues and risks
Summary
Questions
Chapter 7: Data Visualization
Technical requirements
Data visualization - preparation phase
Power BI Report Server
Starting with Power BI Desktop
Defining the data source
Adding visualizations to the Report
Visual interactions
Publishing reports
SQL Server Reporting Services
Adding charts to Reports
Using SQL Server Data Tools
Summary
Chapter 8: Data Transformations with Other Tools
Technical requirements
Categorization, missing values, and normalization
Categorization
Missing values
Normalization
Z-score
Feature-scaling
Using Integration Services for data transformation
Setting up a SSIS project
Categorizing the products
Using R for data transformation
Preparing client R environment
R Syntax first steps
Working example of Z-score computed in R
Using Data Factory for data transformation
Creating Azure Data Factory
Creating simple copy data with ADF
Summary
Questions
Chapter 9: Predictive Model Training and Evaluation
Technical requirements
Preparing SQL Server
Setting up and configuring ML services
Preparing to install our own R packages
Creating data structures
The concept of machine learning in databases
Creating physical data structures
Creating common objects
Creating objects using filestreams
Creating objects using temporal tables
Deploying, training, and evaluating a predictive model
Saving our machine learning model to filestreams
Saving a machine learning model to temporal tables
Summary
Questions
Chapter 10: Making Predictions
Technical requirements
Reading models from a database
Reading the model from a common table
Reading the model from a temporal table
Submitting values to an external script
Submitting values into the external script
Deserializing a predictive model
Making the prediction
Using the PREDICT keyword
Making the predictive model self-training
Re-calculating a predictive model regularly
Re-calculating a predictive model asynchronously
Creating a message type
Creating a contract
Creating queues and services
Sending a request to train a new model
Consuming requests and sending responses
Testing the asynchronous solution
Summary
Questions
Chapter 11: Getting It All Together - A Real-World Example
Technical requirements
Assignment and preparation
SQL Server
Data description
Data exploration
Exploring data using T-SQL
Exploring data using the SSIS Data Profiling Task
Exploring the SourceData.Actions table
Exploring data using R
Data transformation
Training and using predictive models for estimations
Preparing the schema for the model
Training the model
Using the rxLinMod function and finishing the model
Using the model in predictions
Summary
Questions
Chapter 12: Next Steps with Data Science and SQL
Data science next steps
Next steps with SQL Server
Big data clusters
Machine learning
Machine learning services on Linux
Machine learning high availability
Data science in the cloud
Summary
Other Books You May Enjoy
Index

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Hands-On Data Science with SQL Server 2017

Description

More details

Other editions

Additional editions

Content

System requirements