
Microsoft Big Data Solutions
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Other editions
Additional editions

Persons
Content
- Cover
- Title Page
- Copyright
- Contents
- Introduction
- Part I What Is Big Data?
- Chapter 1 Industry Needs and Solutions
- What's So Big About Big Data?
- A Brief History of Hadoop
- Nutch
- What Is Hadoop?
- Derivative Works and Distributions
- Hadoop Distributions
- Core Hadoop Ecosystem
- Important Apache Projects for Hadoop
- The Future for Hadoop
- Summary
- Chapter 2 Microsoft's Approach to Big Data
- A Story of "Better Together"
- Competition in the Ecosystem
- SQL on Hadoop Today
- Hortonworks and Stinger
- Cloudera and Impala
- Microsoft's Contribution to SQL in Hadoop
- Deploying Hadoop
- Deployment Factors
- Deployment Topologies
- Deployment Scorecard
- Summary
- Part II Setting Up for Big Data with Microsoft
- Chapter 3 Configuring Your First Big Data Environment
- Getting Started
- Getting the Install
- Running the Installation
- On-Premise Installation: Single-Node Installation
- HDInsight Service: Installing in the Cloud
- Windows Azure Storage Explorer Options
- Validating Your New Cluster
- Logging into HDInsight Service
- Verify HDP Functionality in the Logs
- Common Post-Setup Tasks
- Loading Your First Files
- Verifying Hive and Pig
- Summary
- Part III Storing and Managing Big Data
- Chapter 4 HDFS, Hive, HBase, and HCatalog
- Exploring the Hadoop Distributed File System
- Explaining the HDFS Architecture
- Interacting with HDFS
- Exploring Hive: The Hadoop Data Warehouse Platform
- Designing, Building, and Loading Tables
- Querying Data
- Configuring the Hive ODBC Driver
- Exploring HCatalog: HDFS Table and Metadata Management
- Exploring HBase: An HDFS Column-Oriented Database
- Columnar Databases
- Defining and Populating an HBase Table
- Using Query Operations
- Summary
- Chapter 5 Storing and Managing Data in HDFS
- Understanding the Fundamentals of HDFS
- HDFS Architecture
- NameNodes and DataNodes
- Data Replication
- Using Common Commands to Interact with HDFS
- Interfaces for Working with HDFS
- File Manipulation Commands
- Administrative Functions in HDFS
- Moving and Organizing Data in HDFS
- Moving Data in HDFS
- Implementing Data Structures for Easier Management
- Rebalancing Data
- Summary
- Chapter 6 Adding Structure with Hive
- Understanding Hive's Purpose and Role
- Providing Structure for Unstructured Data
- Enabling Data Access and Transformation
- Differentiating Hive from Traditional RDBMS Systems
- Working with Hive
- Creating and Querying Basic Tables
- Creating Databases
- Creating Tables
- Adding and Deleting Data
- Querying a Table
- Using Advanced Data Structures with Hive
- Setting Up Partitioned Tables
- Loading Partitioned Tables
- Using Views
- Creating Indexes for Tables
- Summary
- Chapter 7 Expanding Your Capability with HBase and HCatalog
- Using HBase
- Creating HBase Tables
- Loading Data into an HBase Table
- Performing a Fast Lookup
- Loading and Querying HBase
- Managing Data with HCatalog
- Working with HCatalog and Hive
- Defining Data Structures
- Creating Indexes
- Creating Partitions
- Integrating HCatalog with Pig and Hive
- Using HBase or Hive as a Data Warehouse
- Summary
- Part IV Working with Your Big Data
- Chapter 8 Effective Big Data ETL with SSIS, Pig, and Sqoop
- Combining Big Data and SQL Server Tools for Better Solutions
- Why Move the Data?
- Transferring Data Between Hadoop and SQL Server
- Working with SSIS and Hive
- Connecting to Hive
- Configuring Your Packages
- Loading Data into Hadoop
- Getting the Best Performance from SSIS
- Transferring Data with Sqoop
- Copying Data from SQL Server
- Copying Data to SQL Server
- Using Pig for Data Movement
- Transforming Data with Pig
- Using Pig and SSIS Together
- Choosing the Right Tool
- Use Cases for SSIS
- Use Cases for Pig
- Use Cases for Sqoop
- Summary
- Chapter 9 Data Research and Advanced Data Cleansing with Pig and Hive
- Getting to Know Pig
- When to Use Pig
- Taking Advantage of Built-in Functions
- Executing User-defined Functions
- Using UDFs
- Building Your Own UDFs for Pig
- Using Hive
- Data Analysis with Hive
- Types of Hive Functions
- Extending Hive with Map-reduce Scripts
- Creating a Custom Map-reduce Script
- Creating Your Own UDFs for Hive
- Summary
- Part V Big Data and SQL Server Together
- Chapter 10 Data Warehouses and Hadoop Integration
- State of the Union
- Challenges Faced by Traditional Data Warehouse Architectures
- Technical Constraints
- Business Challenges
- Hadoop's Impact on the Data Warehouse Market
- Keep Everything
- Code First (Schema Later)
- Model the Value
- Throw Compute at the Problem
- Introducing Parallel Data Warehouse (PDW)
- What Is PDW?
- Why Is PDW Important?
- How PDW Works
- Project Polybase
- Polybase Architecture
- Business Use Cases for Polybase Today
- Speculating on the Future for Polybase
- Summary
- Chapter 11 Visualizing Big Data with Microsoft BI
- An Ecosystem of Tools
- Excel
- PowerPivot
- Power View
- Power Map
- Reporting Services
- Self-service Big Data with PowerPivot
- Setting Up the ODBC Driver
- Loading Data
- Updating the Model
- Adding Measures
- Creating Pivot Tables
- Rapid Big Data Exploration with Power View
- Spatial Exploration with Power Map
- Summary
- Chapter 12 Big Data Analytics
- Data Science, Data Mining, and Predictive Analytics
- Data Mining
- Predictive Analytics
- Introduction to Mahout
- Building a Recommendation Engine
- Getting Started
- Running a User-to-user Recommendation Job
- Running an Item-to-item Recommendation Job
- Summary
- Chapter 13 Big Data and the Cloud
- Defining the Cloud
- Exploring Big Data Cloud Providers
- Amazon
- Microsoft
- Setting Up a Big Data Sandbox in the Cloud
- Getting Started with Amazon EMR
- Getting Started with HDInsight
- Storing Your Data in the Cloud
- Storing Data
- Uploading Your Data
- Exploring Big Data Storage Tools
- Integrating Cloud Data
- Other Cloud Data Sources
- Summary
- Chapter 14 Big Data in the Real World
- Common Industry Analytics
- Telco
- Energy
- Retail
- Data Services
- IT/Hosting Optimization
- Marketing Social Sentiment
- Operational Analytics
- Failing Fast
- A New Ecosystem of Technologies
- User Audiences
- Summary
- Part VI Moving Your Big Data Forward
- Chapter 15 Building and Executing Your Big Data Plan
- Gaining Sponsor and Stakeholder Buy-In
- Problem Definition
- Scope Management
- Stakeholder Expectations
- Defining the Criteria for Success
- Identifying Technical Challenges
- Environmental Challenges
- Challenges in Skillset
- Identifying Operational Challenges
- Planning for Setup/Configuration
- Planning for Ongoing Maintenance
- Going Forward
- The Hand-Off to Operations
- After Deployment
- Summary
- Chapter 16 Operational Big Data Management
- Hybrid Big Data Environments: Cloud and On-Premise Solutions Working Together
- Ongoing Data Integration with Cloud and On-Premise Solutions
- Integration Thoughts for Big Data
- Backups and High Availability in Your Big Data Environment
- High Availability
- Disaster Recovery
- Big Data Solution Governance
- Creating Operational Analytics
- System Center Operations Manager for HDP
- Installing the Ambari SCOM Management Pack
- Monitoring with the Ambari SCOM Management Pack
- Summary
- Index
System requirements
File format: PDF
Copy-Protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our eBook Help page.