Learn the fundamental aspects of the business statistics, data mining, and machine learning techniques required to understand the huge amount of data generated by your organization. This book explains practical business analytics through examples, covers the steps involved in using it correctly, and shows you the context in which a particular technique does not make sense. Further, Practical Business Analytics using R helps you understand specific issues faced by organizations and how the solutions to these issues can be facilitated by business analytics.
This book will discuss and explore the following through examples and case studies:
- An introduction to R: data management and R functions
- The architecture, framework, and life cycle of a business analytics project
- Descriptive analytics using R: descriptive statistics and data cleaning
- Data mining: classification, association rules, and clustering
Predictive analytics: simple regression, multiple regression, and logistic regression
This book includes case studies on important business analytic techniques, such as classification, association, clustering, and regression. The R language is the statistical tool used to demonstrate the concepts throughout the book.
What You Will Learn
Write R programs to handle data
Build analytical models and draw useful inferences from them
Discover the basic concepts of data mining and machine learning
Carry out predictive modeling
Define a business issue as an analytical problem
Who This Book Is For
Beginners who want to understand and learn the fundamentals of analytics using R. Students, managers, executives, strategy and planning professionals, software professionals, and BI/DW professionals.
Umesh R. Hodeghatta, Ph.D.
Dr. Umesh Rao. Hodeghatta is an acclaimed professional in the field of machine learning, NLP and business analytics. He has his master's degree in EE from Oklahoma State University, USA and Ph.D. from the Indian Institute of Technology (IIT), Kharagpur with a specialization in Machine Learning and NLP. Dr. Umesh Hodeghatta is currently working as a Data Scientist in United States serving multiple clients. He has more than 20 years of work experience and has held technical and senior management positions at XIM-Bhubaneswar, McAfee, Cisco Systems, and AT&T Bell Laboratories, USA. He has recently established IBM Big Data Analytics Lab and HP Research Lab at Xavier Univeristy. Dr. Hodeghatta has published many journal articles in international journals and conference proceedings, viz, "Understading Twitter as e-WOM", "Sentiment Analysis of Hollywood Movies on Twitter", "PCI DSS - Penalty of not being Compliant" are few of the well-known publications. In addition, he has authored a book titled "The InfoSec Handbook: An Introduction to Information Security" published by Springer Apress, USA. Dr. Hodeghatta has contributed his services to many professional organizations and regulatory bodies. He was an Executive Committee member of IEEE Computer Society (India); Academic advisory member for the Information and Security Audit Association (ISACA), USA; IT advisor for the government of Odisha, India; Technical Advisory Member of the International Neural Network Society (INNS) India; Advisory member of Task Force on Business Intelligence & Knowledge Management. Owing to these achievements, he has been listed in "World's Who's Who" of the year - 2012, 2013, 2014, 2015 - published by Marquis Who's Who, USA. He is also a senior member of the IEEE, USA. Further details about Dr. Hodeghatta is available at http://www.mytechnospeak.comUmesha Nayak is a director and principal consultant of MUSA Software Engineering Pvt. Ltd. which focuses on systems / process / management consulting. He has 33 years' experience, of which 12 years are in providing consulting to IT / manufacturing and other organizations from across the globe. He is a Master of Science in Software Systems; Master of Arts in Economics; CAIIB; Certified Information Systems Auditor (CISA), and Certified Risk and Information Systems Control (CRISC) professional from ISACA, US; PGDFM; Certified Ethical Hacker from EC Council; Certified Lead Auditor for many of the standards; Certified Coach among others. He has worked extensively in banking, software development, product design and development, project management, program management, information technology audits, information application audits, quality assurance, coaching, product reliability, human resource management, and consultancy. He was Vice President and Corporate Executive Council member at Polaris Software Lab, Chennai prior to his current assignment. He also held various roles like Head of Quality, Head of SEPG and Head of Strategic Practice Unit - Risks & Treasury at Polaris Software Lab. He started his journey with computers in 1981 with ICL mainframes and continued further with minis and PCs. He was one of the founding members of the information systems auditing in the banking industry in India. He has effectively guided many organizations through successful ISO 9001/ISO 27001/CMMI and other certifications and process/product improvements. He has coauthored the book "The InfoSec Handbook: An Introduction to Information Security" published by Apress Open.
Table of Contents
Chapter 1: Introduction (page count 10)
Chapter Goal: Overview of analytics. Starts with the basics of business analytics and some use cases to build a background for the upcoming chapters. Cover some of the most widely used analytical tools and techniques.
Chapter 2: Basics of R (Page count 20)
Chapter Goal: This chapter introduces R tool, R environment, work space, variables, data types and fundamental tool related concepts. This chapter provides enough basics to start R programing for data analysis.
Chapter 3: R datasets and variables (page count 20)
Chapter Goal: This chapter introduces the data types, variables and data manipulations in R. This also explores various packages of R and how they can be used for data analytics.
Chapter 4: Introduction to Descriptive Analytics (page count 20)
The chapter provides basic statistics required for the data analysis. The basics of statistics like population and sample, descriptive statistics like mean, median, mode and measures of dispersion etc. are discussed in this chapter
Chapter 5: Business Analytics Process and Data exploration (page count 30)
Data exploration, validation, and data cleaning required for the data analysis are discussed in this chapter. In this chapter, we document some of the data-cleaning techniques used in the industry.
Chapter 6: Supervised Machine Learning - Classification (page count 30)
Chapter Goal: This chapter provides an overview of machine learning and data mining techniques. In this chapter, the focus is on classification techniques. It discusses different classification techniques using R packages available to perform the classification tasks. For example: Classification using Naïve Bayes, Classification using decision trees and Building decision trees using R
Chapter 7: Unsupervised Machine Learning - Clustering and Association Rule (page count 20)
Chapter Goal: This chapter explains unsupervised techniques to perform unsupervised machine learning data analysis such as clustering and association rule techniques.
Chapter 8: Simple Linear Regression (page count 20)
Chapter Goal: Introduces the predictive analytics techniques. Understanding simple linear regression and how to interpret the results and fit the data to linear model. Understanding the concepts such as correlation, R-Squared value, Regression Assumptions.
Chapter 9: Multiple Linear Regression (page count 30)
Chapter Goal: We discuss multiple regressions in this chapter, as well as concepts like multicollinearity and adjusted R-square.
Chapter 10: Logistic Regression (page count 20)
Chapter Goal: Explains why logistic regression is a commonly used predictive modelling technique. In this chapter, we discuss model building using logistic regression - What is logistic regression, validating logistic regression line etc.
Chapter 11: Big Data Analytics and Future Trends in Analytics
Chapter Goal: This final chapter gives the basics of big data analysis ecosystem and the value of such a system in carrying out effective analysis. This chapter introduces readers to the concept of Big data analytics and Hadoop ecosystem.