Practical Machine Learning in R

Name: Practical Machine Learning in R
Brand: Wiley
Price: 25.99 EUR
Availability: OnlineOnly

Fred Nwanganga Mike Chapple(Autor*in)

Wiley (Verlag)

1. Auflage

Erschienen am 10. April 2020

464 Seiten

E-Book

ePUB mit Adobe-DRM

Systemvoraussetzungen

978-1-119-59153-5 (ISBN)

25,99 €inkl. 7% MwSt.

Systemvoraussetzungen

für ePUB mit Adobe-DRM

E-Book Einzellizenz

Als Download verfügbar

Beschreibung

Weitere Details

Weitere Ausgaben

Personen

Inhalt

Chapter 1
What Is Machine Learning?

Welcome to the world of machine learning! You're about to embark upon an exciting adventure discovering how data scientists use algorithms to uncover knowledge hidden within the troves of data that businesses, organizations, and individuals generate every day.

If you're like us, you often find yourself in situations where you are facing a mountain of data that you're certain contains important insights, but you just don't know how to extract that needle of knowledge from the proverbial haystack. That's where machine learning can help. This book is dedicated to providing you with the knowledge and skills you need to harness the power of machine learning algorithms. You'll learn about the different types of problems that are well-suited for machine learning solutions and the different categories of machine learning techniques that are most appropriate for tackling different types of problems.

Most importantly, we're going to approach this complex, technical field with a practical mind-set. In this book, our purpose is not to dwell on the intricate mathematical details of these algorithms. Instead, we'll focus on how you can put those algorithms to work for you immediately. We'll also introduce you to the R programming language, which we believe is particularly well-suited to approaching machine learning problems from a practical standpoint. But don't worry about programming or R for now. We'll get to that in Chapter 2. For now, let's dive in and get a better understanding of how machine learning works.

By the end of this chapter, you will have learned the following:

How machine learning allows the discovery of knowledge in data
How unsupervised learning, supervised learning, and reinforcement learning techniques differ from each other
How classification and regression problems differ from each other
How to measure the effectiveness of machine learning algorithms
How cross-validation improves the accuracy of machine learning models

DISCOVERING KNOWLEDGE IN DATA

Our goal in the world of machine learning is to use algorithms to discover knowledge in our datasets that we can then apply to help us make informed decisions about the future. That's true regardless of the specific subject-matter expertise where we're working, as machine learning has applications across a wide variety of fields. For example, here are some cases where machine learning commonly adds value:

Segmenting customers and determining the marketing messages that will appeal to different customer groups
Discovering anomalies in system and application logs that may be indicative of a cybersecurity incident
Forecasting product sales based on market and environmental conditions
Recommending the next movie that a customer might want to watch based on their past activity and the preferences of similar customers
Setting prices for hotel rooms far in advance based on forecasted demand

Of course, those are just a few examples. Machine learning can bring value to almost every field where discovering previously unknown knowledge is useful-and we challenge you to think of a field where knowledge doesn't offer an advantage!

Introducing Algorithms

As we proceed throughout this book, you'll see us continually referring to machine learning techniques as algorithms. This is a term from the world of computer science that comes up again and again in the world of data science, so it's important that you understand it. While the term sounds technically complex, the concept of an algorithm is actually straightforward, and we'd venture to guess that you use some form of an algorithm almost every day.

An algorithm is, quite simply, a set of steps that you follow when carrying out a process. Most commonly, we use the term when we're referring to the steps that a computer follows when it is carrying out a computational task, but we can think of many things that we do each day as algorithms. For example, when we are walking the streets of a large city and we reach an intersection, we follow an algorithm for crossing the street. Figure 1.1 shows an example of how this process might work.

Of course, in the world of computer science, our algorithms are more complex and are implemented by writing software, but we can think of them in this same way. An algorithm is simply a series of precise observations, decisions, and instructions that tell the computer how to carry out an action. We design machine learning algorithms to discover knowledge in our data. As we progress through this book, you'll learn about many different types of machine learning algorithms and how they work to achieve this goal in very different ways.

Figure 1.1 Algorithm for crossing the street

Artificial Intelligence, Machine Learning, and Deep Learning

We hear the terms artificial intelligence, machine learning, and deep learning being used almost interchangeably to describe any sort of technique where computers are working with data. Now that you're entering the world of data science, it's important to have a more precise understanding of these terms.

Artificial intelligence (AI) includes any type of technique where we are attempting to get a computer system to imitate human behavior. As the name implies, we are trying to ask computer systems to artificially behave as if they were intelligent. Now, of course, it's not possible for a modern computer to function at the level of complex reasoning found in the human mind, but we can try to mimic some small portions of human behavior and judgment.
Machine learning (ML) is a subset of artificial intelligence techniques that attempt to apply statistics to data problems in an effort to discover new knowledge by generalizing from examples. Or, in other terms, machine learning techniques are artificial intelligence techniques designed to learn.
Deep learning is a further subdivision of machine learning that uses a set of complex techniques, known as neural networks, to discover knowledge in a particular way. It is a highly specialized subfield of machine learning that is most commonly used for image, video, and sound analysis.

Figure 1.2 shows the relationships between these fields. In this book, we focus on machine learning techniques. Specifically, we focus on the categories of machine learning that do not fit the definition of deep learning.

MACHINE LEARNING TECHNIQUES

The machine learning techniques that we discuss in this book fit into two major categories. Supervised learning algorithms learn patterns based on labeled examples of past data. Unsupervised learning algorithms seek to uncover patterns without the assistance of labeled data. Let's take a look at each of these techniques in more detail.

Figure 1.2 The relationship between artificial intelligence, machine learning, and deep learning

Supervised Learning

Supervised learning techniques are perhaps the most commonly used category of machine learning algorithms. The purpose of these techniques is to use an existing dataset to generate a model that then helps us make predictions about future, unlabeled data. More formally, we provide a supervised machine learning algorithm with a training dataset as input. The algorithm then uses that training data to develop a model as its output, as shown in Figure 1.3.

You can think of the model produced by a supervised machine learning algorithm as sort of a crystal ball-once we have it, we can use it to make predictions about our data. Figure 1.4 shows how this model functions. Once we have it, we can take any new data element that we encounter and use the model to make a prediction about that new element based on the knowledge it obtained from the training dataset.

The reason that we use the term supervised to describe these techniques is that we are using a training dataset to supervise the creation of our model. That training dataset contains labels that help us with our prediction task.

Let's reinforce that with a more concrete example. Consider a loan officer working at the car dealership shown in Figure 1.5. The salespeople at the dealership work with individual customers to sell them cars. The customers often don't have the necessary cash on hand to purchase a car outright, so they seek financing options. Our job is to match customers with the right loan product from three choices.

Subprime loans have the most expensive interest rates and are offered to customers who are likely to miss payment deadlines or default on their loans.
Top-shelf loans have the lowest interest rate and are offered to customers who are unlikely to miss payments and have an extremely high likelihood of repayment.
Standard loans are offered to customers who fall in the middle of these two groups and have an interest rate that falls in between those two values.

Figure 1.3 Generic supervised learning model

Figure 1.4 Making predictions with a supervised learning model

We receive loan applications from salespeople and must make a decision on the spot. If we don't act quickly, the customer may...

Systemvoraussetzungen

Als PDF speichern Als Link merken

Practical Machine Learning in R

Beschreibung

Weitere Details

Weitere Ausgaben

Personen

Inhalt

Chapter 1 What Is Machine Learning?

DISCOVERING KNOWLEDGE IN DATA

Introducing Algorithms

Artificial Intelligence, Machine Learning, and Deep Learning

MACHINE LEARNING TECHNIQUES

Supervised Learning

Systemvoraussetzungen

Chapter 1
What Is Machine Learning?