
Machine Learning
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
Machine Learning: Hands-On for Developers and Technical Professionals provides hands-on instruction and fully-coded working examples for the most common machine learning techniques used by developers and technical professionals. The book contains a breakdown of each ML variant, explaining how it works and how it is used within certain industries, allowing readers to incorporate the presented techniques into their own work as they follow along. A core tenant of machine learning is a strong focus on data preparation, and a full exploration of the various types of learning algorithms illustrates how the proper tools can help any developer extract information and insights from existing data. The book includes a full complement of Instructor's Materials to facilitate use in the classroom, making this resource useful for students and as a professional reference.
At its core, machine learning is a mathematical, algorithm-based technology that forms the basis of historical data mining and modern big data science. Scientific analysis of big data requires a working knowledge of machine learning, which forms predictions based on known properties learned from training data. Machine Learning is an accessible, comprehensive guide for the non-mathematician, providing clear guidance that allows readers to:
* Learn the languages of machine learning including Hadoop, Mahout, and Weka
* Understand decision trees, Bayesian networks, and artificial neural networks
* Implement Association Rule, Real Time, and Batch learning
* Develop a strategic plan for safe, effective, and efficient machine learning
By learning to construct a system that can learn from data, readers can increase their utility across industries. Machine learning sits at the core of deep dive data analysis and visualization, which is increasingly in demand as companies discover the goldmine hiding in their existing data. For the tech professional involved in data science, Machine Learning: Hands-On for Developers and Technical Professionals provides the skills and techniques required to dig deeper.
More details
Other editions
Additional editions

Person
Content
Introduction
Well, times have changed since writing the first edition of this book. Between 2014 and now there is more emphasis on data and what it can do for us but also how that power can be used against us. Hardware has gotten better, processing has gotten much faster, and the ability to classify, predict, and decide based on our data is extraordinary. At the same time, we've become much more aware of the risks of how data is used, the biases that can happen, and that a lot of black-box models don't always get things right.
Still, it's an exciting time to be involved. We still create more data than we can sensibly process. New ideas involving machine learning are being presented daily. The appetite for learning has grown rapidly, too.
Data mining and machine learning have been around a number of years already. When you look closely, the machine learning algorithms that are being applied aren't any different from what they were years ago; what is new is how they are applied at scale. When you look at the number of organizations that are creating the data, it's really, in my opinion, a minority. Google, Facebook, Twitter, Netflix, and a small handful of others are the ones getting the majority of mentions in the headlines with a mixture of algorithmic learning and tools that enable them to scale. So, the real question you should ask is, "How does all this apply to the rest of us?"
Data with large scale, near-instant processing, has come to the fore. The emphasis has moved from batch systems like Hadoop to more streaming-based systems like Kafka. I admit there will be times in this book when I look at the Big Data side of machine learning-it's a subject I can't ignore-but it's only a small factor in the overall picture of how to get insight from the available data. It is important to remember that I am talking about tools, and the key is figuring out which tools are right for the job you are trying to complete.
Aims of This Book
This book is about machine learning and not about Big Data. It's about the various techniques used to gain insight from your data. By the end of the book, you will have seen how various methods of machine learning work, and you will also have had some practical explanations on how the code is put together, leaving you with a good idea of how you could apply the right machine learning techniques to your own problems.
There's no right or wrong way to use this book. You can start at the beginning and work your way through, or you can just dip in and out of the parts you need to know at the time you need to know them.
"Hands-On" Means Hands-On
Many books on the subject of machine learning that I've read in the past have been very heavy on theory. That's not a bad thing. If you're looking for in-depth theory with really complex-looking equations, I applaud your rigor. Me? I'm more hands-on with my approach to learning and to projects. My philosophy is quite simple.
- Start with a question in mind.
- Find the theory I need to learn.
- Find lots of examples I can learn from.
- Put them to work in my own projects.
As a software developer, I like to see lots of examples. As a teacher, I like to get as much hands-on development time as possible but also get the message across to students as simply as possible. There's something about fingers on keys, coding away on your IDE, and getting things to work that's rather appealing, and it's something that I want to convey in the book.
Everyone has his or her own learning styles. I believe this book covers the most common methods, so everybody will benefit.
"What About the Math?"
Like arguing that your favorite football team is better than another or trying to figure out whether Jimmy Page is a better guitarist than Jeff Beck (I prefer Beck), there are some things that will be debated forever and a day. One such debate is how much math you need to know before you can start doing machine learning.
Doing machine learning and learning the theory of machine learning are two very different subjects. To learn the theory, a good grounding in math is required. This book discusses a hands-on approach to machine learning. With the number of machine learning tools available for developers now, the emphasis is not so much on how these tools work but on how you can make these tools work for you. The hard work has been done, and those who did it deserve credit and applause.
"But You Need a PhD!"
No, you don't!
The long-running debate rages on about the level of knowledge you need before you can start doing analysis on data or claim that you are a data scientist. I believe that if you'd like to take a few years completing a degree and then pursuing the likes of a master's degree and then a PhD, you should feel free to go that route. I'm a little more pragmatic about things and like to get reading and start doing.
Academia is great; and with the large number of online courses, papers, websites, and books on the subject of math, statistics, and data mining, there's enough to keep the most eager of minds occupied. I dip in and out of these resources a lot, and it's definitely a good way to keep up-to-date and investigate what's emerging.
For me, though, there's nothing like getting my hands dirty, grabbing some data, trying out some methods, and looking at the results. If you need to brush up on linear regression theory, then let me reassure you now, there's plenty out there to read, and I'll also cover that in this book.
Lastly, can one person ever be a data scientist? I think it's more likely for a team of people to bring the various skills needed for machine learning into an organization. I talk about this more in Chapter 2.
So, while others in the office are arguing whether to bring some PhD brains in on a project, you can be coding up a decision tree to see whether it's viable.
Over the last few years the job title data scientist has been joined by other titles like data engineer and machine learning engineer. All are valid and all focus on aspects of the data science pipeline. They all have their place.
What Will You Have Learned by the End?
Assuming that you're reading the book from start to finish, you'll learn the common uses for machine learning, different methods of machine learning, and how to apply real-time and batch processing.
There's also nothing wrong with referencing a specific section that you want to learn. The chapters and examples were created in such a way that there's no dependency to learn one chapter over another.
The aim is to cover the common machine learning concepts in a practical manner. Using the existing free tools and libraries that are available to you, there's little stopping you from starting to gain insight from the existing data that you have.
Balancing Theory and Hands-on Learning
There are many books on machine learning and data mining available, and finding the balance of theory and practical examples is hard. When planning this book, I stressed the importance of practical and easy-to-use examples, providing step-by-step instructions, so you can see how things are put together.
I'm not saying that the theory is light, because it's not. Understanding what you want to learn or, more importantly, how you want to learn will determine how you read this book.
You can think of the book split into three distinct sections. The first section covers the question, "What is machine learning?" and concentrates on planning for projects, data acquisition, and cleaning. For those wanting some refresher on the math and stats side of things, I've included a new chapter; it also covers linear regression and standard deviation.
The next section takes a closer look at some of the building-block algorithms used in machine learning projects. Clustering, decision trees, support vector machine, association rules learning, and neural networks provide both a background to how they work and code examples for you to work with. It's important to get the hands-on nature early on.
Lastly, I focus on the real-world tools used in enterprise; these are tools like Spark, Kafka, and R. Knowing how these frameworks and tools are put together will give you a grounding to know what to use when.
Source Code for This Book
All the code that is explained in the chapters of the book has been saved on a GitHub repository for you to download and try. For this edition, I've also included the Maven dependency file so you can easily build the project you are working on.
The address for the repository is https://github.com/jasebell/mlbook2ndedition. You can also find it on the Wiley website at www.wiley.com/go/machinelearning2e.
The examples are in either Java, Clojure, or R. If you want to extend your knowledge into other languages, then a search around the GitHub site might lead you to some interesting examples.
Code has been separated by chapter; there's a folder in the repository for each of the chapters, and each has its own build file. The data is also within the repository in the data directory and has been split by each chapter.
Using Git
Git is a version control system that is widely used in business and the open source software community. If you...
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.