Python: Real-World Data Science
Packt Publishing
Published on 6. January 2016
Book
Paperback/Softback
1255 pages
978-1-78646-516-0 (ISBN)
Description
Unleash the power of Python and its robust data science capabilities
About This Book
* Unleash the power of Python 3 objects
* Learn to use powerful Python libraries for effective data processing and analysis
* Harness the power of Python to analyze data and create insightful predictive models
* Unlock deeper insights into machine learning with this vital guide to cutting-edge predictive analytics
Who This Book Is For
Entry-level analysts who want to enter in the data science world will find this course very useful to get themselves acquainted with Python's data science capabilities for doing real-world data analysis.
What You Will Learn
* Install and setup Python
* Implement objects in Python by creating classes and defining methods
* Get acquainted with NumPy to use it with arrays and array-oriented computing in data analysis
* Create effective visualizations for presenting your data using Matplotlib
* Process and analyze data using the time series capabilities of pandas
* Interact with different kind of database systems, such as file, disk format, Mongo, and Redis
* Apply data mining concepts to real-world problems
* Compute on big data, including real-time data from the Internet
* Explore how to use different machine learning models to ask different questions of your data
In Detail
The Python: Real-World Data Science course will take you on a journey to become an efficient data science practitioner by thoroughly understanding the key concepts of Python. This learning path is divided into four modules and each module are a mini course in their own right, and as you complete each one, you'll have gained key skills and be ready for the material in the next module.
The course begins with getting your Python fundamentals nailed down. After getting familiar with Python core concepts, it's time that you dive into the field of data science. In the second module, you'll learn how to perform data analysis using Python in a practical and example-driven way. The third module will teach you how to design and develop data mining applications using a variety of datasets, starting with basic classification and affinity analysis to more complex data types including text, images, and graphs. Machine learning and predictive analytics have become the most important approaches to uncover data gold mines. In the final module, we'll discuss the necessary details regarding machine learning concepts, offering intuitive yet informative explanations on how machine learning algorithms work, how to use them, and most importantly, how to avoid the common pitfalls.
Style and approach
This course includes all the resources that will help you jump into the data science field with Python and learn how to make sense of data. The aim is to create a smooth learning path that will teach you how to get started with powerful Python libraries and perform various data science techniques in depth.
About This Book
* Unleash the power of Python 3 objects
* Learn to use powerful Python libraries for effective data processing and analysis
* Harness the power of Python to analyze data and create insightful predictive models
* Unlock deeper insights into machine learning with this vital guide to cutting-edge predictive analytics
Who This Book Is For
Entry-level analysts who want to enter in the data science world will find this course very useful to get themselves acquainted with Python's data science capabilities for doing real-world data analysis.
What You Will Learn
* Install and setup Python
* Implement objects in Python by creating classes and defining methods
* Get acquainted with NumPy to use it with arrays and array-oriented computing in data analysis
* Create effective visualizations for presenting your data using Matplotlib
* Process and analyze data using the time series capabilities of pandas
* Interact with different kind of database systems, such as file, disk format, Mongo, and Redis
* Apply data mining concepts to real-world problems
* Compute on big data, including real-time data from the Internet
* Explore how to use different machine learning models to ask different questions of your data
In Detail
The Python: Real-World Data Science course will take you on a journey to become an efficient data science practitioner by thoroughly understanding the key concepts of Python. This learning path is divided into four modules and each module are a mini course in their own right, and as you complete each one, you'll have gained key skills and be ready for the material in the next module.
The course begins with getting your Python fundamentals nailed down. After getting familiar with Python core concepts, it's time that you dive into the field of data science. In the second module, you'll learn how to perform data analysis using Python in a practical and example-driven way. The third module will teach you how to design and develop data mining applications using a variety of datasets, starting with basic classification and affinity analysis to more complex data types including text, images, and graphs. Machine learning and predictive analytics have become the most important approaches to uncover data gold mines. In the final module, we'll discuss the necessary details regarding machine learning concepts, offering intuitive yet informative explanations on how machine learning algorithms work, how to use them, and most importantly, how to avoid the common pitfalls.
Style and approach
This course includes all the resources that will help you jump into the data science field with Python and learn how to make sense of data. The aim is to create a smooth learning path that will teach you how to get started with powerful Python libraries and perform various data science techniques in depth.
More details
Language
English
Place of publication
Birmingham
United Kingdom
Dimensions
Height: 235 mm
Width: 191 mm
ISBN-13
978-1-78646-516-0 (9781786465160)
Copyright in bibliographic data is held by Nielsen Book Services Limited or its licensors: all rights reserved.
Schweitzer Classification
Persons
Dusty Phillips is a Canadian software developer and author currently living in Seattle, Washington. He has been active in the open source community for a decade and a half and programming in Python for nearly all of it. He cofounded the popular Puget Sound Programming Python meetup group; drop by and say hi if you're in the area.
Python 3 Object Oriented Programming, Packt Publishing, was the first of his books. He has also written Creating Apps In Kivy, O'Reilly, the mobile Python library, and selfpublished Hacking Happy, a journey to mental wellness for the technically inclined. He was hospitalized for suicidal tendencies shortly after the first edition of this book was published and has been an outspoken proponent for positive mental health ever since. Fabrizio Romano was born in Italy in 1975. He holds a master's degree in computer science engineering from the University of Padova. He is also a certified Scrum master.
Before Python, he has worked with several other languages, such as C/C++, Java, PHP, and C#.
In 2011, he moved to London and started working as a Python developer for Glasses Direct, one of Europe's leading online prescription glasses retailers.
He then worked as a senior Python developer for TBG (now Sprinklr), one of the world's leading companies in social media advertising. At TBG, he and his team collaborated with Facebook and Twitter. They were the first in the world to get access to the Twitter advertising API. He wrote the code that published the first geo-narrowcasted promoted tweet in the world using the API.
He currently works as a senior platform developer at Student.com, a company that is revolutionizing the way international students find their perfect home all around the world.
He has delivered talks on Teaching Python and TDD with Python at the last two editions of EuroPython and at Skillsmatter in London. Phuong Vo.T.H has a MSc degree in computer science, which is related to machine learning. After graduation, she continued to work in some companies as a data scientist. She has experience in analyzing users' behavior and building recommendation systems based on users' web histories. She loves to read machine learning and mathematics algorithm books, as well as data analysis articles. Martin Czygan studied German literature and computer science in Leipzig, Germany. He has been working as a software engineer for more than 10 years. For the past eight years, he has been diving into Python, and is still enjoying it. In recent years, he has been helping clients to build data processing pipelines and search and analytics systems. His consultancy can be found at http://www.xvfz.net. Robert Layton has a PhD in computer science and has been an avid Python programmer for many years. He has worked closely with some of the largest companies in the world on data mining applications for real-world data and has also been published extensively in international journals and conferences. He has extensive experience in cybercrime and text-based data analytics, with a focus on behavioral modeling, authorship analysis, and automated open source intelligence. He has contributed code to a number of open source libraries, including the scikit-learn library used in this book, and was a Google Summer of Code mentor in 2014. Robert runs a data mining consultancy company called dataPipeline, providing data mining and analytics solutions to businesses in a variety of industries. Sebastian Raschka is a PhD student at Michigan State University, who develops new computational methods in the field of computational biology. He has been ranked as the number one most influential data scientist on GitHub by Analytics Vidhya. He has a yearlong experience in Python programming and he has conducted several seminars on the practical applications of data science and machine learning. Talking and writing about data science, machine learning, and Python really motivated Sebastian to write this book in order to help people develop data-driven solutions without necessarily needing to have a machine learning background.
He has also actively contributed to open source projects and methods that he implemented, which are now successfully used in machine learning competitions, such as Kaggle. In his free time, he works on models for sports predictions, and if he is not in front of the computer, he enjoys playing sports.
Python 3 Object Oriented Programming, Packt Publishing, was the first of his books. He has also written Creating Apps In Kivy, O'Reilly, the mobile Python library, and selfpublished Hacking Happy, a journey to mental wellness for the technically inclined. He was hospitalized for suicidal tendencies shortly after the first edition of this book was published and has been an outspoken proponent for positive mental health ever since. Fabrizio Romano was born in Italy in 1975. He holds a master's degree in computer science engineering from the University of Padova. He is also a certified Scrum master.
Before Python, he has worked with several other languages, such as C/C++, Java, PHP, and C#.
In 2011, he moved to London and started working as a Python developer for Glasses Direct, one of Europe's leading online prescription glasses retailers.
He then worked as a senior Python developer for TBG (now Sprinklr), one of the world's leading companies in social media advertising. At TBG, he and his team collaborated with Facebook and Twitter. They were the first in the world to get access to the Twitter advertising API. He wrote the code that published the first geo-narrowcasted promoted tweet in the world using the API.
He currently works as a senior platform developer at Student.com, a company that is revolutionizing the way international students find their perfect home all around the world.
He has delivered talks on Teaching Python and TDD with Python at the last two editions of EuroPython and at Skillsmatter in London. Phuong Vo.T.H has a MSc degree in computer science, which is related to machine learning. After graduation, she continued to work in some companies as a data scientist. She has experience in analyzing users' behavior and building recommendation systems based on users' web histories. She loves to read machine learning and mathematics algorithm books, as well as data analysis articles. Martin Czygan studied German literature and computer science in Leipzig, Germany. He has been working as a software engineer for more than 10 years. For the past eight years, he has been diving into Python, and is still enjoying it. In recent years, he has been helping clients to build data processing pipelines and search and analytics systems. His consultancy can be found at http://www.xvfz.net. Robert Layton has a PhD in computer science and has been an avid Python programmer for many years. He has worked closely with some of the largest companies in the world on data mining applications for real-world data and has also been published extensively in international journals and conferences. He has extensive experience in cybercrime and text-based data analytics, with a focus on behavioral modeling, authorship analysis, and automated open source intelligence. He has contributed code to a number of open source libraries, including the scikit-learn library used in this book, and was a Google Summer of Code mentor in 2014. Robert runs a data mining consultancy company called dataPipeline, providing data mining and analytics solutions to businesses in a variety of industries. Sebastian Raschka is a PhD student at Michigan State University, who develops new computational methods in the field of computational biology. He has been ranked as the number one most influential data scientist on GitHub by Analytics Vidhya. He has a yearlong experience in Python programming and he has conducted several seminars on the practical applications of data science and machine learning. Talking and writing about data science, machine learning, and Python really motivated Sebastian to write this book in order to help people develop data-driven solutions without necessarily needing to have a machine learning background.
He has also actively contributed to open source projects and methods that he implemented, which are now successfully used in machine learning competitions, such as Kaggle. In his free time, he works on models for sports predictions, and if he is not in front of the computer, he enjoys playing sports.