
Become a Python Data Analyst
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
- Implement real-world datasets to perform predictive analytics with Python
- Access modern data analysis techniques and detailed code with scikit-learn and SciPy
Book DescriptionPython is one of the most common and popular languages preferred by leading data analysts and statisticians for working with massive datasets and complex data visualizations. Become a Python Data Analyst introduces Python's most essential tools and libraries necessary to work with the data analysis process, right from preparing data to performing simple statistical analyses and creating meaningful data visualizations. In this book, we will cover Python libraries such as NumPy, pandas, matplotlib, seaborn, SciPy, and scikit-learn, and apply them in practical data analysis and statistics examples. As you make your way through the chapters, you will learn to efficiently use the Jupyter Notebook to operate and manipulate data using NumPy and the pandas library. In the concluding chapters, you will gain experience in building simple predictive models and carrying out statistical computation and analysis using rich Python tools and proven data analysis techniques. By the end of this book, you will have hands-on experience performing data analysis with Python.What you will learn - Explore important Python libraries and learn to install Anaconda distribution
- Understand the basics of NumPy
- Produce informative and useful visualizations for analyzing data
- Perform common statistical calculations
- Build predictive models and understand the principles of predictive analytics
Who this book is forBecome a Python Data Analyst is for entry-level data analysts, data engineers, and BI professionals who want to make complete use of Python tools for performing efficient data analysis. Prior knowledge of Python programming is necessary to understand the concepts covered in this book
More details
Other editions
Additional editions

Content
- Cover
- Title Page
- Copyright and Credits
- Packt Upsell
- Contributor
- Table of Contents
- Preface
- Chapter 1: The Anaconda Distribution and Jupyter Notebook
- The Anaconda distribution
- Installing Anaconda
- Jupyter Notebook
- Creating your own Jupyter Notebook
- Notebook user interfaces
- Using the Jupyter Notebook
- Running code in a code cell
- Running markdown syntax in a text cell
- Styles and formats
- Lists
- Useful keyboard shortcuts
- Summary
- Chapter 2: Vectorizing Operations with NumPy
- Introduction to NumPy
- Problems and solutions
- NumPy arrays
- Creating arrays in NumPy
- Creating arrays from lists
- Creating arrays from built-in NumPy functions
- Attributes of arrays
- Basic math with arrays
- Common manipulations with arrays
- Indexing arrays
- Slicing arrays
- Reshaping arrays
- Using NumPy for simulations
- Coin flips
- Simulating stock returns
- Summary
- Chapter 3: Pandas - Everyone's Favorite Data Analysis Library
- Introduction to the pandas library
- Important objects in pandas
- Series
- Creating a pandas series
- DataFrames
- Creating a pandas DataFrame
- Anatomy of a DataFrame
- Operations and manipulations of pandas
- Inspection of data
- Selection, addition, and deletion of data
- Slicing DataFrames
- Selection by labels
- Answering simple questions about a dataset
- Total employees by department in the dataset
- Overall attrition rate
- Average hourly rate
- Average number of years
- Employees with the most number of years
- Overall employee satisfaction
- Answering further questions
- Employees with Low JobSatisfaction
- Employees with both Low JobSatisfaction and JobInvolvement
- Employee comparison
- Summary
- Chapter 4: Visualization and Exploratory Data Analysis
- Introducing Matplotlib
- Terminologies in Matplotlib
- Introduction to pyplot
- Object-oriented interface
- Common customizations
- Colors
- Colornames
- Setting axis limits
- Setting ticks and tick labels
- Legend
- Annotations
- Producing grids, horizontal, and vertical lines
- EDA with seaborn and pandas
- Understanding the seaborn library
- Performing exploratory data analysis
- Key objectives when performing data analysis
- Types of variable
- Analyzing variables individually
- Understanding the main variable
- Numerical variables
- Categorical variables
- Relationships between variables
- Scatter plot
- Box plot
- Complex conditional plots
- Summary
- Chapter 5: Statistical Computing with Python
- Introduction to SciPy
- Statistics subpackage
- Confidence intervals
- Probability calculations
- Hypothesis testing
- Performing statistical tests
- Summary
- Chapter 6: Introduction to Predictive Analytics Models
- Predictive analytics and machine learning
- Understanding the scikit-learn library
- scikit-learn
- Building a regression model using scikit-learn
- Regression model to predict house prices
- Summary
- Other Books You May Enjoy
- Index
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.