
Python: Data Analytics and Visualization
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
More details
Persons
Martin Czygan studied German literature and computer science in Leipzig, Germany. He has been working as a software engineer for more than 10 years. For the past eight years, he has been diving into Python, and is still enjoying it. In recent years, he has been helping clients to build data processing pipelines and search and analytics systems.Vo.T.H Phuong :
Phuong Vo.T.H has a MSc degree in computer science, which is related to machine learning. After graduation, she continued to work in some companies as a data scientist. She has experience in analyzing users' behavior and building recommendation systems based on users' web histories. She loves to read machine learning and mathematics algorithm books, as well as data analysis articles.Kumar Ashish :
Ashish Kumar is a seasoned data science professional, a publisher author and a thought leader in the field of data science and machine learning. An IIT Madras graduate and a Young India Fellow, he has around 7 years of experience in implementing and deploying data science and machine learning solutions for challenging industry problems in both hands-on and leadership roles. Natural Language Procession, IoT Analytics, R Shiny product development, Ensemble ML methods etc. are his core areas of expertise. He is fluent in Python and R and teaches a popular ML course at Simplilearn. When not crunching data, Ashish sneaks off to the next hip beach around and enjoys the company of his Kindle. He also trains and mentors data science aspirants and fledgling start-ups.Raman Kirthi :
Kirthi Raman is currently working as a lead data engineer with Neustar Inc, based in Mclean, Virginia USA. Kirthi has worked on data visualization, with a focus on JavaScript, Python, R, and Java, and is a distinguished engineer. Previously, he worked as a principle architect, data analyst, and information retrieval specialist at Quotient, Inc. Kirthi has also worked as a technical lead and manager for a start-up. He has taught discrete mathematics and computer science for several years. Kirthi has a graduate degree in mathematics and computer science from IIT Delhi and an MS in computer science from the University of Maryland. He has written several white papers on data analysis and big data.
Content
- Cover
- Copyright
- Credits
- Preface
- Table of Contents
- Module 1: Getting Started with Python Data Analysis
- Chapter 1: Introducing Data Analysis and Libraries
- Data analysis and processing
- An overview of the libraries in data analysis
- Python libraries in data analysis
- Summary
- Chapter 2: NumPy Arrays and Vectorized Computation
- NumPy arrays
- Array functions
- Data processing using arrays
- Linear algebra with NumPy
- NumPy random numbers
- Summary
- Chapter 3: Data Analysis with Pandas
- An overview of the Pandas package
- The Pandas data structure
- The essential basic functionality
- Indexing and selecting data
- Computational tools
- Working with missing data
- Advanced uses of Pandas for data analysis
- Summary
- Chapter 4: Data Visualization
- The matplotlib API primer
- Exploring plot types
- Legends and annotations
- Plotting functions with Pandas
- Additional Python data visualization tools
- Summary
- Chapter 5: Time Series
- Time series primer
- Working with date and time objects
- Resampling time series
- Downsampling time series data
- Upsampling time series data
- Time zone handling
- Timedeltas
- Time series plotting
- Summary
- Chapter 6: Interacting with Databases
- Interacting with data in text format
- Interacting with data in binary format
- Interacting with data in MongoDB
- Interacting with data in Redis
- Summary
- Chapter 7: Data Analysis Application Examples
- Data munging
- Data aggregation
- Grouping data
- Summary
- Chapter 8: Machine Learning Models with scikit-learn
- An overview of machine learning models
- The scikit-learn modules for different models
- Data representation in scikit-learn
- Supervised learning - classification and regression
- Unsupervised learning - clustering and dimensionality reduction
- Measuring prediction performance
- Summary
- Module 2: Learning Predictive Analytics with Python
- Chapter 1: Getting Started with Predictive Modelling
- Introducing predictive modelling
- Applications and examples of predictive modelling
- Python and its packages - download and installation
- Python and its packages for predictive modelling
- IDEs for Python
- Summary
- Chapter 2: Data Cleaning
- Reading the data - variations and examples
- Various methods of importing data in Python
- Basics - summary, dimensions, and structure
- Handling missing values
- Creating dummy variables
- Visualizing a dataset by basic plotting
- Summary
- Chapter 3: Data Wrangling
- Subsetting a dataset
- Generating random numbers and their usage
- Grouping the data - aggregation, filtering, and transformation
- Random sampling - splitting a dataset in training and testing datasets
- Concatenating and appending data
- Merging/joining datasets
- Summary
- Chapter 4: Statistical Concepts for Predictive Modelling
- Random sampling and the central limit theorem
- Hypothesis testing
- Chi-square tests
- Correlation
- Summary
- Chapter 5: Linear Regression with Python
- Understanding the maths behind linear regression
- Making sense of result parameters
- Implementing linear regression with Python
- Model validation
- Handling other issues in linear regression
- Summary
- Chapter 6: Logistic Regression with Python
- Linear regression versus logistic regression
- Understanding the math behind logistic regression
- Implementing logistic regression with Python
- Model validation and evaluation
- Model validation
- Summary
- Chapter 7: Clustering with Python
- Introduction to clustering - what, why, and how?
- Mathematics behind clustering
- Implementing clustering using Python
- Fine-tuning the clustering
- Summary
- Chapter 8: Trees and Random Forests with Python
- Introducing decision trees
- Understanding the mathematics behind decision trees
- Implementing a decision tree with scikit-learn
- Understanding and implementing regression trees
- Understanding and implementing random forests
- Summary
- Chapter 9: Best Practices for Predictive Modelling
- Best practices for coding
- Best practices for data handling
- Best practices for algorithms
- Best practices for statistics
- Best practices for business contexts
- Summary
- A List of Links
- Module 3: Mastering Python Data Visualization
- Chapter 1: A Conceptual Framework for Data Visualization
- Data, information, knowledge, and insight
- The transformation of data
- Data visualization history
- How does visualization help decision-making?
- Visualization plots
- Summary
- Chapter 2: Data Analysis and Visualization
- Why does visualization require planning?
- The Ebola example
- A sports example
- Creating interesting stories with data
- Perception and presentation methods
- Some best practices for visualization
- Visualization tools in Python
- Interactive visualization
- Summary
- Chapter 3: Getting Started with the Python IDE
- The IDE tools in Python
- Visualization plots with Anaconda
- Interactive visualization packages
- Summary
- Chapter 4: Numerical Computing and Interactive Plotting
- NumPy, SciPy, and MKL functions
- Scalar selection
- Slicing
- Array indexing
- Other data structures
- Visualization using matplotlib
- The visualization example in sports
- Summary
- Chapter 5: Financial and Statistical Models
- The deterministic model
- The stochastic model
- The threshold model
- An overview of statistical and machine learning
- Creating animated and interactive plots
- Summary
- Chapter 6: Statistical and Machine Learning
- k-nearest neighbors
- Logistic regression
- Support vector machines
- Principal component analysis
- k-means clustering
- Summary
- Chapter 7: Bioinformatics, Genetics, and Network Models
- Directed graphs and multigraphs
- The clustering coefficient of graphs
- Analysis of social networks
- The planar graph test
- The directed acyclic graph test
- Maximum flow and minimum cut
- A genetic programming example
- Stochastic block models
- Summary
- Chapter 8: Advanced Visualization
- Computer simulation
- Summary
- Appendix: Go Forth and Explore Visualization
- Bibliography
- Index
System requirements
File format: PDF
Copy-Protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our eBook Help page.