
Java Data Analysis
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
All prices
More details
Other editions
Additional editions

Person
John R. Hubbard has been doing computer-based data analysis for over 40 years at colleges and universities in Pennsylvania and Virginia. He holds an MSc in computer science from Penn State University and a PhD in mathematics from the University of Michigan. He is currently a professor of mathematics and computer science, Emeritus, at the University of Richmond, where he has been teaching data structures, database systems, numerical analysis, and big data. Dr. Hubbard has published many books and research papers, including six other books on computing. Some of these books have been translated into German, French, Chinese, and five other languages. He is also an amateur timpanist.
Content
- Cover
- Copyright
- Credits
- About the Author
- About the Reviewers
- www.PacktPub.com
- Customer Feedback
- Table of Contents
- Preface
- Chapter 1: Introduction to Data Analysis
- Origins of data analysis
- The scientific method
- Actuarial science
- Calculated by steam
- A spectacular example
- Herman Hollerith
- ENIAC
- VisiCalc
- Data, information, and knowledge
- Why Java?
- Java Integrated Development Environments
- Summary
- Chapter 2: Data Preprocessing
- Data types
- Variables
- Data points and datasets
- Null values
- Relational database tables
- Key fields
- Key-value pairs
- Hash tables
- File formats
- Microsoft Excel data
- XML and JSON data
- Generating test datasets
- Metadata
- Data cleaning
- Data scaling
- Data filtering
- Sorting
- Merging
- Hashing
- Summary
- Chapter 3: Data Visualization
- Tables and graphs
- Scatter plots
- Line graphs
- Bar charts
- Histograms
- Time series
- Java implementation
- Moving average
- Data ranking
- Frequency distributions
- The normal distribution
- A thought experiment
- The exponential distribution
- Java example
- Summary
- Chapter 4: Statistics
- Descriptive statistics
- Random sampling
- Random variables
- Probability distributions
- Cumulative distributions
- The binomial distribution
- Multivariate distributions
- Conditional probability
- The independence of probabilistic events
- Contingency tables
- Bayes' theorem
- Covariance and correlation
- The standard normal distribution
- The central limit theorem
- Confidence intervals
- Hypothesis testing
- Summary
- Chapter 5: Relational Databases
- The relation data model
- Relational databases
- Foreign keys
- Relational database design
- Creating a database
- SQL commands
- Inserting data into the database
- Database queries
- SQL data types
- JDBC
- Using a JDBC PreparedStatement
- Batch processing
- Database views
- Subqueries
- Table indexes
- Summary
- Chapter 6: Regression Analysis
- Linear regression
- Linear regression in Excel
- Computing the regression coefficients
- Variation statistics
- Java implementation of linear regression
- Anscombe's quartet
- Polynomial regression
- Multiple linear regression
- The Apache Commons implementation
- Curve fitting
- Summary
- Chapter 7: Classification Analysis
- Decision trees
- What does entropy have to do with it?
- The ID3 algorithm
- Java Implementation of the ID3 algorithm
- The Weka platform
- The ARFF filetype for data
- Java implementation with Weka
- Bayesian classifiers
- Java implementation with Weka
- Support vector machine algorithms
- Logistic regression
- K-Nearest Neighbors
- Fuzzy classification algorithms
- Summary
- Chapter 8: Cluster Analysis
- Measuring distances
- The curse of dimensionality
- Hierarchical clustering
- Weka implementation
- K-means clustering
- K-medoids clustering
- Affinity propagation clustering
- Summary
- Chapter 9: Recommender Systems
- Utility matrices
- Similarity measures
- Cosine similarity
- A simple recommender system
- Amazon's item-to-item collaborative filtering recommender
- Implementing user ratings
- Large sparse matrices
- Using random access files
- The Netflix prize
- Summary
- Chapter 10: NoSQL Databases
- The Map data structure
- SQL versus NoSQL
- The Mongo database system
- The Library database
- Java development with MongoDB
- The MongoDB extension for geospatial databases
- Indexing in MongoDB
- Why NoSQL and why MongoDB?
- Other NoSQL database systems
- Summary
- Chapter 11: Big Data Analysis with Java
- Scaling, data striping, and sharding
- Google's PageRank algorithm
- Google's MapReduce framework
- Some examples of MapReduce applications
- The WordCount example
- Scalability
- Matrix multiplication with MapReduce
- MapReduce in MongoDB
- Apache Hadoop
- Hadoop MapReduce
- Summary
- Appendix: Java Tools
- The command line
- Java
- NetBeans
- MySQL
- MySQL Workbench
- Accessing the MySQL database from NetBeans
- The Apache Commons Math Library
- The javax JSON Library
- The Weka libraries
- MongoDB
- Index
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.
File format: PDF
Copy-Protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (only limited: Kindle).
The file format PDF always displays a book page identically on any hardware. This makes PDF suitable for complex layouts such as those used in textbooks and reference books (images, tables, columns, footnotes). Unfortunately, on the small screens of e-readers or smartphones, PDFs are rather annoying, requiring too much scrolling.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our eBook Help page.