This state-of-the-art data-mining software kit accompanies the book Predictive Data Mining: A Practical Guide. Information from the book is necessary to access the software. If you do not currently own the book, you may wish to look at the 1-55860-478-2 book/software package; the 1-55860-403-0 book alone is also available. The software, which is delivered through a special web site, is a collection of routines for efficient mining of big data. Both classical and the more computationally expensive state-of-the-art prediction methods are included. Using a standard spreadsheet data format, this kit implements all of the data-mining tasks described in the book. The software is available for Windows 95/NT and Unix. For detailed information about the kit, please visit the http://www data-miner.com>Data-Miner site.
This state-of-the-art data-mining software kit accompanies the book Predictive Data Mining: A Practical Guide. Information from the book is necessary to access the software. If you do not currently own the book, you may wish to look at the 1-55860-478-2 book/software package; the 1-55860-403-0 book alone is also available. The software, which is delivered through a special web site, is a collection of routines for efficient mining of big data. Both classical and the more computationally expensive state-of-the-art prediction methods are included. Using a standard spreadsheet data format, this kit implements all of the data-mining tasks described in the book. The software is available for Windows 95/NT and Unix. For detailed information about the kit, please visit the http://www data-miner.com>Data-Miner site.
Sprache
Verlagsort
Verlagsgruppe
Elsevier Science & Technology
Zielgruppe
Für höhere Schule und Studium
Für Beruf und Forschung
Maße
Höhe: 240 mm
Breite: 104 mm
Gewicht
ISBN-13
978-1-55860-477-3 (9781558604773)
Copyright in bibliographic data is held by Nielsen Book Services Limited or its licensors: all rights reserved.
Schweitzer Klassifikation
By Sholom M. Weiss and Nitin Indurkhya
By Sholom M. Weiss and Nitin Indurkhya
Predictive Data Mining Contents; Predictive Data Mining: A Practical Guide; by Sholom M. Weiss and Nitin Indurkhya; Preface; 1 What is Data Mining?; 1.1 Big Data; 1.1.1 The Data Warehouse; 1.1.2 Timelines; 1.2 Types of Data-Mining Problems; 1.3 The Pedigree of Data Mining; 1.3.1 Databases; 1.3.2 Statistics; 1.3.3 Machine Learning; 1.4 Is Big Better?; 1.4.1 Strong Statistical Evaluation; 1.4.2 More Intensive Search; 1.4.3 More Controlled Experiments; 1.4.4 Is Big Necessary; 1.5 The Tasks of Predictive Data Mining; 1.5.1 Data Preparation; 1.5.2 Data Reduction; 1.5.3 Data Modeling and Prediction; 1.5.4 Case and Solution Analyses; 1.6 Data Mining: Art or Science; 1.7 An Overview of the Book; 1.8 Bibliographic and Historical Remarks; 2 Statistical Evaluation for Big Data; 2.1 The Idealized Model; 2.1.1 Classical Statistical Comparison and Evaluation; 2.2 It's Big but Is It Biased; 2.2.1 Objective Versus Survey Data; 2.2.2 Significance and Predictive Value; 2.2.2.1 Too Many Comparisons?; 2.3 Classical Types of Statistical Prediction; 2.3.1 Predicting True-or-False: Classification; 2.3.1.1 Error Rates; 2.3.2 Forecasting Numbers: Regression; 2.3.2.1 Distance Measures; 2.4 Measuring Predictive Performance; 2.4.1 Independent Testing; 2.4.1.1 Random Training and Testing; 2.4.1.2 How Accurate Is the Error Estimate?; 2.4.1.3 Comparing Results for Error Measures; 2.4.1.4 Ideal or Real-World Sampling?; 2.4.1.5 Training and Testing from Different Time Periods; 2.5 Too Much Searching and Testing?; 2.6 Why Are Errors Made?; 2.7 Bibliographic and Historical Remarks; 3 Preparing the Data; 3.1 A Standard Form; 3.1.1 Standard Measurements; 3.1.2 Goals; 3.2 Data Transformations; 3.2.1 Normalizations; 3.2.2 Data Smoothing; 3.2.3 Differences and Ratios; 3.3 Missing Data; 3.4 Time-Dependent Data; 3.4.1 Time Series; 3.4.2 Composing Features from Time Series; 3.4.2.1 Current Values; 3.4.2.2 Moving Averages; 3.4.2.3 Trends; 3.4.2.4 Seasonal Adjustments; 3.5 Hybrid Time-Dependent Applications; 3.5.1 Multivariate Time Series; 3.5.2 Classification and Time Series; 3.5.3 Standard Cases and Time-Series Attributes; 3.6 Text Mining; 3.7 Bibliographic and Historical Remarks; 4 Data Reduction; 4.1 Selecting the Best Features; 4.2 Feature Selection from Means and Variances; 4.2.1 Independent Features; 4.2.2 Distance-Based Optimal Feature Selection; 4.2.3 Heuristic Feature Selection; 4.3 Principal Constraints; 4.4 Feature Selection by Decision Trees; 4.5 How Many Measured Values; 4.5.1 Reducing and Smoothing Values; 4.5.1.1 Rounding; 4.5.1.2 K-Means Clustering; 4.5.1.3 Class Entropy; 4.6 How Many Cases?; 4.6.1 A Single Sample; 4.6.2 Incremental Samples; 4.6.3 Average Samples; 4.6.4 Specialized Case-Reduction Techniques; 4.6.4.1 Sequential Sampling over Time; 4.6.4.2 Strategic Sampling of Key Events; 4.6.4.3 Adjusting Prevalence; 4.7 Bibliographic and Historical Remarks; 5 Looking for Solutions; 5.1 Overview; 5.2 Math Solutions; 5.2.1 Linear Scoring; 5.2.2 Nonlinear Scoring: Neural Nets; 5.2.3 Advanced Statistical Methods; 5.3 Distance Solutions; 5.4 Logic Solutions; 5.4.1 Decision Trees; 5.4.2 Decision Rules; 5.5 What Do the Answers Mean?; 5.5.1 Is It Safe to Edit Solutions?; 5.6 Which Solution is Preferable?; 5.7 Combining Different Answers; 5.7.1 Multiple Prediction Methods; 5.7.2 Multiple Samples; 5.8 Bibliographic and Historical Remarks; 6 What's Best for Data Reduction and Mining?; 6.1 Let's Analyze Some Real Data; 6.2 The Experimental Methods; 6.3 The Empirical Results; 6.3.1 Significance Testing; 6.4 So What Did We Learn?; 6.4.1 Feature Selection; 6.4.2 Value Reduction; 6.4.3 Subsampling or All Cases; 6.5 Graphical Trend Analysis; 6.5.1 Incremental Case Analysis; 6.5.2 Incremental Complexity Analysis; 6.6 Maximum Data Reduction; 6.7 Are There Winners and Losers in Performance?; 6.8 Getting the Best Results; 6.9 Bibliogaphic and Historical Remarks; 7 Art or Science? Case Studies in Data Mining; 7.1 Why These Case Studies?; 7.2 A Summary of Tasks for Predictive Data Mining; 7.2.1 A Checklist for Data Preparation; 7.2.2 A Checklist for Data Reduction; 7.2.3 A Checklist for Data Modeling and Prediction; 7.2.4 A Checklist for Case and Solution Analyses; 7.3 The Case Studies; 7.3.1 Transaction Processing; 7.3.2 Text Mining; 7.3.3 Outcomes Analysis; 7.3.4 Process Control; 7.3.5 Marketing and User Profiling; 7.3.6 Exploratory Analysis; 7.4 Looking Ahead; 7.5 Bibliographic and Historical Remarks; Appendix: Data-Miner Software Kit