
Machine Learning Algorithms and Applications
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
The book discusses many methods based in different fields, including statistics, pattern recognition, neural networks, artificial intelligence, sentiment analysis, control, and data mining, in order to present a unified treatment of machine learning problems and solutions. All learning algorithms are explained so that the user can easily move from the equations in the book to a computer program.
More details
Other editions
Additional editions


Persons
Mettu Srinivas PhD from the Indian Institute of Technology Hyderabad, and is currently an assistant professor in the Department of Computer Science and Engineering, NIT Warangal, India.
G. Sucharitha PhD from KL University, Vijayawada and is currently an assistant professor in the Department of Electronics and Communication Engineering at ICFAI Foundation for Higher Education Hyderabad.
Anjanna Matta PhD from the Indian Institute of Technology Hyderabad and is currently an assistant professor in the Department of Mathematics at ICFAI Foundation for Higher Education Hyderabad.
Prasenjit Chatterjee PhD is an associate professor in the Mechanical Engineering Department at MCKV Institute of Engineering, India.
Content
Acknowledgments xv
Preface xvii
Part 1: Machine Learning for Industrial Applications 1
1 A Learning-Based Visualization Application for Air Quality Evaluation During COVID-19 Pandemic in Open Data Centric Services 3
Priyank Jain and Gagandeep Kaur
1.1 Introduction 4
1.1.1 Open Government Data Initiative 4
1.1.2 Air Quality 4
1.1.3 Impact of Lockdown on Air Quality 5
1.2 Literature Survey 5
1.3 Implementation Details 6
1.3.1 Proposed Methodology 7
1.3.2 System Specifications 8
1.3.3 Algorithms 8
1.3.4 Control Flow 10
1.4 Results and Discussions 11
1.5 Conclusion 21
References 21
2 Automatic Counting and Classification of Silkworm Eggs Using Deep Learning 23
Shreedhar Rangappa, Ajay A. and G. S. Rajanna
2.1 Introduction 23
2.2 Conventional Silkworm Egg Detection Approaches 24
2.3 Proposed Method 25
2.3.1 Model Architecture 26
.3.2 Foreground-Background Segmentation 28
2.3.3 Egg Location Predictor 30
2.3.4 Predicting Egg Class 31
2.4 Dataset Generation 35
2.5 Results 35
2.6 Conclusion 37
Acknowledgment 38
References 38
3 A Wind Speed Prediction System Using Deep Neural Networks 41
Jaseena K. U. and Binsu C. Kovoor
3.1 Introduction 42
3.2 Methodology 45
3.2.1 Deep Neural Networks 45
3.2.2 The Proposed Method 47
3.2.2.1 Data Acquisition 47
3.2.2.2 Data Pre-Processing 48
3.2.2.3 Model Selection and Training 50
3.2.2.4 Performance Evaluation 51
3.2.2.5 Visualization 51
3.3 Results and Discussions 52
3.3.1 Selection of Parameters 52
3.3.2 Comparison of Models 53
3.4 Conclusion 57
References 57
4 Res-SE-Net: Boosting Performance of ResNets by Enhancing Bridge Connections 61
Varshaneya V., S. Balasubramanian and Darshan Gera
4.1 Introduction 61
4.2 Related Work 62
4.3 Preliminaries 63
4.3.1 ResNet 63
4.3.2 Squeeze-and-Excitation Block 64
4.4 Proposed Model 66
4.4.1 Effect of Bridge Connections in ResNet 66
4.4.2 Res-SE-Net: Proposed Architecture 67
4.5 Experiments 68
4.5.1 Datasets 68
4.5.2 Experimental Setup 68
4.6 Results 69
4.7 Conclusion 73
References 74
5 Hitting the Success Notes of Deep Learning 77
Sakshi Aggarwal, Navjot Singh and K.K. Mishra
5.1 Genesis 78
5.2 The Big Picture: Artificial Neural Network 79
5.3 Delineating the Cornerstones 80
5.3.1 Artificial Neural Network vs. Machine Learning 80
5.3.2 Machine Learning vs. Deep Learning 81
5.3.3 Artificial Neural Network vs. Deep Learning 81
5.4 Deep Learning Architectures 82
5.4.1 Unsupervised Pre-Trained Networks 82
5.4.2 Convolutional Neural Networks 83
5.4.3 Recurrent Neural Networks 84
5.4.4 Recursive Neural Network 85
5.5 Why is CNN Preferred for Computer Vision Applications? 85
5.5.1 Convolutional Layer 86
5.5.2 Nonlinear Layer 86
5.5.3 Pooling Layer 87
5.5.4 Fully Connected Layer 87
5.6 Unravel Deep Learning in Medical Diagnostic Systems 89
5.7 Challenges and Future Expectations 94
5.8 Conclusion 94
References 95
6 Two-Stage Credit Scoring Model Based on Evolutionary Feature Selection and Ensemble Neural Networks 99
Diwakar Tripathi, Damodar Reddy Edla, Annushree Bablani and Venkatanareshbabu Kuppili
6.1 Introduction 100
6.1.1 Motivation 100
6.2 Literature Survey 101
6.3 Proposed Model for Credit Scoring 103
6.3.1 Stage-1: Feature Selection 104
6.3.2 Proposed Criteria Function 105
6.3.3 Stage-2: Ensemble Classifier 106
6.4 Results and Discussion 107
6.4.1 Experimental Datasets and Performance Measures 107
6.4.2 Classification Results With Feature Selection 108
6.5 Conclusion 112
References 113
7 Enhanced Block-Based Feature Agglomeration Clustering for Video Summarization 117
Sreeja M. U. and Binsu C. Kovoor
7.1 Introduction 118
7.2 Related Works 119
7.3 Feature Agglomeration Clustering 122
7.4 Proposed Methodology 122
7.4.1 Pre-Processing 123
7.4.2 Modified Block Clustering Using Feature Agglomeration Technique 125
7.4.3 Post-Processing and Summary Generation 127
7.5 Results and Analysis 129
7.5.1 Experimental Setup and Data Sets Used 129
7.5.2 Evaluation Metrics 130
7.5.3 Evaluation 131
7.6 Conclusion 138
References 138
Part 2: Machine Learning for Healthcare Systems 141
8 Cardiac Arrhythmia Detection and Classification From ECG Signals Using XGBoost Classifier 143
Saroj Kumar Pandeyz, Rekh Ram Janghel and Vaibhav Gupta
8.1 Introduction 143
8.2 Materials and Methods 145
8.2.1 MIT-BIH Arrhythmia Database 146
8.2.2 Signal Pre-Processing 147
8.2.3 Feature Extraction 147
8.2.4 Classification 148
8.2.4.1 XGBoost Classifier 148
8.2.4.2 AdaBoost Classifier 149
8.3 Results and Discussion 149
8.4 Conclusion 155
References 156
9 GSA-Based Approach for Gene Selection from Microarray Gene Expression Data 159
Pintu Kumar Ram and Pratyay Kuila
9.1 Introduction 159
9.2 Related Works 161
9.3 An Overview of Gravitational Search Algorithm 162
9.4 Proposed Model 163
9.4.1 Pre-Processing 163
9.4.2 Proposed GSA-Based Feature Selection 164
9.5 Simulation Results 166
9.5.1 Biological Analysis 168
9.6 Conclusion 172
References 172
Part 3: Machine Learning for Security Systems 175
10 On Fusion of NIR and VW Information for Cross-Spectral Iris Matching 177
Ritesh Vyas, Tirupathiraju Kanumuri, Gyanendra Sheoran and Pawan Dubey
10.1 Introduction 177
10.1.1 Related Works 178
10.2 Preliminary Details 179
10.2.1 Fusion 181
10.3 Experiments and Results 182
10.3.1 Databases 182
10.3.2 Experimental Results 182
10.3.2.1 Same Spectral Matchings 183
10.3.2.2 Cross Spectral Matchings 184
10.3.3 Feature-Level Fusion 186
10.3.4 Score-Level Fusion 189
10.4 Conclusions 190
References 190
11 Fake Social Media Profile Detection 193
Umita Deepak Joshi, Vanshika, Ajay Pratap Singh, Tushar Rajesh Pahuja, Smita Naval and Gaurav Singal
11.1 Introduction 194
11.2 Related Work 195
11.3 Methodology 197
11.3.1 Dataset 197
11.3.2 Pre-Processing 198
11.3.3 Artificial Neural Network 199
11.3.4 Random Forest 202
11.3.5 Extreme Gradient Boost 202
11.3.6 Long Short-Term Memory 204
11.4 Experimental Results 204
11.5 Conclusion and Future Work 207
Acknowledgment 207
References 207
12 Extraction of the Features of Fingerprints Using Conventional Methods and Convolutional Neural Networks 211
E. M. V. Naga Karthik and Madan Gopal
12.1 Introduction 212
12.2 Related Work 213
12.3 Methods and Materials 215
12.3.1 Feature Extraction Using SURF 215
12.3.2 Feature Extraction Using Conventional Methods 216
12.3.2.1 Local Orientation Estimation 216
12.3.2.2 Singular Region Detection 218
12.3.3 Proposed CNN Architecture 219
12.3.4 Dataset 221
12.3.5 Computational Environment 221
12.4 Results 222
12.4.1 Feature Extraction and Visualization 223
12.5 Conclusion 226
Acknowledgements 226
References 226
13 Facial Expression Recognition Using Fusion of Deep Learning and Multiple Features 229
M. Srinivas, Sanjeev Saurav, Akshay Nayak and Murukessan A. P.
13.1 Introduction 230
13.2 Related Work 232
13.3 Proposed Method 235
13.3.1 Convolutional Neural Network 236
13.3.1.1 Convolution Layer 236
13.3.1.2 Pooling Layer 237
13.3.1.3 ReLU Layer 238
13.3.1.4 Fully Connected Layer 238
13.3.2 Histogram of Gradient 239
13.3.3 Facial Landmark Detection 240
13.3.4 Support Vector Machine 241
13.3.5 Model Merging and Learning 242
13.4 Experimental Results 242
13.4.1 Datasets 242
13.5 Conclusion 245
Acknowledgement 245
References 245
Part 4: Machine Learning for Classification and Information Retrieval Systems 247
14 AnimNet: An Animal Classification Network using Deep Learning 249
Kanak Manjari, Kriti Singhal, Madhushi Verma and Gaurav Singal
14.1 Introduction 249
14.1.1 Feature Extraction 250
14.1.2 Artificial Neural Network 250
14.1.3 Transfer Learning 251
14.2 Related Work 252
14.3 Proposed Methodology 254
14.3.1 Dataset Preparation 254
14.3.2 Training the Model 254
14.4 Results 258
14.4.1 Using Pre-Trained Networks 259
14.4.2 Using AnimNet 259
14.4.3 Test Analysis 260
14.5 Conclusion 263
References 264
15 A Hybrid Approach for Feature Extraction From Reviews to Perform Sentiment Analysis 267
Alok Kumar and Renu Jain
15.1 Introduction 268
15.2 Related Work 269
15.3 The Proposed System 271
15.3.1 Feedback Collector 272
15.3.2 Feedback Pre-Processor 272
15.3.3 Feature Selector 272
15.3.4 Feature Validator 274
15.3.4.1 Removal of Terms From Tentative List of Features on the Basis of Syntactic Knowledge 274
15.3.4.2 Removal of Least Significant Terms on the Basis of Contextual Knowledge 276
15.3.4.3 Removal of Less Significant Terms on the Basis of Association With Sentiment Words 277
15.3.4.4 Removal of Terms Having Similar Sense 278
15.3.4.5 Removal of Terms Having Same Root 279
15.3.4.6 Identification of Multi-Term Features 279
15.3.4.7 Identification of Less Frequent Feature 279
15.3.5 Feature Concluder 281
15.4 Result Analysis 282
15.5 Conclusion 286
References 286
16 Spark-Enhanced Deep Neural Network Framework for Medical Phrase Embedding 289
Amol P. Bhopale and Ashish Tiwari
16.1 Introduction 290
16.2 Related Work 291
16.3 Proposed Approach 292
16.3.1 Phrase Extraction 292
16.3.2 Corpus Annotation 294
16.3.3 Phrase Embedding 294
16.4 Experimental Setup 297
16.4.1 Dataset Preparation 297
16.4.2 Parameter Setting 297
16.5 Results 298
16.5.1 Phrase Extraction 298
16.5.2 Phrase Embedding 298
16.6 Conclusion 303
References 303
17 Image Anonymization Using Deep Convolutional Generative Adversarial Network 305
Ashish Undirwade and Sujit Das
17.1 Introduction 306
17.2 Background Information 310
17.2.1 Black Box and White Box Attacks 310
17.2.2 Model Inversion Attack 311
17.2.3 Differential Privacy 312
17.2.3.1 Definition 312
17.2.4 Generative Adversarial Network 313
17.2.5 Earth-Mover (EM) Distance/Wasserstein Metric 316
17.2.6 Wasserstein GAN 317
17.2.7 Improved Wasserstein GAN (WGAN-GP) 317
17.2.8 KL Divergence and JS Divergence 318
17.2.9 DCGAN 319
17.3 Image Anonymization to Prevent Model Inversion Attack 319
17.3.1 Algorithm 321
17.3.2 Training 322
17.3.3 Noise Amplifier 323
17.3.4 Dataset 324
17.3.5 Model Architecture 324
17.3.6 Working 325
17.3.7 Privacy Gain 325
17.4 Results and Analysis 326
17.5 Conclusion 328
References 329
Index 331
1
A Learning-Based Visualization Application for Air Quality Evaluation During COVID-19 Pandemic in Open Data Centric Services
Priyank Jain* and Gagandeep Kaur┼
Dept. of CSE & IT, Jaypee Institute of Information Technology, Noida, Uttar Pradesh, India
Abstract
Air pollution has become a major concern in many developing countries. There are various factors that affect the quality of air. Some of them are Nitrogen Dioxide (NO2), Ozone (O3), Particulate Matter 10 (PM10), Particulate Matter 2.5 (PM2.5), Sulfur Dioxide (SO2), and Carbon Monoxide (CO). The Government of India under the Open Data Initiative provides data related to air pollution. Interpretation of this data requires analysis, visualization, and prediction. This study proposes machine learning and visualization techniques for air pollution. Both supervised and unsupervised learning techniques have been used for prediction and analysis of air quality at major places in India. The data used in this research contains the presence of six major air pollutants in a given area. The work has been extended to study the impact of lockdown on air pollution in Indian cities as well.
Keywords: Open Data, JSON API, OpenAQ, clustering, SVM, LSTM, prediction, Heat Map visualizations
1.1 Introduction
1.1.1 Open Government Data Initiative
These days, Open Government Data (OGD) is gaining momentum in providing sharing of knowledge by making public data and information of governmental bodies freely available to private citizens in system processable formats so as to reuse it for mutual benefits. OGD is a global movement and has its roots in the initiative started in 2009 by the US President as a Memorandum on Transparency and Open Government providing transparency in government projects and collaborations through sharing of data by public administration and industry to private citizens. Indian Government also has joined this initiative and provides free access to the data for development of applications, etc., so as to be able to reuse the information for mutual growth of industry and government. Open Data is the raw data made available by governments, industry, as well as NGOs, scientific institutions, and educational organizations and as such is not an individual's property.
The growth in the field of Open Data surely asks for new tools and techniques that can support it. Digital transformation needs companies to look out for new tools and techniques so as to be able to support the increasing need for faster delivery of services at large numbers of delivery points. Technologies like SaaS, mobile, and Internet of Things are gaining grounds in providing increase in endpoints and thus enabling the success of Open Data Initiative.
1.1.2 Air Quality
A report, State of Global Air 2017, by Institute for Health Metrics published recently [1] stated that, in the year 2015, there have been 1,090,400 deaths in India only due to an increase in PM2.5. High concentration of PM2.5 in the air is majorly caused by burning of petroleum fuels, household fuels, wooden fuels, agricultural fires, and industry related pollutants and contaminants. In 2015, India and Bangladesh came next to North African and Middle East countries in terms of places with high concentration of PM2.5 in air.
The report compares the ambient concentrations to the air quality guidelines established by the WHO in 2005. Based on the report by WHO, in the year 2015, 92% of the world's population and 86% of Indian population lived in unsafe areas exceeding safe limits. It is therefore need of the hour to develop tools that can provide better forecasting and easy understanding of the surrounding environment to naive users with lowest cost possible. Air Quality Index (AQI) is a commonly used index by agencies to provide information about quality of air in the vicinity to its residents.
The irony of today's Internet world is that even when we are inundated with large quantities of data or information, we as humans still struggle with its rightful interpretation. Extracting meaningful information from plain textual data in old tabular formats is an extraneous task. It is under these circumstances that data visualizations play a vital role.
The objective of this work was to build a machine learning-based visualization app for air quality evaluation and air pollution assessment by assessing various parameters by which air is getting polluted. Existing approaches did not account for variations in values of parameters at different locations. That is why we have trained different models for different locations to capture the trends better.
1.1.3 Impact of Lockdown on Air Quality
COVID-19 is a highly infectious disease caused by a newly discovered Coronavirus which was firstly identified in Wuhan, Central China. It has taken more than 460,000 lives as on 20th June, 2020, around the world. Due to this pandemic, a nationwide lockdown was imposed in India from 24th March, 2020, which extended up to several weeks. It is observed that lockdown could help in reducing pollution levels to a certain extent. This study tries to capture the variations in air pollution levels with and without lockdown.
1.2 Literature Survey
Air pollution occurs when particulates (pm2.5 and pm10), biological molecules, and other harmful substances are introduced into Earth's atmosphere. Natural processes and human activities can both generate air pollution. Air pollution can be further classified into two sections: visible air pollution and invisible air pollution.
Proactive monitoring and control of our natural and built environments is important in various application scenarios. Semantic Sensor Web technologies have been well researched and used for environmental monitoring applications to expose sensor data for analysis in order to provide responsive actions in situations of interest [2]. A sliding window approach that employs the Multilayer Perceptron model to predict short-term PM 2.5 pollution situations is integrated into the proactive monitoring and control framework [2]. Time series data in practical applications always contain missing values due to sensor malfunction, network failure, outliers, etc. [3]. A spatiotemporal prediction framework based on missing value processing algorithms and deep recurrent neural network (DRNN) has been proposed [3]. A generic methodology for weather forecasting is proposed by the help of incremental K-means clustering algorithm in [4]. Air pollution data are available to the public as numeric values on the concentration of pollutants in the air on a web page [5, 6]. The numeric information is not conducive to determining the air pollution level intuitively [6]. To address this problem, the study developed and implemented a program for visualizing the air pollution level for six pollutants by obtaining real-time air pollution data using API and generating a keyhole markup language (KML) file defined to visualize the data on Google Earth intuitively [6]. Visualization method is intuitive and reliable through data quality checking and information sharing with multi-perspective air pollution graphs [7]. This method allows the data to be easily understood by the public and inspire or aid further studies in other fields [7]. As the tools are invented using spatial-temporal visualization and visual analytics for general visualization purposes of geo-referenced time series data of air quality and environmental data, they can be applied to other environmental monitoring data (temperature, precipitation, etc.) through some configurations [8].
According to a survey mentioned in [9], pollution levels in many cities across the country reduced down drastically only after a few days of imposing lockdown. Also, as discussed in the study [10], lockdown could be the effective alternative measure to be implemented for controlling air pollution.
The results above show us that all these machine learning techniques can be used for prediction and evaluating air pollution thereafter. Implementation details are described in the next section.
1.3 Implementation Details
There are several paradigms that can be implemented to classify the quality of air. The novelty of the application is to predict the future air quality of different places in detail with estimated values of various parameters along with its air quality and AQI. The application is able to visualize data in an efficient and descriptive way which is hard to analyze numerically in its raw form.
1.3.1 Proposed Methodology
Our proposed methodology steps have been discussed as follows:
- 1. Fetch real-time air quality data through an API of Open Data.
- 2. Clustering of air quality data based on AQI and assigning classes of air quality from good to severe.
- 3. Train a Support Vector Machine (SVM) model on the previously clustered data.
- 4. Train different time series Long Short-Term Memory (LSTM), a Recurrent Neural Network (RNN) model for different places to predict the future air quality of that place based on the previous trend.
- 5. Assign air quality and AQI to the observed/predicted values of the parameters. AQI is assigned based on the worst 24-hour average of all the parameters.
- 6. Different visualizations of the past data and future predictions using Heat Maps, Graphs, etc.
- 7. Compare variations in different parameters contributing toward air pollution at different places.
- 8. Provide a user-friendly web app to predict and analyze air quality.
Figure 1.1 Workflow of...
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.