Network Science

Name: Network Science | Analysis and Optimization Algorithms for Real-World Applications
Brand: Wiley
Price: 78.99 EUR
Availability: OnlineOnly

Analysis and Optimization Algorithms for Real-World Applications

Carlos Andre Reis Pinheiro(Author)

Wiley (Publisher)

1st Edition

Published on 20. October 2022

352 pages

E-Book

ePUB with Adobe-DRM

System requirements

978-1-119-89893-1 (ISBN)

€78.99incl. 7% vat

System requirements

for ePUB with Adobe-DRM

E-Book Single Licence

Available for download

Description

Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.

Alles über E-Books, Kopierschutz & Dateiformate finden Sie in unserem Info- & Hilfebereich.

Network Science

Network Science offers comprehensive insight on network analysis and network optimization algorithms, with simple step-by-step guides and examples throughout, and a thorough introduction and history of network science, explaining the key concepts and the type of data needed for network analysis, ensuring a smooth learning experience for readers. It also includes a detailed introduction to multiple network optimization algorithms, including linear assignment, network flow and routing problems.

The text is comprised of five chapters, focusing on subgraphs, network analysis, network optimization, and includes a list of case studies, those of which include influence factors in telecommunications, fraud detection in taxpayers, identifying the viral effect in purchasing, finding optimal routes considering public transportation systems, among many others. This insightful book shows how to apply algorithms to solve complex problems in real-life scenarios and shows the math behind these algorithms, enabling readers to learn how to develop them and scrutinize the results.

Written by a highly qualified author with significant experience in the field, Network Science also includes information on:

* Sub-networks, covering connected components, bi-connected components, community detection, k-core decomposition, reach network, projection, nodes similarity and pattern matching

* Network centrality measures, covering degree, influence, clustering coefficient, closeness, betweenness, eigenvector, PageRank, hub and authority

* Network optimization, covering clique, cycle, linear assignment, minimum-cost network flow, maximum network flow problem, minimum cut, minimum spanning tree, path, shortest path, transitive closure, traveling salesman problem, vehicle routing problem and topological sort

With in-depth and authoritative coverage of the subject and many case studies to convey concepts clearly, Network Science is a helpful training resource for professional and industry workers in, telecommunications, insurance, retail, banking, healthcare, public sector, among others, plus as a supplementary reading for an introductory Network Science course for undergraduate students.

More details

Other editions

Person

Content

Preface x

Acknowledgments xiii

About the Author xiv

About the Book xv

1 Concepts in Network Science 1

1.1 Introduction 1

1.2 The Connector 2

1.3 History 3

1.3.1 A History in Social Studies 4

1.4 Concepts 5

1.4.1 Characteristics of Networks 7

1.4.2 Properties of Networks 7

1.4.3 Small World 8

1.4.4 Random Graphs 11

1.5 Network Analytics 12

1.5.1 Data Structure for Network Analysis and Network Optimization 13

1.5.1.1 Multilink and Self-Link 14

1.5.1.2 Loading and Unloading the Graph 15

1.5.2 Options for Network Analysis and Network Optimization Procedures 15

1.5.3 Summary Statistics 16

1.5.3.1 Analyzing the Summary Statistics for the Les Misérables Network 17

1.6 Summary 21

2 Subnetwork Analysis 23

2.1 Introduction 23

2.1.1 Isomorphism 25

2.2 Connected Components 26

2.2.1 Finding the Connected Components 27

2.3 Biconnected Components 35

2.3.1 Finding the Biconnected Components 36

2.4 Community 38

2.4.1 Finding Communities 45

2.5 Core 58

2.5.1 Finding k-Cores 59

2.6 Reach Network 62

2.6.1 Finding the Reach Network 65

2.7 Network Projection 70

2.7.1 Finding the Network Projection 72

2.8 Node Similarity 77

2.8.1 Computing Node Similarity 82

2.9 Pattern Matching 88

2.9.1 Searching for Subgraphs Matches 91

2.10 Summary 98

3 Network Centralities 101

3.1 Introduction 101

3.2 Network Metrics of Power and Influence 102

3.3 Degree Centrality 103

3.3.1 Computing Degree Centrality 103

3.3.2 Visualizing a Network 110

3.4 Influence Centrality 114

3.4.1 Computing the Influence Centrality 115

3.5 Clustering Coefficient 121

3.5.1 Computing the Clustering Coefficient Centrality 121

3.6 Closeness Centrality 124

3.6.1 Computing the Closeness Centrality 124

3.7 Betweenness Centrality 129

3.7.1 Computing the Between Centrality 130

3.8 Eigenvector Centrality 136

3.8.1 Computing the Eigenvector Centrality 137

3.9 PageRank Centrality 144

3.9.1 Computing the PageRank Centrality 144

3.10 Hub and Authority 151

3.10.1 Computing the Hub and Authority Centralities 152

3.11 Network Centralities Calculation by Group 157

3.11.1 By Group Network Centralities 158

3.12 Summary 164

4 Network Optimization 167

4.1 Introduction 167

4.1.1 History 167

4.1.2 Network Optimization in SAS Viya 170

4.2 Clique 170

4.2.1 Finding Cliques 172

4.3 Cycle 176

4.3.1 Finding Cycles 177

4.4 Linear Assignment 179

4.4.1 Finding the Minimum Weight Matching in a Worker-Task Problem 181

4.5 Minimum-Cost Network Flow 185

4.5.1 Finding the Minimum-Cost Network Flow in a Demand-Supply Problem 188

4.6 Maximum Network Flow Problem 194

4.6.1 Finding the Maximum Network Flow in a Distribution Problem 195

4.7 Minimum Cut 199

4.7.1 Finding the Minimum Cuts 201

4.8 Minimum Spanning Tree 205

4.8.1 Finding the Minimum Spanning Tree 206

4.9 Path 208

4.9.1 Finding Paths 211

4.10 Shortest Path 220

4.10.1 Finding Shortest Paths 223

4.11 Transitive Closure 235

4.11.1 Finding the Transitive Closure 236

4.12 Traveling Salesman Problem 239

4.12.1 Finding the Optimal Tour 243

4.13 Vehicle Routing Problem 249

4.13.1 Finding the Optimal Vehicle Routes for a Delivery Problem 253

4.14 Topological Sort 265

4.14.1 Finding the Topological Sort in a Directed Graph 266

4.15 Summary 268

5 Real-World Applications in Network Science 271

5.1 Introduction 271

5.2 An Optimal Tour Considering a Multimodal Transportation System - The Traveling Salesman Problem Example in Paris 272

5.3 An Optimal Beer Kegs Distribution - The Vehicle Routing Problem Example in Asheville 285

5.4 Network Analysis and Supervised Machine Learning Models to Predict COVID-19 Outbreaks 298

5.5 Urban Mobility in Metropolitan Cities 306

5.6 Fraud Detection in Auto Insurance Based on Network Analysis 312

5.7 Customer Influence to Reduce Churn and Increase Product Adoption 320

5.8 Community Detection to Identify Fraud Events in Telecommunications 324

5.9 Summary 328

Index 329

Preface

I was first exposed to the term network science in 2008 when I moved to Dublin to conduct a post-doctoral research term at the Dublin City University in partnership with Eircom. By that time, there was much more network analysis, or social network analysis. At first glance, I thought it was related to social media, such as Facebook, Twitter, LinkedIn, and Instagram, along with many others. Many people likely have this misunderstanding. When we mention social network analysis most people are directly pointed to social media, or the analysis of social interactions.

Network science involves several disciplines like social sciences, graph theory, mathematical modeling, statistics analysis, and optimization, to name a few. Assuming that everything is connected, the main goal is to solve complex problems or to understand a scenario in a unique perspective. When I say everything is connected, I mean that most of the real-world problems can be analyzed in a network perspective, where descriptive attributes are linked to constraints or restrictions, which are linked to possible outcomes or targets, which are linked to goals and solutions, which ultimately are linked to the problems. Even traditional approaches such as predictive modeling can have a network understanding, where input or independent variables are linked to output, target, or dependent variables. How strongly or weakly are they connected to each other? How strongly or weakly are they connected to the target? Surrogate variables can be connected to the original input variables. Network metrics or centralities can turn out to be the most important and relevant predictors in supervised modeling. It can definitely be a more complex approach, but certainly give us opportunities for a more comprehensive understanding of the problem and the possible optimal solutions.

For example, in association rules, items are correlated to each other based on some specific transactions. These correlations create rules, which are symmetric. Similar correlations can be visualized as a network, where items are connected to each other upon a very particular frequency, defining therefore distinct weights for the links among them. The weighted links between items define the importance of the relationships, similar to how confidence and support define the importance of the association rules. This is a very straightforward network analysis that can be conducted in a similar approach as association rules. As in sequence analysis, if we have the time identifier, or the information about the sequence of the transactions, we can also define the direction of the links between the items, and then we can produce a similar analysis as we do in sequence association rules, but again, in a network perspective. Something that we cannot see in association rules or in sequence analysis is the strength of the weak links or the missing links. Imagine that from the association rules, we find two strong rules, Coke associated to Lays, and Pepsi associated to Lays. There is no association between Coke and Pepsi. In network analysis, we would see a "triangle" with a missing link, with a link between Coke and Lays, a link between Pepsi and Lays, but no link between Coke and Pepsi. Perhaps, this missing link indicates they are surrogate products. Speaking about Coke and Pepsi it is easy to figure that out. But what about a grocery store with dozens of thousands of items?

In business problems there is always a question to be answered, an operational task to be improved, a challenge to be overcome, an insight to be produced. Are the customers willing to purchase this specific product? How likely are they to purchase it? How do subscribers consume this specific service? Are they willing to increase or decrease their usage? How likely? How can we improve a telecommunications network due to the customers' usage? How can we improve our supply chain across all stores based on past purchases? Can we describe a specific economic scenario considering different countries, states, or cities? Can we explain cause and effect of complex events in politics, international trade relations, or immigration? Network science works nicely to describe social relationships. However, by social here you can understand almost anything. Social can be people, employees, companies, countries, equipment, products, services, governments, or a combination of them. Network science also works well as an exploratory analysis tool, clearly describing complex scenarios, particularly when relations between entities play a key role.

How do we answer business questions? We often answer questions, solve problems, or explain business scenarios by working on the data that describe that problem or scenario. Machine learning models adjust mathematical and statistical equations according to the data available, based on the data distribution, the types of the variables, their variation, and many other data aspects. Different models work better for some specific type of data, but all models require data to create, or better, to find the right correlations between the problem and the solution. The nice thing about network science is that we can create different networks upon the same set of data, which creates distinct exploratory models and then different outcomes. The way we translate the data into the nodes and links, the way we use the data to define the relationships between them, the way we weight the nodes and links, everything changes the input network and therefore the results. For instance, based on the CDRs - Call Detail Records - we can define a network where the mobile phone is a node, a household is a node, or a switch is a node. The way these nodes are related also depends on the way we want to envision the network. The link can be calls and messages, physical connections of the telecommunications network, and people moving around and being handled by different cell towers. This flexibility in building different "inputs" and performing multiple exploratory models gives us more possibilities to describe and understand the problem, and for sure, more options to find viable solutions.

The first real problem I experienced in terms of network science was working in Dublin at Dublin City University in partnership with Eircom. We had a wonderful challenge to better understand the churn event at Eircom, a major telecommunications company in Ireland. We were asking ourselves if the churn event could occur as a viral event. When subscribers decide to leave, would they influence other subscribers to leave as well? In order to answer this question, we should understand the subscribers' relationships. At the end, this is what communications providers do right? They allow people to get connected to each other. Then, we got the fundamental data from carriers, calls, and text messages. This data describes in detail when and how one subscriber gets connected to another. Based on this data, we built the network, considering all subscribers, and all relationships. In addition to that, we considered the churn event over time. By doing this, we could monitor what happened when a subscriber decided to leave. What happened with their friends, relatives, co-workers, and so on? Did they leave afterwards? We investigated a substantial amount of data, considering a reasonable timeframe, to understand the overall viral effect some subscribers could exert over the others. In addition to the traditional transactional and demographic data about the customers, we described them in terms of their network centralities (what valued them as nodes within a network), and in terms of communities (how they were grouped together based on their relationships). At the end, a classification model was trained to estimate the likelihood of each subscriber's behavior as an influencer in terms of the churn event. We also noticed that some subscribers can be influencers in one specific business event and not be for another. That means, the characteristics of the influencer subscribers differ from one business event to another. The patterns of influence in churn were different from purchasing, consuming, or product adoption.

After that experience, I had the privilege and luck to work on many projects involving network science, looking at vastly different type of problems in a variety of industries. Even though each project is unique in terms of the particular problem, or the best possible solution, or based on the timely data available, most of them search for solutions in the same space. The solutions go from traditional business demands such as avoiding churn and boosting product adoption in communications and retail, to detecting fraud in finance, insurance, communications, taxpayers, and consumer goods. This required, a search for the optimal learning path not based on the content or subject of the courses but upon the relationships of the courses created by the student enrollments to evaluate and understand the players in economic trading among government agencies. Also, to search the main actors in illicit trade to find the best delivery routing in wholesales - from depot to stores, to optimize a work force scheduling in restaurants and hotels, to find optimal routes using the public transportation systems, to understand the virus spreading and predict new outbreaks, to foresee population movements and the impact in society, to evaluate the urban mobility and create solutions for specific big events, and unexpected situations, among others. The more I work in the field of network science in different industries with distinct customers, the more I believe it is part of an overall solution for complex problems.

After many projects in the field of network...

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Network Science

Description

More details

Other editions

Additional editions

Person

Content

Preface

System requirements