Schweitzer Fachinformationen
Wenn es um professionelles Wissen geht, ist Schweitzer Fachinformationen wegweisend. Kunden aus Recht und Beratung sowie Unternehmen, öffentliche Verwaltungen und Bibliotheken erhalten komplette Lösungen zum Beschaffen, Verwalten und Nutzen von digitalen und gedruckten Medien.
I was first exposed to the term network science in 2008 when I moved to Dublin to conduct a post-doctoral research term at the Dublin City University in partnership with Eircom. By that time, there was much more network analysis, or social network analysis. At first glance, I thought it was related to social media, such as Facebook, Twitter, LinkedIn, and Instagram, along with many others. Many people likely have this misunderstanding. When we mention social network analysis most people are directly pointed to social media, or the analysis of social interactions.
Network science involves several disciplines like social sciences, graph theory, mathematical modeling, statistics analysis, and optimization, to name a few. Assuming that everything is connected, the main goal is to solve complex problems or to understand a scenario in a unique perspective. When I say everything is connected, I mean that most of the real-world problems can be analyzed in a network perspective, where descriptive attributes are linked to constraints or restrictions, which are linked to possible outcomes or targets, which are linked to goals and solutions, which ultimately are linked to the problems. Even traditional approaches such as predictive modeling can have a network understanding, where input or independent variables are linked to output, target, or dependent variables. How strongly or weakly are they connected to each other? How strongly or weakly are they connected to the target? Surrogate variables can be connected to the original input variables. Network metrics or centralities can turn out to be the most important and relevant predictors in supervised modeling. It can definitely be a more complex approach, but certainly give us opportunities for a more comprehensive understanding of the problem and the possible optimal solutions.
For example, in association rules, items are correlated to each other based on some specific transactions. These correlations create rules, which are symmetric. Similar correlations can be visualized as a network, where items are connected to each other upon a very particular frequency, defining therefore distinct weights for the links among them. The weighted links between items define the importance of the relationships, similar to how confidence and support define the importance of the association rules. This is a very straightforward network analysis that can be conducted in a similar approach as association rules. As in sequence analysis, if we have the time identifier, or the information about the sequence of the transactions, we can also define the direction of the links between the items, and then we can produce a similar analysis as we do in sequence association rules, but again, in a network perspective. Something that we cannot see in association rules or in sequence analysis is the strength of the weak links or the missing links. Imagine that from the association rules, we find two strong rules, Coke associated to Lays, and Pepsi associated to Lays. There is no association between Coke and Pepsi. In network analysis, we would see a "triangle" with a missing link, with a link between Coke and Lays, a link between Pepsi and Lays, but no link between Coke and Pepsi. Perhaps, this missing link indicates they are surrogate products. Speaking about Coke and Pepsi it is easy to figure that out. But what about a grocery store with dozens of thousands of items?
In business problems there is always a question to be answered, an operational task to be improved, a challenge to be overcome, an insight to be produced. Are the customers willing to purchase this specific product? How likely are they to purchase it? How do subscribers consume this specific service? Are they willing to increase or decrease their usage? How likely? How can we improve a telecommunications network due to the customers' usage? How can we improve our supply chain across all stores based on past purchases? Can we describe a specific economic scenario considering different countries, states, or cities? Can we explain cause and effect of complex events in politics, international trade relations, or immigration? Network science works nicely to describe social relationships. However, by social here you can understand almost anything. Social can be people, employees, companies, countries, equipment, products, services, governments, or a combination of them. Network science also works well as an exploratory analysis tool, clearly describing complex scenarios, particularly when relations between entities play a key role.
How do we answer business questions? We often answer questions, solve problems, or explain business scenarios by working on the data that describe that problem or scenario. Machine learning models adjust mathematical and statistical equations according to the data available, based on the data distribution, the types of the variables, their variation, and many other data aspects. Different models work better for some specific type of data, but all models require data to create, or better, to find the right correlations between the problem and the solution. The nice thing about network science is that we can create different networks upon the same set of data, which creates distinct exploratory models and then different outcomes. The way we translate the data into the nodes and links, the way we use the data to define the relationships between them, the way we weight the nodes and links, everything changes the input network and therefore the results. For instance, based on the CDRs - Call Detail Records - we can define a network where the mobile phone is a node, a household is a node, or a switch is a node. The way these nodes are related also depends on the way we want to envision the network. The link can be calls and messages, physical connections of the telecommunications network, and people moving around and being handled by different cell towers. This flexibility in building different "inputs" and performing multiple exploratory models gives us more possibilities to describe and understand the problem, and for sure, more options to find viable solutions.
The first real problem I experienced in terms of network science was working in Dublin at Dublin City University in partnership with Eircom. We had a wonderful challenge to better understand the churn event at Eircom, a major telecommunications company in Ireland. We were asking ourselves if the churn event could occur as a viral event. When subscribers decide to leave, would they influence other subscribers to leave as well? In order to answer this question, we should understand the subscribers' relationships. At the end, this is what communications providers do right? They allow people to get connected to each other. Then, we got the fundamental data from carriers, calls, and text messages. This data describes in detail when and how one subscriber gets connected to another. Based on this data, we built the network, considering all subscribers, and all relationships. In addition to that, we considered the churn event over time. By doing this, we could monitor what happened when a subscriber decided to leave. What happened with their friends, relatives, co-workers, and so on? Did they leave afterwards? We investigated a substantial amount of data, considering a reasonable timeframe, to understand the overall viral effect some subscribers could exert over the others. In addition to the traditional transactional and demographic data about the customers, we described them in terms of their network centralities (what valued them as nodes within a network), and in terms of communities (how they were grouped together based on their relationships). At the end, a classification model was trained to estimate the likelihood of each subscriber's behavior as an influencer in terms of the churn event. We also noticed that some subscribers can be influencers in one specific business event and not be for another. That means, the characteristics of the influencer subscribers differ from one business event to another. The patterns of influence in churn were different from purchasing, consuming, or product adoption.
After that experience, I had the privilege and luck to work on many projects involving network science, looking at vastly different type of problems in a variety of industries. Even though each project is unique in terms of the particular problem, or the best possible solution, or based on the timely data available, most of them search for solutions in the same space. The solutions go from traditional business demands such as avoiding churn and boosting product adoption in communications and retail, to detecting fraud in finance, insurance, communications, taxpayers, and consumer goods. This required, a search for the optimal learning path not based on the content or subject of the courses but upon the relationships of the courses created by the student enrollments to evaluate and understand the players in economic trading among government agencies. Also, to search the main actors in illicit trade to find the best delivery routing in wholesales - from depot to stores, to optimize a work force scheduling in restaurants and hotels, to find optimal routes using the public transportation systems, to understand the virus spreading and predict new outbreaks, to foresee population movements and the impact in society, to evaluate the urban mobility and create solutions for specific big events, and unexpected situations, among others. The more I work in the field of network science in different industries with distinct customers, the more I believe it is part of an overall solution for complex problems.
After many projects in the field of network...
Dateiformat: ePUBKopierschutz: Adobe-DRM (Digital Rights Management)
Systemvoraussetzungen:
Das Dateiformat ePUB ist sehr gut für Romane und Sachbücher geeignet – also für „fließenden” Text ohne komplexes Layout. Bei E-Readern oder Smartphones passt sich der Zeilen- und Seitenumbruch automatisch den kleinen Displays an. Mit Adobe-DRM wird hier ein „harter” Kopierschutz verwendet. Wenn die notwendigen Voraussetzungen nicht vorliegen, können Sie das E-Book leider nicht öffnen. Daher müssen Sie bereits vor dem Download Ihre Lese-Hardware vorbereiten.Bitte beachten Sie: Wir empfehlen Ihnen unbedingt nach Installation der Lese-Software diese mit Ihrer persönlichen Adobe-ID zu autorisieren!
Weitere Informationen finden Sie in unserer E-Book Hilfe.