This book, presented in three volumes, examines environmental disciplines in relation to major players in contemporary science: Big Data, artificial intelligence and cloud computing.Today, there is a real sense of urgency regarding the evolution of computer technology, the ever-increasing volume of data, threats to our climate and the sustainable development of our planet. As such, we need to reduce technology just as much as we need to bridge the global socio-economic gap between the North and South; between universal free access to data (open data) and free software (open source). In this book, we pay particular attention to certain environmental subjects, in order to enrich our understanding of cloud computing. These subjects are: erosion; urban air pollution and atmospheric pollution in Southeast Asia; melting permafrost (causing the accelerated release of soil organic carbon in the atmosphere); alert systems of environmental hazards (such as forest fires, prospective modeling of socio-spatial practices and land use); and web fountains of geographical data.Finally, this book asks the question: in order to find a pattern in the data, how do we move from a traditional computing model-based world to pure mathematical research? After thorough examination of this topic, we conclude that this goal is both transdisciplinary and achievable.
Dominique Lafly is a Professor at the University of Toulouse, France. As a geographer, he is interested in the landscape, and the links between societies and their environment. Concerned with the issue of Big Data, he promotes multidisciplinary programs to bring IT closer to environmental applied disciplines.
1. Introduction to Environmental Management and Services, Thi Kim Oanh Nguyen, Quoc Tuan Le, Tongchai Kanabkaew, Sukhuma Chitaporpan and Truong Ngoc Han Le.Part. Environmental Case Studies2. Air Quality Monitoring with Focus on Wireless Sensor Application and Data Management, Tan Loi Huynh, Sathita Fakrapai and Thi Kim Oanh Nguyen.3. Emission Inventories for Air Pollutants and Greenhouse Gases with Emphasison Data Management in the Cloud, Thi Kim Oanh Nguyen, Nguyen Huy Lai, Didin Agustian Permadi, Nhat Ha Chi Nguyen, Kok Sothea, Sukhuma Chitporpan, Thongchai Kanabkaew, Jantira Rattanarat and Surusak Sichum.4. Atmospheric Modeling with Focus on Management of Input/Output Data and Potential of Cloud Computing Applications, Thi Kim Oanh Nguyen, Nhat Ha Chi Nguyen, Nguyen Huy Lai and Didin Agustian Permadi.5. Particulate Matter Concentration Mapping from Satellite Imagery, Thi Nhat Thanh Nguyen, Viet Hung Luu, Van Ha Pham, Quang Hung Bui and Thi Kim Oanh Nguyen.6. Comparison and Assessment of Culturable Airborne Microorganism Levels and Related Environmental Factors inHo Chi Minh City, Vietnam, Tri Quang Hung Nguyen, Minh Ky Nguyen and Ngoc Thu Huong Huynh.7. Application of GIS and RS in Planning Environmental Protection Zones in Phu Loc District, Thua Thien Hue Province, Quoc Tuan Le, Trinh Minh Anh Nguyen, Huy Anh Nguyen and Truong Ngoc Han Le.8. Forecasting the Water Quality and the Capacity of the Dong Nai River to Receive Waste water up to 2020, Quoc Tuan Le, Thi Kieu Diem Ngo and Truong Ngoc Han Le.9. Water Resource Management, Imeshi Weerasinghe.10. Assessing Impacts of Land Use Change and Climate Change on Water Resources in the La Vi Catchment, Binh Dinh Province, Kim Loi Nguyen, Le Tan Dat Nguyen, Hoang Tu Le, Duy Liem Nguyen, Ngoc Quynh Tram Vo, Van Phan Le, Duy Nang Nguyen, Thi Thanh Thuy Nguyen, Gia Diep Pham, Dang Nguyen Dong Phuong, Thi Hong Nguyen, Thong Nhat Tran, Margaret Shanafield and Okke Batelaan.
Preface: Why TORUS? Toward an Open Resource Using Services, or How to Bring Environmental Science Closer to Cloud Computing
Geography, Ecology, Urbanism, Geology and Climatology - in short, all environmental disciplines are inspired by the great paradigms of Science: they were first descriptive before evolving toward systemic and complexity. The methods followed the same evolution, from the inductive of the initial observations one approached the deductive of models of prediction based on learning. For example, the Bayesian is the preferred approach in this book (see Volume 1, Chapter 5), but random trees, neural networks, classifications and data reductions could all be developed. In the end, all the methods of artificial intelligence (IA) are ubiquitous today in the era of Big Data. We are not unaware, however, that, forged in Dartmouth in 1956 by John McCarthy, Marvin Minsky, Nathaniel Rochester and Claude Shannon, the term artificial intelligence is, after a long period of neglect at the heart of the future issues of the exploitation of massive data (just like the functional and logical languages that accompanied the theory: LISP, 1958, PROLOG, 1977 and SCALA, today - see Chapter 8).
All the environmental disciplines are confronted with this reality of massive data, with the rule of the 3+2Vs: Volume, Speed (from the French translation, "Vitesse"), Variety, Veracity, Value. Every five days - or even less - and only for the optical remote sensing data of the Sentinel 2a and 2b satellites, do we have a complete coverage of the Earth at a spatial resolution of 10 m for a dozen wavelengths. How do we integrate all this, how do we rethink the environmental disciplines where we must now consider at the pixel scale (10 m) an overall analysis of 510 million km2 or more than 5 billion pixels of which there are 1.53 billion for land only? And more important in fact, how do we validate automatic processes and accuracy of results?
Figure P.1. At the beginnig of AI, Dartmouth Summer Research Project, 1956. Source: http://www.oezratty.net/wordpress/2017/semantique-intelligence-artificielle/
Including social network data, Internet of Things (IoT) and archive data, for many topics such as Smart Cities, it is not surprising that environmental disciplines are interested in cloud computing.
Before understanding the technique (why this shape, why a cloud?), it would seem that to represent a node of connection of a network, we have, as of the last 50 years, drawn a potatoid freehand, which, drawn took the form of a cloud. Figure P.2 gives a perfect illustration on the left, while on the right we see that the cloud is now the norm (screenshot offered by a search engine in relation to the keywords: Internet and network).
What is cloud computing? Let us remember that, even before the term was dedicated to it, cloud computing was based on networks (see Chapter 4), the Internet and this is: "since the 50s when users accessed, from their terminals, applications running on central systems" (Wikipedia). The cloud, as we understand it today, has evolved considerably since the 2000s; it consists of the mutualization of remote computing resources to store data and use services dynamically - to understand software - dedicated via browser interfaces.
Figure P.2. From freehand potatoid to the cloud icon. The first figure is a schematic illustration of a distributed SFPS switch. For a color version of this figure, see www.iste.co.uk/laffly/torus3.zip
This answers the needs of the environmental sciences overwhelmed by the massive data flows: everything is stored in the cloud, everything is processed in the cloud, even the results expected by the end-users recover them according to their needs. It is no wonder that, one after the other, Google and NASA offered in December 2016 - mid-term of TORUS! - cloud-based solutions for the management and processing of satellite data: Google Earth Engine and NASA Earth Exchange.
But how do you do it? Why is it preferable - or not - for HPC (High Performance Computing) and GRIDS? How do we evaluate "Cloud & High Scalability Computing" versus "Grid & High-Performance Computing"? What are the costs? How do you transfer the applications commonly used by environmental science to the cloud? What is the added value for environmental sciences? In short, how does it work?
All these questions and more are at the heart of the TORUS program developed to learn from each other, understand each other and communicate with a common language mastered: geoscience, computer science and information science; and the geosciences between them; computer science and information sciences. TORUS is not a research program. It is an action that aims to bring together too (often) remote scientific communities, in order to bridge the gap that now separates contemporary computing from environmental disciplines for the most part. One evolving at speeds that cannot be followed by others, one that is greedy for data that others provide, one that can offer technical solutions to scientific questioning that is being developed by others and so on.
TORUS is also the result of multiple scientific collaborations initiated in 2008-2010: between the geographer and the computer scientist, between France and Vietnam with an increasing diversity of specialties involved (e.g. remote sensing and image processing, mathematics and statistics, optimization and modeling, erosion and geochemistry, temporal dynamics and social surveys) all within various scientific and university structures (universities, engineering schools, research institutes - IRD, SFRI and IAE Vietnam, central administrations: the Midi-Pyrénées region and Son La district, France-Vietnam partnership) and between research and higher education through national and international PhDs.
Naturally, I would like to say, the Erasmus+ capacity building program of the European Union appeared to be a solution adapted to our project:
"The objectives of the Capacity Building projects are: to support the modernization, accessibility and internationalization of higher education in partner countries; improve the quality, relevance and governance of higher education in partner countries; strengthen the capacity of higher education institutions in partner countries and in the EU, in terms of international cooperation and the process of permanent modernization in particular; and to help them open up to society at large and to the world of work in order to reinforce the interdisciplinary and transdisciplinary nature of higher education, to improve the employability of university graduates, to give the European higher education more visibility and attractiveness in the world, foster the reciprocal development of human resources, promote a better understanding between the peoples and cultures of the EU and partner countries."1
In 2015, TORUS - funded to the tune of 1 million euros for three years - was part of the projects selected in a pool of more than 575 applications and only 120 retentions. The partnership brings together (Figure P.3) the University of Toulouse 2 Jean Jaurès (coordinator - FR), the International School of Information Processing Sciences (EISTI - FR), the University of Ferrara in Italy, the Vrije University of Brussels, the National University from Vietnam to Hanoi, Nong Lam University in Ho Chi Minh City and two Thai institutions: Pathumthani's Asian Institute of Technology (AIT) and Walaikak University in Nakhon Si Thammarat.
Figure P.3. The heart of TORUS, partnership between Asia and Europe. For a color version of this figure, see www.iste.co.uk/laffly/torus3.zip
With an equal share between Europe and Asia, 30 researchers, teachers-researchers and engineers are involved in learning from each other during these three years, which will be punctuated by eight workshops between France, Vietnam, Italy, Thailand and Belgium. Finally, after the installation of the two servers in Asia (Asian Institute of Technology - Thailand; and Vietnam National University Hanoi - Vietnam), more than 400 cores will fight in unison with TORUS to bring cloud computing closer to environmental sciences. More than 400 computer hearts beat in unison for TORUS, as well as those of Nathalie, Astrid, Eleonora, Ann, Imeshi, Thanh, Sukhuma, Janitra, Kim, Daniel, Yannick, Florent, Peio, Alex, Lucca, Stefano, Hichem, Hung(s), Thuy, Huy, Le Quoc, Kim Loi, Agustian, Hong, Sothea, Tongchai, Stephane, Simone, Marco, Mario, Trinh, Thiet, Massimiliano, Nikolaos, Minh Tu, Vincent and Dominique.
To all of you, a big thank you.
Structure of the book
This book is divided into three volumes.
Volume 1 raises the problem of voluminous data in geosciences before presenting the main methods of analysis and computer solutions mobilized to meet them.
Volume 2 presents remote sensing, geographic information systems (GIS) and spatial data infrastructures (SDI) that are central to all disciplines that deal with geographic space.
Volume 3 is a collection of thematic application cases representative of the specificities of the teams involved in TORUS and which motivated their needs in terms of cloud computing.