Mathematics of Deep Learning

Name: Mathematics of Deep Learning | An Introduction to Foundational Mathematics of Neural Nets
Brand: De Gruyter
Price: 64.95 EUR
Availability: OnlineOnly

An Introduction to Foundational Mathematics of Neural Nets

Leonid Berlyand Pierre-Emmanuel Jabin(Author)

De Gruyter (Publisher)

2nd Edition

Published on 2. February 2026

VIII, 150 pages

E-Book

PDF with digital watermarking

System requirements

978-3-11-221821-1 (ISBN)

€64.95incl. 7% vat

System requirements

for PDF with digital watermarking

E-Book Single Licence

Available for download

Description

Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.

Alles über E-Books, Kopierschutz & Dateiformate finden Sie in unserem Info- & Hilfebereich.

This course aims at providing a mathematical perspective to some key elements of the so-called deep neural networks (DNNs). Much of the interest on deep learning has focused on the implementation of DNN-based algorithms. Our hope is that this compact textbook will offer a complementary point of view that emphasizes the underlying mathematical ideas. We believe that a more foundational perspective will help to answer important questions that have only received empirical answers so far.

Our goal is to introduce basic concepts from deep learning in a rigorous mathematical fashion, e.g. introduce mathematical definitions of deep neural networks (DNNs), loss functions, the backpropagation algorithm, etc.

We attempt to identify for each concept the simplest setting that minimizes technicalities but still contains the key mathematics.

The book focuses on deep learning techniques and introduces them almost immediately. Other techniques such as regression and SVM are briefly introduced and used as a steppingstone for explaining basic ideas of deep learning.

Throughout these notes, the rigorous definitions and statements are supplemented by heuristic explanations and figures. The book is organized so that each chapter introduces a key concept. When teaching this course, some chapters could be presented as a part of a single lecture whereas the others have more material and would take several lectures.

More details

Other editions

Persons

Leonid Berland received his Ph. D. in 1985 from Kharkiv University (Ukraine). He joined the Pennsylvania State University (PSU) in 1991, and he is currently a Professor of Mathematics and a member of the Materials Research Institute at PSU. He is a founding co-director of PSU Centers for Interdisciplinary Mathematics and for Mathematics of Living and Mimetic Matter. He is known for his works at the interface between mathematics and other disciplines such as physics, materials sciences, life sciences, and most recently, computer science. He co-authored three books and more than 100 publications. His interdisciplinary works received research awards from leading research agencies in the USA, such as NSF, the US Department of Energy, and the National Institute of Health as well as internationally (Bi-National Science Foundation and NATO). Most recently his work was recognized with the Humboldt Research Award of 2021. His teaching excellence was recognized by C.I. Noll Award for Excellence in Teaching by Eberly College of Science at Penn State.

Pierre-Emmanuel Jabin is currently a distinguished professor at the Pennsylvania State University since August 2020. He was a student of École Normale Supérieure from 1995 to 1999; he earned his Ph.D. in 2000 and his HRD in 2003 both at Université Pierre et Marie Curie (Paris VI). He was more recently a professor at the University of Maryland from 2011 to 2020, where he was also director of the Center for Scientific Computation and Mathematical Modeling from 2016 to 2020. Jabin's work in applied mathematics is internationally recognized and he has made seminal contributions to the theory and applications of many-particle/multi-agent systems together with advection and transport phenomena. Jabin was an invited speaker at the International Congress of Mathematicians in Rio de Janeiro in 2018.

Content

Intro
Contents
1 About this book
2 Introduction to machine learning: what and why?
2.1 Some motivation
2.2 What is machine learning?
3 Classification problem
4 The fundamentals of artificial neural networks (ANNs)
4.1 Basic definitions
4.2 ANN classifiers and the softmax function
4.3 The universal approximation theorem
4.4 Why is non-linearity in ANNs necessary?
4.4.1 0+0=8?
4.4.2 Non-linear activation functions are necessary in ANNs
4.5 Why do we need biases?
4.6 Exercises
5 Supervised, unsupervised, and semi-supervised learning
5.1 Basic definitions
5.2 Example of unsupervised learning: detecting bank fraud
5.3 Exercises
6 The regression problem
6.1 What is regression? How does it relate to ANNs?
6.2 Example: linear regression in dimension 1
6.3 Logistic regression as a single neuron ANN
6.3.1 1D example: studying for an exam
6.3.2 2D example of admittance to graduate school: separation of sets and decision boundary
6.3.3 Relation between ANNs and regression
6.3.4 Logistic regression vs. networks with many layers
6.4 Exercises
7 Support vector machine
7.1 Preliminaries: convex sets and their separation, geometric Hahn-Banach theorem
7.2 Support vector machine
7.3 Hard-margin SVM classifiers and support vectors
7.4 Soft margin SVM classifier
7.5 Exercises
8 Kernel methods
8.1 Kernels: what/why?
8.1.1 Recalling the basic principles of linear regression
8.1.2 How a linear kernel arises in linear regression
8.1.3 Linear regression in general dimension
8.1.4 Ridge regression
8.1.5 Solving the ridge regression by the dual problem
8.1.6 How linear kernel arises in ridge regression
8.1.7 Feature space and feature map in classification and regression: examples
8.2 Kernel: definitions and basic properties
8.2.1 Definitions
8.2.2 The ``kernel trick'': connecting feature maps and kernels
8.2.3 Choosing the right kernel: the example of the Gaussian kernel
8.3 Exercises
9 Gradient descent method in the training of DNNs
9.1 Deterministic gradient descent for the minimization of multivariable functions
9.2 Additive loss functions
9.3 What are SGD algorithms? When to use them?
9.4 Epochs in SGD
9.5 Weights
9.6 Choosing the batch size through a numerical example
9.7 Exercises
10 Backpropagation
10.1 Computational complexity
10.2 Chain rule review
10.3 Diagrammatic representation of the chain rule in simple examples
10.4 The case of a simple DNN with one neuron per layer
10.5 Backpropagation algorithm for general DNNs
10.6 Exercises
11 Convolutional neural networks (CNNs)
11.1 Convolution
11.1.1 Convolution of functions
11.1.2 Convolution of matrices
11.1.3 Hadamard product and feature detection
11.2 Convolutional layers
11.3 Padding layer
11.4 Pooling layer
11.5 Building CNNs
11.6 Equivariance and invariance
11.7 Summary of CNNs
11.8 Exercises
A Review of the chain rule
Bibliography
Index

System requirements

Save as PDF Copy link into clipboard

Schweitzer Fachinformationen

Mathematics of Deep Learning

Description

More details

Other editions

Additional editions

Previous edition

Persons

Content

System requirements