Chapter 2
Dhall Syntax and Semantics in Depth
Beneath Dhall's approachable surface lies a deeply consistent, mathematically precise language, engineered for expressive power and predictable behavior. This chapter invites you to explore the intricate architecture of Dhall's syntax and semantics, illuminating the rigor behind every expression and transformation. From atomic data types to parameterized abstractions, rules for composition, and the guarantees of static typing, you will discover how each construct interlocks to form a robust configuration calculus-one that empowers you to move beyond mere files to engineered, verifiable systems.
2.1 Primitive Data Types and Literals
Dhall defines a collection of primitive data types that form the cornerstone of expression and configuration authoring within its strongly-typed functional system. These primitives-Natural, Integer, Double, Bool, and Text-are carefully chosen to balance expressiveness, simplicity, and safety. Their literals, encoding conventions, and coercion semantics underpin Dhall's guarantees of totality and referential transparency.
Natural and Integer The Natural type corresponds to the set of non-negative integers (N), including zero but excluding negative values. Syntax for Natural literals consists of an unadorned decimal sequence, for example:
0 42 100000 No sign or decimal point is permitted; negative numbers are reserved for Integer literals. The type Integer represents the full set of integers (Z), supporting both positive and negative values. Literals for Integer explicitly allow an optional leading minus sign, indicating negativity:
0 -5 12345 Importantly, the literal 0 is polymorphic: it can be used both as Natural and Integer depending on typing context, requiring no explicit annotation.
Despite apparent similarity, Natural and Integer have distinct type signatures and functions in Dhall programs. Natural is favored when modeling quantities that inherently cannot be negative, such as sizes, counts, or indices, ensuring compile-time invariants. Functions expecting Natural will reject negative inputs unless an explicit coercion is performed (e.g., conversion or validation). Integer, by contrast, is appropriate for values that naturally span negative ranges, such as offsets, balances, or temperature measures.
Double The Double type models double-precision IEEE 754 floating-point values, accommodating rational approximations. Dhall literal syntax for Double adopts decimal notation possibly including fractional and exponent parts, e.g.:
0.0 3.1415 6.022e23 -1.0e-10 The decimal point is mandatory to distinguish Double literals from integer counterparts. Omitting the fractional part defaults to Integer or Natural. Unlike many general-purpose languages, Dhall discourages implicit numeric widening; numeric literals are strict in type identity, preventing accidental loss of precision or semantic ambiguity.
Bool The primitive Bool type defines the boolean domain {True,False} for logical expressions and branch control. Literal forms are the keywords True and False (capitalized precisely):
True False No alternative aliases or capitalization variants are accepted. Bool literals often appear in conditional expressions and guards, instantiating clear semantics for control flow and assertions.
Text The Text primitive denotes Unicode string sequences encoded in UTF-8. Dhall's text literals utilize double-quoted strings, supporting multiline and escaped character sequences:
"hello, world" "line 1\nline 2" '' This is a multi-line text literal spanning several lines '' The multiline text literal syntax uses two single quotes to delimit the content, preserving newlines and formatting exactly as written. Unlike other languages, Dhall enforces strong immutability and no runtime concatenation side effects with Text values, aligning with its configuration semantics. Unicode escapes, control characters, and interpolation are disallowed, thereby ensuring predictable and platform-independent text.
Dhall's primitive data types serve as the foundational building blocks for constructing expressions and defining configurations with strong static guarantees. These types-Natural, Integer, Double, Bool, and Text-provide precise semantic domains that prevent common errors and enhance expressiveness by leveraging inherent type invariants and strict literal encoding.
Natural and Integer The Natural type encapsulates the set of all non-negative integers N = {0,1,2,3,.}. Literals for Natural consist exclusively of an unsigned decimal numeral sequence:
0 7 132 These literals admit no leading sign nor decimal points. The type system prohibits assigning negative values to Natural, so attempting -5 as a Natural literal triggers a static type error. Non-negativity guarantees enable safer reasoning about quantities that inherently cannot be negative, such as counts, sizes, or indices.
Conversely, Integer represents signed integers Z with explicit optional leading signs in literals:
0 -42 100 +7 Though the + sign is syntactically acceptable, it is generally omitted by convention. Crucially, Integer literals support negative numbers, expanding the expressiveness to domains where signed values are essential (e.g., offsets, differences). Since Integer and Natural are distinct types, implicit coercions are disallowed to maintain type safety; conversion requires explicit functions, e.g., Natural/fold.
The literal 0 is special in that it can inhabit both Natural and Integer types depending on context, functioning as a polymorphic numeric zero. Its usage exemplifies how Dhall leverages contextual typing for ergonomic literals without sacrificing static guarantees.
Double Floating-point arithmetic is modeled through the Double type, mirroring the IEEE 754 double-precision standard. Literals must explicitly include a decimal...