Chapter 2
Object Model and Representation System
At the heart of MoarVM lies a remarkably expressive object system, allowing language features previously deemed impractical for efficient VM implementation. This chapter unpacks the intricate machinery behind MoarVM's 6model architecture and its extensible representation protocol, exposing the technical wizardry that empowers Raku's powerful metaprogramming, role composition, and type manipulation capabilities. Dive into the layers beneath the surface to discover how objects, roles, and native integration coalesce into a dynamic and high-performance runtime environment.
2.1 The 6model Architecture
At the heart of the 6model architecture lies a unified and flexible meta-object protocol designed to provide a consistent, orthogonal foundation for object representation and behavior. The architecture encapsulates four primary abstractions: objects, attributes, roles, and metaclasses, which collectively offer a versatile substrate supporting complex programming paradigms such as multiple dispatch and parametric polymorphism.
The fundamental unit in 6model is the object, modeled as a container of attributes-slots associating names with values. Objects are instances of classes, themselves objects, enabling a recursive hierarchical structure known as the metamodel hierarchy. Classes describe the layout and semantics of their instances through attributes and behavioral composition, modeling not only data but also the mechanisms by which objects interact. The architecture's metamodel hierarchy is strictly stratified: at the base is the object class, which is an instance of itself, thus bootstrapping the entire system in a self-consistent manner.
Attributes in 6model are more than simple memory slots; they are first-class entities encapsulating detailed metadata, including type constraints, initialization semantics, and visibility rules. This detailed attribute metadata enables fine control over object layout and behavior without compromising uniformity in access patterns. The architecture supports named, positional, and computed attributes, accommodating static data as well as dynamically derived properties. Furthermore, attribute composition permits attributes to be shared or overridden along the inheritance chain, supporting complex reuse and extension patterns.
Roles represent an innovative, composable behavioral unit distinct from classical inheritance. A role encapsulates a set of attributes and methods, designed to be mixed into classes or objects to compose behavior modularly. In 6model, roles are parametric and can accept type parameters or constraints, enabling highly reusable and adaptable components. This parametric nature allows roles to serve as polymorphic templates, fostering code generalization without sacrificing type safety. Roles may also specify required methods and attributes, enforcing contracts that classes incorporating them must implement.
The integration of roles within the metamodel supports multiple dispatch, a core facility that dispatches method invocations based on the runtime types of multiple arguments. Unlike single dispatch, which selects a method solely on the receiver's type, multiple dispatch evaluates the dynamic types of all operands. 6model realizes this by leveraging its uniform object representation and the hierarchical organization of classes and roles. Method dictionaries are stored in the metaobjects corresponding to roles and classes, allowing fine-grained control over method selection and resolution order in the presence of role composition.
At the implementation level, method dispatch tables are constructed dynamically, utilizing the metamodel's introspective capabilities to gather applicable methods from all relevant roles and classes. The system employs pervasive caching and memoization techniques to offset the potential overhead of this dynamic resolution. The elegant confluence of roles and multiple dispatch affords a powerful mechanism for method specialization and behavioral composition, essential for building extensible frameworks and domain-specific languages.
The 6model architecture's support for parametric roles further elevates its expressiveness. Roles parameterized by types or other roles enable defining abstract behaviors that are later instantiated with concrete parameters, mirroring the functionality of generics in statically typed languages but within a dynamic, uniform framework. This parameterization manifests through specialized metaobjects that enforce constraints and resolve instantiations at runtime, maintaining the delicate balance between flexibility and correctness.
Overall, the 6model object system presents a homogeneous yet extensible foundation where classes, roles, and objects interoperate seamlessly. The metamodel hierarchy enforces a principled stratification, while attribute metadata and role composition encourage modularity and reuse. Multiple dispatch and parametric roles introduce adaptability and genericity, empowering developers to define rich semantics atop a coherent, rigorously designed architecture. This uniformity and generality not only simplify the implementation of diverse programming models but also enable advanced features such as reflective introspection, behavioral adaptation, and dynamic reconfiguration within a single, consistent framework.
2.2 REPRs: Representation Protocol
MoarVM's Representation Protocol (REPR) constitutes a cornerstone of its flexible and performant object model, enabling a diverse range of data structures to be seamlessly integrated into the runtime environment. Each REPR encapsulates a strategy for how an object's data is stored, accessed, and manipulated, providing a polymorphic interface that supports the implementation of both built-in and user-defined data types. The REPR system operates transparently to the end-user but is meticulously structured to optimize memory layout, execution speed, and extensibility.
At its core, a REPR defines the internal representation of an object's payload along with a set of callback functions-referred to as hooks-that mediate interactions with the object. These hooks form a well-specified API enabling MoarVM to perform essential operations such as object instantiation, slot access, garbage collection marking, serialization, and type introspection. The extensibility of the REPR system lies in the ability to implement these hooks differently for each representation, thus tailoring behavior to the semantics and performance requirements of distinct data forms.
Built-in REPRs in MoarVM cover foundational types, including arrays, hashes (associative arrays), strings, and native integer or floating-point numbers. For example, the array representation employs a contiguous memory buffer for element storage, allowing constant-time indexed access and efficient iteration. Conversely, the hash REPR often uses an internal hash table structure with collision resolution mechanisms, favoring flexible key-value pair management at the cost of more complex memory access patterns. These built-in REPRs are carefully engineered with memory layout choices that balance speed and space efficiency. For instance, arrays avoid pointer indirection by embedding the elements directly when possible, whereas hashes maintain pointers to key-value entries to support dynamic resizing and sparse storage.
User-defined REPRs extend this paradigm, permitting language implementers or advanced users to craft custom storage models that integrate seamlessly into MoarVM's object system. To define a custom REPR, a developer implements a descriptor structure containing pointers to a standard set of hooks, such as:
- allocate: Allocates memory and constructs the initial object instance.
- copy_to_storage: Handles copying or cloning of object contents.
- gc_mark: Marks references for the garbage collector to trace.
- serialize/deserialize: Facilitate persistent storage or network transmission.
- get_and_set: Retrieve or update slot values through controlled access.
- compose: Allows the REPR to perform setup when a new type is constructed.
These hooks contribute to the metalevel control attainable via REPRs, enabling metaprogramming facilities that produce powerful language features. For instance, a custom REPR can implement lazy slot initialization, virtualized field storage, or compile-time optimizations of object layouts. By intercepting slot access within get_and_set hooks, a REPR can implement dynamic behavior such as computed properties or access control mechanisms without modifying the language syntax.
Performance considerations permeate the design and implementation of REPRs. Since every object operation-construction, method dispatch, or field access-ultimately funnels through the REPR interface, minimizing overhead in hook invocation is critical. MoarVM mitigates this by using inlineable C function pointers and designing hooks with minimal abstraction...