
High Performance with MongoDB
Description
Alles über E-Books | Antworten auf Fragen rund um E-Books, Kopierschutz und Dateiformate finden Sie in unserem Info- & Hilfebereich.
- Optimize schema design, indexing, storage, and system resources for real-world workloads
- Scale confidently with in-depth coverage of replication, sharding, and cluster management techniques
- Purchase of the print or Kindle book includes a free PDF eBook
Book DescriptionWith data as the new competitive edge, performance has become the need of the hour. As applications handle exponentially growing data and user demand for speed and reliability rises, three industry experts distill their decades of experience to offer you guidance on designing, building, and operating databases that deliver fast, scalable, and resilient experiences. MongoDB's document model and distributed architecture provide powerful tools for modern applications, but unlocking their full potential requires a deep understanding of architecture, operational patterns, and tuning best practices. This MongoDB book takes a hands-on approach to diagnosing common performance issues and applying proven optimization strategies from schema design and indexing to storage engine tuning and resource management. Whether you're optimizing a single replica set or scaling a sharded cluster, this book provides the tools to maximize deployment performance. Its modular chapters let you explore query optimization, connection management, and monitoring or follow a complete learning path to build a rock-solid performance foundation. With real-world case studies, code examples, and proven best practices, you'll be ready to troubleshoot bottlenecks, scale efficiently, and keep MongoDB running at peak performance in even the most demanding production environments.What you will learn - Diagnose and resolve common performance bottlenecks in deployments
- Design schemas and indexes that maximize throughput and efficiency
- Tune the WiredTiger storage engine and manage system resources for peak performance
- Leverage sharding and replication to scale and ensure uptime
- Monitor, debug, and maintain deployments proactively to prevent issues
- Improve application responsiveness through client driver configuration
Who this book is forThis book is for developers, database administrators, system architects, and DevOps engineers focused on performance optimization of MongoDB. Whether you're building high-throughput applications, managing deployments in production, or scaling distributed systems, you'll gain actionable insights. Basic knowledge of MongoDB is assumed, with chapters designed progressively to support learners at all levels.
All prices
More details
Content
- Intro
- FM
- Acknowledgements
- Contributors
- Preface
- Free Benefits with Your Book
- Systems and MongoDB Architecture
- What are systems?
- Characteristics of systems
- Changing systems is a risky business
- A system with no delays is simple
- A system with delays can behave in unexpected ways
- Trying to fix oscillations
- Systems surprise us
- A typical software system
- Algorithmic efficiency (complexity)
- Avoid premature optimization
- Amdahl's law (limit of parallel speedup)
- Locality and caching
- Little's law (throughput versus latency)
- Understanding MongoDB architecture
- The document model: MongoDB's foundation
- Key architectural components of MongoDB
- The data services system
- Query engine
- Storage engine/WiredTiger
- Libraries
- Other system components that mongod uses
- Managing complexity in modern data platforms
- Flexible data model with rigorous capabilities
- Built-in redundancy and resilience
- Horizontal scaling with intelligent distribution
- Performance tools
- Finding bottlenecks
- An incremental process for optimization
- Summary
- References
- Schema Design for Performance
- Understanding the core principles of schema design
- There is no single right way
- Data collocation
- Read and write trade-offs
- Small versus large documents
- Common myths
- Key strengths of the MongoDB schema design
- One-to-many relationships
- Embedding weak entities
- Dynamic attributes
- Caches and snapshots
- Optimization for common use cases
- Schema evolution
- Schema validation
- Common schema design mistakes
- Overnormalizing
- Overembedding
- Other common anti-patterns
- Schema design patterns by benefit
- Patterns for read performance optimization
- Embedding pattern
- Extended reference pattern
- Subset pattern
- Patterns for write performance optimization
- Document versioning pattern
- Bucketing pattern
- Key prefixing pattern
- Patterns for query and analytics optimization
- Computed pattern
- Schema versioning pattern
- Polymorphic pattern
- Archive pattern for storage optimization
- Real-world application: The Socialite app
- Scenario 1: User profile and activity feed
- Scenario 2: Chat system
- Summary
- Indexes
- Introduction to indexes
- What is an index?
- Resource efficiency and trade-offs
- Resource usage
- Common misconceptions about indexes in MongoDB
- Types of indexes in MongoDB
- Single-field indexes
- Compound indexes
- Multikey indexes
- Sparse indexes
- Wildcard indexes
- Partial indexes
- Designing efficient indexes
- Cardinality and selectivity
- Constructing compound indexes
- Equality queries
- Sorts and range queries
- The ESR guideline
- Maximizing resources with partial indexes
- Covered queries: the performance holy grail
- Ascending versus descending index order
- Indexing and aggregation pipelines
- Summary
- Aggregations
- MongoDB's aggregation framework
- Core concepts of the aggregation pipeline
- Performance considerations
- Aggregation pipeline flow
- Optimizing aggregation pipelines
- Optimization techniques
- Filter data early
- Avoid unnecessary $unwind and $group
- Design efficient $group operations
- Avoid common $lookup performance issues
- Efficient use of $project and $addFields
- Working with large datasets
- Aggregation pipeline limits
- Managing memory constraints with allowDiskUse
- Aggregation in distributed environments
- Optimizing aggregation for sharded collections
- Understanding shard-local versus merged operations
- Monitoring and profiling aggregation performance
- Utilizing materialized views
- Summary
- Replication
- Understanding MongoDB replica sets
- Components of a replica set
- Replication and high availability
- Understanding the MongoDB election process
- Replica set configuration
- Chained replication
- Replica set tags and analytics nodes
- Replication internals and performance
- Flow control
- Replication and the oplog
- Managing replication lag
- Read and write strategies
- Read preference
- Write concern and durability
- Summary
- Sharding
- Understanding core sharding architecture
- Architectural components of a sharded cluster
- Sharding a collection and selecting a shard key
- Why scatter-gather is bad
- Strategic shard key selection
- Shard key for targeting operations
- Shard key with good granularity
- Avoid increasing or decreasing shard key values
- Types of sharding
- Range-based sharding
- Hashed sharding
- Zone-based sharding
- Advanced sharding administration
- Resharding: Whether, when, and how
- Balancer considerations
- Pre-splitting: Whether, when, and how
- Moving unsharded collections
- Colocating sharded collection chunks together
- Summary
- Storage Engines
- Exploring storage engines
- Overview of WiredTiger
- A lookup operation
- An update operation
- An insert operation
- Eviction, checkpointing, and recovery
- Compression and encryption
- Configuration for improving performance
- Changing the size of the WiredTiger cache
- Changing syncdelay
- Changing minSnapshotHistoryWindowInSeconds
- Changing how eviction works
- Switching to the in-memory storage engine
- Changing the max leaf page size
- Summary
- Change Streams
- Understanding change streams architecture
- How change streams work: From write operations to events
- Event structure and life cycle
- Implementing change streams effectively
- Choosing the right scope and filtering strategy
- Server-side filtering with aggregation pipelines
- Document lookup strategies and performance
- Building a price monitoring service
- Managing performance and durability
- Resource optimization strategies
- Handling high-volume event streams
- Special considerations for sharded deployments
- Advanced patterns and production readiness
- Transaction visibility and event batching
- Document size limitations and collection life cycle
- Monitoring and health checks
- Replica set considerations
- Performance-tuning recap
- Summary
- Transactions
- Understanding multi-document ACID transactions
- History and evolution of transactions in MongoDB
- Introduction to ACID properties in MongoDB
- Document-level atomicity versus multi-document transactions
- Document-level atomicity
- Multi-document transactions atomicity
- When to use multi-document transactions in MongoDB
- Transactions API and session management
- Core API
- Callback API
- Read/write concerns and transaction behavior
- Performance considerations with transactions
- Replica set versus sharded cluster transactions
- WiredTiger cache considerations
- Managing transaction runtime limits and errors
- Lock acquisition and contention management
- Optimizing transaction size and duration
- Common transaction anti-patterns and their performance costs
- Long-running transactions and their impact on system performance
- Unnecessary use of transactions where single-document atomicity would suffice
- Single-document transactions
- Transactions for read-only operations
- Misunderstanding transaction scope and atomicity
- Frequent small transactions on hot documents/collections
- Improper error handling and retry logic
- Insufficient monitoring of transaction metrics
- Summary
- Client Libraries
- What are drivers?
- How MongoDB drivers work
- Key features of MongoDB drivers
- Consistency and reliability through shared specifications
- Idiomatic experience
- Performance optimization
- What are object-document mappers (ODMs)?
- Understanding ODMs
- Key features of ODMs
- Schema enforcement and data validation
- Intuitive query APIs
- Relationship management
- Middleware and life cycle hooks
- Type safety and IDE integration
- Impact on developer productivity
- When to use ODMs
- What are application frameworks?
- The value of application frameworks
- Leveraging ODMs and ORMs in frameworks
- Popular MongoDB-compatible frameworks
- Best practices when using frameworks with MongoDB
- Beyond the basics
- Asynchronous and non-blocking patterns
- Surfacing and handling failure conditions
- Connection management
- Read/write concerns and read preferences
- Compression and network performance
- Summary
- Managing Connections and Network Performance
- Understanding connection fundamentals
- Latency
- Connection churn
- Network saturation
- Understanding the connection lifecycle
- Connection establishment
- Connection utilization and pooling
- Connection termination
- MongoDB connection architecture
- TCP/IP and the MongoDB Wire Protocol
- Driver connection pooling
- MongoDB server connection handling
- Monitoring and troubleshooting connections
- Connection monitoring best practices
- Optimizing connection management
- Connection pool optimization
- Connection timeout configuration
- Server-side optimization
- Operating system configuration
- Performance optimization leveraging network compression
- Benefits and trade-offs of network compression
- Available compression algorithms
- Implementing network compression
- Connection strategies for serverless environments
- Summary
- Advanced Query and Indexing Concepts
- Understanding query execution
- Plan stages, or "how indexes can be used"
- Using the explain command
- The queryPlanner section
- The executionStats section
- Analyzing log messages
- Identifying problematic patterns
- Query targeting ratio
- Waiting for disk or other resources
- How to influence query execution
- Using hint
- Using query settings
- Other options
- MQL semantics and indexes
- Challenges with arrays and multikey indexes
- Equality is not just equality
- $elemMatch
- Deduplication
- Challenges with null and $exists
- Additional best practices
- Updates
- findAndModify
- Aggregation and query
- $sample
- $facet
- $where, $function, $accumulator, and mapReduce
- $text
- $regex and indexes
- Aggregation versus match expressions
- $expr and indexes
- $or clause and indexes
- Special collections, index types, and features
- Time-series collections
- Geospatial indexes
- Atlas Search
- Atlas Vector Search
- Collations
- Summary
- Operating Systems and System Resources
- Technical requirements
- Managing resources for optimal performance
- CPU utilization
- Memory management
- Storage
- Network
- Configuring systems for MongoDB performance
- Understand the ideal ratio of simultaneous operations per CPU core
- Search for other CPU-intensive processes during performance dips
- Select the right filesystem for your application
- Extent file system (XFS)
- Fourth extended filesystem (ext4)
- Other file systems
- Filesystem settings
- Avoid RAID with parity
- Adjust readahead settings
- Change filesystem cache settings
- Check SSD health
- Avoid double encryption and compression
- Ensure resident memory usage stays below 80%
- Networking best practices
- Using auto-scaling for Atlas performance
- Summary
- Monitoring and Observability
- Key differences between monitoring and observability
- Core MongoDB metrics and signals
- Operational metrics
- Performance-specific metrics
- Monitoring with MongoDB Atlas
- Atlas UI features
- Real-Time Performance Panel (RTPP)
- Performance Advisor
- Query Insights
- Alerts
- Atlas Search issues
- Connection issues
- Query issues
- Oplog issues
- Self-managed monitoring tools
- mongostat: real-time activity snapshot
- mongotop: Collection-level read/write timings
- serverStatus: Comprehensive metrics via the shell
- Database profiler: Deep dive into slow operations
- Integration with external monitoring systems
- Prometheus + Grafana
- Atlas integration for Prometheus
- Visualizing with Grafana
- Datadog
- Application performance monitoring platforms
- OpenTelemetry
- Considerations for external tools
- Common performance patterns and what to monitor
- Disk I/O bottlenecks
- Cache pressure (WiredTiger cache utilization)
- Replication lag and oplog window drift
- Summary
- Debugging Performance Issues
- Identifying inherently slow operations
- Case study: Troubleshooting cluster performance with Atlas
- Diagnosing the cause
- Root cause identification
- Implementing the solution
- Results and learnings
- Case study: Unexpected admin query causing collection scan
- Diagnosing the cause
- Implementing the solution
- Results and learnings
- Managing blocked operations
- Case study: Cursor leak investigation
- Diagnosing the cause
- Root cause identification
- Implementing the solution
- Results and learnings
- Use case: Burst of poorly optimized queries
- Diagnosing the cause
- Implementing the solution
- Results and learnings
- Addressing hardware resource constraints
- Case study: Atlas cluster resource optimization
- Diagnosing the cause
- Implementing the solution
- Results and learnings
- Use case: Diagnosing insufficient IOPs in self-managed MongoDB
- Diagnosing the cause
- Implementing the solution
- Results and learnings
- Systematic approach to performance troubleshooting
- Summary
- Unlock Your Exclusive Benefits
- Afterword
- Index
- Other Books You May Enjoy
Contents
- Acknowledgements
- Preface
- Systems and MongoDB Architecture
- Schema Design for Performance
- Indexes
- Aggregations
- Replication
- Sharding
System requirements
File format: ePUB
Copy protection: Adobe-DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Install the free reader Adobe Digital Editions prior to download (see eBook Help).
- Tablet/smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook before downloading (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePub works well for novels and non-fiction books – i.e., „flowing” text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook uses Adobe-DRM, a „hard” copy protection. If the necessary requirements are not met, unfortunately you will not be able to open the eBook. You will therefore need to prepare your reading hardware before downloading.
Please note: We strongly recommend that you authorise using your personal Adobe ID after installation of any reading software.
For more information, see our ebook Help page.
File format: ePUB
Copy protection: without DRM (Digital Rights Management)
System requirements:
- Computer (Windows; MacOS X; Linux): Use a reader that can handle the file format ePUB, such as Adobe Digital Editions or FBReader – both free (see eBook Help).
- Tablet/Smartphone (Android; iOS): Install the free app Adobe Digital Editions or the app PocketBook (see eBook Help).
- E-reader: Bookeen, Kobo, Pocketbook, Sony, Tolino and many more (not Kindle).
The file format ePUB works well for novels and non-fiction books – i.e., 'flowing' text without complex layout. On an e-reader or smartphone, line and page breaks automatically adjust to fit the small displays.
This eBook does not use copy protection or Digital Rights Management
For more information, see our eBook Help page.