This is a personal note that I decided to share. It reflects my understanding of a subject and may contain errors and approximations. Feel free to contribute by contacting me here. Any help will be credited!
The MongoDB Cache System relies on two caches: the WiredTiger storage engine internal cache and the OS filesystem cache (which is not a MongoDB feature).
All read and write operations in MongoDB run through the cache.
OS Filesystem Cache
The OS filesystem cache depends on the underlying operating system. It is fully controlled by the OS (not MongoDB or WiredTiger). The purpose of the OS filesystem cache is to reduce I/O with the disk by caching frequently accessed disk pages. Data in the filesystem cache have the same format as on-disk data (ie. data is compressed).
WiredTiger Cache
WiredTiger Cache Purpose
WiredTiger cache holds the most recently used data and indexes. We call the “working set” the sum of all indexes + data frequently accessed. When the working set fits in the cache, query performances are optimal.
Ideally the cache should be configured to be large enough to hold an application’s working set.
WiredTiger documentation.
WiredTiger Cache Data
Data loaded in the WiredTiger cache (indexes and collection data) have a different representation to the on-disk format :
- Collection data are uncompressed in the cache.
- Indexes still benefit from index prefix compression in the cache. Data in the cache are stored in BTree pages.
We can distinguish two types of data in the WiredTiger cache :
- clean data: data identical to the on-disk version.
- dirty data: data modified and needs to be reconciled with the on-disk version.
There is no target allocation in the cache for read, or write operations nor per database or collection.
WiredTiger Cache Size
The default WiredTiger cache size is max(256 MB, 0.5*(RAM - 1GB))
.
On MongoDB Atlas, the WiredTiger cache size is :
- 50% of the RAM on M40 and above.
- 25% of the RAM on M30 and smaller clusters.
WiredTiger Cache Management
WiredTiger actively monitors the cache content and evicts data when it runs low on space.
The eviction system is composed of:
- One eviction server (a thread that runs in the background).
- Three shared eviction queues (two classic and an urgent one).
- Zero or more eviction worker threads (they read pages in the eviction queues and evict them).
Clean data eviction is just a page removal in the memory. Dirty data eviction performs a reconciliation with the data store to persist the latest data version on disk.
We can tune some eviction system parameters with various parameters: eviction_trigger
(95% by default), eviction_dirty_target
(5% by default), eviction_dirty_trigger
(20% by default)…
Parameters tuning can be useful in some specific cases such as bulk write. If applications try to write a lot of data in a short period of time, the amount of dirty data will grow really fast. It will lead to pressure on the cache and the eviction system. When the system can’t reduce the cache size, WiredTiger application threads can be requisitioned to evict content. This will reduce the priority of data distribution to clients.
MongoDB Atlas Cache Monitoring
In MongoDB Atlas, several cache metrics are available in the metrics dashboard. They only concern the WiredTiger cache.
- Cache Usage (total size, clean data size, dirty data size)
- Cache Activity (read/write bytes per second)
- Cache Ratio (cache fill ratio which is the percentage of data in the cache and dirty fill ratio which is the percentage of dirty data in the cache)