In transactional databases, frequent snapshots can create performance bottlenecks, impacting the entire system. But what if there was a way to boost snapshot performance without compromising consistency? Introducing Snapshot Optimization, a game-changer for database speed and efficiency.
The Problem:
In many transactional databases, a snapshot is taken with every transaction. Each snapshot comes at a cost: acquiring a database mutex, a lock that temporarily halts other operations. This constant locking becomes a performance drag, especially in scenarios with frequent read operations and multithreaded environments.
The Solution:
Snapshot Optimization breaks the lockstep between every transaction and a new snapshot. It leverages the power of sequence numbers, unique identifiers attached to each snapshot. If the sequence number remains unchanged since the last snapshot, meaning no modifications occurred, there's no need to create a new one. Instead, the database reuses the existing snapshot, bypassing the mutex acquisition and its performance penalty.
Benefits:
- Faster database operations: Reusing snapshots frees up valuable resources, leading to significant performance improvements, especially with read-heavy workloads.
- Reduced CPU usage: Optimized snapshotting minimizes resource demands, keeping your CPU cool and efficient.
- Smoother multithreaded experience: In multithreaded environments, frequent mutex acquisition can cause contention and slowdown. Snapshot Optimization keeps the threads flowing smoothly.
Who Benefits the Most:
Taking snapshots, along with running heavy read workloads with low change rates, makes this feature particularly useful.
Here are some examples:
- Data analytics: Querying large datasets without modifying them.
- Reporting: Generating reports based on historical data snapshots.
- Machine learning: Training models with read-only access to large datasets.
Benchmarking the Boost:
Tests show the impact is real. In a head-to-head comparison with RocksDB, Speedb with Snapshot Optimization delivered remarkable results:
- Completed 1 million snapshot/delete operations in under 30 seconds with 16 threads
- RocksDB, without the optimization, took nearly 2 minutes with the same workload
User and system time graph shows that with Speedb, the performance is much higher without taking any extra CPU resources.
These numbers speak for themselves: Snapshot Optimization offers a dramatic performance boost for databases relying heavily on read operations.
Ready to Snap Up Speed?
Snapshot Optimization is most effective for read-heavy workloads. So, assess your typical usage patterns and see if this feature aligns with your database needs.
Getting Started
The Snapshot Optimization feature requires the Folly library to be installed. This C++ library provides the building blocks for efficient snapshot caching and reuse. Once installed, simply enable the feature during compilation.
How to install folly on Ubuntu:
sudo apt install libssl-dev libfmt-devgit clone https://github.com/facebook/follycd follysudo ./build/fbcode_builder/getdeps.py install-system-deps --recursivemkdir build_cd build_cmake .. -DBUILD_SHARED_LIBS=ONmake -j $(nproc) install
To compile Speedb with the snapshot optimization enhancement
please compile it using cmake:git clone https://github.com/speedb-io/speedbcd speedbmkdir buildcd buildcmake .. -DWITH_SNAP_OPTIMIZATION=ON -DBUILD_SHARED_LIBS=ON -DCMAKE_BUILD_TYPE=Releasemake -j $(nproc)
Follow us on GitHub to get all the latest and greatest improvements. Your contribution is always welcome!
Have any questions? Join our discord server and talk with the community!