-
Notifications
You must be signed in to change notification settings - Fork 48
Live Compaction
In order to use live compaction in Sirius, you must choose a SiriusLog implementation that supports it. Currently, SegmentedUberStore (Set the config property SiriusConfiguration.LOG_VERSION_ID
to "SEGMENTED_1.0") is the only SiriusLog that supports live compaction.
A SegmentedUberStore is very similar to a traditional UberStore, but instead of a single large data/index pair, the backend is split into segments. There is only one segment ever open for writing (called active), and when it reaches the maximum number of events, specified by SiriusConfiguration.LOG_EVENTS_PER_SEGMENT
, a new segment is opened. A newly-split segment has a numerical "name", which is the previous segment's name + 1. These names monotonically increase.
The key idea in live compaction is this: since events in the log are idempotent, an event in the WAL overrides any number of previous events with the same key. Thus, in a given segment, all keys necessarily override any matching keys from previous segments. To compact, we will "apply" keys from segments to previous segments – removing the matching events from the previous segments.
The list of segments are divided into two parts:
- inactive, which consist of the read-only files from closed segments
- active, which is where new writes are applied. There exists only one active segment at a time
Compaction consists of:
- for each inactive segment S, where S.keys-applied == false:
- gather the keys(S)
- for each inactive segment S', where S'.name < S.name:
- build S'.compacting, which consists of all events in S' whose keys
were not found in keys(S)
- for each inactive segment S', where there exists S'.compacting
- replace S' with S'.compacting
- mark S.keys-applied = true
There is no internal compaction of a segment. This may be a future enhancement.
####Segment Merging
Merging is the process where adjacent compacted segments are combined. When two adjacent segments have an aggregate size less than the SiriusConfiguration.LOG_EVENTS_PER_SEGMENT
, and both have had their keys applied in compaction, they can be merged into one. After a round of compaction has finished, a round of merges will be triggered. The time to do this will be included in the "compaction time" metric. It shouldn't be much in the general case.
Live compaction can be enabled by setting the SiriusConfiguration.COMPACTION_SCHEDULE_MINS
parameter to a non-zero value. This parameter specifies the number of minutes between compaction calls. It is set to 0 by default, which effectively turns compaction off. You should also choose a segment size (see below) and configure it with the SiriusConfiguration.LOG_EVENTS_PER_SEGMENT
parameter.
There are a number of factors that go into this decision, but first and foremost, you want your choice to satisfy the following statement:
The system should be able to apply the keys of a newly-closed segment in compaction FASTER than a newly-opened segment fills up with new events. In other words, you should choose a segment size s.t. compaction can keep up with updates.
This depends on a few factors: the ID space of your data, the throughput of updates, and the speed to process events in compaction. Given the following definitions:
S = max events per segment
N = number of unique keys in WAL
r(w) = rate of writes into system (events/sec)
r(c) = rate of compaction in system: Roughly, speed at which events can be read and then re-written to disk (events/sec).
You must ensure that:
S > (N * r(w)) / r(c)
For example, consider an application with the following characteristics:
N = 50 million unique keys
r(w) = 250 events/sec # experimentally, this is near the top-end we've seen of sirius throughput for a 3-node system
r(c) = 35,000 events/sec # estimated from 2KB/event, 70MB/sec spinning disk throughput; expect much higher from SSDs or server-grade hardware
S > (50M * 250) / 35K =~ 360K
Max Segment Size needs to be at least 360,000 or so.
Other factors in determining segment size:
- Heap usage of compaction increases with Segment size. The exact amount will depend on your data set, but we've seen around 200MB per million events in a Segment. So, 2M will net you around 400MB of extra heap usage during compaction.
- The higher the segment size, the higher the possibility for duplicate keys within a segment. Since internal compaction of segments is not implemented, these will not be compacted away until a future segment applies this key in compaction. Depending on your update patterns, this may be worth noting.
- The higher the segment size, the quicker the original compaction step will be. This is a one-time cost, so it's probably not a good reason to choose any particular value, but it's worth knowing. The original compaction should really take along the lines of O(N * N/S) time, which is really O(N^2). Expect it to take a while.
Other factors in determining compaction schedule:
- if a compaction is currently running and another one fires off, it will have no effect. It is safe.
- if a compaction fires off and no new segment has been closed since the last compaction run, it will have no effect. It is safe.
- compaction will use memory according to the segment size and is not very processor-intensive. It should be safe to run at any point.
- since compaction is generally very safe, a reasonable compaction schedule is probably on the order of 15 minutes rather than 6 hours.
Copyright 2013-2017 Comcast Cable Communications Management, LLC
Using Sirius
Getting started with Sirius
Configuring Sirius
How to Deploy Sirius
Reference Applications
Migrations and Release Notes
Migration Guide 1.1.x to 1.2.x
Release Notes 2.0.0
Migration Guide 1.2.x to 2.0.x
Developing Sirius
Getting Started
Contributing Guildelines
License
Releasing Sirius
Scaladocs
Test Timeout Failures
Transaction Log
Implementation
Live Compaction
waltool
Other