Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Modify optimized compaction to cover edge cases #25594

Merged
merged 32 commits into from
Jan 14, 2025

Conversation

devanbenz
Copy link

@devanbenz devanbenz commented Nov 27, 2024

This PR changes the algorithm for compaction to account for the following
cases that were not previously accounted for:

  • Many generations with a groupsize over 2 GB
  • Single generation with many files and a groupsize under 2 GB

Where groupsize is the total size of the TSM files in said shard directory.

@devanbenz devanbenz force-pushed the db/4201/compaction-bugs branch from 6e9db1b to cab638c Compare December 13, 2024 17:20
@devanbenz devanbenz force-pushed the db/4201/compaction-bugs branch from 9f5098b to 8c9d7e7 Compare December 16, 2024 18:49
@devanbenz devanbenz changed the title feat(wip): WIP modifying compaction tests feat: Modify optimized compaction to cover edge cases Dec 16, 2024
@devanbenz devanbenz marked this pull request as ready for review December 16, 2024 19:32
tsdb/engine/tsm1/engine.go Outdated Show resolved Hide resolved
Copy link
Member

@gwossum gwossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass, mostly about comments.

This PR changes the algorithm for compaction to account for the following
cases that were not previously accounted for:

- Many generations with a groupsize over 2 GB
- Single generation with many files and a groupsize under 2 GB
Where groupsize is the total size of the TSM files in said shard directory.

closes #25666
@devanbenz devanbenz force-pushed the db/4201/compaction-bugs branch from 7de2cf2 to d631314 Compare December 16, 2024 23:32
@devanbenz devanbenz self-assigned this Dec 16, 2024
Copy link
Contributor

@davidby-influx davidby-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still need to review the tests more closely, as well.

tsdb/engine/tsm1/compact.go Outdated Show resolved Hide resolved
tsdb/engine/tsm1/compact.go Outdated Show resolved Hide resolved
tsdb/engine/tsm1/compact.go Show resolved Hide resolved
tsdb/engine/tsm1/compact.go Outdated Show resolved Hide resolved
tsdb/engine/tsm1/compact_test.go Outdated Show resolved Hide resolved
tsdb/engine/tsm1/compact_test.go Outdated Show resolved Hide resolved
for shards that may have over a 2 GB group size but
many fragmented files (under 2 GB and under aggressive
point per block count)
Copy link
Contributor

@davidby-influx davidby-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lots of good changes and more tests! Thanks for the effort.
I still have a few things that you may want to change. Happy to discuss things in a teleconference.

tsdb/config.go Show resolved Hide resolved
tsdb/engine/tsm1/compact.go Outdated Show resolved Hide resolved
tsdb/engine/tsm1/compact.go Outdated Show resolved Hide resolved
tsdb/engine/tsm1/compact_test.go Outdated Show resolved Hide resolved
tsdb/engine/tsm1/compact_test.go Outdated Show resolved Hide resolved
tsdb/engine/tsm1/compact_test.go Outdated Show resolved Hide resolved
tsdb/engine/tsm1/compact_test.go Outdated Show resolved Hide resolved
tsdb/engine/tsm1/compact_test.go Outdated Show resolved Hide resolved
tsdb/engine/tsm1/compact_test.go Outdated Show resolved Hide resolved
tsdb/engine/tsm1/compact_test.go Outdated Show resolved Hide resolved
Copy link
Contributor

@davidby-influx davidby-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great changes, really improving this code. A few more in the endless cycle...

tsdb/engine/tsm1/compact.go Outdated Show resolved Hide resolved
tsdb/engine/tsm1/compact_test.go Outdated Show resolved Hide resolved
@@ -77,6 +81,9 @@ const (
// partition snapshot compactions that can run at one time.
// A value of 0 results in runtime.GOMAXPROCS(0).
DefaultSeriesFileMaxConcurrentSnapshotCompactions = 0

// MaxTSMFileSize is the maximum size of TSM files.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

tsdb/engine/tsm1/compact_test.go Outdated Show resolved Hide resolved
tsdb/engine/tsm1/compact_test.go Outdated Show resolved Hide resolved
tsdb/engine/tsm1/compact_test.go Outdated Show resolved Hide resolved
tsdb/engine/tsm1/compact_test.go Outdated Show resolved Hide resolved
Copy link
Contributor

@davidby-influx davidby-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code looks good, suggested more test scenarios.

Think about backfills, and the files they could generate.

tsdb/engine/tsm1/compact_test.go Outdated Show resolved Hide resolved
davidby-influx
davidby-influx previously approved these changes Dec 26, 2024
Copy link
Contributor

@davidby-influx davidby-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good. Please don't merge until @gwossum has a chance to re-review, though

Copy link
Contributor

@davidby-influx davidby-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, I am confused by the conditionals in the planner that compare a metric from the first file in a generation and also compare aggregate metrics of the generation. Either there is some invariant (e.g., the first file is always the largest) which is neither tested nor documented, or these conditionals are heuristic, not reliable.

The comments seems to refer to single files, but the operations are on generations. Maybe I'm missing something. Let's discuss.

If there is an invariant (like the first file in a generation is always the largest), let's both document it and test for it.

@@ -454,7 +474,7 @@ func (c *DefaultPlanner) Plan(lastWrite time.Time) ([]CompactionGroup, int64) {
var skip bool

// Skip the file if it's over the max size and contains a full block and it does not have any tombstones
if len(generations) > 2 && group.size() > uint64(maxTSMFileSize) && c.FileStore.BlockCount(group.files[0].Path, 1) == tsdb.DefaultMaxPointsPerBlock && !group.hasTombstones() {
if len(generations) > 2 && group.size() > uint64(tsdb.MaxTSMFileSize) && c.FileStore.BlockCount(group.files[0].Path, 1) >= tsdb.DefaultMaxPointsPerBlock && !group.hasTombstones() {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly confused whether this is still what we want to do. We skip a group (i.e., a generation) here if it is large (sum of all files is larger than the largest permissible single file), and the first file has the default maximum points per block and there are no tombstones.

This seems to be mixing metrics from the first file in the generation (points per block) with metrics from the whole generation (combined file size). Do we need to look at the points per block of all the files in the generation? Why are we skipping a generation if it is larger than a single file can be? What's the significance of that?

I understand the original code had this strange mix of conditionals, but do we understand why, and whether we should continue with them? At the very least, the comment Skip the file if... is misleading, because we are skipping a generation which may contain more than one file, are we not?

Copy link
Author

@devanbenz devanbenz Jan 9, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I think the comment is a bit misleading. I was mostly just keeping Plan, and PlanLevel as is... I would have no problem with modifying the existing logic in them though. Perhaps instead of checking individual file block counts and the entire group size against 2 GB I take the approach checking all the files in the group and all the block sizes in the group? Some pseudo code:

if gens <= 1
	skip

filesAtMaxSize = 0
filesAtMaxBlocks = 0

for file in generation
	if file < maxSize
		filesAtMaxSize++
	if file.blocks < maxBlocks
		filesAtMaxBlocks++

if filesAtMaxSize >= 2 || filesAtMaxBlocks >= 2 || has tombstones
	plan

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After consideration, I think you were right, @devanbenz, to change Plan and PlanLevel minimally. While their algorithms are obtuse, we shouldn't change them in the PR or at this time, to minimize the risks in what is already a large change to compaction.

tsdb/engine/tsm1/compact.go Show resolved Hide resolved
tsdb/engine/tsm1/compact.go Show resolved Hide resolved
Size: 2048 * 1024 * 1024,
},
{
Path: "03-05.tsm1",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to also have this test with the small file as the first one in the generation, as well as the last.

data := []tsm1.FileStat{
{
Path: "01-04.tsm1",
Size: 251 * 1024 * 1024,
Path: "01-05.tsm1",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reverse the order of the file sizes, as well, to make sure the various tests of the first file in the code behave correctly whichever case it encounters (smallest first, largest first).

// under 2 GB and a mix of aggressive max blocks and default max blocks
// it should be further compacted.
func TestDefaultPlanner_FullyCompacted_LargeSingleGenerationUnderAggressiveBlocks(t *testing.T) {
// > 2 GB total group size
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test multiple orderings of the file sizes, as discussed in other comments above

func TestDefaultPlanner_FullyCompacted_LargeSingleGenerationMaxAggressiveBlocks(t *testing.T) {
// > 2 GB total group size
// 100% of files are at aggressive max block size
data := []tsm1.FileStat{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test the reverse file order, as well. As discussed above.

// > 2 GB total group size
// 100% of files are at aggressive max block size
data := []tsm1.FileStat{
{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test reverse order, as well.

data := []tsm1.FileStat{
{
Path: "01-13.tsm1",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test reverse order, as well.

{
Path: "03-04.tsm1",
Size: 600 * 1024 * 1024,
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test reverse file order, as well.

Copy link
Contributor

@davidby-influx davidby-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For each test, extract the code from the creation of the DefaultPlanner to the final check of the PlanOptimize output into a named function or local lambda, taking the slice of tsm1.FileStat as an argument.

There may be commonalities between tests (the creation of the DefaultPlanner, the calling of Plan for each level, the calling of PlanLevel, and even the assertions, perhaps) which will reduce redundant code. You may be able to have one test function that does all the cases; I haven't checked.

Then, pass in the original slice of FileStats. Next, call slices.Reverse on it and test again. This gets two tests for each set of FileStats and gets rid of lots of code.

You could even have a [][]tsm1.FileStats over which you iterate, running two tests (forward and reverse) on each []tsm1.FileStat

Look here for an idea how to do this. There are others in the code base that may even be more analogous.

@devanbenz
Copy link
Author

@davidby-influx I've gone ahead and moved all the cases where we plan files for compaction in to a single test case. I'm going to do the same with the tests where we do not plan them as well.

Copy link
Contributor

@davidby-influx davidby-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. I think you may either need formatting strings in your require. functions to print the testName argument, or perhaps testing.T.Run would simplify things. If I am wrong on both of these points, LMK and I will approve as is.


expectedNotFullyCompacted := func(cp *tsm1.DefaultPlanner, reasonExp string, generationCountExp int64, testName string) {
compacted, reason := cp.FullyCompacted()
require.Equal(t, reason, reasonExp, "fullyCompacted reason", testName)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need a %s in the message argument to print the testName?


}

for _, test := range furtherCompactedTests {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would testing.T.Run be better here? Sometimes it can complicate things, and if it doesn't help, there's no need to switch.

Interestingly, you can see a minimal example of testing.T.Run in the testify /require documentation, or various places in our code.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For example: in the middleware

I should have mentioned this in my last review. Not a critical change, particularly if it complicates your test structure. But, it gets rid of all the testName arguments to your require. methods (which may need formatting?).

@devanbenz
Copy link
Author

Looks great. I think you may either need formatting strings in your require. functions to print the testName argument, or perhaps testing.T.Run would simplify things. If I am wrong on both of these points, LMK and I will approve as is.

Ah yes, the t.Run is much cleaner. I've refactored to do it that way.

Copy link
Contributor

@davidby-influx davidby-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question/change. Sorry for the infinite review cycle on this one.

cp = tsm1.NewDefaultPlanner(ffs, tsdb.DefaultCompactFullWriteColdDuration)
expectedFullyCompacted(cp, test.expectedFullyCompactedReasonExp, test.name)
// Reverse test files and re-run tests
slices.Reverse(test.fs)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also need to reverse the block counts here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@davidby-influx ah yes I believe so

Copy link
Contributor

@davidby-influx davidby-influx left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

aggressivePointsPerBlockCount := 0
filesUnderMaxTsmSizeCount := 0
for _, tsmFile := range gens[0].files {
if c.FileStore.BlockCount(tsmFile.Path, 1) >= tsdb.AggressiveMaxPointsPerBlock {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I understand why we don't need to check the points in all blocks. Can you explain why are we checking the BlockCount for block 1 and not block 0?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NVM, figured it out. BlockIterator is a Java-style iterator, and the index is the number of times Next gets called on it, so 1 is actually the first block.

expectedgenerationCount int64
}

furtherCompactedTests := []PlanOptimizeTests{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Table-driven testing FTW!

Comment on lines +2174 to +2176
for _, f := range level4Groups[0] {
e.logger.Info("TSM optimized compaction on single generation running, increasing total points per block to 100_000.", zap.String("path", f))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will be nice for determining when this is helping out, orif it is causing us issues).

Comment on lines +94 to +96
func SingleGenerationReason() string {
return fmt.Sprintf("not fully compacted and not idle because single generation with more than 2 files under %d GB and more than 1 file(s) under aggressive compaction points per block count (%d points)", int(MaxTSMFileSize/1048576000), AggressiveMaxPointsPerBlock)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I won't block the PR for this, but this is still a superfluous function. The fmt.Sprintf could have been used directly with the string var.

Copy link
Member

@gwossum gwossum left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@devanbenz devanbenz merged commit f04105b into master-1.x Jan 14, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants