Skip to content

[DLM] Introduce default rollover cluster setting & expose it via APIs #94240

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 46 commits into from
Mar 7, 2023

Conversation

gmarouli
Copy link
Contributor

@gmarouli gmarouli commented Mar 1, 2023

In this PR we are introducing the following:

  • A cluster setting that configures the default rollover used by DLM
  • Introduces a new query param, namely include_defaults for GET /data_streams, GET /_index_template and GET /_component_template, this will expose default configuration related to these endpoints.
  • Exposes the default rollover configuration currently in place when the include_defaults
    parameter is true in the above endpoints
  • Uses the rollover configuration from the cluster setting in DLM.

Part of: #93596

Docs preview:

@elasticsearchmachine elasticsearchmachine added Team:Data Management Meta label for data/management team and removed needs:triage Requires assignment of a team area label labels Mar 6, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

@gmarouli gmarouli added >non-issue and removed Team:Data Management Meta label for data/management team labels Mar 6, 2023
@elasticsearchmachine elasticsearchmachine added the Team:Data Management Meta label for data/management team label Mar 6, 2023
Copy link
Contributor

@andreidan andreidan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this Mary

This looks great, I left some minor suggestions

@gmarouli gmarouli requested a review from andreidan March 6, 2023 13:31
@gmarouli
Copy link
Contributor Author

gmarouli commented Mar 6, 2023

Thanks for the prompt review @andreidan . Along with implementing the feedback from your review I added a test for setting parser in DataLifecycle which I removed accidentally during the development of this feature.

@elasticsearchmachine
Copy link
Collaborator

Hi @gmarouli, I've created a changelog YAML for you.

@gmarouli
Copy link
Contributor Author

gmarouli commented Mar 6, 2023

Thanks for the prompt review @andreidan . Along with implementing the feedback from your review I added a test for setting parser in DataLifecycle which I removed accidentally during the development of this feature.

Ignore the comment above ^^.... The test was at a different class..... face_palm fixing it

Copy link
Contributor

@andreidan andreidan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating on this Mary

Left a few minor suggestions, but this is almost ready to go 🚀

* We require the default rollover conditions to have min_docs set to a non-negative number to avoid empty indices
* and to have at least one MAX condition set to ensure that the rollover will be triggered.
*/
static class RolloverConditionsValidator implements Setting.Validator<RolloverConditions> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm in two minds about this validator.
The setting is meant to allow us to make changes, it'll not be something the users will configure.

I'm tempted to say we shouldn't add this validation, but only validate that the setting is not empty (which should be covered by the RolloverConditions.parseSetting ? )
What do you think?

Note: If we do keep it I think it shouldn't be restricted to min_docs. The are other min_* conditions that could be valid to avoid rolling over empty indices

Copy link
Contributor Author

@gmarouli gmarouli Mar 7, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The setting is meant to allow us to make changes, it'll not be something the users will configure.
That is true, but I think even we can make a mistake, so avoiding creating empty indices is a safe bet. I see this validation helping the us not blocking us.

If we do keep it I think it shouldn't be restricted to min_docs. The are other min_* conditions that could be valid to avoid rolling over empty indices
That is true, I did consider it but I had some concerns about the following:

  • min_age doesn't protect us from empty indices, you can have an index that has been there for 7 days and it can still be empty.
  • min_size & min_primary_shard_size concerns me because an index doesn't have an empty footprint so if you set it up wrongly you might still have an empty index (if I am not mistaken).
  • min_primary_shard_docs I think is a good addition :).

I would suggest that we keep the validation but we add the min_primary_shard_docs as an option too. What do you think? I am thinking that there are no use cases that would not want at least one of the two.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Discussed offline. We will remove the checks for min_* conditions.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the whole validator could be removed and leave the RolloverConditions#parser to make sure there's at least one condition, regardless what that is.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Opened #94447

@gmarouli
Copy link
Contributor Author

gmarouli commented Mar 7, 2023

@elasticmachine run elasticsearch-ci/part-1, I believe the failure is related (#94281) I have contacted the @HiDAl to verify the assumption and fix it.

@gmarouli gmarouli merged commit fe20d92 into elastic:main Mar 7, 2023
@gmarouli gmarouli deleted the dlm-rollover-cluster-setting branch December 10, 2024 07:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
>feature Team:Data Management Meta label for data/management team v8.8.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants