Skip to content

Implicitly rollover data streams / aliases based on max_primary_shard_docs #94065

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged

Conversation

martijnvg
Copy link
Member

@martijnvg martijnvg commented Feb 23, 2023

Implicitly rollover a data stream or alias if primary shard doc count of most recent backing index is on or beyond 200M (seems like a good catch all upper bound).

Closes #87246

@martijnvg martijnvg added :Data Management/ILM+SLM Index and Snapshot lifecycle management :StorageEngine/TSDB You know, for Metrics >enhancement labels Mar 6, 2023
@martijnvg martijnvg marked this pull request as ready for review March 6, 2023 15:09
@elasticsearchmachine elasticsearchmachine added Team:Data Management Meta label for data/management team Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) labels Mar 6, 2023
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytics-geo (Team:Analytics)

@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-data-management (Team:Data Management)

Copy link
Contributor

@gmarouli gmarouli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM :) just a minor suggestion.

if (targetIsTsdb && currentMaxPrimaryShardDocs > MAX_PRIMARY_SHARD_DOCS_FOR_TSDB) {
Map<String, Condition<?>> conditions = new HashMap<>(rolloverRequest.getConditions().getConditions());
conditions.put(MaxPrimaryShardDocsCondition.NAME, new MaxPrimaryShardDocsCondition(MAX_PRIMARY_SHARD_DOCS_FOR_TSDB));
rolloverRequest.setConditions(new RolloverConditions(conditions));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: we also have a builder that copies from an existing instance to simplify this code:

            rolloverRequest.setConditions(
                RolloverConditions.newBuilder(rolloverRequest.getConditions())
                    .addMaxPrimaryShardDocsCondition(MAX_PRIMARY_SHARD_DOCS_FOR_TSDB)
                    .build()
            );

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that addMaxPrimaryShardDocsCondition(...) fails if max primary doc count has already been defined? This can happen if it has been defined, but with a value higher than MAX_PRIMARY_SHARD_DOCS_FOR_TSDB.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm.... ah damn I didn't think of it, I will add a note to fix it, that was not my intention with this builder :p. Thanks @martijnvg

@martijnvg martijnvg changed the title Implicitly rollover tsdb data streams based on max_primary_shard_docs Implicitly rollover data streams / aliases based on max_primary_shard_docs Mar 8, 2023
@martijnvg
Copy link
Member Author

@gmarouli I updated the PR to always implicitly rollover data streams (and aliases) if a shard of most recent index has 200M or more documents.

@elasticsearchmachine
Copy link
Collaborator

Hi @martijnvg, I've created a changelog YAML for you.

@gmarouli
Copy link
Contributor

gmarouli commented Mar 8, 2023

@gmarouli I updated the PR to always implicitly rollover data streams (and aliases) if a shard of most recent index has 200M or more documents.

I think it makes sense. Passing this number is almost guaranteed to cause problems right?

@martijnvg
Copy link
Member Author

Passing this number is almost guaranteed to cause problems right?

Yes. I think this will cause issues a search time. Shard level searches getting slower without a real way of speeding that up. Certain bites cache entries getting uncomfortable large.

@martijnvg martijnvg merged commit 1c35212 into elastic:main Mar 8, 2023
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Jul 14, 2023
martijnvg added a commit that referenced this pull request Jul 14, 2023
This was introduced via #94065
Relates to #87246
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Jul 14, 2023
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Jul 14, 2023
elasticsearchmachine pushed a commit that referenced this pull request Jul 14, 2023
elasticsearchmachine pushed a commit that referenced this pull request Jul 14, 2023
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Jul 20, 2023
In elastic#87246 we describe some reasons why it's a good idea to limit the doc
count of a shard, and we started to do so in elastic#94065, so this commit
adjusts the sizing guidance docs to match.
DaveCTurner added a commit that referenced this pull request Jul 20, 2023
In #87246 we describe some reasons why it's a good idea to limit the doc
count of a shard, and we started to do so in #94065, so this commit
adjusts the sizing guidance docs to match.
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Jul 20, 2023
In elastic#87246 we describe some reasons why it's a good idea to limit the doc
count of a shard, and we started to do so in elastic#94065, so this commit
adjusts the sizing guidance docs to match.
DaveCTurner added a commit to DaveCTurner/elasticsearch that referenced this pull request Jul 20, 2023
In elastic#87246 we describe some reasons why it's a good idea to limit the doc
count of a shard, and we started to do so in elastic#94065, so this commit
adjusts the sizing guidance docs to match.
elasticsearchmachine pushed a commit that referenced this pull request Jul 20, 2023
In #87246 we describe some reasons why it's a good idea to limit the doc
count of a shard, and we started to do so in #94065, so this commit
adjusts the sizing guidance docs to match.
elasticsearchmachine pushed a commit that referenced this pull request Jul 20, 2023
In #87246 we describe some reasons why it's a good idea to limit the doc
count of a shard, and we started to do so in #94065, so this commit
adjusts the sizing guidance docs to match.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:Data Management/ILM+SLM Index and Snapshot lifecycle management >enhancement :StorageEngine/TSDB You know, for Metrics Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) Team:Data Management Meta label for data/management team v8.8.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add a max_primary_shard_docs to default ILM policies
4 participants