Skip to content

[Enterprise Search][Behavioral Analytics] Events ingest API #95027

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 70 commits into from
Apr 12, 2023

Conversation

afoucret
Copy link
Contributor

@afoucret afoucret commented Apr 4, 2023

Description:

In order to populate our analytics data streams we need an API to collect events.
This API will validate the event and emit them to bulk processor that will persist event into events data streams

  • A new REST action is added to the code to receive event.
curl -H "Content-Type: application/json" -XPOST -u "elastic-admin:elastic-password" "localhost:9200/_application/analytics/bar/event/pageview" -d '

{
  "session": {
    "id": "dghdjdk"
  },
  "user": {
    "id": "bdjhdkdjdl"
  },
  "page": {
    "url": "http://elastic.co/blog"
  }
}'

Checklist

  • The event is send into a log file where it can be consumed by filebeat
  • Unit Tests
  • Licence check
  • Role for the intake API
  • YAML rest tests
  • API spec

@afoucret afoucret requested a review from jimczi April 11, 2023 17:11
Comment on lines +26 to +48
public static final Setting<TimeValue> FLUSH_DELAY_SETTING = Setting.timeSetting(
Strings.format("%s.%s", SETTING_ROOT_PATH, "flush_delay"),
TimeValue.timeValueSeconds(10),
TimeValue.timeValueSeconds(1),
TimeValue.timeValueSeconds(60),
Setting.Property.NodeScope
);

public static final Setting<Integer> MAX_NUMBER_OF_EVENTS_PER_BULK_SETTING = Setting.intSetting(
Strings.format("%s.%s", SETTING_ROOT_PATH, "max_events_per_bulk"),
1000,
1,
10000,
Setting.Property.NodeScope
);

public static final Setting<Integer> MAX_NUMBER_OF_RETRIES_SETTING = Setting.intSetting(
Strings.format("%s.%s", SETTING_ROOT_PATH, "max_number_of_retries"),
3,
0,
5,
Setting.Property.NodeScope
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ This settings will allow to make the bulk processor configurable

@Override
protected void doClose() {
// Ensure the bulk processor is closed, so pending requests are flushed.
bulkProcessor.close();
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ Ensure the bulk processor events are drained when closing.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should attempt to execute the remaining 'under construction' bulk request. But agree it would be great to double check this

Comment on lines +188 to +193
public List<Setting<?>> getSettings() {
return List.of(
BulkProcessorConfig.MAX_NUMBER_OF_EVENTS_PER_BULK_SETTING,
BulkProcessorConfig.FLUSH_DELAY_SETTING,
BulkProcessorConfig.MAX_NUMBER_OF_RETRIES_SETTING
);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ Event ingest related settings.

Copy link
Contributor Author

@afoucret afoucret Apr 11, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ℹ️ Changes to made to this class are syntaxic sugar.

) {
this.eventCollectionName = eventCollectionName;
this.eventType = eventType;
this.debug = debug;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kderusso I kept debug instead of suggestions like verbose.

Indeed it is intended to be used with the debug flag of the JS tag that will output event processed by the server into the dev console of the browser (feature that I plan to implement later).

@afoucret afoucret requested a review from pgomulka April 11, 2023 18:44
Copy link
Member

@kderusso kderusso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work!

}

try {
if (eventType.toLowerCase(Locale.ROOT).equals(eventType) == false) throw new IllegalArgumentException();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Curly brackets

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed! Thank you.

Copy link
Member

@carlosdelest carlosdelest left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks really good to me. I just have a question on privileges for the POST action - could they be index privileges instead of manage cluster type privileges?

Very good job on tests! 💯


// This number must be incremented when we make changes to built-in templates.
public static final int REGISTRY_VERSION = 1;
protected static final int REGISTRY_VERSION = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why protected? Are we expecting subclasses for AnalyticsTemplateRegistry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have another PR with a class in the same package that use this variable.
I will probably push those into a file that hold all these constant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have another PR with a class in the same package that use this variable.

Maybe use default package visibility then? Sorry for nagging, I see protected as a strong signal that it is used in a subclass. Let's use default package visibility if that's the use case 👍

Copy link
Contributor

@pgomulka pgomulka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM,
I left questions, but feel free to merge anyway once CI is green

}

public BulkProcessor2 create(Client client) {
return BulkProcessor2.builder(client::bulk, new BulkProcessorListener(), client.threadPool())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks good, but do we want to have control over indexing being enabled or not?
let's say a user is sending a lot of requests that results in analyticsEventEmitter to keep on creating index requests What if cluster is struggling at the moment? Do we want to have control to turn that off( that would mean that data is lost ..)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will probably implement it into the AnalyticsEventEmitter.
If you don't mind I will do it in a follow-up PR to be able to merge the PR and unblock other work streams that rely on the feature.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, follow up is good. But I will this up to ent search team to decide if this is even needed at all. I thought that it might be good as an emergency switch off button

We have that for deprecation logging, but we started with that feature turned off so we needed this to enable this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good call and I will add these feature. It will probably be part of the upcoming PR to add metering to the event ingest API.

@Override
protected void doClose() {
// Ensure the bulk processor is closed, so pending requests are flushed.
bulkProcessor.close();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it should attempt to execute the remaining 'under construction' bulk request. But agree it would be great to double check this

@afoucret
Copy link
Contributor Author

@elasticsearchmachine run elasticsearch-ci/part-2

@afoucret afoucret merged commit 675163b into elastic:main Apr 12, 2023
ywangd added a commit to ywangd/elasticsearch that referenced this pull request Apr 13, 2023
The cluster privileges are listed in alphabetic order on get-privileges
response. This PR ensures that is matched in doc.

Relates: elastic#95027
Resolves: elastic#95210
ywangd added a commit that referenced this pull request Apr 25, 2023
The cluster privileges are listed in alphabetic order on get-privileges
response. This PR ensures that is matched in doc.

Relates: #95027
Resolves: #95210
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
:EnterpriseSearch/Application Enterprise Search external-contributor Pull request authored by a developer outside the Elasticsearch team >feature Team:Enterprise Search Meta label for Enterprise Search team v8.8.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants