Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for ECSLayout for elasticsearch-ahc and / or elaticsearch-jest in combination with data streams #84

Open
thaarbach opened this issue Feb 24, 2023 · 10 comments
Labels

Comments

@thaarbach
Copy link
Contributor

thaarbach commented Feb 24, 2023

Description
Support ECSLayout with log4j2-elasticsearch-ahc and/or log4j2-elasticsearch-jest

Why: If using appender in a centralized logging setup in combination with elastic-apm in a clustered environment it is much easier to setup the appender. Adding fields provided by the elastic-apm e.g. client.ip, trace.id, transaction.id, http.*, error.* etc. with VirtualProperties <VirtualProperty name="client.ip" value="$${ctx:client.ip}"/> doesn't work.

We have also tried to setup the appender with JestHttp with the use of data streams which won't work. maybe an configuration.

Configuration ahc

<Appenders>
    <Elasticsearch name="elasticsearch">
	<JacksonJsonLayout>
		<JacksonMixIn targetClass="org.apache.logging.log4j.core.LogEvent" mixInClass="org.appenders.log4j2.elasticsearch.json.jackson.LogEventJacksonEcsJsonMixIn"/>
		<NonEmptyFilter/>
		<VirtualProperty name="host.name" value="$${sys:hostName}.example.de"/>
		<VirtualProperty name="service.version" value="$${sys:elastic.apm.service_version}"/>
		<VirtualProperty name="service.name" value="$${sys:elastic.apm.service_name}"/>
		<VirtualProperty name="data_stream.type" value="logs"/>
		<VirtualProperty name="data_stream.dataset" value="$${sys:elastic.apm.service_name}.example.de"/>
		<VirtualProperty name="data_stream.namespace" value="$${sys:elastic.apm.environment}.example.de"/>
		<VirtualProperty name="client.ip" value="$${ctx:client.ip}"/>
		<PooledItemSourceFactory poolName="itemPool"
									itemSizeInBytes="1024"
									maxItemSizeInBytes="8192"
									initialPoolSize="500"
									monitored="true"
									monitorTaskInterval="10000"
									resizeTimeout="500">
			<UnlimitedResizePolicy resizeFactor="0.6"/>
		</PooledItemSourceFactory>
	</JacksonJsonLayout>

	<AsyncBatchDelivery batchSize="500" eliveryInterval="5000">
		<IndexTemplate apiVersion="8" name="log4j2-${sys:elastic.apm.service_name}" path="classpath:composableIndexTemplate.json"/>
		<ILMPolicy name="logs" createBootstrapIndex="false">
			{}
		</ILMPolicy>
		<AHCHttp name="http-main"
					connTimeout="500"
					readTimeout="30000"
					gzipCompression="true"
					maxTotalConnections="8"
					serverUris="http://localhost:9200">
			<PooledItemSourceFactory poolName="batchPool"
										itemSizeInBytes="5120000"
										initialPoolSize="10"
										resizeTimeout="500">
				<UnlimitedResizePolicy resizeFactor="0.70"/>
			</PooledItemSourceFactory>
			<ElasticsearchDataStream />
			<BatchLimitBackoffPolicy maxBatchesInFlight="4"/>
			<ServiceDiscovery
									refreshInterval="5000"
									configPolicies="serverList">
			</ServiceDiscovery>
		</AHCHttp>
</AsyncBatchDelivery>

Configuration JestHttp

<Appenders>
    <Elasticsearch name="elasticsearch">
	<ECSLayout serviceName="${sys:elastic.apm.service_name}" eventDataset="${sys:elastic.apm.service_name}.log">
		<KeyValuePair key="host.name" value="${sys:hostName}.example.de"/>
		<KeyValuePair key="service.version" value="${sys:elastic.apm.service_version}"/>
		<KeyValuePair key="data_stream.type" value="logs"/>
		<KeyValuePair key="data_stream.dataset" value="${sys:elastic.apm.service_name}"/>
		<KeyValuePair key="data_stream.namespace" value="${sys:elastic.apm.environment}"/>
	</ECSLayout>
	<IndexName indexName="log4j2-${sys:elastic.apm.service_name}"/>
	<ThresholdFilter level="INFO" onMatch="ACCEPT"/>

	<AsyncBatchDelivery deliveryInterval="5000" batchSize="500" shutdownDelatMillis="10000">
		<IndexTemplate apiVersion="8" name="log4j2-${sys:elastic.apm.service_name}" path="classpath:composableIndexTemplate.json"/>
		<ILMPolicy name="logs" createBootstrapIndex="false">
			{}
		</ILMPolicy>
		<JestHttp serverUris="http://localhost:9200" dataStreamsEnabled="true"/>
		<AppenderRefFailoverPolicy>
			<AppenderRef ref="stderr"/>
		</AppenderRefFailoverPolicy>
	</AsyncBatchDelivery>
    </Elasticsearch>
</Appenders>

Additional
ILMPolicy with createBootstapIndex only works, if an empty template is provided

<ILMPolicy name="logs" createBootstrapIndex="false">
    {}
</ILMPolicy>
@thaarbach
Copy link
Contributor Author

Got it working with JestHttp. Forgot to change the dependency 🙈

@rfoltyns
Copy link
Owner

I'm glad you got it working 👏

Watch out for the double $ for dynamic VirtualProperty-ies. They need to use dynamic="true" flag to resolve correctly:

<VirtualProperty name="client.ip" value="$${ctx:client.ip}" dynamic="true"/>

.. and ctx doesn't work with AsyncLogger - that's due to Log4j2 not copying over thread context vars in async mode.

Also, I highly recommend AHCHttp. It's much more performant and has much lower footprint that JestHttp. It should work with netty-all jar. AHC module has become the new "tip of the spear". It will be the main focus of further development and part of it's code will become the backbone of http layer in 2.0.

I hacked the HC example to use your configuration. After a few tweaks it worked like a charm (mappings are still missing, but I'm sure you'll figure it out)

Try

mvn clean install && java -jar -Delastic.apm.environment=apm-test -Delastic.apm.service_name=elasticsearch-ahc log4j2-elasticsearch-hc-springboot/target/log4j2-elasticsearch-hc-springboot-0.0.1-SNAPSHOT.jar

and then

curl -XPOST -H 'Content-Type: application/json' -d '{}' http://localhost:9200/log4j2-elasticsearch-ahc/_search | jq

on this branch

@thaarbach
Copy link
Contributor Author

Hey Rafal,

ah, i thought the double $ is needed for escaping reasons. Got it from (https://github.com/rfoltyns/log4j2-elasticsearch/tree/master/log4j2-elasticsearch-core#virtual-properties)

.. and ctx doesn't work with AsyncLogger - that's due to Log4j2 not copying over thread context vars in async mode.

If i understand the log4j2.x system property log4j2.isThreadContextMapInheritable right, then the child thread inherit the Thread Context Map. This is explained here https://logging.apache.org/log4j/2.x/manual/thread-context.html#configuration. But maybe, i am wrong.

Also, I highly recommend AHCHttp. It's much more performant and has much lower footprint that JestHttp.

Thats why i'm tried AHCHttp first ;-).

@rfoltyns
Copy link
Owner

It works as an escape char, yes. First $ is replaced somewhere around loading the xml file - Log4j2 goodness - only one remains and is VirtualProperty.dynamic=false, it will be replaced while building the serialiser at this line in JacksonSerializer.

the child thread inherit the Thread Context Map

I never got ctx working with StrSubstitutor. If values are copied, they're not copied where VirtualProperty would expect it.

@rfoltyns
Copy link
Owner

rfoltyns commented Mar 1, 2023

@thaarbach Is there anything else we can address regarding the original issue? Seems like the code - at least in this repository :) - works as advertised

@thaarbach
Copy link
Contributor Author

@rfoltyns i'm using AsyncLoggers and set some informations e.g. the current user.id in the ThreadContext (in our case MDC because of using SLF4J) and they are indexed when log4j2.isThreadContextMapInheritable=true. Mayby there is some magic done by SLF4J.

Is there anything else we can address regarding the original issue? Seems like the code - at least in this repository :) - works as advertised

Whould be nice, if ECSLayout also works with elasticsearch-ahc

@rfoltyns
Copy link
Owner

rfoltyns commented Mar 2, 2023

Is the log4j2-elasticsearch-examples branch I mentioned above not working as you expect?

@thaarbach
Copy link
Contributor Author

thaarbach commented Mar 2, 2023

Is the log4j2-elasticsearch-examples branch I mentioned above not working as you expect?

Didn't try it yet. At the moment i'm fighting with ingest piplines, because they won't be executed on index requests.

Are ingest pipelines supported per request?

@rfoltyns
Copy link
Owner

rfoltyns commented Mar 2, 2023

I never played around with these tbh. Elasticsearch docs mentions index.default_pipeline setting. See what happens once you provide it in index-template/component-template-settings. With data stream and index template settings setup provided by this plugin, should work nicely..?

I'll look into Pipeline API and see if there's a possibility to properly support it in 1.7

@thaarbach
Copy link
Contributor Author

Got it working. The trick is set the pipeline as index.final_pipeline and force an reindex or rollover.

Old data can be updated with POST my_data_stream/_update_by_query?pipeline=my_pipeline

Now i'm be able to enrich the log entries with geo data and decode urls :-)

@rfoltyns rfoltyns added the howto label May 12, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants