KFlow: fast IPFIX flows collector with Kafka export

Downloading and running

KFlow requires at least Java 1.8 to run. It is known to work with Apache Kafka 2.3.0, but any recent version should be fine. Linux is required to achieve high performance, and MacOS can be used for testing.

KFlow is distributed as a self-containing JAR (also known as fat JAR). Simply download kflow-VERSION-all.jar from GitHub releases page and run the following (assuming java is on your PATH):

java -jar kflow-VERSION-all.jar

You can also build the project yourself (see Building section).

Configuration

By default KFlow listens for IPFIX on port 4739/udp and pushes decoded flows to local Kafka broker localhost:9092 to topic ipfix. It also exposes Prometheus metrics on port 8080/tcp. This behavior can be customized by overriding default configuration.

Configuration values are resolved from multiple sources with the following precedence, from highest to lowest:

Java system properties prefixed with app., usually passed as one or more -Dapp.name=value command line arguments,
custom .properties file specified with -c or --config command line argument,
default configuration values.

Important properties

Property	Description
`server.port`	Port number to listen for IPFIX flows
`server.threads`	Number of processing threads
`server.buffer.size`	SO_RCVBUF buffer size in bytes (allocated per processing thread)
`kafka.topic`	Kafka topic to write decoded flows to
`kafka.producers`	Number of Kafka producers (see Performance notes)
`kafka.props.bootstrap.servers`	Kafka brokers to setup initial connection with
`kafka.props.*`	Various Kafka producer configuration properties (see producer docs)
`metrics.port`	Port number to expose Prometheus metrics

Output format

KFlow produces JSON records with the following fields:

Field	Type	Description
`dvc`	string (IPv4)	IPv4 address of host that sent IPFIX packet
`src`	string (IPv4)	sourceIPv4Address (8)
`srcp`	number	sourceTransportPort (7)
`dst`	string (IPv4)	destinationIPv4Address (12)
`dstp`	number	destinationTransportPort (11)
`proto`	number	protocolIdentifier (4)
`flags`	number	tcpControlBits (6)
`bytes`	number	octetDeltaCount (1)
`pkts`	number	packetDeltaCount (2)
`time`	number	Export time in milliseconds since epoch

For the description of fields other than dvc or time please refer to IPFIX entities list.

Performance notes

Because of the way Netty handles UDP channels, KFlow requires epoll() call to be available to achieve high performance. It still can run without epoll() support, though all processing will be done in a signle thread regardless of the server.threads setting.
At a very high volumes (about 500,000 flows per second in our setup) shared lock inside Kafka producer code becomes a bottleneck. To overcome this limitation, KFlow can create multiple producers (as specified by kafka.producers config value) and distribute load between them.
Receive queue monitoring and detecting packet drops is crucial for handling large volumes of traffic in production. KFlow exposes server_packet_drops and server_receive_queue Prometheus metrics. Both are parsed from /proc/net/udp and therefore available only on Linux.
When exposed metrics are not enough for performance troubleshooting, we recommend using profiling tools such as the excellent async-profiler.

Building

KFlow uses Gradle as its build system.

./gradlew build builds the project,
./gradlew run runs it.

Fat JAR is produced in build/libs/kflow-VERSION-all.jar. The build also assembles redistributable application archives in build/distributions folder.

Extensibility

KFlow was built with extensibility in mind.

Support for a new flow export format, such as Netflow or sFlow, can be added by implementing PacketDecoder interface and passing it as a constructor parameter to Server class. Handling multiple formats in a single application is possible by creating multiple Server instances with different decoders and ports.

Similarly, Kafka output can be replaced by a custom Sink interface implementation, which is to be passed as parameter to Server class constructor.

Limitations

KFlow supports IPv4 addresses only. There are no plans for adding IPv6 support.
The exported field set is limited to basic fields only because of performance considerations.

If you need more feature rich solution and don't care about performance that much, you should probably take a look at Cloudflare's goflow. In our setup a single goflow node was able to handle about 250.000 flows per second.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
gradle/wrapper		gradle/wrapper
src		src
.gitignore		.gitignore
.java-version		.java-version
.travis.yml		.travis.yml
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
build.gradle		build.gradle
gradle.properties		gradle.properties
gradlew		gradlew
gradlew.bat		gradlew.bat
settings.gradle		settings.gradle

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

KFlow: fast IPFIX flows collector with Kafka export

Downloading and running

Configuration

Important properties

Output format

Performance notes

Building

Extensibility

Limitations

About

Releases

Packages

Languages

License

sukhinin/kflow

Folders and files

Latest commit

History

Repository files navigation

KFlow: fast IPFIX flows collector with Kafka export

Downloading and running

Configuration

Important properties

Output format

Performance notes

Building

Extensibility

Limitations

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages