Skip to content

Apache Flink Docker image with Apache Iceberg support for Linux (i.e., non-Mac M1, M2, and M3 chips).

License

Notifications You must be signed in to change notification settings

j3-signalroom/linux_flink_with_iceberg

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Apache Flink with Apache Iceberg support for Linux machines

This repository hosts the Dockerfile, which defines the Apache Flink Docker image with Hadoop and Apache Iceberg for Linux machines. The official Apache Flink image on Docker Hub does not contain Hadoop and Apache Iceberg dependencies. This image extends the official image by adding the Flink-provided Hadoop Uber JAR and Apache Iceberg. You can use this Docker image to deploy Flink in Session Mode or Application Mode cluster on Docker, which allows you to read and write to Apache Iceberg tables.

Table of Contents

1.0 Let's get started!

To run the Docker container locally:

  1. Clone the repo:

    git clone https://github.com/j3-signalroom/linux_flink_with_iceberg.git
  2. Login in to your docker desktop.

  3. We'll run the Dockerfile via docker-compose, so please follow these steps:

    a. Navigate to the root folder of the linux_flink_with_iceberg/ repository that you cloned.

    b. Open a terminal in this directory.

    c. Execute the following script:

    scripts/run-flink-locally.sh --profile=<AWS_SSO_PROFILE_NAME> [--aws_s3_bucket=<AWS_S3_BUCKET_NAME>]
    Argument placeholder Replace with
    <AWS_SSO_PROFILE_NAME> your AWS SSO profile name for your AWS infrastructue that host your AWS Secrets Manager.
    <AWS_S3_BUCKET_NAME> [Optional] can specify the name of the AWS S3 bucket used to store Apache Iceberg files.

2.0 Resources

Apache Flink 1.20.0

Apache Iceberg 1.6.1