This repository hosts the Dockerfile
, which defines the Apache Flink Docker image with Hadoop and Apache Iceberg for Linux machines. The official Apache Flink image on Docker Hub does not contain Hadoop and Apache Iceberg dependencies. This image extends the official image by adding the Flink-provided Hadoop Uber JAR and Apache Iceberg. You can use this Docker image to deploy Flink in Session Mode or Application Mode cluster on Docker, which allows you to read and write to Apache Iceberg tables.
Table of Contents
To run the Docker container locally:
-
Clone the repo:
git clone https://github.com/j3-signalroom/linux_flink_with_iceberg.git
-
Login in to your docker desktop.
-
We'll run the Dockerfile via docker-compose, so please follow these steps:
a. Navigate to the root folder of the
linux_flink_with_iceberg/
repository that you cloned.b. Open a terminal in this directory.
c. Execute the following script:
scripts/run-flink-locally.sh --profile=<AWS_SSO_PROFILE_NAME> [--aws_s3_bucket=<AWS_S3_BUCKET_NAME>]
Argument placeholder Replace with <AWS_SSO_PROFILE_NAME>
your AWS SSO profile name for your AWS infrastructue that host your AWS Secrets Manager. <AWS_S3_BUCKET_NAME>
[Optional] can specify the name of the AWS S3 bucket used to store Apache Iceberg files.