Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: azure blob storage support #883

Merged
merged 53 commits into from
Sep 9, 2024
Merged

feat: azure blob storage support #883

merged 53 commits into from
Sep 9, 2024

Conversation

khyurri
Copy link
Contributor

@khyurri khyurri commented Jul 18, 2024

This PR integrates Azure Blob Storage into BadgerDoc and eliminates the use of boto3, minio, and aioboto3 across all microservices. Additionally, it standardizes the storage configuration.

Migration from previous version

Backward Compatibility

  1. In the previous version, when a signed url was passed from BadgerDoc to Airflow/Databricks, the parameter was named s3_signed_url. The current version renames this parameter to signed_url by default. However, to maintain backward compatibility, the parameter JOBS_SIGNER_URL_KEY_NAME is used to rename the signed URL key in the arguments passed.
    For example, setting JOBS_SIGNER_URL_KEY_NAME=s3_signed_url retains the original parameter name.
  2. The parameter S3_PRE_SIGNED_EXPIRES_HOURS has been renamed to JOBS_SIGNED_URL_TTL, and its values are now set in minutes rather than hours.

.env migration

  1. Rename S3_PRE_SIGNED_EXPIRES_HOURS -> JOBS_SIGNED_URL_TTL
  2. Rename JOBS_RUN_PIPELINES_WITH_SIGNED_URL -> JOBS_SIGNED_URL_ENABLED

files migration from Minio / S3 into Azure Blob Storage

-TBC-

Removed microservices

  • Remove convert microservice

Removed or skipped tests

-TBC-

Known issues

  • Python3.12 base image can't be built
  • Clear build dir after building Python3.12 image
  • Azure and Minio installations should automatically create containers or buckets upon an admin login
  • Amazon S3 must have permissions to create new buckets or bucket must be already created (AWS S3 Support doesn't work after migration into badgerdoc_storage #964)
  • Rename CI to use service/lib naming for workflows

@khyurri khyurri force-pushed the feature/add_blob_storage branch 3 times, most recently from bb89a52 to 584b2e2 Compare July 25, 2024 16:17
@khyurri khyurri changed the title wip: migrate all microservices from direct minio/boto3 usage to badgerdoc_storage feat: azure blob storage support Jul 25, 2024
khyurri and others added 24 commits July 29, 2024 13:31
Co-authored-by: Djordje Vukovic <djordje_vukovic@epam.com>
Co-authored-by: Djordje Vukovic <djordje_vukovic@epam.com>
Co-authored-by: Filip Negojevic <filip_negojevic@epam.com>
Co-authored-by: Djordje Vukovic <djordje_vukovic@epam.com>
Co-authored-by: Filip Negojevic <filip_negojevic@epam.com>
Co-authored-by: Djordje Vukovic <djordje_vukovic@epam.com>
Co-authored-by: Uros Stefanovic <uros_stefanovic@epam.com>
Co-authored-by: Denis Rybakov <minefrs@gmail.com>
…_is_needed" (#898)

Co-authored-by: Djordje Vukovic <djordje_vukovic@epam.com>
Co-authored-by: Djordje Vukovic <djordje_vukovic@epam.com>
Co-authored-by: Filip Negojevic <filip_negojevic@epam.com>
Co-authored-by: Uros Stefanovic <uros_stefanovic@epam.com>
Co-authored-by: Uros Stefanovic <uros_stefanovic@epam.com>
Co-authored-by: djo753 <145800147+djo753@users.noreply.github.com>
Co-authored-by: Filip Negojevic <filip_negojevic@epam.com>
@khyurri khyurri merged commit 8342a50 into main Sep 9, 2024
22 checks passed
@khyurri khyurri deleted the feature/add_blob_storage branch September 9, 2024 09:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants