Skip to content
This repository has been archived by the owner on Jun 25, 2023. It is now read-only.

feat: DF solution #24

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

feat: DF solution #24

wants to merge 3 commits into from

Conversation

Reversaidx
Copy link

No description provided.

@Reversaidx Reversaidx changed the title feat: init app feat: DF solution May 22, 2023
Copy link
Collaborator

@rsolovev rsolovev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @Reversaidx, the solution launched with errors -- no logs from container itself, but here is state error from pod manifest:

lastState:
        terminated:
          exitCode: 126
          reason: ContainerCannotRun
          message: >-
            failed to create shim task: OCI runtime create failed: runc create
            failed: unable to start container process: exec: "./start.sh":
            permission denied: unknown

@Reversaidx
Copy link
Author

Fixed, thx you.

@rsolovev
Copy link
Collaborator

second iteration launched, but CUDA reported OOM on start, full logs attached:
inca-smc-mlops-challenge-solution-7f76c796f7-v28w7.log -- see machine/gpu specs here

@Reversaidx
Copy link
Author

Could you run again, decreased a num of workers

Copy link
Collaborator

@rsolovev rsolovev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Reversaidx for this great solution, decreasing workers number helped -- here are our test results for your latest commit.

If you would like to work on your solution further, you can continue optimizing/improving it and re-request our review once done. Any contribution during the challenge period will be taken into account while choosing a winner. Many thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants