Skip to content

Max number of steps #4635

Closed Answered by fmbenhassine
pkernevez asked this question in Q&A
Discussion options

You must be logged in to vote

The only limit to the number of steps is the amount of resources you allocate to the JVM. In my experience, the table step_execution becomes a bottle neck due to the frequent updates to the step executions and contexts (specifically if steps are running in parallel, like with partitioning).

I don't think you need the dynamic number of steps approach (and end up with 1500 steps in your job). Have you thought about partitioning the input data set and setup a reasonable fixed number of parallel steps? Partitioning really works well in most cases (and is restartable in case of failures, only failed partitions are reprocessed). This should solve your problem as each worker step would have its …

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Comment options

You must be logged in to vote
0 replies
Answer selected by pkernevez
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants