(PUP-12061) Allow splay and splaylimit to be modified in puppet.conf #9484

joshcooper · 2024-09-26T00:00:25Z

If an agent was already running periodically and splay or splaylimit was modified in puppet.conf, then the agent didn't recalculate its splay mode (enabled/disabled) or its splay offset. A full service restart was required for those changes to take effect.

Now the agent will behave as follows when the agent is running periodically:

if splay is changed from disabled to enabled, recalculate splay offset based on the current splaylimit
If splay is changed from enabled to disabled, set splay offset to 0.
If splaylimit is modified, then recalculate splay based on the new splaylimit
If runinterval is modified and splaylimit inherits from runinterval, then recalculate splay offset based on the new runinterval
If runinterval is modified, but splaylimit does not inherit, then keep splay offset as-is.

The agent also now logs an info message about splay at information level, see PUP-10732

$ bundle exec puppet agent --no-daemonize --verbose 
Notice: Starting Puppet client version 8.10.0
Info: Running agent every 10 seconds with splay 7 of 10 seconds

This behavior change affects the puppet agent service on all platforms except windows because of the way our windows service runs puppet agent --onetime, see PUP-5823

Confusingly, the changes to the Puppet::Daemon class affect all periodic runs, even when using --no-daemonize.

Running puppet agent --onetime (such as is implied by --test) is unaffected, because we don't start the scheduler.

lib/puppet/daemon.rb

…/main/splay" This reverts commit a1a84cb.

Memoize the jobs so it's clearer which jobs 0 and 1 are.

selinux.pp

If puppet.conf was modified, but splaylimit wasn't, then we recalculated the splay offset. This goes against the desired behavior of only splaying when the agent is initially started and running every runinterval seconds from that point forward or until runinterval/splaylimit are changed in the future. Now the SplayJob remembers its last splaylimit so that it only recalculates splay if the limit changes. It's important to recalculate splay so that the agent chooses a slot over the new range, either smaller or bigger than before.

Provide more context when the agent runs so it's clearer whether splay is enabled and what the limits are. We don't want to do this for the reparse and signal jobs, because they run every 15 and 5 seconds, respectively. The info message is of the form: Running agent every 20 seconds with splay 5 of 10 seconds This message is logged whenever the agent is running periodically, not for onetime runs. Whether we're daemonized doesn't affect the behavior. This is because we always call Puppet::Daemon#start when running periodically to run the event loop.

Previously, enabling or disabling splay in puppet.conf did not work if the agent was already running periodically. If the agent was started with splay disabled, then enabling it would cause a NoMethodError, when trying to call Puppet::Scheduler::Job#splay_limit If the agent was started with splay enabled, then disabling it would have no effect, since we never recalculated the splay limit. To handle these cases, always create a SplayJob for the agent_run job and set its splay_limit to either the limit or 0, depending on whether splay is enabled or not. Setting a splay_limit to 0 causes the splay to also be set to 0 because rand(1) always returns 0.

Since splaylimit defaults to runinterval, changes to runinterval affect splay. Add tests covering cases where runinterval is increased and decreased. It's important to recalculate splay to reduce thundering herds. For example, if runinterval is changed from 30 mins to 60 mins, we want the compute a new slot in the 60 min interval to decrease server load.

joshcooper · 2024-09-27T14:32:03Z

Jenkins, please test this

joshcooper · 2024-09-27T18:05:12Z

jenkins please test this

joshcooper force-pushed the splaylimit_daemon branch from 5e53f75 to 7b420be Compare September 26, 2024 00:11

joshcooper commented Sep 26, 2024

View reviewed changes

lib/puppet/daemon.rb Show resolved Hide resolved

joshcooper added 2 commits September 26, 2024 16:09

Reapply "Merge pull request puppetlabs#9345 from mhashizume/PUP-11728…

7ac64a8

…/main/splay" This reverts commit a1a84cb.

(PUP-12061) Create example group for splay tests

578261d

Memoize the jobs so it's clearer which jobs 0 and 1 are.

joshcooper force-pushed the splaylimit_daemon branch from 7b420be to f1c2fb0 Compare September 26, 2024 23:41

joshcooper commented Sep 27, 2024

View reviewed changes

selinux.pp Outdated Show resolved Hide resolved

This comment was marked as outdated.

Sign in to view

joshcooper added 4 commits September 26, 2024 21:54

joshcooper force-pushed the splaylimit_daemon branch from f1c2fb0 to 15bc22b Compare September 27, 2024 04:54

joshcooper marked this pull request as ready for review September 27, 2024 19:37

joshcooper requested a review from a team as a code owner September 27, 2024 19:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(PUP-12061) Allow splay and splaylimit to be modified in puppet.conf #9484

(PUP-12061) Allow splay and splaylimit to be modified in puppet.conf #9484

joshcooper commented Sep 26, 2024 •

edited

Loading

This comment was marked as outdated.

joshcooper commented Sep 27, 2024

joshcooper commented Sep 27, 2024

(PUP-12061) Allow splay and splaylimit to be modified in puppet.conf #9484

Are you sure you want to change the base?

(PUP-12061) Allow splay and splaylimit to be modified in puppet.conf #9484

Conversation

joshcooper commented Sep 26, 2024 • edited Loading

This comment was marked as outdated.

joshcooper commented Sep 27, 2024

joshcooper commented Sep 27, 2024

joshcooper commented Sep 26, 2024 •

edited

Loading