Sequential system execution of job queue

BatchPatch Forums Home Forums BatchPatch Support Forum Sequential system execution of job queue

Viewing 8 posts - 1 through 8 (of 8 total)
  • Author
    Posts
  • #8926
    Colleen Schmidt
    Participant

    Hi, I’m trying out the “Wait for host to be detected online” in preparation for having 4 servers patched in order (server1 will patch, once it’s back online server2 will patch, etc).

    My test job is simple:

    Get last boot time

    Get c: disk space

    Reboot (force always)

    Wait for host to be detected online

    send email notification

    The issue that I’m having is that the second test system is starting it’s job before the first system has come back online. Any reason why this is happening?

    Obviously this is just a test job since I’m not actually patching the systems. Once the “Wait for host to be detected online” command works as intended, I plan on putting it into the job that I’ve created for patching, which is scheduled to run at a certain time and day.

    Thanks.

    -Colleen

    #10360
    doug
    Moderator

    Colleen – You didn’t mention it, but I assume you are using the ‘Advanced Multi-Row Queue Sequence’ or the ‘Basic Multi-Row Queue Sequence’ right?

    From what you have shown me, it looks like the problem is that you have “Wait for host to be detected online” immediately following your “Reboot (force always)” command. So what’s happening is the reboot command is initiated and then a split second later the ‘Wait for host to be detected online’ checks to see if the host is online. In this case your host simply does not have enough time to go offline for the reboot. To rectify the situation you might do something like this instead:


    Get last boot time

    Get c: disk space

    Reboot (force always)

    Wait 3 minutes

    Wait for host to be detected online

    send email notification

    OR

    Get last boot time

    Get c: disk space

    Reboot (force always)

    Wait for host to go offline and come back online

    send email notification


    Either one of these should generally work for the most part, but neither is a 100% absolute guarantee.

    The potential issue with ‘Wait for 3 minutes’ is that there are times when a host could take longer than 3 minutes to initiate the shutdown sequence before it is rebooted. And so in rare cases you could find that the 3 minutes passes but the host still has not shutdown and rebooted, and so then the ‘Wait for host to be detected online’ will find the host online without it having ever rebooted. You could set the wait time to 5 minutes or 10 minutes (or even just 1 minute) but it’s always a balancing act because you don’t want your process to take forever and you don’t want your process to start the ‘Wait for host to be detected online’ before the host has a chance to go offline. 1 minute would probably be sufficient in most cases, but 3 minutes is probably safer. Or you could do two 1-minute back to back waits to create a 2 minute wait period.

    The potential issue with ‘Wait for host to go offline and come back online’ is that BP cannot accurately determine “offline-ness” 100% of the time. It is generally very good at it using the default setting under ‘Tools > Settings > Grid Preferences > Hosts are considered offline after 3 ping timeouts’ but there are cases, particularly with virtual machines, where a host can be rebooted extremely rapidly in under a few seconds. In cases like this the host might go offline and come back online without BP ever officially detecting it offline. What happens in this case is that your queue hangs until the timeout is reached (the timeout value and options are configured in the job queue window under the ‘Special items’). So there is kind of a balancing act to be mindful of. You could minimize the likelihood of this happening by setting the value to ‘Hosts are considered offline after 2 ping timeouts’, but then on the flip side this could create another type of situation where you have a host that is online but there is a network blip of some kind for a few seconds, and then it could trigger BP to think that the host was offline.

    I hope this helps! The bottom line is that these options are not 100% infallible. The are designed to help streamline processes, but they have to be used with the understanding that there are edge cases where they might not behave as desired.

    -Doug

    #10361
    Colleen Schmidt
    Participant

    Yes, I was using the “Advanced Multi-Row Queue Sequence” option. Thank you for the insight. I’m testing with physical systems, but 2 of the production systems are vms and 2 are physical. So I will have a large difference in reboot time as the vms are lightening fast compared to the physicals.

    I have tried it with the two 1 minute delays along with the “Wait for host to go offline and come back online” and it worked perfectly. Thank you!

    Do you know if there’s a way that I could schedule this type of job queue? Since they are dependent on each other, I have only discovered the manual way of kicking this type of job queue off, but scheduling it would be very advantageous.

    Thanks again for your help!

    -Colleen

    #10362
    doug
    Moderator

    Yes you can schedule the ‘advanced multi-row queue sequence’ by using that option in the Task Scheduler. You’ll see in the drop-down menu in the Task Scheduler there is an item called ‘Execute advanced multi-row queue sequence’

    -Doug

    #10356
    Colleen Schmidt
    Participant

    Wonderful, thanks for your help!

    -Colleen

    #12292
    Jack.Roberts
    Participant

    In order to schedule the ‘advanced multi-row queue sequence’ by using the option in the Task Scheduler, do I select the Sequence ExecutionRow for the scheduled task?

    #12294
    doug
    Moderator

    @Jack.Roberts – yes.

    #12300
    Jack.Roberts
    Participant

    Perfect!

    Thanks for the confirmation.

    Jack

Viewing 8 posts - 1 through 8 (of 8 total)
  • You must be logged in to reply to this topic.