Decentralized workload automation plays a key role in successful digital transformation. Decentralization is an important factor in a company’s dynamic and ability to grow. Moreover, decentralization has a positive effect on the agility, motivation and willingness of teams and managers to take responsibility. The trend towards decentralization in modern corporate cultures is taking place in parallel with the trend towards decentralization in IT: monolithic mainframes and mainframes have long been the exception rather than the rule and are being replaced by more efficient alternatives, including IT operations in the cloud. Is it still appropriate today to design automation monolithic? We think: No!
Sovereignty and Best of Breed
Part of a company’s technological sovereignty is being able to use different, modern technologies freely without being tied to a single provider. The advantage of this strategy is not only the avoidance of vendor locked-in, a dreaded situation in which one is dependent, for better or worse, on the support, pricing and future viability of a single vendor. It is also necessary for the agility of groups and departments that teams can autonomously and confidently solve their tasks with the tools that are best suited for them. Because many modern tools include their own effective automation functions, the respective higher-level automation unit must be able to hand over responsibility. Best of breed is only possible if available automation technologies can also be used effectively.
Agile teams and satisfied employees
Deployments increasingly include complex schedules that the teams want to use in order to be able to work optimally. Specialists want to work efficiently and without hurdles with the tools best suited to them. This also makes it increasingly difficult to recruit staff if companies cannot offer their employees up-to-date tools or confront them with bureaucratic environmental conditions. At the same time, the danger of losing control must be countered. When designing decentralization, it is necessary that the advantages of these structures can be used without being confronted with undesirable side effects, such as data islands or the multiple holding of the same resources. This can be achieved if the decentralization of central structures is not done in a jerky manner, but is treated as a process and the decentralized structures remain linked to the central elements. Ideally, the best features of the two models, decentralization and centralization, should be combined to suit one’s needs.
Current developments
Although most industries and business models have long relied on heterogeneous IT landscapes, mainframes are still used in companies from the financial sector, the insurance industry and also in the public sector. But even in those sectors it is becoming increasingly visible that heterogeneous networks and cloud solutions are more efficient, require less maintenance and are fundamentally more cost-effective in the long term. In addition, the complexity of subsystems such as EDWH, Big Data, Analytics, DevOps, HPC and Cloud is continuously increasing and life-cycles are becoming shorter thanks to agile development. As a result, it is becoming increasingly difficult to use centralized scheduling to orchestrate an increasingly decentralized environment. Coordination problems can massively affect the cohesion and functionality of decentralized environments. The challenge: not only the workload itself, but also the scheduling must be decentralized.
Decentralized scheduling
Different subsystems, such as Jenkins, Cloud, EDWH, Big Data, HPC, etc. require or include specific best-of-breed job controls. This makes it increasingly difficult to force all job scheduling tasks into a central enterprise scheduling without productivity suffering. But how can the transformation to decentralized orchestration succeed? To ensure stability, availability and security at all times, the transition must be carefully planned and executed in stages. Modernization and migration to more flexibility through decentralization are not “one time events”, but lead to necessary transformation processes. First, it is important to be aware of the difference between synchronous and asynchronous processing from the user’s point of view and from the scheduling system’s point of view.
Synchronous and asynchronous processing from the user’s point of view
From the user’s point of view, job chains are synchronous processing, while job nets are asynchronous processing.
Synchronous Processing from the Scheduling System’s Perspective
From the point of view of the scheduling system, the processing in the agent is synchronous processing if the system waits for the exit code of the agent before executing the next process.
Execution of a process in an external (scheduling) system
Synchronous Processing (Polling)
A commonly used method to enable processing in external systems is “polling”.
With polling, the external process is processed like a synchronous internal job. The scheduling system starts a job that starts the process in the external system and then actively waits for the external process to be completed. To do this, the job asks the external system at regular intervals about the status of the process and takes over optionally available status information. After the external process has been completed, the job ends with an exit code that corresponds to the result of the external process. Polling is in some cases the only option for asynchronous processing, but the method has consequences of which one should be aware:
- If polling is performed with high frequency, a non-negligible overhead is created in both the higher- and lower-level systems.
- If this is avoided by polling at a low frequency, unnecessarily long waiting times occur, which can delay the overall process.
- Complex technical problems have to be solved. These include restarting polling (e.g. after network and server problems or downtimes) and distinguishing errors in the polling process from errors in the external process.
Asynchronous processing
In asynchronous processing, the scheduling system starts a job that starts the process in the external system and transfers the job’s credentials to the external system. In this variant, this job does not wait for the end of the processing in the external system, but ends itself with an exit code that signals the successful start in the external system.
The job enters a pending state, which is treated by the scheduling system like a running job, without this job being active. It is now the responsibility of the external system to actively report status information and the exit status of the processing to the higher-level scheduling system at the end of the process using the transferred job credentials. Optionally, status information (e.g. process progress) can also be transferred to the higher-level scheduling system during the process. This form of processing offers the following advantages:
- There is no polling overhead due to active waiting.
- Unnecessary waiting times are avoided
- Implementation is less complicated and less prone to errors
However, the remote system must meet some requirements:
- Suitable interfaces (API)
- Trigger concept for reporting errors
- Integration of interface processes (error messages, feedback at the end and status information during the process)
- Optimally restart capability
Decentralized implementation with BICsuite and schedulix
Provided that the conditions described above are met, BICsuite and schedulix are able to provide true asynchronous processing with a remote system. BICsuite and schedulix can of course be used as a central scheduling system. In some cases (for example, in the process of transforming from on premise to the cloud, in large structures with clearly delineated decentralized units, or when connecting previously separate large units into a higher-level unit) it may make sense to work with several hierarchical BICsuite, or schedulix Scheduling Servers. However, BICsuite and schedulix can also be used as a master system for existing scheduling solutions or be subordinated to an existing system in order to decentralize individual company divisions. An intermediary function is also conceivable: In an existing central scheduling system, BICsuite or schedulix is used as a decentralization level and to control subsystems. Please contact us so that we can jointly develop a strategy for decentralized automation in your company.