Skip to main content

KNIME Hub is a crucial part of the KNIME ecosystem, providing professionals a platform to easily share and deploy their KNIME workflows, components, and extensions. One of the notable features of this package is its ability of automation of the workflows and streamline data processing, thanks to its triggers and scheduling capabilities.

 

Democratizing Hub computation

 

Triggers and schedules were never so relevant, since KNIME introduced their new service: The Community Hub Teams. These functions were only available on the KNIME Server and Business Hub. With the lincence costs of Server And Business HUB, these functions were the privilege of mostly bigger, enterprise companies. With the Community Hub Teams, KNIME allows smaller business units, small businesses, single entrepreneurs to join and enjoy leverages of these services for a reasonable price. Despite Server and Hub, the Community HUB, the instance is not managed by the organization, it is managed by KNIME. The pricing has a “Pay As You Use” model.

The Monthly cost of the Service starts from €99, including 3 team members and you are paying €0.10 for each started minute. More information about the Community Hub Teams here, and you can find pricing details here.

Let’s say, you have 4 workflows, and you can orchestrate and keep the computation time under a few minute, running them in every 3 hours, you can expect the monthly cost around €150, which is a reasonable price. Scheduling your workflows allows you to have a more predictable service costs, while relying only on triggers, the costs may vary depends on how often you fire them.

About Triggers in KNIME Hub

 

Triggers on KNIME Hub

Triggers in the KNIME Hub are mechanisms that automatically start workflows based on specific events. These events may include a variety of actions, such as changes in a file, database updates, or receiving data from webhooks. Triggers play a vital role in dynamic data environments, enabling immediate processing as soon as new data becomes available.

File and workflow-based triggers activate workflows when a specific file is created, modified, or deleted in a monitored directory. This is especially helpful in situations where data files are frequently updated or new files are introduced to a system, like log files or batch data uploads. These type of triggers can be set up from the KNIME Hub’s end. You can also set up triggers based on workflow changes, these options are especially useful in the context of automated testing and deployment.

Advantages of Triggers

Triggers allow immediate action upon data changes, ensuring that workflows consistently operate with the most up-to-date information.
Automation: Triggers improve efficiency and minimise the chances of human error by reducing the need for manual intervention.
Triggers make integration with other systems and data sources seamless, allowing KNIME to be integrated in diverse IT environments.

Workflow Scheduling on KNIME Hub

By utilising the scheduling feature on KNIME Hub, professionals can automate the execution of workflows at specific times or intervals, eliminating the need for manual initiation. This feature is crucial for routine, repetitive tasks that require consistent execution, such as data extraction, transformation, and loading (ETL) processes, reporting, and maintenance tasks.

Setting Up Schedules

 

Scheduling on KNIME Hub

 

Everyday Schedules: Workflows can be scheduled to run on a daily basis at a designated time. This is helpful for tasks that professionals may need to perform on a regular basis, like generating daily sales reports or updating daily dashboards.

Weekly schedules are a useful tool for professionals who want to ensure that their workflows run smoothly on specific days of the week. As an example, a workflow can be set up to run on Mondays in order to gather performance metrics for the week.

Monthly schedules are perfect for tasks that occur on a monthly basis, like month-end financial reconciliations or generating monthly summary reports

Flexible Schedules: With custom schedules, professionals can easily configure workflows to run at specific intervals or on designated dates and times. This is helpful for workflows that don’t follow a regular schedule, such as those that don’t occur daily, weekly, or monthly.

 

 

The Advantages of Scheduling

 

Consistency is key when it comes to scheduling. It ensures that tasks are performed reliably and punctually, which is vital for regular reporting and maintenance activities.
Efficiency: By automating repetitive tasks through scheduling, professionals can save valuable time and redirect their efforts towards more strategic activities.
Reliability: Scheduled workflows help minimise the chances of tasks being overlooked, guaranteeing that all essential processes are carried out without the need for manual supervision.

 

Exploring the synergy between triggers and scheduling

 

When triggers and scheduling are used together, the full potential of KNIME Hub’s automation capabilities is set. For example, a workflow can be configured to begin processing incoming data immediately upon arrival (trigger) and also to perform thorough data validation checks at the end of each day (schedule). This combination guarantees that data processing is fast and accurate, improving the overall quality of data and reliability of workflows.

Example Use Case

Consider a retail company that has the task of handling sales data from numerous stores. File-based triggers can be set up by the company to initiate workflows whenever new sales data files are uploaded to a shared directory. In addition, the company has the option to set up nightly workflows that can collect daily sales data and produce reports for the management team. Through the utilisation of triggers for real-time processing and schedules for routine reporting, the company guarantees that its data processing is both prompt and effective.

In summary

Triggers and scheduling on KNIME Hub are essential tools for automating workflows and enhancing data processing efficiency. Triggers allow for the initiation of real-time workflows based on specific events, while scheduling ensures the consistent and timely performance of repetitive tasks. By utilising these features, data scientists and analysts can streamline their processes by reducing manual intervention, minimize errors, and prioritize more strategic tasks.

 

Author:

Gabor Zombory, Head of Services in Data & Analytics, Datraction

Leave a Reply