Introducing Data Flows: Revolutionizing Data Transformation
Unleash the Power of Data with Seamless, Visual Workflow Management
We are excited to introduce Data Flows, a new module that transforms the way you process and manage data. With an intuitive, drag-and-drop interface, Data Flows offer you total control over your data transformation process, enabling faster decision-making, enhanced transparency, and streamlined workflows.
This means that all data cleaning operations, calculations take place in a single visual interface where each step can be easily documented and understood by all users. From the input table to the output tables, create a continuous sequence of interconnected nodes that transform your data, step by step, with simple drag-and-drop.
Let’s dive into the features, benefits, and how you can get started with Data Flows.
Why Data Flows Matter: Solving the Data Management Challenges
Data management can be complex, especially when dealing with large and diverse datasets. Traditional methods require manual intervention, scattered scripts, and often lack the visibility necessary to troubleshoot or scale. Data Flows solve these problems by providing a fully visual, centralized solution for data transformation.
Visual & Intuitive Interface: Every transformation step is laid out in an easy-to-understand, drag-and-drop interface, making it simple to clean, aggregate, and structure your data with minimal effort.
Increased Transparency: With clear visual representation, users can track each transformation step, ensuring the process is transparent and easily understood.
Efficient Data Handling: By centralizing data workflows, Data Flows simplify the entire process, reducing manual effort and increasing operational efficiency.
Key Benefits of Data Flows
Enhanced Visibility & Transparency: Easily document and understand each transformation step.
Optimized Data Processing: Perform complex operations like merges, aggregations, and calculations within a single interface, minimizing manual work.
Faster Decision-Making: Quickly aggregate data from multiple sources for immediate, actionable insights.
Greater Flexibility: Choose whether to execute transformations on demand or on a scheduled basis.
Scalability & Efficiency: Handle large-scale data operations by processing multiple outputs in parallel, reducing processing time and effort.
Higher Performance: Data Flow can be viewed as the process of taking data from one or many physical tables, processing the data, and then putting the results in another physical table. Unlike the views and merges and fusions that needed to be cached, data flows are always building data on tables that do not require to be re-calculated or re-processed each time we use them in other tables in Data Engine or Analytics. You don’t need to bother anymore about certain constraints and actions that previously were needed, such as cache management and dependency checking.
Logical Flow Sequences. You can organize flows one after the other so that you can build logical processing groups.
Seamless Collaboration. Because all node operations are contained in one single Data Flow, it makes it very easy to share flows with other users, make copies of entire processes and simply change the input and/or output tables. Each node can get a specific name and description, making it easy to maintain a good understanding of the data processing, with low documentation efforts.
Example: You can have an initial flow that cleans the data and standardizes the output schema.
Here we can see how data is prepared for further processing (using a combination of nodes to join, clean and aggregate the data). This flow has been run manually only once, as it relates to historical data that won’t change any more. Note, this is a single Data Flow but has multiple paths. Each flow starts with an Input Table and data gets pushed to an Output Table.
After that, a second flow can use the Output Tables to then calculate standard metrics for various reports and dashboards, as per the example below. Here the output tables from the first flow are joined with data generated from another path.
This allows to keep the processing of data clean and modular while at the same time identifying potential areas of performance improvement.
Data Flow Execution: Control Your Data on Your Terms
One of the standout features of Data Flows is the ability to control when and how your data gets processed. Unlike traditional Views and Merges that refresh automatically, Data Flows give you the flexibility to schedule or trigger executions as needed, improving efficiency and performance.
On-Demand or Scheduled Execution: You decide when to run transformations, reducing unnecessary computations and ensuring data is only processed when necessary.
Parallel Processing: Handle multiple data transformations simultaneously, optimizing performance and reducing delays.
A Complete Suite for Endless Data Processing Possibilities
Data Flows offer endless flexibility with the ability to create multiple output tables from a variety of input sources. Whether executing transformations in parallel or sequence, the tool ensures your data is always up-to-date and ready for analysis.
No Secondary Management Tasks: Forget about manually handling merges or naming conventions—Data Flows streamline the entire process from start to finish.
Run Transformations Efficiently: Whether you need a single output table or multiple, Data Flows allow you to process everything in parallel or sequentially to meet your specific needs.
How to Get Started with Data Flows
Getting started with Data Flows is simple and straightforward. Here’s how you can start using this powerful feature in Data Engine:
Log into your Data Engine account.
Navigate to the ‘Data’ tab and select ‘Data Flows.’
Create a new Data Flow by choosing your input tables, setting up transformations, and defining your output tables.
Schedule or execute your Data Flow based on your specific needs.
For detailed instructions, check out our [user guide].
Subscription information
Data Flows will be made available to all customers of Data Engine as part of their existing offering without having to do a thing.
Conclusion: Time to Embrace the Power of Data Flow
We’re thrilled to bring Data Flows to your Data Engine experience. This module is designed to simplify your data transformations, improve performance, and provide complete control over when and how your data is processed.
It’s worth noting that all the functionality of Views, Merges and Fusions are available as nodes within a Data Flow. However rest assured that the existing Views, Merges and Fusions features will remain available for customers.
Data Flows will become more powerful with every release of Data Engine. Start using Data Flows today and revolutionize how you manage your data workflows.
Related Resources
For more in-depth resources, check out the following: