Covering Disruptive Technology Powering Business in The Digital Age

Home > DTA news > News > Syncsort’s Continued Innovation Simplifies Big Data Integration with Hadoop and Spark 2.0
Syncsort’s Continued Innovation Simplifies Big Data Integration with Hadoop and Spark 2.0


Syncsort, a global leader in Big Iron to Big Data solutions, today announced new advancements in its industry-leading Big Data integration solution, DMX-h, that enable organizations to accelerate business objectives by speeding development, adapting to evolving data management requirements and leveraging rapid innovation in Big Data technology. New, unmatched Integrated Workflow capabilities and Spark 2.0 integration dramatically simplify Hadoop and Spark application development, enabling organizations to extract maximum value from all their enterprise data assets.

“As Hadoop implementations continue to grow, with more diverse and complex use cases, and a constantly evolving Big Data technology stack, organizations require an increasingly efficient and flexible application development environment,” said Tendü Yoğurtçu, General Manager of Syncsort’s Big Data business. “By enhancing our single software environment with our new integrated workflow capability, we give customers an even simpler, more flexible way to create and manage their data pipelines. We also extend our design-once, deploy-anywhere architecture with support for Apache Spark 2.0, and make it easy for customers to take advantage of the benefits of Spark 2.0 and integrated workflow without spending time and resources redeveloping their jobs.”

Integrated Workflow Delivers Unparalleled Flexibility and Simplicity, Accelerates Time to Insight

Building an end-to-end data pipeline can be time-consuming and complicated, with various workloads executed on multiple compute frameworks, all of which need to be orchestrated and kept up to date. For example, an organization might need to access a data warehouse or mainframe, run batch integration for large historical reference data in Hadoop MapReduce, and run streaming analytics and machine learning workflows with Apache Spark. Delays in development prevent business users from getting the insights they need for decision-making.

With Integrated Workflow, organizations can now manage various workloads such as batch ETL on very large repositories of historical data, referencing business rules during data ingest in a single workflow.

The new feature greatly simplifies and speeds development of the entire data pipeline, from accessing critical enterprise data, to transforming that data, and ultimately analyzing it for business insights.

Built into Syncsort DMX-h’s design-once, deploy-anywhere architecture, Integrated Workflow empowers developers to:

  • Dramatically reduce development time and resourcesby writing jobs in one environment, such as a laptop, and running them anywhere, including MapReduce, Spark 1.x or Spark 2.0, on-premise or in the cloud.
  • Optimize with new technologies with an adoption pace that is best-fit for their businesswith the ability to run each workload on the compute framework that is best-fit for that workload.
  • Enable organizations to leverage existing skills set and reduceby using a graphical interface to easily create and combine sophisticated workflows into one job, even if they are running on different compute frameworks.

As a result of all the benefits of Integrated Workflow, developers have unparalleled simplicity and flexibility to adapt to changing workloads, allowing them to deliver faster time to insight, while minimizing development and opportunity costs.

Spark 2.0 Support

Syncsort introduced Spark support in its last major release of DMX-h, allowing customers to take the same jobs initially designed for MapReduce and run them natively in Spark. With the new release, developers can now leverage the same capability to seamlessly take advantage of the enhancements made in Spark 2.0. They can visually design data transformations once and run the jobs in MapReduce, Spark 1.x or Spark 2.0, by simply changing the compute framework. No rewriting, reconfiguring or recompiling are required.

This article was originally publshed on and can be viewed in full