IBM this week announced a cloud-centered platform for data and data analytics. Suitable for use in everything from simple on-prem private clouds to hybrid clouds utilizing multiple public clouds, the platform takes a single pane-of-glass approach and treats all data points as if they are on a single server, no matter where they reside. The platform is called IBM Cloud Private for Data (ICP Data), and Big Blue claims its users can use it “to quickly uncover previously unobtainable insights from their data.”
“Basically, IBM Cloud Private for Data is bringing the public cloud to your data, regardless of where that is,” Rob Thomas, GM of IBM Analytics, told Data Center Knowledge. “We’re delivering a platform that can see all of your data, whether it’s in Oracle, Teradata, an IBM repository, or on any public cloud. We give you visibility, the ability to train machine learning models across all of your data — and that’s very unique.”
In other words, the platform is more than just a way to warehouse data. It brings an assortment of IBM analytics tools to the mix, and according to Thomas is easy enough to be used out-of-the box with little if any hand holding from IBM. There’s also a fully capable community edition that customers can use in a non-production environment while they learn the ropes.
Since news of the platform was made public at IBM’s Think event in February, there have been a couple of additions. Support for MongoDB and PostgreSQL databases has been added through partnerships with MongoDB and Postgres’ parent company, EnterpriseDB. IBM has also formed a partnership with Red Hat, which has resulted in the platform being able to run natively on OpenShift, Red Hat’s container orchestration platform. In addition, compliance with the GDPR, the EU’s new data privacy regulation, has been added into the platform, with master data management being implemented as microservices and IBM Data Risk Management being included as an application on the platform.
“All of this really hits at these major needs we see in the world: modernize your data, do it in a compliant way, and make your data more accessible,” Thomas said. “That’s what we’re doing.”
Although the buzzwords around the platform are containers and Kubernetes, what’s most impressive is the speed at which IBM claims it can analyze data.
“We have an ingestion engine that’s built inside what’s called DB2 Event Store,” he said. “It’s based on a SPARK engine and uses a Parquet file format. This is the highest performance engine for bringing data in and doing analytics on the fly. What we’re doing with this event engine is enabling real-time analytics like never before.”
Thomas said the platform can ingest 250 billion events a day, which is about the number of transactions the global credit card industry processes in a year.
“Basically, we ingest up to a million rows a second, to be that specific,” he said. “As you’re ingesting that data, you could’ve built an analytical model on top of that and be doing analytics as you’re bringing that data in. It’s incredibly fast. It’s no longer a two-step process of bring data in and do analytics. We can do it at once and do it at incredible speed.”
Big Blue is also pushing the platform’s usefulness at the edge, which is important since more and more data is generated from consumers using mobile devices and from sensors gathering data on connected industrial devices. Thomas indicated that this aspect of the platform might be something of a work in progress and hinted that we might expect to see more edge capabilities evolve over time.
“We have some work going on right now with what we call Wide Angle SQL Query, to be able to query SQL all the way out to an IoT device, and this is just the start of that,” he said. “Even having this ingestion engine enables us to integrate data from a lot of different IoT type sources, so it’s a good step in that direction.”
One of the biggest selling points ICP Data brings to the table, according to Thomas, is the choices it offers users for the handling of data.
“I think one of the challenges in IT has always been that companies released great analytical tools and they say, ‘Step one: move all your data into our tool.’ That costs a lot of time and costs a lot of money. Our philosophy is that you can leave your data where it is, if you want, and we can access it there, or you can bring the data in.
“If you bring the data in, you’ve got all types of choices for how you manage that data. It can sit in Hadoop, it can sit in MongoDB, it can sit in Postgres, it can sit in DB2, or it can sit in that event engine that I described. We provide a variety of different heterogeneous offerings or microservices for how you manage your data, how you store it, but we don’t require it. I think that’s pretty unique.”