Quickly scale and deploy a complete private cloud for your enterprise data and AI architecture
Highlights
For years, companies have been accumulating data at an impressive pace. Some of this data is being duplicated across various clouds and repositories at astounding levels. This duplication leads to data silos, high costs, delayed projects and increased security risks.
Data users assigned to an analytics initiative spend as much as 60 percent of their overall time trying to locate the data they need. Organizations have existing processes to store data in multiple repositories, clouds and back-end systems and have invested resources to move data to a central location where it can be managed, controlled and made accessible. However, most centralized big data projects still fail to deliver on the promise of easy and controlled data access. Turning that failure to success would allow data scientists to access data across their organization wherever it resides. This greatly reduces the amount of time spent searching for data.
As such, IBM developed Cloud Pak for Data System, a hyper-converged system that combines software, storage, compute and networking. Built on a governed data and AI platform, Cloud Pak for Data System simplifies and unifies the management, governance and analysis of data. It allows you to provision and deploy data services flexibly and rapidly, tailored to specific needs. Users can automatically add discovered hardware nodes and add software from an app-like experience to expand their System.
The Cloud Pak for Data System installation is designed to be as simple as using the software itself. Hyper-converged technology makes it possible for users to scale and evolve their infrastructures simply and economically as application loads change. Cloud Pak for Data System includes a flexible microservices software architecture and field-programmable gate arrays (FPGAs).
Customers are aware of the value and speed cloud provides to their applications and services and want the same efficiency in their own data centers. Cloud Pak for Data System offers accelerated time to value, allowing an entire private cloud system to be stood up for a data and AI architecture in hours. That's why we also call it a “cloud in a box.”
Hyper-convergence enables a software-defined data center to provide cost-effective agility, scalability and security. Customers can quickly develop applications and analyze their data in a security-rich environment within their own firewalls and expand as they need rather than resort to expensive capital outlays. Most importantly, developers don’t need to change how they deploy applications, whether they’re working in the cloud or their own data center.
When your needs outgrow your existing installation, Cloud Pak for Data System simplifies expansion. Buy small increments of computing or storage capacity as needed, rather than spending excessively on capital. Based on application demand and delivery, budgeting can be completed in quarterly expansions on an ad hoc basis. When the hardware arrives, it’s pre-loaded and configured with software and licenses. Once installed in the data center, the existing cluster will automatically discover new hardware and licenses, inherit all user configurations, and be ready to use in a few hours. The platform evolves based on each enterprise's needs and scalability, and it’s all managed and highly available to help ensure maximum performance and productivity.
Natively built with Red Hat OpenShift Container Platform, Cloud Pak for Data System can support seamless migration of containers.
With OpenShift, users can easily write and deploy applications knowing that they’ll run on a platform optimized for Red Hat OpenShift. When choosing to deploy a private cloud on premises, Cloud Pak for Data System provides optimized hardware to increase the container performance of the Red Hat cluster while speeding the time to value of data workloads.
The Cloud Pak for Data platform delivers on six primary uses cases:
Cloud Pak for Data System delivers on the use cases above with accelerated deployment, optimized performance, and reduced footprints—all behind a firewall on premise in your data center.
Beyond the benefits of the platform, the hyper-converged infrastructure offers proven performance, scalability, intelligence and simplicity for business needs. It helps companies with:
On top of the benefits outlined above, another key value-add for Cloud Pak for Data is Netezza Performance Server.
Netezza Performance Server for Cloud Pak for Data allows customers to augment the market leading in-database capabilities of legacy IBM PureData Systems with the full spectrum of analytics and AI solutions modern businesses require. The upgrade to Netezza Performance Server on Cloud Pak for Data and Cloud Pak for Data System is 100% compatible with existing Netezza appliances. Upgrading to the new version is a seamless lift and shift, not a migration. It’s as simple as “nz_migrate.” This saves time, effort and ultimately cost that would have gone toward migrations.
Designed for deep analysis of complex and diverse data volumes scaling into the petabytes, Netezza Performance Server delivers actionable business insight with industry-leading speed and cost of ownership. Customers benefit from in-database analytics and hardware-accelerated machine learning while leveraging Cloud Pak for Data System’s powerful AI capabilities such as IBM InfoSphere DataStage, IBM Cognos Analytics and IBM Watson Knowledge Studio. This combination allows customers to see hidden trends in customer behavior, uncover subtle changes in the market for competitive advantage and model the impact of changes made based on this intelligence.
Netezza Performance Server requires minimal administration and tuning both for initial deployment as well as ongoing maintenance. Netezza Performance Server allows data scientists to spend a majority of their time analyzing results and providing insight rather than preparing data. Netezza Performance Server delivers a distinct performance advantage over other analytic options. Asymmetric massively parallel processing (AMPP)™ architecture combines Red Hat OpenShift and containers running on blade servers and NVMe disk storage with hardware-accelerated data filtering using field programmable gate arrays (FPGAs). This combination delivers fast query performance on complex analytic workloads, providing sophisticated analytics to drive business insight.
Data scientists can build their models using all enterprise data. Netezza Performance Server consolidates all analytics into a single platform where the data resides. Linear scalability with little to no impact on performance allows data scientists and quantitative teams to operate inside the appliance without having to off-load massive data sets to separate infrastructure.
Hardware acceleration allows data scientists to iterate and fine-tune analytical models faster to arrive at the best solution. Once the model is developed, it is seamlessly executed against the relevant data. Prediction and scoring can also be done where the data resides. Users get their predictive scores in near real-time, operationalizing advanced analytics by making it available throughout the enterprise.
Netezza Performance Server supports an extensive array of built-in analytical tools. It is delivered with a library of more than 200 pre-built, scalable, in-database analytic functions that execute analytics in parallel while abstracting away the complexity of parallel programming from developers, users and DBAs. Netezza Performance Server leverages native Cloud Pak for Data System support for Jupyter or Zeppelin notebooks, RStudio and a wide variety of other machine learning environments including SPSS and the Watson Catalog.
The analytics functionality extends to in-database geospatial analytics that are compatible with industry-standard ESRI GIS formats. Integration with existing geospatial analytic environments is simplified as a result. Because data lives in a variety of places and formats, IBM’s Fluid Query allows users to query data in multiple data stores: third-party, open source, Hadoop, cloud and remote data sources. Users familiar with Fluid Query will find 100% compatibility in Performance Server. Companies have accumulated and analyzed data in PureData System for Analytics for years – realizing the value of data-driven business decisions. With Cloud Pak for Data System and Netezza Performance Server, users benefit from a hyper-converged, containerized, microservices "cloud in a box" architecture – bringing the scalability and economics of the cloud to their own data centers. Users can enjoy flexibility with even faster and easier to manage analytics capabilities – a key component to every customer’s journey to AI.
With Netezza Performance Server on Cloud, users can now deploy Netezza Performance Server on private clouds and public clouds—such as IBM Cloud, Amazon Web Services (AWS) or Azure—through Cloud Pak for Data. Netezza is also available on the cloud as a service on Microsoft Azure.
Further information see our blog IBM's Integrated Cloud Data Platform and our Netezza Performance Server webpage.
To hear more about how Netezza integrates with Cloud Pak for Data, contact us here, or to download the brochure, click here.