Palo Alto, CA, USA
May 13, 2021   |  By Bill Zhang
Since my last blog, What you need to know to begin your journey to CDP, we received many requests for a tool from Cloudera to analyze the workloads and help upgrade or migrate to Cloudera Data Platform (CDP). The good news is Cloudera has a tried and tested tool, Workload Manager (WM) that meets your needs. WM saves time and reduces risks during upgrades or migrations.
May 13, 2021   |  By Dinesh Chandrasekhar
Are you using the right stream processing engine for the job at hand? You might think you are—and you very well might be!—but have you really examined the stream processing engines out there in a side-by-side comparison to make sure? Our Choose the Right Stream Processing Engine for Your Data Needs whitepaper makes those comparisons for you, so you can quickly and confidently determine which engine best meets your key business requirements.
May 10, 2021   |  By Tristan Stevens
The introduction of CDP Public Cloud has dramatically reduced the time in which you can be up and running with Cloudera’s latest technologies, be it with containerised Data Warehouse, Machine Learning, Operational Database or Data Engineering experiences or the multi-purpose VM-based Data Hub style of deployment.
May 10, 2021   |  By Bill Havanki
Cloudera Data Platform (CDP) provides an API that enables you to access CDP functionality from a script, or to integrate CDP features with an application. In practice you can use the CDP API to script repetitive tasks, manage CDP resources, or even create custom applications. You can learn more about the API in its official documentation. There are multiple ways to access the API, including through a dedicated CLI, through a Java SDK, and through a low-level tool called cdpcurl.
May 6, 2021   |  By Andreas Skouloudis
In this article, I will be focusing on the contribution that a multi-cloud strategy has towards these value drivers, and address a question that I regularly get from clients: Is there a quantifiable benefit to a multi-cloud deployment? That question is typically being asked when I explain the ability to leverage container technology that offers a consistent deployment environment across multiple clouds and form factors (public, private, or hybrid cloud).
May 5, 2021   |  By WeiWei Yang
Apache YuniKorn (Incubating) has just released 0.10.0 (release announcement). As part of this release, a new feature called Gang Scheduling has become available. By leveraging the Gang Scheduling feature, Spark jobs scheduling on Kubernetes becomes more efficient.
May 4, 2021   |  By Patrick Angeles
Speed matters in financial markets. Whether the goal is to maximize alpha or minimize exposure, financial technologists invest heavily in having the most up-to-date insights on the state of the market and where it is going. Event-driven and streaming architectures enable complex processing on market events as they happen, making them a natural fit for financial market applications.
May 3, 2021   |  By David LeGrand
Last year presented business and organizational challenges that hadn’t been seen in a century and the troubling fact is that the challenges applied pains and gains unequally across industry segments. While brick-and-mortar retail was crushed a year ago with mandated store closures, digital commerce retailers realized ten years of digital sales penetration in only three months.
Apr 30, 2021   |  By Vijay Karthikeyan
Apache Spark is now widely used in many enterprises for building high-performance ETL and Machine Learning pipelines. If the users are already familiar with Python then PySpark provides a python API for using Apache Spark. When users work with PySpark they often use existing python and/or custom Python packages in their program to extend and complement Apache Spark’s functionality. Apache Spark provides several options to manage these dependencies.
Apr 29, 2021   |  By Pierre Villard
Cloudera released a lot of things around Apache NiFi recently! We just released Cloudera Flow Management (CFM) 2.1.1 that provides Apache NiFi on top of Cloudera Data Platform (CDP) 7.1.6. This major release provides the latest and greatest of Apache NiFi as it includes Apache NiFi 1.13.2 and additional improvements, bug fixes, components, etc. Cloudera also released CDP 7.2.9 on all three major cloud platforms, and it also brings Flow Management on DataHub with Apache NiFi 1.13.2 and more.
May 13, 2021   |  By Cloudera
Kafka Summit Europe 2021 takes places May 11-12, what will be the major takeaways and interesting points from the event? Join us, live, to discuss what we think are the most important things to know. Take advantage of live Q&A with some of Cloudera's event streaming experts.
May 6, 2021   |  By Cloudera
Continuous SQL is using Structured Query Language (SQL) to create computations against unbounded streams of data, and show the results in a persistent storage. The result stored in a persistent storage can be connected to other applications to have an analytical visualization of your data. Compared to traditional SQL, in Continuous SQL the data has a start, but no end. This means that queries continuously process results to a sink or other target types. When you define your job in SQL, the SQL statement is interpreted and validated against a schema. After the statement is executed, the results that match the criteria are continuously returned.
Apr 29, 2021   |  By Cloudera
In this meetup, we’re going to once again put ourselves in the shoes of an electric car manufacturer that is deploying a recently developed electric motor out into their new cars. We’re going to show how to explore some data that has been previously collected through various different sources and stored into Apache Hive within a data warehouse, with the goal of tracking down a specific set of potentially defective parts. We’ll then take the results of this data exploration and create an interactive dashboard that presents our results in a visually appealing way using a BI tool that’s integrated right into the same data warehouse.
Apr 29, 2021   |  By Cloudera
Join us for this month's Machine Learning research discussion with Cloudera Fast Forward Labs. We will discuss few-shot text classification - including a live demo and Q&A. This is an applied research report by Cloudera Fast Forward. We write reports about emerging technologies. Accompanying each report are working prototypes or code that exhibits the capabilities of the algorithm and offer detailed technical advice on its practical application.
Apr 23, 2021   |  By Cloudera
Learn how Cloudera and Red Hat help enterprise companies securely manage the complete data lifecycle, putting data to work faster and reducing time to value. Cloudera Data Platform (CDP) Private Cloud on Red Hat® OpenShift® aggregates and visualizes data to derive actionable insights in a secure, hybrid, and open-source environment.
Apr 21, 2021   |  By Cloudera
You asked for and we are delivering the third in our “Hello:“ series of introductory “Big Data” topics. Our next meetup covers using Apache NiFi. Lots of people want to be a data scientist... but what good is machine learning, artificial intelligence or advanced analytics if you don’t have data? Getting data is incredibly important, but getting data in real time or near real time helps you give near real time insight.
Apr 8, 2021   |  By Cloudera
In this video, we'll walk through an example on how you can use Cloudera Machine Learning to run some python code that creates specific Machine Learning models. We’ll then go through some features within Cloudera Machine Learning such as job scheduling and model deployments to see how you can do some more advanced machine development operations!
Apr 8, 2021   |  By Cloudera
Join the CDP Public Cloud team for a live chat about what's new in CDP Public Cloud - we'll chat about some of our favorite new features, including our recent Google Cloud launch.
Apr 7, 2021   |  By Cloudera
A complete overview of Cloudera Machine Learning (CML) on Cloudera Data Platform. This video covers all CML features for data science workflows.
Apr 7, 2021   |  By Cloudera
The kubectl tool provides direct administrative access to the Kubernetes cluster underlying a CDE service, which is useful for troubleshooting, among other things. This video will demonstrate how to set up kubectl access. To enable kubectl, we will need a couple of prerequisites. We wiil need the kubeconfig file from the CDE service. We will need to get and authorize the IAM user, and then need to make sure that everything is set up correctly, both for kubectl and some other tools like k9s.
Jun 28, 2018   |  By Cloudera
Enterprises require fast, cost-efficient solutions to the familiar challenges of engaging customers, reducing risk, and improving operational excellence to stay competitive. The cloud is playing a key role in accelerating time to benefit from new insights. Managed cloud services that automate provisioning, operation, and patching will be critical for enterprises to leverage the full promise of the cloud when it comes to time to value and agility.
Jun 26, 2018   |  By Cloudera
The adoption of cloud computing in the financial services sector has grown substantially in the past three years on a global basis. Diversification of risk is always a key concern for financial institutions and the seeming safety of having a single cloud provider is not being properly measured from a systemic risk and operational risk perspective.
Jun 12, 2018   |  By Cloudera
This white paper provides a reference architecture for running Enterprise Data Hub on Oracle Cloud Infrastructure. Topics include installation automation, automated configuration and tuning, and best practices for deployment and topology to support security and high availability.
May 17, 2018   |  By Cloudera
A cloud-based analytics platform needs to be easy, unified, and enterprise-grade to meet the demands of your business. This white paper covers how Cloudera's machine learning and analytics platform complements popular cloud services like Amazon Web Services (AWS) and Microsoft Azure, and enables customers to organize, process, analyze, and store data at large scale...anywhere.
May 15, 2018   |  By Cloudera
The Modern Platform for Machine Learning and Analytics Optimized for Cloud.
Mar 25, 2018   |  By Cloudera
In the wake of the global financial crisis, the world has become much more interconnected and immensely more complex. As a result, you can no longer simply look at the past as an indicator of future trends. The financial services industry needs real-time insights into numerous interacting variables to make informed decisions.

Cloudera delivers the modern platform for machine learning and analytics optimized for the cloud. Imagine having access to all your data in one platform. The opportunities are endless. We enable you to transform vast amounts of complex data into clear and actionable insights to enhance your business and exceed your expectations.

The right products for the job:

  • Enterprise Data Hub: Operate with confidence—thanks to comprehensive security and governance—while at the same time enabling unrivaled self-service performance at extreme scale. All in an enterprise-grade solution that lets you run anywhere, on-premises or in hybrid- and multi-cloud environments.
  • Data Science Workbench: Accelerate machine learning from research to production with the secure, self-service enterprise data science platform built for the enterprise.
  • Data Warehouse: A modern data warehouse that delivers an enterprise-grade, hybrid cloud solution designed for self-service analytics.
  • Data Science & Engineering: Cloudera Data Science provides better access to Apache Hadoop data with familiar and performant tools that address all aspects of modern predictive analytics.
  • Altus Cloud: The industry’s first machine learning and analytics cloud platform built with a shared data experience.

The world’s leading organizations choose Cloudera to grow their businesses, improve lives, and advance human achievement.