Grow and innovate your business with powerful analytics with Cassandra and Spark (part 1)

Posted in: Advanced Analytics, Business Insights, Cassandra, Technical Track

AnalyticsOrganizations are tapping into increasingly sophisticated analytics techniques to improve opportunities for growth, innovation and competitive advantage.

Organizations are increasing the sophistication of their analytics strategies to improve opportunities for growth, innovation and to take a competitive advantage. The analytics trend is about much more than just data. It is in fact a new way of doing business – one that relies on a data based decision process. Leveraging the organization’s analytics capabilities, large volumes of heterogeneous sources of data (both internal and external to the organization) can deliver near instant feedback. Insight and intelligence derived from fast moving datasets can help management make “split second” strategy decisions, spur innovation, lead to new products, enhance customer relationships, detect fraud, increase operation efficiencies and build a competitive advantage for your organization.

Highly data-driven companies are three times more likely to report significant improvement in decision making, but only 1 in 3 executives say their organization is highly data driven” – PwC’s Global Data & Analytics Survey

Cassandra’s Place in the Analytics Ecosystem

Apache Cassandra is a distributed database for managing large amounts of structured data across many commodity servers, while providing highly available service and no single point of failure” – Datastax

One of Cassandra’s selling points is its high write throughput using commodity hardware, which allows companies to scale infrastructure very quickly and thus, achieve a high degree of flexibility at a low cost. Cassandra’s ability to scale linearly by simply adding a node to a cluster, and letting the cluster re-balance itself, enables companies to manage infrastructure in a cost-effective way and scale faster which translates in a competitive business advantage and delivering real-world ROI for their adopters by ramping up their data collection in a cost-effective way.

Additionally, Cassandra’s ability to integrate well with other technologies such Apache Spark (a fast, in-memory data processing engine that allows to efficiently execute streaming, machine learning or CQL workloads that would be otherwise impossible or inefficient in a Cassandra only ecosystems) permit many interesting use cases that explore the power of both technologies to allow getting more from the data you own. Below are just a few examples of what can be accomplished using this synergy.

Use Cases


The ever-growing challenge for healthcare providers (hospitals, clinics, etc.) is to treat patients in a more efficient way as the pressure to reduce costs increase, while also improving service quality and risk KPIs. In care coordination and home-care services, machine and wearable devices are being leveraged to track and optimize treatment, optimize patient flow and increase equipment up time.

Cassandra is perfect for storing lots of time series patient data that come from ever growing number of devices, sensors and similar mechanisms, that could be further analyzed using Apache Spark.



The financial institutions facing increasing competition institutions are seeking new ways to leverage technology to gain efficiency. The sector has widely adopted analytics to provide better investment decisions, thus higher and more consistent returns. The adoption of analytics is inexorably transforming the sector’s landscape.

The increasing volume of market data poses a big challenge for financial institutions and this is where Cassandra can stretch its legs because Cassandra is a perfect fit for ingesting high-volume time series data. By adding Spark, many new uses for the collected data brings analytics in near run-time such as Credit Risk Analysis, Credit Scoring or fraud detection.


Telecom Companies

Now that customers frequently connect to their networks through voice, text and other smartphone interactions, telecom companies have access to huge quantities of data.

Due to the nature of it’s business, up-time and performance are paramount for this companies. As a result, Apache Cassandra can be a good fit to store data that will later enable operators to conduct analytics programs to boost efficiency of their networks, segment customers and drive profitability.


Energy and Utilities

With the advent of smart-grids, more and more, companies are leveraging the power of analytics for energy management, building automation and energy distribution in utility companies.

Again, Cassandra’s ability to handle high velocity time-series data, gives the ability to integrate millions of data points on the network performance and lets engineers use analytics to monitor the network, sending automated predictions, decreasing thereby the service outages, maintenance costs as well as charting energy usage patterns.

Want to talk with an expert? Schedule a call with our team to get the conversation started.

About the Author

Eight years of Pedro's work history has included consulting and Developmental Business Unit Management for multinational IT consultancy firms. This provided him with extensive experience in leading development teams as well as designing and developing software. Pedro's MBA in management gave him with the business knowledge required when interacting with the clients from various global multinational companies. He is dedicated to building and designing solutions that solve client requirements regardless of the underlining technology.

No comments

Leave a Reply

Your email address will not be published. Required fields are marked *