Apache Cassandra promises linear scalability and workload distribution, among many other features—and rightly so. However, as with many good things in life, these benefits come with a set of upfront conditions. When the use case aligns with the architectural limitations,…
Read More >Apache Kafka and Apache Flink are popular platforms for data streaming applications. However, provisioning and managing your own clusters can be challenging and incur operational overhead. Amazon Web Services (AWS) provides a fully managed, highly available version of these platforms…
Read More >Apache Kafka and Apache Flink are popular data streaming applications platforms. However, provisioning and managing your own clusters can be challenging and incur operational overhead. Amazon Web Services (AWS) provides a fully managed, highly available version of these platforms that…
Read More >Episode 57 Shownotes Welcome to another episode of the Datascape Podcast. On today’s show, Big Data expert and Microsoft data platform MVP Luan Moreno tunes in from Brazil. Luan talks about the start of his career as a DBA, his…
Read More >Here at Pythian, we love our data. Our code is no exception (pun sort of intended), so I’ll be covering dataclasses in Python today. The problem As a Python developer, you’ve almost certainly run into code that looks like the…
Read More >I recently encountered the above issue which prompted me to write this blog post so I can easily reference the solution whenever I need it. However, I also hope it might help anyone out there who hits a similar issue….
Read More >Apache Beam is an SDK (software development kit) available for Java, Python, and Go that allows for a streamlined ETL programming experience for both batch and streaming jobs. It’s the SDK that GCP Dataflow jobs use and it comes with…
Read More >“We have many disparate data sources and we’re having a hard time getting a global view of all our data across our organization.” “Our data is currently all in <enter data warehouse name here> and we want to migrate it…
Read More >As a solutions architect at Pythian, I often get questions from clients about the many solutions available to them to address their big data needs. Between Hadoop, cloud-based, and hybrid solutions, finding the best option for their unique needs can…
Read More >A Twitter user recently turned to the platform to issue an appeal to her bank: “Please don’t send me emails asking if I’m ready to buy a house ten minutes after emailing me an overdraft notice.” That one tweet neatly…
Read More >