Author: Manoj Kukreja

Configure high availability – load balancing for Hiveserver2

Manoj Kukreja, Pythian Big Data Consultant, provides you with the right steps to ensure that you have a smooth and available Hive system, performing under increased workloads.

Read More >

Benchmarking Google Cloud SQL instances

Google Cloud SQL is a fully managed database service that makes it easy to set-up, maintain, manage, and administer your relational MySQL databases in the cloud. Cloud SQL allows you to focus on your applications rather than administering your databases….

Read More >

Step-by-step upgrades to Cloudera Manager and CDH

Lately, several of our security conscious clients have expressed a desire to install and/or upgrade their Hadoop distribution on cluster nodes that do not have access to the internet. In such cases the installation needs to be performed using local…

Read More >

Ingest a single table from Microsoft SQL Server Data into Hadoop

Introduction This blog describes the best-practice approach in regards to the data ingestion from SQL Server into Hadoop. The case scenario is described as under: Single table ingestion (no joins) No partitioning Complete data ingestion (trash old and replace new)…

Read More >