Using Ansible to Secure Cloudera Manager Installation on a Hadoop Cluster

Building a secure Hadoop cluster requires protecting a number of services which comprise Hadoop infrastructure. If you are using CDH distribution, then Cloudera Manager (CM) is one of the components that needs to be secured. There is a good step by step guide in CM documentation, and it’s easy to follow for one server, but what when you have hundreds of them? There are different approaches to the problem of managing server’s configuration at scale, but I’d like to focus on Ansible which is a neat framework for parallel commands execution and complex rollouts.

Read More >

Big Data is the Commercial Supercomputing in the Age of Datafication

Modern commercial supercomputing in the age of Datafication is what we today call Big Data. I think a better term for it would be Data Supercomputing but the industry has already spoken so Big Data it is. The architecture shifted from environments that required massively-parallel compute-intensive number crunching to massively-parallel data-volume-intensive processing.

Read More >

HDFS Authentication Puzzle

HDFS authentication model changed in recent releases, but documentation is stale which can lead people into thinking HDFS is using very primitive authentication

Read More >

Structured vs Unstructured Big Data Architecture

Read More >

A Petabyte of Data Is a Terrible Thing to Waste

Inspired by a T-shirt he got from Splunk, Alex shares his thoughts on Big Data, data storage and the changing landscape with emerging technologies like Hadoop.

Read More >

New Year, New Big Data Appliance

Shortly before we all went on break for the holiday, Oracle announced the new BDA X3-2. Now I have time to properly sit down with a glass of fine scotch and dig into the details of what is included in the release. Turns out that there are quite a few changes packed in. We are getting new hardware, new Hadoop, new Connectors and new NoSQL. Tons of awesome features are included. Let’s get into it.

Read More >

IOUG Big Data SIG — Kick-Off Meeting at OOW12

We have the SIG meeting at Oracle Open World Everyone is welcome. Gwen Shapira is the SIG leader and expect lots of great things in that space. The SIG is also looking for volunteers — that’s going to be hot space so if you want to engage early come and let us know.

Read More >

A First Foray Into Hadoop Territory

Before I dig into the mechanics under the hood of the Hadoop beastie (which is the part, I assume, that is going to be heady as hell), I thought it would be a good idea to play a little bit with some of its applications to give me a feel for the lay of the land.

Read More >

Roll Your Own Big Data Appliance

The chances of getting my hands on 18 servers each with 12 cores, 48g RAM and 84T storage each all connected by InfiniBand are not that great. But I can play with the software, and so can you. Unlike Oracle’s Exadata, almost every software component that is available on the Big Data Appliance is also available for download. So, lets roll our own Big Data appliance!

Read More >

Looking for BigData, NoSQL or Exadata slides?

It was fun presenting today at Portland and I’m looking forward to continuing my user group marathon at Denver tomorrow and on Thursday. Since many people asked me where they can find my slides, and I predict that few more will keep asking about them over the next few days, I uploaded my Big Data and NoSQL presentations to SlideShare. You can find them here:

Read More >
Page 10 of 11« First Page...7891011