Oracle Big Data Appliance (BDA) is being announced at the Oracle OpenWorld keynote as I’m posting this. It will take some time for it to be actually available for shipment and some details will likely change but here is what we have so far about Oracle Big Data Appliance.
A rack with InfiniBand, full of 2U servers similar to Exadata Storage. No flash storage needed so couple sockets and a dozen of disks will do. Maybe more ram than Exadata storage cells themselves. I suspect you could have as many servers as you want in a configuration but since Hadoop clusters are usually dozens and more nodes, full rack seems reasonable with about 20 Hadoop compute nodes to start with. Real deployments should easily go into multiple racks stacked together.
Low latency, high bandwidth communication is critical for fast data loading and later data processing with Hadoop so InfiniBand will be there — same Exadata/Exalogic-like platform.
Oracle should also have its own NoSQL engine — Oracle NoSQL Database. If you know existing Oracle products, Berkley DB seems to be a reasonable foundation to power Oracle’s new NoSQL engine.
The results of Hadoop processing are meant to be fed into Oracle Database (Exadata, of course). So Hadoop Loader tool is in order. To orchestrate this all and create data processing programs for this environment, Oracle Data Integrator (ODI) is already out there and just needs some improvements to spill out scripts runnable on BDA.
Have you heard about R Language? Does it fit here? It absolutely does and in two places — on the appliance itself working on top of Hadoop and Oracle NoSQL as well as front end for data analysts to work with data loaded in Oracle database from BDA. I think Data Scientists will be happy.
This is not a brand new stack — many customers use Hadoop/MapReduce technologies for initial big data analysis and then structure the data for loading into Oracle Database for analysis with more traditional BI tools. Some of the Pythian customers do that type of data processing all the time.
Now, Oracle can provide the whole stack with Exadata and Big Data Appliance in front of it. And of course Exalytics adding the last bit – end user BI interface.
If you are think about BDA as a big ETL machine, you won’t be wrong but it can do more than that.
Oh, almost forgot… Of course, there needs to be something to manage it all so something in Oracle Enterprise Manager must be introduced.
Seems like this is how Big Data Appliance is designed right now but before it hits customers’ hands — things might change so take it with a grain of salt.
Don’t know about you but it seems quite exciting to get my hands on it. I’m sure Gwen Shapira would sell her soul to get her hands dirty with this Big Data Appliance. ;-)