Diving Into Hadoop
At the recent Publisher Forum in Palm Springs, Disney's Karl Reece led an amazing breakout session on a topic I've been quite curious about the last few months: Hadoop. If you attended the PubForum, click here: http://www.admonsters.com/session/member-breakout-session-big-data-syste... You've probably heard of the big data crunching software (maybe you've even seen the famed yellow elephant), but you might not be sure exactly what it does and how it could benefit the data operations within your company.
If that's where you're at, you might want to start with these articles Karl suggested.
From the first piece, GigaOm's Derrick Harris explains: "Hadoop is, at its core, an Apache Software Foundation project consisting of two primary subprojects — Hadoop MapReduce and the Hadoop Distributed File System. MapReduce is the parallel-processing engine that allows Hadoop to churn through large data sets in relatively short order. HDFS is the distributed file system that lets Hadoop scale across commodity servers and, importantly, store data on the compute nodes in order to boost performance (and potentially save money)."
Following up on the breakout session, we wanted to start a forum thread on Hadoop and big data processing tools in general to find out who is working with them, the benefits they've seen, the ways in which they've experimented and directions they hope to push toward, as well as the advice they give others heading down this road.
In addition, feel free to submit any questions about Hadoop, big data and other relevant topics. Big data is still a brave new world for media companies and we can all help each other decipher how the tools and processes fit into ad operations.