From Mar 27, 2012 I recently wrote a job to identify the counties with the most houses for sale (according to the 2010 Census). To do so, I ingested the data from the Census Bureau, and wrote a MapReduce job. My goal was to have all counties and the total houses for sale in ascending […]
Tag: cloud
Getting Started with Hydra
AddThis recently released Hydra on the open source community, and with my proximity to this technology, I thought it would be a wonderful winter adventure to peruse the code base. After cloning the project from GitHub, I required updating/adding Java 7 to my Eclipse environment. It was a bit more tedious than I expected. I […]
Simple Hadoop Overview
As per the Hadoop website: Hadoop is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, […]