Tag: hadoop

County Housing Search

From Mar 27, 2012 I recently wrote a job to identify the counties with the most houses for sale (according to the 2010 Census). To do so, I ingested the data from the Census Bureau, and wrote a MapReduce job. My goal was to have all counties and the total houses for sale in ascending […]

Simple Hadoop Overview

As per the Hadoop website: Hadoop is a framework that allows for the distributed processing of large data sets across clusters of computers using a simple programming model. It is designed to scale up from single servers to thousands of machines, each offering local computation and storage. Rather than rely on hardware to deliver high-availability, […]