<img height="1" width="1" style="display:none" src="https://www.facebook.com/tr?id=1334192293361106&amp;ev=PageView&amp;noscript=1">


The beta release of Cloudera Impala, the first (and open source) real-time query engine for Apache Hadoop, has been out in the wild (in binary as well as VM forms) for over a month now, and users have had time to get up-close and hands-on. Consequently, we’re beginning to see some fascinating self-published observations and guides.

Here are just a few examples; you may know of more that we’ve missed:

  • How I Came to Love Big Data (or at least acknowledge its existence), by 37signals data analyst Noah Lorang
    Highlight: “We set up a couple of machines in a cluster, pulled together a few sample datasets, and ran a few benchmarks comparing Impala, Hive, and MySQL, and the results were encouraging for Impala.”
  • Cloudera Impala – Closing the Near Real Time Gap Working with Big Data, by Six3 Systems developer Wayne Wheeles
    Highlight: “This is as advertised; easy to use, easy to implement on, very fast, very flexible and more than capable of running real time analytics.”
  • BigData: Cloudera Impala and ArcPy, by ESRI architect Mansour Raad
    Highlight: “(Impala) was pretty fast and cool…This combination of Big Data in HDFS converted instantly into ‘SmallData’ for rendering and further processing in ArcMap is a great marriage.”
  • Cloudera’s Impala, by IBM researcher Sandeep Tata
    Highlight: “I’m excited that Impala ups the game for structured data processing on Hadoop…”
  • Cloudera Impala – Fast, Interactive Queries with Hadoop, by Vodafone UK architect Istvan Szegedi
    Highlight: ”Cloudera Impala is certainly an exciting solution that is utilizing the same concept as Google BigQuery but promises to support a wider range of input formats, and by making it available as an open source technology it can attract external developers to improve the software and take it to the next stage.”
  • From Zero to Impala in Minutes, by AMPLab developer Matt Massie
    Highlight: “Use Apache Whirr to bring up a Cloudera Impala multi-node cluster on EC2 in minutes. When the installation script finishes, you’ll be able to immediately query the sample data in Impala without any more setup needed.”

Many thanks to these users for their feedback and interest. If you know of other examples for this list, let us know in comments.

Learn more about SAP SuccessFactors
Vijay Nachimuthu

By Vijay Nachimuthu

Vijay Nachimuthu is a Managing Principal of AltaFlux. His blogs mainly focuses on latest cloud technology trends and its impact on enterprises.