ENTRIES TAGGED "hive"
Some organizations create their own real-time analysis tools, while others turn to specialized solutions. In a previous post, I highlighted SQL-based real-time analytic tools that can handle large amounts of data. I noted that other big data management systems such as MPP databases and MapReduce/Hadoop were too batch-oriented to deliver analysis in near real-time. At least for MapReduce/Hadoop systems things may have changed slightly. A group of researchers from UC Berkeley and Yahoo recently modified MapReduce to allow for pipelining between operators.
The growing need to manage and make sense of Big Data, has led to a surge in demand for analytic databases, which many companies are attempting to fill. As an alternative to current shared-nothing analytic databases, HadoopDB is a hybrid that combines parallel databases with scalable and fault-tolerant Hadoop/MapReduce systems.