Key Features of Hive
The main utility of Hive is that it enables analysts to write SQL-like commands. You write commands in SQL, and Hive translates them into MapReduce code.
Apart from this, there are some other significant features of Hive such as queries which can be run, built-in features and types of data which can be processed.
Processing Unstructured Data
One of the most common problems encountered while dealing with big data or the Hadoop ecosystem is processing large volumes of unstructured logs.
In spite of having a tremendous volume and being unstructured in nature, where conventional RDBMSs and SQL fail, Hive is capable of processing unstructured (or rather, semi-structured) logs.
To summarise, the main features of Hive are —
- An SQL-like interface to write queries on large datasets.
- Hive can be used to process all variants of data i.e. Structured, Semi-structured and Unstructured.
- A variety of built-in functions for working with dates, strings, etc.; and
- Easy ETL (extraction, transformation, and loading) of data.