Should I use Presto or Hive?

While Presto may be the better choice for most scenarios, one should not discount Hive as there is always a use case too demanding for Presto.

As Presto has a limitation on the maximum amount of memory each task can store, it fails if the query requires a significant amount of memory. While this error handling logic (or a lack thereof) is acceptable for interactive queries, it is not suitable for daily/weekly reports that must run reliably. Hive may be a better alternative for such tasks.

Hive

Presto

Optimized for batch processing of large ETL jobs and batch SQL queries on huge data sets.

Used for running interactive analytic queries against data sources of all sizes ranging from gigabytes to petabytes.

Mature SQL – ANSI SQL.

Less mature SQL (still ANSI compliant).

Easily extensible.

Some extensibility, but limited compared to Hive.

Optimized for query throughput.

Optimized for latency.

Needs more resources per query.

Resource-efficient.

Suitable for large fact-to-fact joins.

Optimized for star schema joins (1 large fact table and many smaller dimension tables).

Suitable for large data aggregations.

Interactive queries and quick data exploration.

Rich ecosystem (plenty of resources online)

Less rich ecosystem (but now improving with big users such as Facebook, Netflix).