Thursday, November 07, 2013

Presto: Interacting with petabytes of data at Facebook [Facebook Engineering]

Check here for Presto details and documentation; for another perspective, see Facebook open sources its SQL-on-Hadoop engine, and the web rejoices (Gigaom)
"In Fall 2012, a small team in the Facebook Data Infrastructure group set out to solve this problem for our warehouse users. We evaluated a few external projects, but they were either too nascent or did not meet our requirements for flexibility and scale. So we decided to build Presto, a new interactive query system that could operate fast at petabyte scale.
In this post, we will briefly describe the architecture of Presto, its current status, and future roadmap.
Architecture
Presto is a distributed SQL query engine optimized for ad-hoc analysis at interactive speed. It supports standard ANSI SQL, including complex queries, aggregations, joins, and window functions."
Presto: Interacting with petabytes of data at Facebook

No comments: