“SQL on all the things” is the tagline related to Presto, the question engine that was initially developed by Fb to quickly analyze huge quantities of information — notably knowledge that lay scattered throughout a number of codecs and sources. Since its launch as an open supply challenge in 2013, Presto has been adopted broadly throughout a whole bunch of enterprises. At the moment, a powerful worldwide neighborhood contributes to its ongoing growth.
A decade or so beforehand, the normal strategy for an organization to deal with its knowledge processing wants was to arrange an information heart, inventory it with CPUs and arduous drives, and purchase the entire related software program to tame, retailer, and analyze the information. This additionally required an funding in a number of software program licenses and related service contracts. These knowledge providers tended for use in bursts — i.e., the start of the week and finish of the quarter dealt with much more site visitors than different occasions. However since these sources had been statically allotted, they needed to be provisioned for peak utilization and left under-utilized the remainder of the time. Moreover, firms would wish to workers a workforce of engineers to maintain this setup operational, guarantee excessive availability, and troubleshoot varied use instances.
Elastic cloud economics is the tectonic shift on this business that now permits enterprises to pay just for the sources they use. They will faucet low-cost knowledge storage providers supplied within the cloud, resembling Amazon S3, and dynamically provision knowledge processing workhorses within the type of digital servers that carefully match the scale of the various workload.
This decoupling of storage and compute permits customers to seamlessly resize their compute sources. Question engines like Presto work nicely on this auto-scaling context, and they’re seeing increased adoption as extra enterprises transfer knowledge to the cloud. Presto has an extensible, federated design that enables it to learn and course of knowledge seamlessly from disparate knowledge sources and file codecs.
Whereas Presto’s federated structure is kind of helpful in having the ability to course of knowledge in place, it engenders vital complexity in producing an optimum execution plan for a question. The remainder of this text will clarify why producing an optimum question execution plan is a tough drawback for Presto and categorical a view on the way in which ahead.
The evolution of the question optimizer
First, let me take a step again and describe the generic drawback and a number of the options which were developed over the previous a number of a long time. Question optimizers are accountable for changing SQL, expressed declaratively, to an environment friendly sequence of operations which may be carried out by the engine on the underlying knowledge. As such, question optimizers are a essential element of databases.
The enter to a question optimizer is a “logical plan,” which itself is the results of parsing the enter SQL and changing it to a high-level assortment of the operations required to execute the question. The optimizer then works on changing the logical plan into an environment friendly execution technique relying on the operators obtainable to the question engine and the traits of the information structure.
For a typical SQL question, there exists one logical plan however many methods for implementing and executing that logical plan to provide the specified outcomes. The optimizer is accountable for selecting the “finest” execution plan amongst these candidates. Actually, the pool of potential candidates (the search area) that the optimizer should sift via is so giant that figuring out the optimum question plan amongst them is an “NP-hard drawback.” That is one other approach of claiming that computer systems can not clear up this drawback in any cheap timeframe.
Most question optimizers use a set of heuristic guidelines to curtail the search area. The objectives embrace minimizing the information learn from disk (predicate and restrict pushdown, column pruning, and so on.) and lowering community switch of information (reshuffle), with the final word intention of quick question execution and fewer sources used. In its early model, Presto’s question optimizer was a algorithm that will function on, and mutate, the logical plan till a hard and fast level is reached.
Allow us to attempt to perceive what this appears like utilizing a concrete instance. The instance under, picked from a set of advert hoc queries, is a part of the generally used determination assist benchmark, TPC-H. TPC-H Q3, the Delivery Precedence Question, is a question that retrieves the 10 unshipped orders with the very best worth.
SELECT
l_orderkey,
SUM(l_extendedprice * ( 1 - l_discount )) AS income,
o_orderdate,
o_shippriority
FROM
buyer,
orders,
lineitem
WHERE
c_mktsegment = 'AUTOMOBILE'
AND c_custkey = o_custkey
AND l_orderkey = o_orderkey
AND o_orderdate < DATE '1995-03-01'
AND l_shipdate > DATE '1995-03-01'
GROUP BY l_orderkey,
o_orderdate,
o_shippriority
ORDER BY
income DESC,
o_orderdate;
This question performs a three-way be a part of between the information tables buyer
, orders
, and lineitem
(be a part of keys custkey
and orderkey
) and narrows the outcomes set by making use of a set of filters (c_mktsegment
, o_orderdate
, l_shipdate
). The question then calculates an combination SUM
by grouping on every distinct mixture of l_orderkey
, o_orderdate
, and o_shippriority
and orders the consequence set by descending order of the computed column (income
) and orderdate
.
The naive strategy
The naive strategy to optimizing this question would carry out a full cross be a part of on the three tables (a Cartesian product), remove from this set all of the tuples that don’t fulfill the filters within the WHERE
clause, then carry out the aggregation by figuring out every distinctive mixture of l_orderkey
, o_orderdate
, and o_shippriority
and calculate the SUM(l_extendedprice * ( 1 - l_discount ))
, and eventually order the consequence set on o_orderdate
. This sequence of operations, whereas assured to provide correct outcomes, won’t work for even a average dimension dataset in most . The Cartesian product would produce an enormous intermediate consequence set that’s past the principle reminiscence limits of most servers. Additionally it is inefficient to learn all the information from disk for all three tables whereas the question is just excited by particular tuples that fulfill the constraints described within the predicates.
The rule-based optimizer (RBO)
This framework mitigates a number of the issues within the naive strategy. For instance, it could possibly generate a plan by which the predicates are utilized whereas the information is learn for every desk. Due to this fact, whereas materializing tuples for desk buyer
, solely the information that match c_mktsegment = 'AUTOMOBILE'
could be realized. Likewise solely information satisfying o_orderdate < DATE '1995-03-01'
for desk orders
and information satisfying l_shipdate > DATE '1995-03-01'
for desk lineitem
could be learn from disk. This already reduces the scale of the intermediate consequence set by a number of orders of magnitude.
The RBO would additionally by no means counsel a Cartesian product of all three tables for the intermediate consequence on this case. It might as a substitute first carry out a be a part of between two tables, e.g. buyer
and orders
, and solely retain the tuples that match the predicate c_custkey = o_custkey
, after which carry out one other be a part of between this intermediate consequence set and the lineitem
desk.
There are two benefits to the RBO methodology. The primary benefit is the tremendously diminished reminiscence required to compute this be a part of because it aggressively applies filters to prune out tuples that aren’t of curiosity. The second benefit is the enabling of environment friendly algorithms to course of this be a part of, such because the generally used hash be a part of. Briefly, that is an algorithm by which a hash desk could be constructed out of the be a part of keys of the smaller desk.
For instance, whereas becoming a member of buyer
with orders
, a hash desk is constructed on buyer.c_custkey
, after which for the information in orders
, solely the information the place orders.o_custkey
exists within the hash desk are learn. The hash desk is constructed from the smaller enter to the be a part of as a result of this has a better probability of becoming in reminiscence, and materializes solely the tuples which might be mandatory for every be a part of. (Pushing the aggregation under the be a part of is one other superior optimization approach, however is past the scope of this text.)
The associated fee-based optimizer (CBO)
The subsequent step within the evolution of question optimizers was the appearance of cost-based optimization. If one knew some traits of the information within the tables, resembling minimal and most values of the columns, variety of distinct values within the columns, variety of nulls, histograms depicting distribution of column knowledge, and so on., these might have a significant impression in a number of the selections the optimizer would make. Allow us to stroll via this with the TPC-H Q3 benchmark question mentioned above.
The CBO continues from the fastened level reached by the RBO. To enhance on the RBO, it will be helpful to find out the scale of the inputs to the joins with a view to resolve which enter needs to be used to construct the hash desk. If it was a be a part of between two tables on be a part of keys, with no further predicates, the RBO sometimes has information of the row counts of the tables and might select the hash enter appropriately. Nevertheless, generally, there are further predicates that decide the scale of the be a part of enter.
For instance, within the Q3 question there’s buyer.c_mktsegment = 'AUTOMOBILE'
on buyer
and orders.o_orderdate < DATE '1995-03-01'
on orders
. The CBO depends on the engine to offer it with the selectivities of those predicates on the tables, after which makes use of these to estimate the scale of the be a part of inputs. Due to this fact, despite the fact that the buyer
desk could also be smaller than the orders
desk, as soon as the filters are utilized, the variety of information flowing into the be a part of from the orders
desk may very well be fewer. A CBO may also propagate via the plan the estimated cardinality of sure operations resembling joins or aggregations and use these to make clever selections in different elements of the question plan.
Question optimization in Presto vs. conventional databases
Price-based optimization is just not straightforward to understand in Presto. Conventional databases don’t decouple storage and compute, and sometimes don’t interface with any sources of information aside from these which have been ingested. As such, these databases can analyze and retailer all of the related statistics about their datasets. In distinction, Presto operates on knowledge lakes the place the information could possibly be ingested from disparate processes and the size of the information is a number of orders of magnitude bigger.
Maintaining knowledge lake statistics correct and up to date requires an enormous dedication of sources and may be very arduous to take care of — as many enterprises have found. Moreover, as an information federation engine, Presto interfaces, oftentimes concurrently, with a number of datastores which have very totally different traits. These knowledge sources might vary from a simple filesystem (HDFS) to steady streaming knowledge methods (Pinot, Druid) to mature database methods (Postgres, MySQL). And extra of those connectors are being added often.
To create a unified value mannequin that covers all these knowledge sources and permits the optimizer to quantitatively motive about tradeoffs in entry prices throughout these sources is an extremely arduous drawback. The shortage of dependable statistics and the shortage of a unified value mannequin have induced main enterprises to fully ignore cost-based optimization in Presto, despite the fact that the neighborhood has developed restricted assist for some CBO options like be a part of reordering.
A wiser question optimization plan for Presto
It’s my opinion that as a substitute of expending effort to recreate what labored for conventional RDBMSs, particularly painstakingly reconstructing a federated value mannequin for Presto, it will be extra helpful to discover different approaches to fixing this drawback. Outdated-school approaches to question optimization have sometimes thought of the issue in a static context — i.e., the optimizer works independently to give you an optimum plan that’s then handed off to the scheduler or execution engine of the DBMS.
Adaptive question execution is a paradigm that removes the architectural distinction between question planning and question execution. It’s a framework by which the execution plan is adaptive: It might probably monitor the information being processed and might change form relying on the traits of the information flowing via the operators within the plan.
Adaptive question execution sometimes takes as enter a question plan that’s produced as the results of heuristic/rule-based optimization, after which reorders operators or replans subqueries primarily based on run-time efficiency. Such an strategy would partition a question plan into subplans which may be independently altered. Throughout question execution, the system displays and detects a suboptimal subplan of a question primarily based on key efficiency indicators and dynamically replans that fragment. The engine might additionally probably reuse the partial outcomes that had been produced by the previous subplan or operators and keep away from redoing the work. These suboptimal plan fragments could also be reoptimized a number of occasions all through question execution, with a cautious tradeoff between opportunistic re-optimization and the danger of manufacturing a fair much less optimum plan.