Is Partition By more performant than Group By?

In the past I’ve tried to search over internet and read some tutorials and the message I’m getting (also from github comments on sql related project repositories, like this from linq2db) is that new, modern window functions are more efficient than old, traditional grouping.

WindowFunctions

By the way, regarding linq2db and similar patterns (sqlalchemy in python and so on): notice how it enables you to use the new powerful features and so abstraction doesn’t hide complexity, when digging into details is your job.

Traditional, plain old way:grouping

For example this could be the first part of the simplified query.

SELECT distinct
csdc.cargo_id, csh.delivery_id, csh.parcel_id, ab.deal_tracking_num,
parcel.price as parcel_price
FROM comm_schedule_header csh
INNER JOIN ab_tran ab ON  ab.ins_num=csh.ins_num
inner join (select deal_tracking_num, max(ins_num) as ins_num
from ab_tran group by  deal_tracking_num) gab
on  gab.ins_num=csh.ins_num

The effort to avoid the distinct and to optimize the query is not worth it (not even sure it would be much better than the original, eventually).

In the past I’ve tried to search over internet and read some tutorials and the message I’m getting (also from github comments on sql related project repositories, like this from linq2db) is that new, modern window functions are more efficient than old, traditional grouping.

New, modern and more performant approach.

The modernized, completed and reduced SQL, after deleting lines not relevant to the topic, is:

select cargo_id ,delivery_id , parcel_id , deal_tracking_num ,
parcel_price
from
(SELECT csdc.cargo_id, csh.delivery_id, csh.parcel_id, ab.deal_tracking_num,
parcel.price as parcel_price,
row_number() over(
    partition by ab.deal_tracking_num, csdc.cargo_id, csh.delivery_id, csh.parcel_id
    order by ab.ins_num desc) as ab_rn
FROM comm_schedule_header csh
INNER JOIN ab_tran ab ON  ab.ins_num=csh.ins_num
INNER JOIN comm_sched_delivery_cmotion csdc ON csh.delivery_id=csdc.delivery_id
left join (
    select parcel_id, ins.price, row_number()
           over(partition by fee.fee_id, fee.parcel_id order by fee.ins_num desc) as RN
    from INS_FEE_PARAM fee
    inner join ins_price ins on ins.ins_num = fee.ins_num
    and ins.param_seq_num=fee.param_seq_num and ins.pricing_source_id=fee.fee_seq_num
    and ins.pricing_vol>0 and ins.price>0
    where fee.fee_def_id=2
    ) parcel on parcel.RN=1 and parcel.parcel_id=csh.parcel_id
WHERE csh.bav_flag=1
AND csh.volume_type = 8
AND csh.last_update>=DATE '2020-03-01' -- example of parameter selection
AND csh.delivery_id0
ORDER BY  csdc.cargo_id, csh.delivery_id, csh.parcel_id, ab.deal_tracking_num) dash
where dash.ab_rn = 1;

When I time it, I see, for example, 3731 results in about less 10 seconds from Oracle SQL Developer version 18.1 as a client, for a certain date parameter selection (as written above, DATE '2020-03-01'), but I consider this timing not reliable, also because I have to click CTRL + END to fetch all rows (among other reasons, e.g the variance of statistics, possible caching effects, etc…).

My DB is Oracle Database 12c Enterprise Edition – 64 bit, but of course this topic applies much more broadly.

Scientific method, query plan comparison

A more scientific approach, as suggested by DBA experts, is comparing the query plans. Given the permissions I have, I compared the explain plans of the two queries and I roughly looked at the total cost of the select statement. The cost of the window function version, 12793, as per the html rendered here, is the minimum cost of the compared explain plans, so again this confirms that the sql in the question is, up till now, the best one, so the window functions appear more performant than just grouping.

This is the explain plan, exported as html.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s