トップページに戻る    次のSQLパズルへ    前のSQLパズルへ

2-3-31 標準SQLでメジアン(中央値)を取得

SQLパズル

MedianTable2
Val
---
 10
 10
 50
 50
 50
 50
100
100
150
150
150
200

Valのメジアン(中央値)を取得する。

出力結果
MedianVal
---------
       75


データ作成スクリプト

create table MedianTable2(Val number(3));

insert into MedianTable2 values( 10);
insert into MedianTable2 values( 10);
insert into MedianTable2 values( 50);
insert into MedianTable2 values( 50);
insert into MedianTable2 values( 50);
insert into MedianTable2 values( 50);
insert into MedianTable2 values(100);
insert into MedianTable2 values(100);
insert into MedianTable2 values(150);
insert into MedianTable2 values(150);
insert into MedianTable2 values(150);
insert into MedianTable2 values(200);
commit;


SQL

--■■■group byとhavingを使わない方法■■■
select avg(distinct Val) as MedianVal
  from (select Val,
        (select count(*)+1 from MedianTable2 bb where bb.Val>aa.Val) as Rank,
        (select count(*)   from MedianTable2 bb where bb.Val=aa.Val) as EqualCount
          from MedianTable2 aa) a,
        (select count(*) as RecordCount from MedianTable2) b
 where mod(RecordCount,2) = 0
   and (RecordCount/2  between Rank and Rank+EqualCount-1
    or RecordCount/2+1 between Rank and Rank+EqualCount-1)
    or mod(RecordCount,2) = 1
   and ceil(RecordCount/2) between Rank and Rank+EqualCount-1;

--■■■group byとhavingを使う方法■■■
select avg(Val) as MedianVal
from (select Val
        from (select Val,
              (select count(*)+1 from MedianTable2 bb where bb.Val>aa.Val) as Rank
                from MedianTable2 aa) a,
              (select count(*) as RecordCount from MedianTable2) b
      group by Val,Rank,RecordCount
      having mod(RecordCount,2) = 0
         and (RecordCount/2  between Rank and Rank+count(*)-1
          or RecordCount/2+1 between Rank and Rank+count(*)-1)
          or mod(RecordCount,2) = 1
         and ceil(RecordCount/2) between Rank and Rank+count(*)-1);


解説

以下の結果から分かるように、

SQL> select Val,Rank,EqualCount,RowN,
  2  Rank as MinRowN,
  3  Rank+EqualCount-1 as MaxRowN
  4  from (select Val,
  5        Rank() over(order by Val) as Rank,
  6        count(*) over(partition by Val) as EqualCount,
  7        Row_Number() over(order by Val) as RowN
  8          from MedianTable2)
  9        order by Val;

VAL  Rank  EqualCount  RowN  MinRowN  MaxRowN
---  ----  ----------  ----  -------  -------
 10     1           2     1        1        2
 10     1           2     2        1        2
 50     3           4     3        3        6
 50     3           4     4        3        6
 50     3           4     5        3        6
 50     3           4     6        3        6
100     7           2     7        7        8
100     7           2     8        7        8
150     9           3     9        9       11
150     9           3    10        9       11
150     9           3    11        9       11
200    12           1    12       12       12

MinRowN = Rank
MaxRowN = Rank+EqualCount-1
が成り立ちます。

いいかえれば、
Valをソートキーとすると、
Valごとでの、Row_Number() over(order by Val)の最小値 = Rank() over(order by Val)
Valごとでの、Row_Number() over(order by Val)の最大値 = Rank() over(order by Val)
                                                     +count(*) over(partition by Val)-1
ということです。

これらのことをふまえて、
between述語で、Row_Numberが対象範囲にはいっているかを調べつつ、
標準SQLでメジアンを求めてます。

2-3-6 メジアン(中央値)を取得

2007年8月1日 MySQLでメジアンを求める方法