トップページに戻る    次のSQLパズルへ    前のSQLパズルへ

10-329 連続した範囲をまとめ、最も優先される期間を出力

SQLパズル

my_data2テーブル
  ID  staD        endD
----  ----------  ----------
 108  2009-12-28  2010-01-22
 108  2010-03-29  2010-04-11
 108  2010-05-01  2010-05-31
 111  2011-07-01  2011-07-16
 222  2010-04-25  2010-05-02
 222  2011-07-01  2011-07-16
 222  2011-07-10  2011-07-20
 333  2011-07-01  2011-07-10
 333  2011-07-11  2011-07-15
 333  2011-07-16  2011-07-30
 444  2011-07-01  2011-07-10
 444  2011-07-16  2011-07-30
 445  2010-01-01  2010-04-30
 445  2010-01-01  2010-05-31
 445  2010-05-17  2010-08-06
 658  2010-01-01  2010-04-30
 658  2010-01-01  2010-05-31
 658  2010-01-01  2010-05-31
 777  2010-01-01  2010-01-10
 777  2010-01-01  2010-01-16
 777  2010-01-16  2010-01-17
 777  2010-01-16  2010-01-20
 777  2010-01-21  2010-01-30
 999  2010-03-01  2010-03-14
 999  2010-03-01  2010-04-24
 999  2010-04-25  2010-05-02
1778  2010-01-04  2010-01-17
1778  2010-01-18  2010-01-31
1778  2010-02-01  2010-02-28
1778  2010-04-04  2010-05-02
1778  2010-05-03  2010-05-30
1778  2010-05-31  2010-06-27
1778  2010-07-19  2010-08-01
2535  2010-03-01  2010-03-14
2535  2010-03-15  2010-03-28
2535  2010-04-05  2010-05-02
2710  2010-05-01  2010-08-31
2710  2010-09-01  2010-12-31
2710  2011-01-01  2011-04-30
2710  2011-05-01  2011-08-31

IDごとで、StaDとendDが連続した範囲をまとめる。
そして、以下の優先順位において、最も優先される期間を出力する。

1st - return the current summary range if exists
else 2nd - return the min future summary range if exists
else 3rd - return the max past summary range if exists

日本語化すると、
優先順位1 sysdateを含む期間
優先順位2 未来の期間の中で最小のもの
優先順位3 過去の期間の中で最大のもの

出力結果
  ID  staD      endD
----  --------  --------
 108  10-05-01  10-05-31
 111  11-07-01  11-07-16
 222  11-07-01  11-07-20
 333  11-07-01  11-07-30
 444  11-07-01  11-07-10
 445  10-01-01  10-08-06
 658  10-01-01  10-05-31
 777  10-01-01  10-01-30
 999  10-03-01  10-05-02
1778  10-04-04  10-06-27
2535  10-04-05  10-05-02
2710  10-05-01  11-08-31

こちらを参考にさせていただきました(英語)


データ作成スクリプト

create table my_data2(ID,staD,endD) as
select  108,date '2009-12-28',date '2010-01-22' from dual union all
select  108,date '2010-03-29',date '2010-04-11' from dual union all
select  108,date '2010-05-01',date '2010-05-31' from dual union all
select  111,date '2011-07-01',date '2011-07-16' from dual union all
select  222,date '2010-04-25',date '2010-05-02' from dual union all
select  222,date '2011-07-01',date '2011-07-16' from dual union all
select  222,date '2011-07-10',date '2011-07-20' from dual union all
select  333,date '2011-07-01',date '2011-07-10' from dual union all
select  333,date '2011-07-11',date '2011-07-15' from dual union all
select  333,date '2011-07-16',date '2011-07-30' from dual union all
select  444,date '2011-07-01',date '2011-07-10' from dual union all
select  444,date '2011-07-16',date '2011-07-30' from dual union all
select  445,date '2010-01-01',date '2010-04-30' from dual union all
select  445,date '2010-01-01',date '2010-05-31' from dual union all
select  445,date '2010-05-17',date '2010-08-06' from dual union all
select  658,date '2010-01-01',date '2010-04-30' from dual union all
select  658,date '2010-01-01',date '2010-05-31' from dual union all
select  658,date '2010-01-01',date '2010-05-31' from dual union all
select  777,date '2010-01-01',date '2010-01-10' from dual union all
select  777,date '2010-01-01',date '2010-01-16' from dual union all
select  777,date '2010-01-16',date '2010-01-17' from dual union all
select  777,date '2010-01-16',date '2010-01-20' from dual union all
select  777,date '2010-01-21',date '2010-01-30' from dual union all
select  999,date '2010-03-01',date '2010-03-14' from dual union all
select  999,date '2010-03-01',date '2010-04-24' from dual union all
select  999,date '2010-04-25',date '2010-05-02' from dual union all
select 1778,date '2010-01-04',date '2010-01-17' from dual union all
select 1778,date '2010-01-18',date '2010-01-31' from dual union all
select 1778,date '2010-02-01',date '2010-02-28' from dual union all
select 1778,date '2010-04-04',date '2010-05-02' from dual union all
select 1778,date '2010-05-03',date '2010-05-30' from dual union all
select 1778,date '2010-05-31',date '2010-06-27' from dual union all
select 1778,date '2010-07-19',date '2010-08-01' from dual union all
select 2535,date '2010-03-01',date '2010-03-14' from dual union all
select 2535,date '2010-03-15',date '2010-03-28' from dual union all
select 2535,date '2010-04-05',date '2010-05-02' from dual union all
select 2710,date '2010-05-01',date '2010-08-31' from dual union all
select 2710,date '2010-09-01',date '2010-12-31' from dual union all
select 2710,date '2011-01-01',date '2011-04-30' from dual union all
select 2710,date '2011-05-01',date '2011-08-31' from dual;


SQL

--■■■分析関数を使う方法■■■
select ID,staD,endD
from (select ID,min(staD) as staD,
      max(endD) as endD,
      Row_Number() over(partition by ID order by
                        case when sysdate between min(staD)
                                              and max(endD)
                             then 1 -- 1st return the current summary range
                             when min(staD) > sysdate
                             then 2 -- 2nd return the min future summary range
                             else 3 end, -- 3rd return the max past summary range
                        case when min(staD) > sysdate
                             then GID else -GID end) as rn
      from (select ID,staD,endD,
            sum(willSum) over(partition by ID
                              order by staD) as GID
            from (select ID,staD,
                  max(endD) as endD,
                  case when staD-1
                     <= max(max(endD))
                        over(partition by ID
                             order by staD
                             range between unbounded preceding
                                       and 1 preceding)
                       then 0 else 1 end as willSum
                    from my_data2
                  group by ID,staD))
      group by ID,GID)
where rn = 1;

--■■■階層問い合わせを使う方法(10g以降)■■■
select ID,staD,endD
from (select ID,staD,endD,
      Row_Number() over(partition by ID
      order by case when sysdate between staD
                     and endD
                    then 1 -- 1st return the current summary range
                    when staD > sysdate
                    then 2 -- 2nd return the min future summary range
                    else 3 end, -- 3rd return the max past summary range
               case when staD > sysdate
                    then  to_number(to_char(staD,'yyyymmdd'))
                    else -to_number(to_char(staD,'yyyymmdd')) end) as rn
      from (select ID,min(connect_by_root staD) as staD,endD
              from my_data2
             where connect_by_IsLeaf = 1
            connect by nocycle prior ID=ID
                           and prior endD+1 between staD and endD
            group by ID,endD))
where rn = 1;


解説

期間をまとめるロジックでは、検索case式と分析関数のrange指定を組み合わせてます。
OracleSQLパズル 9-3 開始日と終了日をまとめる

階層問い合わせを使う方法もありますが、経路を列挙する際にメモリを大量に使いそうですね。

■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■■
Row_Number関数の第1ソートキーでは、検索case式で場合分けを行ってます。
なぜ場合分けが必要か

case when sysdate between min(staD)
                      and max(EndD)
     then 1 -- 1st return the current summary range
     when min(staD) > sysdate
     then 2 -- 2nd return the min future summary range
     else 3 end

min(staD)   max(EndD)   resultOfCase
----------  ----------  ------------
2010-02-02  2010-02-05  3
2010-02-12  2010-02-25  3
2010-03-22  2010-04-05  3
2010-06-01  2010-08-01  1
2010-09-11  2010-09-21  2
2010-10-15  2010-10-21  2

*************************************************************************
Row_Number関数の第2ソートキーでは、下記の不等式の法則を意識してます。

a < b ⇔ -a > -b

case when min(staD) > sysdate
     then GID else -GID end

min(staD)   max(EndD)   GID  resultOfCase
----------  ----------  ---  ------------
2010-02-02  2010-02-05    1  -1
2010-02-12  2010-02-25    2  -2
2010-03-22  2010-04-05    3  -3
2010-06-01  2010-08-01    4  -4
2010-09-11  2010-09-21    5   5
2010-10-15  2010-10-21    6   6