トップページに戻る    次のSQLパズルへ    前のSQLパズルへ

10-229 重複を除いた累計

SQLパズル

custTable
yyyymm  cust
------  ----
200711    A
200711    B
200712    A
200712    C
200801    A
200801    B
200802    A
200802    C
200803    A
200803    C
200803    A
200804    D
200805    X
200805    X
200805    X

yyyymmごとの、custの重複を除いたcustの数と
yyyyごとで、そのyyyymmまでの、custの重複を除いたcustの数
を求める。

YYYYMM  cust_count             cust_count_YTD
------  ----------             --------------
200711           2 (AとB)で2                2  (AとB)で2
200712           2 (AとC)で2                3  (AとBとC)で3
200801           2 (AとB)で2                2  (AとB)で2
200802           2 (AとC)で2                3  (AとBとC)で3
200803           2 (AとC)で2                3  (AとBとC)で3
200804           1 (AとD)で2                4  (AとBとCとD)で4
200805           1 (Xのみ)で1               5  (AとBとCとDとX)で5

出力結果
YYYYMM  cust_count  cust_count_YTD
------  ----------  --------------
200711           2               2
200712           2               3
200801           2               2
200802           2               3
200803           2               3
200804           1               4
200805           1               5

こちらを参考にさせていただきました(英語)


データ作成スクリプト

create table custTable(yyyymm,cust) as
select '200711','A' from dual union all
select '200711','B' from dual union all
select '200712','A' from dual union all
select '200712','C' from dual union all
select '200801','A' from dual union all
select '200801','B' from dual union all
select '200802','A' from dual union all
select '200802','C' from dual union all
select '200803','A' from dual union all
select '200803','C' from dual union all
select '200803','A' from dual union all
select '200804','D' from dual union all
select '200805','X' from dual union all
select '200805','X' from dual union all
select '200805','X' from dual;


SQL

--■■■グループ化しない方法■■■
select distinct yyyymm,cust_count,
sum(WillSum) over(partition by substr(yyyymm,1,4) order by yyyymm) as cust_count_YTD
from (select yyyymm,count(distinct cust) over(partition by yyyymm) as cust_count,
      case Row_Number() over(partition by substr(yyyymm,1,4),cust order by yyyymm)
      when 1 then 1 else 0 end as WillSum
        from custTable)
order by yyyymm;

--■■■グループ化する方法■■■
select yyyymm,count(distinct cust) as cust_count,
sum(sum(WillSum)) over(partition by substr(yyyymm,1,4) order by yyyymm) as cust_count_YTD
from (select yyyymm,cust,
      case Row_Number() over(partition by substr(yyyymm,1,4),cust order by yyyymm)
      when 1 then 1 else 0 end as WillSum
        from custTable)
group by yyyymm
order by yyyymm;

--■■■model句を使う方法(10g以降)■■■
select distinct YYYYMM,cust_count,cust_count_YTD
  from custTable
 model
 dimension by (yyyymm,RowNum as soeji)
 measures(cust,0 as cust_count,0 as cust_count_YTD)
 rules(cust_count[any,any]     = count(distinct cust)[CV(),any],
       cust_count_YTD[any,any] = count(distinct cust)[CV() >= yyyymm
                                                 and trunc(CV(),-2) = trunc(yyyymm,-2),any]);


解説

数学で、
X+0 = X
であることをふまえてます。

言いかえると、
0が加算の単位元であることをふまえてます。

10-190 distinctオプションとorder by指定の分析関数を代用
10-214 重複を除いた訪問者数
10-259 重複を除いた集計の累計