我有以下模式的数据:

| user_id | date   | score  | | ------- | ------ | ------ | | 1       | 201901 | 1      | | 1       | 201902 | 2      | | 1       | 201903 | 3      | | 2       | 201901 | 1      | | 2       | 201902 | -1     | | 2       | 201903 | 2      | 

我需要得出以下结果:

| user_id | one_score  | two_score | three_score | max_score | min_score | | ------- | ---------- | --------- | ----------- | --------- | --------- | | 1       | 1          | 3         | 6           | 3         | 1         | | 2       | 1          | 0         | 2           | 2         | -1        | 

请注意,one_score是第一个结果的总和,two_score是前两个结果的总和,三分是与user_id相关联的前三个结果的总和。

到目前为止,查询的总体布局为:

SELECT   MAX(score),   MIN(score) FROM scores GROUP BY user_id 

我不确定计算one_score,two_score和three_score的最佳方法是什么。 一种可能的方法是为每种情况编写一个自定义聚合函数,该函数将整个列作为输入:

SELECT   MAX(score),   MIN(score),   one_score(score),   two_score(score),   three_score(score) FROM scores GROUP BY user_id 

我想知道是否有比这涉及窗口函数更好的方法。 似乎我应该在每列中更改的是应应用求和函数的行数,而不是针对每种情况编写单独的函数。 如何为滚动总和one_score,two_score,three_score编写窗口函数?

注意-这是根据“实际”案例建模的简化案例,有两个区别:

  1. 代替求和函数,它将是一个数学表达式
  2. 范围不是1、2、3的范围,而是会变化很大(最后10个,最后30个,最后50个,等等)。

===============>>#1 票数:2

您可以使用row_number()窗口函数对每个用户的行编号,然后将这些编号用于sum()FILTER子句。

SELECT x.user_id,        sum(x.score) FILTER (WHERE x.rn <= 1) one_score,        sum(x.score) FILTER (WHERE x.rn <= 2) two_score,        sum(x.score) FILTER (WHERE x.rn <= 3) three_score,        max(x.score) max_score,        min(x.score) min_score        FROM (SELECT s.user_id,                     s.score,                     row_number() OVER (PARTITION BY s.user_id                                        ORDER BY s.date) rn                     FROM scores s) X        GROUP BY x.user_id; 

db <>小提琴

  ask by pgdba123 translate from so

===============>>#2 票数:1

我喜欢OP的自定义聚合概念:

create or replace function limited_sum_state(int[], int, int) returns int[] language plpgsql as $$ begin     if $1[1] < $2 then         $1[1] := $1[1] + 1;         $1[2] := $1[2] + $3;     end if;     return $1; end $$;  create or replace function limited_sum_final(int[]) returns int language sql as $$     select $1[2] $$;  create aggregate sum_of_first_elements(int, int) (     sfunc = limited_sum_state,     stype = int[],     finalfunc = limited_sum_final,     initcond = '{0, 0}'); 

现在,我们可以用一种优雅的方式编写查询:

select     user_id,     sum_of_first_elements(1, score order by date) as one_score,     sum_of_first_elements(2, score order by date) as two_score,     sum_of_first_elements(3, score order by date) as three_score,     max(score) as max_score,     min(score) as min_score from scores group by user_id; 

Db <>小提琴。

  ask by pgdba123 translate from so

===============>>#3 票数:1

对于包括Postgres在内的大多数DBMS,您可以在sum(..) over ( partition by ... order by ... ) max(..) over ( partition by ... )min(..) over ( partition by ... )针对您的情况的窗口分析功能。 但是,这样一来,您将获得无可争议的结果,这些结果应该被透视。 然后,我们需要在轮换期间为分数的常规性提供另一个价值。 因此,子查询中将需要rank()row_number()函数,以便在主查询中使用产生的值。 因此,请考虑:

select user_id,        max(case when rnk = 1 then score end) as score_one,        max(case when rnk = 2 then score end) as score_two,        max(case when rnk = 3 then score end) as score_three,        max(max_score) as max_score,        min(min_score) as min_score   from   (    select user_id,           rank() over ( partition by user_id order by date ) as rnk,           sum(score) over ( partition by user_id order by date ) as score,           max(score) over ( partition by user_id ) as max_score,           min(score) over ( partition by user_id ) as min_score      from scores    ) q   group by user_id 

演示版

  ask by pgdba123 translate from so

本文未有回复,本站智能推荐: