当前位置:网站首页>电商项目常见连续登录,消费,日期等问题
电商项目常见连续登录,消费,日期等问题
2022-08-02 14:05:00 【大学生爱编程】
一.数据,建表语句
id datestr amount
1,2019-02-08,6214.23
1,2019-02-08,6247.32
1,2019-02-09,85.63
1,2019-02-09,967.36
1,2019-02-10,85.69
1,2019-02-12,769.85
1,2019-02-13,943.86
1,2019-02-14,538.42
1,2019-02-15,369.76
1,2019-02-16,369.76
1,2019-02-18,795.15
1,2019-02-19,715.65
1,2019-02-21,537.71
2,2019-02-08,6214.23
2,2019-02-08,6247.32
2,2019-02-09,85.63
2,2019-02-09,967.36
2,2019-02-10,85.69
2,2019-02-12,769.85
2,2019-02-13,943.86
2,2019-02-14,943.18
2,2019-02-15,369.76
2,2019-02-18,795.15
2,2019-02-19,715.65
2,2019-02-21,537.71
3,2019-02-08,6214.23
3,2019-02-08,6247.32
3,2019-02-09,85.63
3,2019-02-09,967.36
3,2019-02-10,85.69
3,2019-02-12,769.85
3,2019-02-13,943.86
3,2019-02-14,276.81
3,2019-02-15,369.76
3,2019-02-16,369.76
3,2019-02-18,795.15
3,2019-02-19,715.65
3,2019-02-21,537.71
create table deal_tb(
id string
,datestr string
,amount string
)row format delimited fields terminated by ',';
二.计算逻辑
1.按照用户和日期进行分组,求每用户每天的消费
select id,datestr,sum(amount) as sum_amount from deal_tb group by id,datestr;
2.根据用户id进行开窗按照日期进行排序,如果日期减去序号数是相同的说明这几行日期是连续的
对日期进行开窗排序
select t1.id as id,t1.datestr as datestr,t1.sum_amount as sum_amount,row_number() over(partition by id order by datestr) as rn from (select id,datestr,sum(amount) as sum_amount from deal_tb group by id,datestr) t1;
上表起名为tt1,注意用到日期函数date_sub(tt1.datestr,tt1.rn)
日期减去序号再得出一列
select tt1.id,tt1.datestr,tt1.sum_amount,date_sub(tt1.datestr,tt1.rn) as grep from (select t1.id as id,t1.datestr as datestr,t1.sum_amount as sum_amount,row_number() over(partition by id order by datestr) as rn from (select id,datestr,sum(amount) as sum_amount from deal_tb group by id,datestr) t1)tt1;
临时结果
1 2019-02-08 12461.55 2019-02-07
1 2019-02-09 1052.99 2019-02-07
1 2019-02-10 85.69 2019-02-07
1 2019-02-12 769.85 2019-02-08
1 2019-02-13 943.86 2019-02-08
1 2019-02-14 538.42 2019-02-08
1 2019-02-15 369.76 2019-02-08
1 2019-02-16 369.76 2019-02-08
1 2019-02-18 795.15 2019-02-09
1 2019-02-19 715.65 2019-02-09
1 2019-02-21 537.71 2019-02-10
2 2019-02-08 12461.55 2019-02-07
2 2019-02-09 1052.99 2019-02-07
2 2019-02-10 85.69 2019-02-07
2 2019-02-12 769.85 2019-02-08
2 2019-02-13 943.86 2019-02-08
2 2019-02-14 943.18 2019-02-08
2 2019-02-15 369.76 2019-02-08
2 2019-02-18 795.15 2019-02-10
2 2019-02-19 715.65 2019-02-10
2 2019-02-21 537.71 2019-02-11
3 2019-02-08 12461.55 2019-02-07
3 2019-02-09 1052.99 2019-02-07
3 2019-02-10 85.69 2019-02-07
3 2019-02-12 769.85 2019-02-08
3 2019-02-13 943.86 2019-02-08
3 2019-02-14 276.81 2019-02-08
3 2019-02-15 369.76 2019-02-08
3 2019-02-16 369.76 2019-02-08
3 2019-02-18 795.15 2019-02-09
3 2019-02-19 715.65 2019-02-09
3 2019-02-21 537.71 2019-02-10
3.统计用户连续交易总额,连续登录天数连续登录开始时间个结束时间以及间隔天数
连续交易总额:
以id和grep进行分组,gerp相同的表示是连续交易的
select ttt1.id,ttt1.grep,sum(ttt1.sum_amount) from
(select tt1.id as id,tt1.datestr as datestr,tt1.sum_amount as sum_amount,date_sub(tt1.datestr,tt1.rn) as grep from (select t1.id as id,t1.datestr as datestr,t1.sum_amount as sum_amount,row_number() over(partition by id order by datestr) as rn from (select id,datestr,sum(amount) as sum_amount from deal_tb group by id,datestr) t1)tt1)ttt1 group by id,grep;
临时结果:(小数列可以用round函数处理一下:round(sum(ttt1.sum_amount)))
1 2019-02-07 13600.23
1 2019-02-08 2991.6500000000005
1 2019-02-09 1510.8
1 2019-02-10 537.71
2 2019-02-07 13600.23
2 2019-02-08 3026.6499999999996
2 2019-02-10 1510.8
2 2019-02-11 537.71
3 2019-02-07 13600.23
3 2019-02-08 2730.04
3 2019-02-09 1510.8
3 2019-02-10 537.71
根据id,grep分组后查询列增加count(1)即可得到连续登录天数(代码省略)
根据id,grep分组后的datestr列,最小的即为连续登录的开始时间,最大的即为连续登录的结束时间
多出连续登录的开始时间和结束时间两列:
select ttt1.id,ttt1.grep,sum(ttt1.sum_amount),min(datestr),max(datestr),count(1) from (select tt1.id as id,tt1.datestr as datestr,tt1.sum_amount as sum_amount,date_sub(tt1.datestr,tt1.rn) as grep from (select t1.id as id,t1.datestr as datestr,t1.sum_amount as sum_amount,row_number() over(partition by id order by datestr) as rn from (select id,datestr,sum(amount) as sum_amount from deal_tb group by id,datestr) t1)tt1)ttt1 group by id,grep;
4.间隔时间:指的是每一次连续登录时间的间隔
根据id进行聚合开窗,grep相减得到的值就是所求
举例:排序开窗后的结果
datestr rn grep
2.8 1 2.7
2.9 2 2.7 2.9-2.7=2 间隔两天
2.10 3 2.7
2.13 4 2.9 2.10与2.13之间也是间隔两天
2.14 5 2.9
2.15 6 2.9
grep相同表示是连续登录的,根据grep分组后,相减得到的就是连续登录的间隔时间
聚合开窗函数:lag(字段,n):获取当前行的前几行的那一行数据
select ttt1.id,ttt1.grep,sum(ttt1.sum_amount),min(datestr),max(datestr),count(1),datediff(grep,lag(grep,1) over(partition by ttt1.id order by grep)) from (select tt1.id as id,tt1.datestr as datestr,tt1.sum_amount as sum_amount,date_sub(tt1.datestr,tt1.rn) as grep from (select t1.id as id,t1.datestr as datestr,t1.sum_amount as sum_amount,row_number() over(partition by id order by datestr) as rn from (select id,datestr,sum(amount) as sum_amount from deal_tb group by id,datestr) t1)tt1)ttt1 group by id,grep;
边栏推荐
- Briefly write about the use and experience of PPOCRLabel
- [VCU] Detailed S19 file (S-record)
- Web Design (Beginners) [easy to understand]
- Introduction and use of Haystack
- uniapp小程序禁止遮罩弹窗下的页面滚动的完美解决办法
- Flask request application context source code analysis
- 宝塔面板搭建小说CMS管理系统源码实测 - ThinkPHP6.0
- uni-app页面、组件视图数据无法刷新问题的解决办法
- 标签加id 和 加号 两个文本框 和一个var 赋值
- C语言——断言assert的使用
猜你喜欢
随机推荐
Briefly write about the use and experience of PPOCRLabel
C语言初级—数组元素的增删改查
什么是 Web 3.0:面向未来的去中心化互联网
ToF相机从Camera2 API中获取DEPTH16格式深度图
绕过正则实现SQL注入
无人驾驶综述:摘要
C语言一维数组练习——将m个元素移动到数组尾部
yolov5 improvement (1) Add attention focus mechanism
C语言初级—水仙花数
C语言sizeof和strlen的区别
C语言待解决
猜数字游戏,猜错10次关机(srand、rand、time)随机数生成三板斧(详细讲解!不懂问我!)
宝塔搭建PESCMS-Ticket开源客服工单系统源码实测
Eslint规则大全
binlog与iptables防nmap扫描
Unit 11 Serializers
MySQL知识总结 (一) 数据类型
drf source code analysis and global catch exception
2022最新交规记忆重点
主存储器(一)