当前位置:网站首页>Quick review of interview (II): why is cross entropy useful
Quick review of interview (II): why is cross entropy useful
2022-07-19 16:44:00 【Zinc a】
Cross entropy CrossEntropy
The formula of multi classification cross entropy is :
J = − 1 m ∑ i = 1 m ∑ k = 1 K y k ( i ) l o g ( p k ( i ) ) \Large J = -\frac{1}{m}\sum_{i=1}^m\sum_{k=1}^K y_k^{(i)}log(p_k^{(i)}) J=−m1i=1∑mk=1∑Kyk(i)log(pk(i))
among m m m Indicates the number of samples , K K K Represents the number of categories , y k ( i ) y_k^{(i)} yk(i) It means the first one i i i The first sample is k k k Values of categories , Hot coding means , When y i y_i yi Belong to the first k k k The class time is 1, No, it is 0. p k ( i ) p_k^{(i)} pk(i) It means the first one i i i Of samples k k k Prediction score of class .
Due to the special representation of the unique heat code , When not equal to this class y = 0 y=0 y=0, therefore ∑ k = 1 K y k ( i ) l o g ( p k ( i ) ) \sum_{k=1}^K y_k^{(i)}log(p_k^{(i)}) ∑k=1Kyk(i)log(pk(i)) Finally, only values belonging to that category are valid , In this case, the real category is category 1, be ∑ k = 1 K y k ( i ) l o g ( p k ( i ) ) = y 1 ( i ) l o g ( p 1 ( i ) ) = l o g ( p 1 ( i ) ) \sum_{k=1}^K y_k^{(i)}log(p_k^{(i)}) = y_1^{(i)}log(p_1^{(i)}) = log(p_1^{(i)}) ∑k=1Kyk(i)log(pk(i))=y1(i)log(p1(i))=log(p1(i))
So actually for a single sample , Only one group needs to be considered 1 1 1 And p p p The relationship between
Draw y = l o g ( x ) y = log(x) y=log(x) Function diagram of
It can be seen that , When p p p near 0 when , − l o g ( p 1 ( i ) ) -log(p_1^{(i)}) −log(p1(i)) The bigger it is ( It was originally negative , Plus the cross entropy J J J The minus sign in front is positive ), The closer the 1 The smaller the loss
Such as p = 0.03 p = 0.03 p=0.03 when , l o g ( p ) = − 3.50 log(p) = -3.50 log(p)=−3.50, l o s s = − l o g ( p ) = 3.50 loss = -log(p) = 3.50 loss=−log(p)=3.50
p = 0.5 p = 0.5 p=0.5 when , l o g ( p ) = − 0.69 log(p) = -0.69 log(p)=−0.69, l o s s = − l o g ( p ) = 0.69 loss = -log(p) = 0.69 loss=−log(p)=0.69
In this way, the multi classification task is transformed into log The predicted value of the model on the function is the same as 1 Distance relationship between , distance 1 The closer you get, the lower the loss , distance 0 The closer you get, the higher the loss
边栏推荐
- The development idea plug-in failed to pass the audit due to compatibility problems
- 项目报错“BeanInitializationException: com.xxxxx.xx.dao.data.Dao can‘t get a sessionFactory“
- QA机器人召回优化
- 在go中使用yaml
- Yolov7: how to export the correct onnx based on your own training model
- VS2019 MFC动态创建EDIT控件 ,CSliderCtrl类成员函数Create应用创建Slider Control控件[MFC动态创建控件四]
- 关于df命令由于设备名太长自动换行的问题
- eslint 使用总结
- 为什么很多人都知道打工不挣钱却还在打工?
- mysql函数汇总之系统信息函数
猜你喜欢
CSDN认证C1级别学习笔记 - WEB基础篇
The IPO of Aoyang technology was terminated: the annual revenue was 800million, and Su Wei held 67.5% equity
RT thread training learning and experience (II)
奥扬科技IPO被终止注册:年营收8亿 苏伟持有67.5%股权
思必驰冲刺科创板:年营收3亿亏3.4亿 阿里与联想之星是股东
CSDN认证C1级别学习笔记 - Web进阶篇
张朝阳夜跑33公里:直播聊物理 揭示“超级月亮”成因
CF591A Wizards‘ Duel
面试快速复习(二):交叉熵为什么有用
Zhang Chaoyang runs 33km at night: live chat on physics reveals the cause of "super moon"
随机推荐
51world 如何修改渲染视频/窗口的大小 ?
OS知识点简介(二)
RT-thread培训学习和心得(二)
Mycat2 start message [mycat-3036][err_init_config] start filemetadatastoragemanager fail
圆环形材质mask
记录使用mRemoteNG软件启动后无法进行复制粘贴
VS2019 MFC动态创建EDIT控件 ,CSliderCtrl类成员函数Create应用创建Slider Control控件[MFC动态创建控件四]
mysql函数汇总之系统信息函数
Thesis reading_ Medical NLP_ SMedBERT
【Linux】 CentOS 7.5 离线环境安装MySQL5.7.30
Uniapp authorized login to obtain user information and code
JMeter 21 天打卡 day12
RT thread training learning and experience (II)
论文阅读_医疗NLP_ SMedBERT
WCF——实现带认证的服务及常见报错
PDF合并后的PDF页面大小不一的解决方法
【Unity3D】UGUI之布局组件
JMeter 21 天打卡 day13
Configuration commune du client nacos
Gesture recognition dataset: Jester dataset decompression