企业数字神经网络系统.doc
《企业数字神经网络系统.doc》由会员分享,可在线阅读,更多相关《企业数字神经网络系统.doc(213页珍藏版)》请在文库网上搜索。
1、. . . . . . . . . . . . . . . . . . . . . . . . . . . .24 6.3A Stick-breaking Approach to IBP . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .25 1Introduction to the Dirichlet Distribution An example of a pmf is an ordinary six-sided die - to sample the pmf you roll the die and prod
2、uce a number from one to six. But real dice are not exactly uniformly weighted, due to the laws of physics and the reality of manufacturing. A bag of 100 real dice is an example of a random pmf - to sample this random pmf you put your hand in the bag and draw out a die, that is, you draw a pmf. A ba
3、g of dice manufactured using a crude process 100 years ago will likely have probabilities that deviate wildly from the uniform pmf, whereas a bag of state-of-the-art dice used by Las Vegas casinos may have barely perceptible imperfections. We can model the randomness of pmfs with the Dirichlet distr
4、ibution. One application area where the Dirichlet has proved to be particularly useful is in modeling the distribu- tion of words in text documents 9. If we have a dictionary containing k possible words, then a particular document can be represented by a pmf of length k produced by normalizing the e
5、mpirical frequency of its words. A group of documents produces a collection of pmfs, and we can fi t a Dirichlet distribution to capture the variability of these pmfs. Diff erent Dirichlet distributions can be used to model documents by diff erent authors or documents on diff erent topics. In this s
6、ection, we describe the Dirichlet distribution and some of its properties. In Sections 1.2 and 1.4, we illustrate common modeling scenarios in which the Dirichlet is frequently used: fi rst, as a conjugate prior for the multinomial distribution in Bayesian statistics, and second, in the context of t
7、he compound Dirichlet (a.k.a. P olya distribution), which fi nds extensive use in machine learning and natural language processing. Then, in Section 2, we discuss how to generate realizations from the Dirichlet using three methods: urn-drawing, stick-breaking, and transforming Gamma random variables
8、. In Sections 3 and 6, we delve into Bayesian non-parametric statistics, introducing the Dirichlet process, the Chinese restaurant process, and the Indian buff et process. 1.1 Defi nition of the Dirichlet Distribution A pmf with k components lies on the (k 1)-dimensional probability simplex, which i
9、s a surface in Rk denoted by k and defi ned to be the set of vectors whose k components are non-negative and sum to 1, that is k= q Rk| Pk i=1qi = 1,qi 0 for i = 1,2,.,k. While the set klies in a k-dimensional space, k is itself a (k 1)-dimensional object. As an example, Fig. 1 shows the two-dimensi
10、onal probability simplex for k = 3 events lying in three-dimensional Euclidean space. Each point q in the simplex can be thought of as a probability mass function in its own right. This is because each component of q is non-negative, and the components sum to 1. The Dirichlet distribution can be tho
11、ught of as a probability distribution over the (k 1)-dimensional probability simplex k; that is, as a distribution over pmfs of length k. Dirichlet distribution:Let Q = Q1,Q2,.,Qk be a random pmf, that is Qi 0 for i = 1,2,.,k and Pk i=1Qi = 1. In addition, suppose that = 1,2,.,k, with i 0 for each i
12、, and let 0= Pk i=1i. Then, Q is said to have a Dirichlet distribution with parameter , which we denote by Q Dir(), if it has1 f(q;) = 0 if q is not a pmf, and if q is a pmf then f(q;) = (0) Qk i=1(i) k Y i=1 qi1 i ,(1) 1The density of the Dirichlet is positive only on the simplex, which as noted pr
13、eviously, is a (k 1)-dimensional object living in a k-dimensional space. Because the density must satisfy P(Q A) = R Af(q;)d(q) for some measure , we must restrict the measure to being over a (k 1)-dimensional space; otherwise, integrating over a (k 1)-dimensional subset of a k-dimensional space wil
14、l always give an integral of 0. Furthermore, to have a=(儀匀罚儁儁儀讀缁弋H缀栞礄圀椀褂嬃霃鞃霃攃罣噞晒搀漀挀愀昀搀攀攀挀昀戀攀攀愀攀最椀昀罣噞晒搀漀挀尀尀戀搀攀挀搀搀愀挀搀挀刀挀樀吀焀伀昀夀堀洀圀猀夀稀攀最夀礀礀攀瘀洀挀愀夀琀倀吀瘀夀稀儀罣嘀愀挀挀昀昀愀搀搀:栀艹艹艹u艹葠絶晹攀琀眀娀夀栀吀刀昀砀渀最夀唀儀匀砀爀娀爀稀焀洀一礀搀猀砀氀瘀洀漀栀瘀欀漀欀儀焀最罣噞晒噧晒葨罣鑞畞 攠臿羉祏r葓鹎啓f葭筟膘媍潧綂葙啎祏齢Sf葭罣扜扜珿罓豻呜疂耀舀(磬. . . . . . . . . . . . . . . . . . . . . . . .
- 配套讲稿:
如PPT文件的首页显示word图标,表示该PPT已包含配套word讲稿。双击word图标可打开word文档。
- 特殊限制:
部分文档作品中含有的国旗、国徽等图片,仅作为作品整体效果示例展示,禁止商用。设计者仅对作品中独创性部分享有著作权。
- 关 键 词:
- 企业 数字 神经网络 系统