Maximum Entropy (Maxent) Models
Maximum Entropy (Maxent) Models
- applying Mathematical Programming/Optimization where the objective function is the entropy formula 𝐻𝐏(𝐏) and subjected with any additional constraints
Maxent Model - Example
let's consider a discrete random variable 𝐶 with 2 outcomes: ℎ and 𝑡
- 𝐏(𝐶=ℎ) = probability of seeing heads
- 𝐏(𝐶=𝑡) = probability of seeing tails
below is the formula for univariate entropy, in which we want to maximize 𝐻𝐏(𝐏) with respect to the constraints of the model
- 𝐻𝐏(𝐏) = 𝛴𝑥∊𝐶 [ - 𝐏(𝐶=𝑥) 𝑙𝑛 𝐏(𝐶=𝑥) ]
below are 3 different models
Model With No Constraints | Model With 1 Constraint | Model With 2 Constraints |
---|---|---|
NONE here 𝐏(𝐶) is allowed to be an un-normalized distribution i.e. 𝐏(𝐶) does not have to be a probability distribution | 𝐏(𝐶=ℎ) + 𝐏(𝐶=𝑡) = 1 this constrains 𝐏(𝐶) to be a normalized distribution i.e. 𝐏(𝐶) is a probability distribution | 𝐏(𝐶=ℎ) + 𝐏(𝐶=𝑡) = 1 𝐏(𝐶=ℎ) = 0.3 |
thus there is a 2D plane of possible candidates | thus there is a 1D line of possible candidates | thus there is a single 1D point as the possible candidate |
𝐻𝐏(𝐏) is maximized when: this is because the max of -𝐏(𝐶=𝑥)𝑙𝑛𝐏(𝐶=𝑥) is 1/𝑒 | 𝐻𝐏(𝐏) is maximized when: 𝐏(𝐶=ℎ) = 1/2 𝐏(𝐶=𝑡) = 1/2 | 𝐻𝐏(𝐏) is maximized when: which is the only candidate point |
Why Find Maximum Entropy Model?
maximizing entropy in effect helps us find an estimated distribution model 𝐏ˆ that:
- minimizes commitment (which is another way of saying maximizes entropy)
- resembles some reference to the true population distribution (actually empirical distribution)
this is what we want in the estimated distribution model 𝐏ˆ
Solution
is to maximize entropy 𝐻, subject to feature-based constraints:
- 𝐄𝐏[𝑓𝑖] = 𝐄𝐏ˆ[𝑓𝑖] ↔ 𝛴𝑥∊𝑓𝑖𝐏𝑥 = 𝐶𝑖
adding constraints/features:
- lowers maximum entropy
- raises the maximum likelihood of data
- brings the distribution model further from the uniform distribution
- brings the distribution model closer to the empirical distribution
Maxent - Properties
the Maximum Likelihood Estimation (MLE) exponential model formulation is also convex (dual)
Subpages
Resources
, multiple selections available,