DocumentCode :
13503
Title :
A Difference of Convex Functions Approach to Large-Scale Log-Linear Model Estimation
Author :
Tsiligkaridis, Theodoros ; Marcheret, E. ; Goel, Vikas
Author_Institution :
Dept. of Electr. Eng. & Comput. Sci., Univ. of Michigan, Ann Arbor, MI, USA
Volume :
21
Issue :
11
fYear :
2013
fDate :
Nov. 2013
Firstpage :
2255
Lastpage :
2266
Abstract :
We introduce a new class of parameter estimation methods for log-linear models. Our approach relies on the fact that minimizing a rational function of mixtures of exponentials is equivalent to minimizing a difference of convex functions. This allows us to construct convex auxiliary functions by applying the concave-convex procedure (CCCP). We consider a modification of CCCP where a proximal term is added (ProxCCCP), and extend it further by introducing an ℓ1 penalty. For solving the ` convex + ℓ1´ auxiliary problem, we propose an approach called SeqGPSR that is based on sequential application of the GPSR procedure. We present convergence analysis of the algorithms, including sufficient conditions for convergence to a critical point of the objective function. We propose an adaptive procedure for varying the strength of the proximal regularization term in each ProxCCCP iteration, and show this procedure (AProxCCCP) is effective in practice and stable under some mild conditions. The CCCP procedure and proposed variants are applied to the task of optimizing the cross-entropy objective function for an audio frame classification problem. Class posteriors are modeled using log-linear models consisting of approximately 6 million parameters. Our results show that CCCP variants achieve a much better cross-entropy objective value as compared to direct optimization of the objective function by a first order gradient based approach, stochastic gradient descent or the L-BFGS procedure.
Keywords :
audio signal processing; convex programming; entropy; gradient methods; iterative methods; parameter estimation; signal classification; stochastic programming; ℓ1 penalty; ℓ1 auxiliary problem; AProxCCCP; CCCP modification; CCCP procedure; CCCP variants; L-BFGS procedure; ProxCCCP iteration; SeqGPSR; audio frame classification problem; class posteriors; concave-convex procedure; convex auxiliary functions; convex functions approach; cross-entropy objective function; cross-entropy objective value; first order gradient based approach; large-scale log-linear model estimation; log-linear models; parameter estimation methods; stochastic gradient descent; Large scale systems; Linear systems; Optimization; Parameter estimation; Pattern recognition; Concave-convex procedure; adaptive regularization; audio frame classification; proximal regularization; sparse optimization;
fLanguage :
English
Journal_Title :
Audio, Speech, and Language Processing, IEEE Transactions on
Publisher :
ieee
ISSN :
1558-7916
Type :
jour
DOI :
10.1109/TASL.2013.2271592
Filename :
6548025
Link To Document :
بازگشت