Please add comments here for "A new perspective of statistical modeling with PRISM" by Taisuke Sato and Neng-Fu Zhou.

From James Cussens:

I enjoyed reading this paper, and am certain that many of the ideas (particularly re tabulation) can be applied to my own work on SLPs.

I know that PRISM "explanation graphs" are a compact way of representing all proofs, but I still suspect that in some cases they will be very big. For example consider an HMM with many states (such as those used in bioinformatics). For a given output there are exponentially many possible state trajectories - its basically a very over-general stochastic regular grammar. Will there then be a problem with the size of the explanation graphs? I see the big win with your only-parse-once-and-then-store-the-result in cases where parsing really helps you, ie when the underlying grammar is not too general, and thus it is worth concentrating attention on the relatively small number of valid parses (relative, that is, to all potential parses). In short, is there any case where storing the results of parsing hurts you?

As you might expect, I feel the restriction to failure-free logic programs is too great. It is also, I believe, unnecessary. One need merely add "unobserved failures" to the missing data in your EM algorithm. This is what I did in the FAM algorithm for SLPs. No doubt there will be cases where this will lead to intractability, but not always.

Concerning the "loss of probability mass to infinite generation process" do you know about these papers?

@Article{chi98:_estim_probab_contex_free_gramm,

author = {Zhiyi Chi and Stuart Geman}, title = {Estimation of Probabilistic

Context-Free Grammars},

journal = {Computational Linguistics}, year = 1998, volume = 24, number = 2, pages = {299-305}

}

@Article{chi99:_statis_proper_probab_contex_free_gramm,

author = {Zhiyi Chi}, title = {Statistical Properties of Probabilistic

Context-Free Grammars},

journal = {Computational Linguistics}, year = 1999, volume = 25, number = 1, pages = {299-305}

}

They discuss this issue. I wanted to mention them in my MLJ paper, but the editors asked me to cut the paper down. Grrr.

From Taisuke Sato :

Thanks for your interesting comment. As you suspect, explanation graphs can grow out of the main memory. We do not take any specific measures to combat this problem in the current implemenatin of Prism1.6. However there are ways to avoid this memory-overflow by (1) sophisticated programming requiring less memory if possible, (2) compressing graphs, or (3) sharing subproofs among different goals, etc.(2) and (3) are future implementation issues.

Prism programs assume no loss of probabability mass, and this is a constraint on Prism programming sometimes causing my headache. Mayebe it is possible to count in failures as your FAM algorithm at the cost of incresing computation time. Loss to inifinity on the other hand seems to have no way out. So we were relieved when we found Chi's papers that assure no loss of infinity with learned parameters in PCFG cases:-)


Last edited on Friday, July 4, 2003 12:43:16 am.