Conclusion

In this chapter, two partial solutions for the Training of CHnMMs were developed and tested: The Training algorithm based on the expectation-maximization paradigm updates an existing completely specified model based on a trace of observations in order to make the model better explain the ob-servations. The second Training algorithm based on the maximum likelihood approach uses a trace of observations to find the most likely values of unknown parameters in an incomplete model.

The practical feasibility of the EM-based approach is identical to that of the CHnMM Smoothing algorithm (cf. Chapter 6), since the algorithm has been shown to be implementable as an evaluation function to the recursive (or even to the ordinary iterative) CHnMM Smoothing algorithm, and this particular evaluation function barely increases the memory consumption or computation time.

Both algorithms can be used to train initial state and symbol emission proba-bilities. The EM-based algorithm can additionally train the parameters of many well-known probability distributions. However, while the similar Baum-Welch algorithm used to train HMMs has been proven to never result in a worse-fitting model, this behavior cannot be guaranteed for the EM-based CHnMM Training algorithm. Yet, it has been argued that the sometimes slightly suboptimal result of the EM-based approach will usually be acceptable in practice. Overall, the goal of this Chapter to provide a CHnMM Training algorithm with the same power as the HMM Baum-Welch Training algorithm has largely been reached.

Still, no Training algorithm currently exist to train the type of parametric probability distribution used to specify activity durations, or how to train ar-bitrary distribution functions. The EM-based CHnMM Training approach is not guaranteed to result in a model that is more likely to explain the trace of observations than the input model. Thus, while the newly developed algorithms enable the Training of certain aspects of CHnMMs for the very first time, the overall problem of completely training CHnMMs is not yet solved completely.

In this and the previous three chapters, algorithms for all four basic behavior reconstruct tasks derived from HMMs have been developed for the more expres-sive model class of CHnMMs. The next chapter concludes this work by assessing success or failure of completing the goals of this work. Furthermore, the pos-sible impact of its findings with respect to other research areas and potential applications will be discussed, and an outlook on possible further research will be given.

Chapter 8

In this work, Conversive Hidden non-Markovian Models as an extensive class of partially observable discrete stochastic models have been introduced and de-fined. CHnMMs are more expressive than the previous state-of-the-art in PODS models, and are in fact a superset of it (cf. Table 2.1 on page 16). The main benefit of CHnMMs over existing approaches is that they can model systems that at the same time are continuous in time with arbitrarilily distributed ac-tivity durations and contain concurrent activities. They can thus be used to accurately model more real-life systems than was previously possible.

PODS models are usually used to solve one of four well known behavior reconstruction tasks: Evaluation, Decoding, Smoothing and Training. For each of the four tasks algorithms have been developed in this work that can solve them for CHnMMs. The algorithms solve the tasks exactly without having to resort to approximations where applicable (i.e. for Evaluation, Decoding and Smoothing). And for each task, at least one developed algorithm has been shown experimentally to be efficient enough to be practically feasible in real-life applications.

CHnMMs thus extend the state-of-the-art in behavior reconstruction of par-tially observable discrete stochastic sytems by expanding the set of practical problems whose behavior can be reconstructed.

8.1 Assessment of Goal Completion

The goal for this work (cf. Section 1.3) was to enable practitioners to use the four behavior reconstruction tasks Evaluation, Decoding, Smoothing and Training in partially observable discrete stochastic systems that are continuous in time and contain concurrent activities. Those systems can be modelled as CHnMMs, which have been introduced in Chapter 3, and algorithms for the four basic tasks have been developed in Chapters 4, 5, 6 and 7. Only the CHnMM Training algorithms have been show to be somewhat limited in their effectiveness. Thus, the goals of this work have generally been reached.

Furthermore, two success criteria on those goals where defined that should ensure that the developed algorithms are not only of theoretical interest, but are actually practically applicable to real-life problems. To that end, the first criterion demanded that - where applicable - the algorithmsmust compute

ex-117

act results and not approximations. For Evaluation, Decoding and Smoothing algorithms have been developed that do not resort to any kind of approxima-tions¹ and are thus exact, fulfilling this success criterion. And to the Training task the exactness requirement is not applicable, since the Training task is only to find abetter matching model, but does not require any particular model to be returned. Thus, this mandatory success criterion has been met for the Training task as well.

The second success criterion is a soft criterion that demands the developed algorithms to be as efficient as possible with respect to memory consumption and computation time. This criterion was established in order for the algorithms to be computationally feasible for practical problems on commodity computer hardware so that the cost savings realized through the behavior reconstruction with CHnMMs are not eaten up by the additional costs of extensive computing hardware. To that end, algorithms for all four tasks have been shown to be practically feasible with respect to memory consumption and computation time.

This feasibility has been shown for a range of models that differ in the number of concurrent activities and the number as well as the connectivity of discrete states, and for long traces of observations (e.g. sequences of 10000 observations).

Furthermore, for each task the memory and time complexity of at least one developed algorithm has been shown to be at mostO(nlog(n)) in the length of the observation sequence whose behavior is reconstructed. Thus, with increasing computing power, the length of an observation sequence whose behavior can be reconstructed in a given amount of time increases by almost the same factor.

So, the soft criterion of allowingefficient behavior reconstruction on CHnMMs has been met for a wide range of models and observation data.

Overall, the goals of this work have been reached and all success criteria have been adhered to. The research was thus successful.

Im Dokument Conversive Hidden non-Markovian models (Seite 120-124)