Casting Structured Prediction into Software

Package 1-SP n-SP SSGD BCFW L-SSVM Perceptron ML

PyStruct X X X X X X ×

SVM^struct X X × × X × ×

Dlib X × × × × X ×

CRFsuite × × × × × X X

Table 4.1: Comparison of learning algorithms implemented in popular structured predic-tion software packages. 1-CP stands for1-slack Cutting Plane,n-CP forn-slack Cutting plane, SSGD for stochastic subgradient decent learning of SSVMs, BCFW is as described in section 4.2.4, L-SVM stands for latent variable SSVMs, and ML for maximum likelihood learning.

Package Multi-Class Multi-Label Chain Graph LSVM LDCRF

PyStruct X X X X X X

SVM^struct X X × × × ×

Dlib X × X X × ×

CRFsuite × × X × × ×

Table 4.2: Comparison of models implemented in popular structured prediction software packages. LSVM stands for the latent multi-class SVM, LDCRF for latent dynamic conditional random fields.

variable SSVMs. See Section 4.2 for a detailed description of the algorithms. The cutting plane implementation uses thecvxoptpackage [Dahl and Vandenberghe, 2006] for quadratic optimization.

Encoding the structure of the problem is done using ♠♦❞❡❧ classes, which compute Φ and encode the structure of the problem. PyStructimplements models for many common cases, such as multi-class and multi-label classification, conditional random fields with constant or data-dependent pairwise potentials, and several latent variable models. The maximization for finding y in Equa-tion 4.2 is carried out using highly optimized implementaEqua-tions from external libraries. PyStructincludes support for usingOpenGM [Kappes et al., 2013], LibDAI [Mooij, 2010], fusion moves [Rother et al., 2007, Lempitsky et al., 2010], andAD³ [Martins et al., 2011]. It also includes an interface to a general purpose linear programming solver fromcvxopt [Dahl and Vandenberghe, 2006].

Table 4.1 and Table 4.2 list learning algorithms and models that are imple-mented inPyStructand compares them to other publicly available structured prediction libraries.

4.4 Casting Structured Prediction into Software

4.4.2 Project Goals

Modularity PyStructseparates the algorithms for parameter estimation and inference from the task-dependent formulation of Φ. This allows practitioners, for example in computer vision or natural language processing, to improve their model without changing any optimization code. On the other hand, researchers working on better inference or parameter learning can easily benchmark their improvements on a wide array of applications.

Completeness PyStructaims at providing complete predictors that can be used directly in applications. It contains model formulation for many typical scenarios. This is in contrast to SVM^structthat provides no models at all, requiring the user to develop significant amounts of code, even for simple tasks.

Efficiency WhilePyStructfocuses on usability, providing efficient and com-petitive implementations is important to allow fast prototyping and scaling to large datasets. PyStructachieves the same runtime performance as the popular SVM^structmodel for cutting plane algorithms, and provides implementations of the BCFW and Subgradient methods that scale to large datasets.

Documentation and Examples PyStructprovides full documentation of all classes and functions. It also provides examples for many important applica-tions, such as sequence tagging, multi-label classification and image segmentation.

Furthermore, standard benchmarks are included as examples, which allows easy comparison with the literature.

Integration To improve usability, PyStructis interoperable with other nu-meric and scientific Python projects, such asscikit-learn[Pedregosa et al., 2011], mahotas [Coelho, 2013],gensim [ ˇReh˚uˇrek and Sojka, 2010], andscikit-image. This allows users to build powerful applications with little effort. In particular, most of the model-selection methods ofscikit-learncan be used directly with PyStruct.

Testing PyStruct contains a testing-suite with 80% line-coverage. It also employs continuous integration to ensure stability and a seamless user experience.

Listing 1Example of defining and learning a CRF model.

1 ♠♦❞❡❧ ❂ ❝r❢s✳❊❞❣❡❋❡❛t✉r❡●r❛♣❤❈❘❋✭

2 ❝❧❛ss❴✇❡✐❣❤t❂✐♥✈❡rs❡❴❢r❡q✉❡♥❝②✱

3 s②♠♠❡tr✐❝❴❡❞❣❡❴❢❡❛t✉r❡s❂❬✵✱ ✶❪✱

4 ❛♥t✐s②♠♠❡tr✐❝❴❡❞❣❡❴❢❡❛t✉r❡s❂❬✷❪✱

5 ✐♥❢❡r❡♥❝❡❴♠❡t❤♦❞❂✬q♣❜♦✬✮

7 ss✈♠ ❂ ❧❡❛r♥❡rs✳◆❙❧❛❝❦❙❙❱▼✭♠♦❞❡❧✱ ❈❂✵✳✵✶✱ ♥❴❥♦❜s❂✲✶✮

8 ss✈♠✳❢✐t✭❳✱ ❨✮

4.4.3 Usage Example: Semantic Image Segmentation

We demonstrate the use ofPyStructon the task of semantic image segmentation, the main focus of this work. The example shows how to learn ann-slack support vector machine on a superpixel-based CRF on thePascal VOCdataset. Details of the experiment can be found in Section 4.3. Each sample (corresponding on one entry of the list❳) is represented as a tuple consisting of input features and a graph representation.

The source code is shown in Listing 1. Lines 1–5 declare a model using parametric edge potentials for arbitrary graphs. Here❝❧❛ss❴✇❡✐❣❤tre-weights the Hamming loss according to inverse class frequencies. The parametric pairwise interactions have three features: a constant feature, color similarity, and relative vertical position. The first two are declared to be symmetric with respect to the direction of an edge, the last is antisymmetric. We use fusion moves for inference.

Line 5 creates a❧❡❛r♥❡robject that will learn the parameters for the given model using then-slack cutting plane method, and line 6 performs the actual learning.

Using this simple setup, we achieve an accuracy ( Jaccard index) of 30.3 on the validation set following the protocol of Krähenbühl and Koltun [2012], who report 30.2 using a more complex approach. Training the structured model takes approximately 30 minutes using a single i7 core.

4.4.4 Experiments

WhilePyStructfocuses on usability and covers a wide range of applications, it is also important that the implemented learning algorithms run in acceptable time. In this section, we compare our implementation of the1-slack cutting plane

4.4 Casting Structured Prediction into Software

0.0001 0.001 0.01 0.1 1.0

0 C 500 1000 1500 2000 2500 3000

3500 learning time (s) MNIST SVM^struct

PyStruct

0.0001 0.001 0.01 0.1 1.0

C 0.88

0.89 0.90 0.91 0.92

0.93 accuracy MNIST

SVM^struct PyStruct

Figure 4.2: Runtime comparison ofPyStructand SVM^structfor multi-class classification.

algorithm with the implementation in SVM^struct. We compare performance of the Crammer-Singer multi-class SVM with respect to learning time and accuracy on the MNIST dataset of handwritten digits. While multi-class classification is not very interesting from a structured prediction point of view, this problem is well-suited to benchmark the cutting plane solvers, as loss-augmented prediction is trivial.

Results are shown in Figure 4.2. We report learning times and accuracy for varying regularization parameter C. The MNIST dataset has 60 000 training examples,784features and10classes.The setup of the experiment is the same as in Chapter 5. The figure indicates thatPyStructhas competitive performance, while using a high-level interface in a dynamic programming language.

Im Dokument Methods for Learning Structured Prediction in Semantic Segmentation of Natural Images (Seite 65-70)