• Keine Ergebnisse gefunden

3. Syntactic Assembly Modeling 27

3.5. Summary

instance proposed in [1], would be possible, too. And like in the case ofbaufixrassemblies positions of mating features can guide the parsing process performed by the semantic network.

Again the Ernest knowledge base contains attributes which represent context de-pendent, domain specific knowledge like, for example, the states of virtual mating fea-tures. And in analogy to the baufixrdomain, there are attributes which govern the instantiation process for they register examined objects or estimate how to continue parsing.

Given sets of 3D object models as input data, the semantic network searches them for table scenes similar to the method introduced in Section 3.2.2. A table is an obligatory part of any table scene. Thus, after instantiating concepts for all objects in the input data, a table serves as the starting point of a first parsing attempt. It is instantiated as a

TABLE-PARTwhich in turn is annotated as a part of a modified concept of aTABLE-SCENE. Then, all other objects with unused virtual mating feature in a certain range about an unused feature of the initial table are considered as possible further components of the

TABLE-SCENE. The nearest one that complies with the syntactic model is instantiated correspondingly, e.g. as aCHAIR-PART, and theTABLE-SCENEbecomes and instance as well.

The features responsible for this instantiation are henceforth considered to be usedand the TABLE-SCENE is conjoined with earlier found ones, if necessary. Afterwards, parsing restarts at another table if there is any or continues with unexamined objects near to already instantiated ones. This process iterates until all objects in the input data have been examined. Figure 3.28(b) exemplifies how detected structures are visualized. It depicts a result obtained from the scene in Fig. 3.28(a). In this example, the chair upon the table is not close enough to any of the table’s virtual features to be understood as part of a conventional table scene. Consequently, it was not instantiated as such and thus appears as a wireframe model in the output.

The syntactic structure which the semantic network produced for this table scene is shown in Fig. 3.28(c). Since there is just one table to initialize parsing, no conjoin operations were necessary and the resulting structure is linear (see page 48). Two of the three chairs included in the structure are of typeCHAIR1. The one at the right front of the table is closest to a virtual mating feature of the table and thus is represented on the lowest level of the parsing tree. The CHAIR2 type chair at the left front is to be found on the next level and, even though the perspective is confusing, the chair in the back is farthest from the table and thus integrated on the highest hierarchical level.

correspond-ing parser, formal grammars allow to classify complex patterns and simultaneously yield structural descriptions.

Grammatical formalisms have been introduced to computer aided manufacturing as well. They were used for workpiece or machine tool design [19, 95] or for modeling in assembly sequence planning [121]. But syntax-based contributions to reasoning in assembly make use of rather complex formalisms like shape or graph grammars and simple grammatical approaches were not yet reported. Moreover, to the best of our knowledge they have not yet been applied to assembly recognition.

This chapter dealt with mechanical assemblies from a pattern recognition point of view. Faced with the requirement of visual assembly detection in a dynamic and unpre-dictable environment we introduced a simple but flexible method of syntactic assembly modeling. We demonstrated that concepts from the early days of formal language theory are sufficient to model the component structure of individual mechanical assemblies as well as of whole classes of composite objects. The fundamental idea was to bring together concepts first introduced by Homem de Mello and Sanderson [53, 54, 55] and a discovery by Hall [49].

Dealing with assembly sequence planning, Homem de Mello and Sanderson proposed the use of AND/OR graphs to represent all feasible decompositions of mechanical arti-facts. Hall, however, pointed out that AND/OR graphs and context free grammars as introduced by Chomsky [27] are equivalent formalisms since nodes of an AND/OR graph can be mapped one to one onto variables and terminals of a CFG while hyperlinks can be identified with context free productions. This proves that it is possible to represent assembly structures using simple grammatical formalisms.

Furthermore, this chapter showed that consideration of the mechanical function of subassemblies lead to particular compact representations of assembly knowledge. Under-standing composite objects to consist of functional units results in recursive models of entire classes of assemblies. These general models can derive structural descriptions of all feasible assemblies of a certain class and may actually be learned from examples.

However, context free grammatical models of composite objects are rather coarse and do not represent topological or geometric relations. They cannot model even simple mechanical restrictions and thus may derive structures which are mechanically infeasible.

Thus, more sophisticated syntax-based techniques like attributed grammars [30], graph grammars [87], coordinate grammars [84], or plex grammars [35] seem more appropriate since they could cope with context sensitive and geometrical constraints. But we pointed out that CFGs are structurally equivalent to a certain type of semantic networks. We demonstrated that CFGs can be implemented using the semantic network language Ernestand thatErnestknowledge bases can be employed in assembly detection from vision. By assigning attributes toErnestconcepts domain dependent knowledge of non context free nature can be encoded. Thus, suitable attributes will prevent the generation of infeasible assembly structures from image analysis.

Discussions with researchers working on speech processing revealed a similarity

be-tween discourse and assembly structures. And ideas originally developed for discourse parsing led to a robust algorithm for assembly detection from higher-dimensional data.

The portability of concepts from discourse theory to assembly detection was corroborated by means of two exemplary implementations. One dealt with the originally considered cooperative construction scenario and the other transferred our methods to the detection of certain arrangements of pieces of furniture.

For the baufixrscenario we also presented a couple of extensions of our syntactical methodology. Observing that bolted assemblies can be understood as a collection of partial orders led to a heuristic that recognizes relations among mating features. Syntac-tic image analysis thus can also provide connection details so that there is topological information that allows to generate high-level assembly sequence plans. Moreover, in as-sembly detection from image data perspective occlusions are a rather frequent problem.

They may cause erroneous structures or prevent assembly detection at all. We thus pro-posed to incorporate process knowledge in order to generate reliable assembly structures even if some elementary components were not correctly recognized. Finally, syntactic constraints inherent to our model can be used to support the recognition of elemen-tary objects. To this end syntactic assembly detection is integrated into a feedback loop along with a module for elementary object recognition. Provided with incomplete data from the object recognition module the assembly detection unit hypothesizes labels for previously unrecognized image areas by regarding the syntactic context they appear in.

Provided competing data from elementary object recognition assembly detection can rate the competitors by analyzing which of the alternatives yields the most likely assembly structures.

In the previous chapter, we saw that context free grammars provide a compact and flexible way to represent assembly knowledge. They can be implemented as semantic networks and instantiation strategies inspired from discourse parsing accomplish the detection of assembly structures in image data. However, the scenario of the SFB 360 necessitates more than just detection. The machine that assembles baufixrparts must be able to resolve references made by its instructor. For example, if the instructor wants a tailplane-fin to be manipulated, it is not enough to detect that there are several assemblies in the surroundings, the machine rather has to recognize which of them was meant. It must determine which of the detected structures corresponds to something known as tailplane-fin.

Certain hierarchical descriptions or assembly plans found from image parsing could of course be labeled and stored in a data base. Other plans yielded by scene analysis could then be matched against this set of prototypes in order to realize recognition on a syntactical level. Yet, in general, this will be error-prone and time-consuming. As we discussed in Chapter 3.1.2, mechanical assemblies might, on the one hand, have numerous syntactic structures. On the other hand, our parsing algorithm cannot be guaranteed to always yield the same structure of a complex assembly. Depending on its location and orientation in the image, object cluster parsing will start at different subparts of a cluster so that different syntactic structures are likely to result. We thus would need many representative descriptions for syntactic assembly recognition. However, the more complex an assembly is the more syntactic structures it will generally have. Experiments by Homem de Mello and Sanderson [55] and Wolter [136] even revealed the number of different structures to grow exponentially with the number of interconnected parts.

In this chapter, we shall first verify these observations for the assemblies of the baufixrdomain. Then we will present an approach to assembly recognition that extends the syntactic methodology and allows to recognize assemblies without unreasonable ef-forts. We will introduce a graph representation of relations among mating features of elementary assembly parts. Such descriptions can be understood as the semantics of hi-erarchical assembly sequence plans. They do not have to be provided manually but can be generated automatically. We shall explain this in more detail and will then discuss how this representation scheme applies to different recognition tasks.