• Keine Ergebnisse gefunden

Final consideration

Im Dokument Identifying dependencies among delays (Seite 150-153)

About the Robustness

6.3 Final consideration

In average we could observe an improvement of10% -17% (with respect to all three considered criteria) in the case of Benjamini-Hochberg multistatistical correction.

Avoiding multistatistical corrections delivers a slightly bigger improvement (15% -20% for all the considered criteria).

Concerning the linear regression it is not easy to state which one of the suggested models (A, B or D) is better. Their average values are really close to each other.

Alternative B seems to be slightly stronger with respect to the first two criteria. This predominance can be due to the large amount of simulations executed (133scenarios can be seen as four and a half months of daily registrations), as already observed in Chapter 5.

Alternative D also shows a quite good performance, which we ascribe to the definition of very small slack times. The considered stations are really close to each other, hence the traveling times are seldom bigger than fifteen minutes, especially thanks to the intermediate delay measurements (events marked as special).

Even if the results do not show a clear predominance of our suggested model (no multistatistical correction and regression A), this set of “virtual” constraints provides a good improvement on the robustness of the solutions. The choice of a quantile of 10% does not bring a reasonable enhancement if compared to a5% quantile.

To summarize, we do suggest to transform the dependencies corresponding to the Tri-graph output without multistatistical correction and with a5% quantile into “virtual”

constraints of the model, by the application of the linear regression A.

7

Conclusion

Ich verstehe nur Bahnhof.

German idiomatic expression

In this thesis we have been concerned with the analysis of the capacity constraints in the Delay Management problem for the railway system, from both a theoretical and a practical point of view. As the title suggests, our aim was to find a new formulation for the bonds, that could be inserted in the corresponding uncapacitated Macromodel of the problem, increasing the robustness of the solution without requiring a detailed knowledge of the system infrastructure.

The Delay Management problem has been described as a directed graph (Event-Activity-Graph) and it has been proved to beN P-complete. Our formulation of the problem has been compared to the ones of other research groups to highlight similar-ities, advantages and drawbacks. Altogether it appears easily readable and adaptable to represent all the aspects of the problem.

Core of the work was the investigation of the delay dataset through a stochastic anal-ysis. The measured arrival and departure delays have been considered as observa-tions of random variables, to which different stochastic algorithms (in particular three graphical models) have been applied, in order to determine the dependencies among the disturbances. Three methods have been used for the first time on the railway prob-lem: two of them are classical procedures (Full Conditional Independence Graph and Covariance Graph) while the third one (Tri-graph) has been introduced only in 2004 from Wille and Bühlmann (ETH Zürich). This method has in this thesis one of its first applications outside the genetic field.

The graphical method procedures for the identification of the constraints have been implemented in R and validated through applications to datasets reporting train de-lays in the Harz area. These data have been placed at disposal of the Optimization Group (Prof. A. Schöbel) of the NAM Institute in the context of the DisKon project (Disposition und Konfliktlösungsmanagement für die beste Bahn). Different samples have been tested to evaluate the influence of the amount of variables and observations on the methods. The results obtained either using a multistatistical test (Bonferroni, Benjamini-Hochberg) or without correction, have been compared for different values of the quantile (1%,5% and10%). The edges pointed out by the Tri-graph have been subdivided into four groups corresponding to waiting, driving, virtual activities and errors. Due to the lack of information about the timetable on which the data are based, it has not been possible to make any statement on the changing activities. Further comparisons have been carried out just on the pointed “virtual” activities.

The Full Conditional Independence Graph method could not be applied but in small samples, due to the singularity of the covariance matrix. The Covariance Graph is

penalized by the transitivity property of the covariance, which largely increases the cardinality of the output. The Tri-graph method, although not completely free from the transitivity property, is able to confine its effects. When applied to a “large” data sample the Tri-graph points out up to90% of the original waiting activities and nearly 80% of the driving activities. Even if the size of the sample is reduced to one fifth of the original, the Tri-graph is still able to capture40% of the “virtual” activities.

As common in mathematics, we had to face with some challenging problems along the way. Firstly, the assumption on the distribution of the delays. The analysis we carried out could not prove the normal distribution of the variables required by the graphical methods, but up to now it has not been possible to find any distribution that fits the delays data in any possible contest. We can trace this problem back to the un-availability of a standard procedure to measure the delays. Further researches on this direction are recommended, to analyze properly the behavior of the railway system.

Secondly, the transformation of the edges pointed out by the Tri-graph method into time constraints of the uncapacitated model. The solution we considered is a linear regression on the pair of variables corresponding to each edge. This choice is quite satisfactory, since it is coherent with the theoretical definition of the delay and the slope of the approximation results really close to one in case of waiting activities.

In general, we should deeply consider the influence of the slack times in these con-straints, especially in the case of driving and “virtual” activities.

Additional investigations on the robustness of the new formulation should be con-ducted. A first step on this directions has already been done in the last chapter of this thesis, where an analytical comparison between the Macromodel and our Meso-model (i.e. MacroMeso-model plus “virtual” constraints) has been performed. The concept of robustness is however too broad to consider our analysis exhaustive. Nevertheless, our model improves by18% the robustness of the uncapacitated solution according to the three considered criteria (number of violated headways, cost in seconds of the violations and price in seconds to correct the violations).

Though the literature on capacitated problems and delay propagation is abundant, we did not find any other research group that applied stochastic approaches to identify the “important” capacity constraints of the model (whatever important means). The stochastic analysis was limited to identify the distribution of the delays so that this information could be used in the simulation of the delay propagation. This makes difficult to judge the results we obtained so far, since they are the first of their kind.

But, as a Chinese proverb says,the journey of a thousand miles must begin with a single step.

Figure 7.1:. . .just an “ironic” conclusion. . .

A

Maximum Likelihood

Im Dokument Identifying dependencies among delays (Seite 150-153)