• Keine Ergebnisse gefunden

Algorithmic exploration of the entorhinal- entorhinal-hippocampal loop

7.2 A scale-space model for spatial navigation

7.2.2 Simplified model and results

The theoretical results obtained above were simplified further to demonstrate proof-of-principle. The arrangement of spatial samplers was pre-defined to be arranged hexagonally and form Voronoi cells. Thereby, the cluster region of a sampler is also hexagonal. Furthermore, the spatial sampler which is closest to the current location is thought to represent the input space the best according to a winner-take-all mechanism. Hence, the space in which the agent navigates is already discretized according to the hexagonal arrangement.

Due to the absence of additional afferents, the spatial sampling process was used to detect spatial as well as temporal transitions and also to recruit place cells. In contrast to the temporal transition systemMof Section 7.1, the recruiting process of place cells depended on the activation of a spatial sampler instead of directly measuring distance within the abstract place cells. In other words, as soon as a spatial sampler was activated in , the best matching unit in Σ was determined.

If the activity was below a certain threshold, a novel neuron was recruited. Although the activation of the samplers themselves were computed as Euclidean distance and the threshold set to a distance of 0.05 m for the purpose of simplification of the model, the place cells were thereby independent of the underlying metric. Hence the model could be altered by exchanging the computation of activity as a result of Euclidean distance to neural activation, for instance given by boundary vector input.

The spatial sampling centers were used to form grid cells in multiple scales.

The scale increment was set to √

2 according to the theoretical results described above. Note that the discretization and pre-computation of grid fields has several numerical issues. To examine their impact, the grid fields were arranged such that parts of the S-Shaped trajectory, already used in temporal transition systemM(see Figure 7.3), fell on the apex of intermediate sampling clusters. This meant that the agent’s perceived location, which was represented by the discrete sampling process, was prone to jump vertically up- and down according to the sampling centers while the agent actually moved only horizontally. Note that these effects are the results of the simplifications introduced here, primarily the discretization. It is expected that an elaborate model involving non-discretized sampling fields is unlikely to produce similar artifacts and will be subject of future studies. It is believed that the discretized variant suffices to demonstrate proof-of-principle, though.

Figure 7.7 illustrates the learning algorithm of the scale-space model during exploration. Note that acquisition of place cells and temporal transitions is similar to the algorithm used inM, however presented here in a way which allows multiple simultaneously active place cells ptPt. It is further extended such that place cell activity is buffered in a temporal buffer structure. Buffering is required during acquisition of multiple scales of spatial transitions because the transitions are only learned if their corresponding spatial afferents are in a suitable integration window.

Consider a single grid cells. It is selected as best matching unit during the winner-take-all selection process according to the sampling centerct with the highest activity in combination with pre-synaptically active place cells (if any). Inspired by descriptions of synaptic circuits [323], the co-activation used here is formed by a simplelogical and operation. Recall that place cells are recruited by the spatial sampling process on the smallest scale, though. Thus, a grid cell on larger scales expresses on-center regions

7.2 A scale-space model for spatial navigation 87

Figure 7.7Algorithm for learning transitions in a spatio-temporal transition systemP.

which may cover multiple place fields. Learning is therefore modelled according to the following process. Any place cellptPt which is co-active with gt is correlated withgt. In contrast, any place cellpt−n which was in a suitable previous temporal window and co-active with a previous grid cellgprev will be decorrelated fromgt. Note that in the presented simplified model, the correlation and decorrelation mechanism reduces to a binary operation of tagging a place cell to be co-active with either one grid cell or another. Spatial transitions are stored as a logical and connectivity between any previously active place cell pt−nand the previously active grid cellgprev alongside any currently active place cell pt. The suitable temporal window to select pt−n for the acquisition of spatial transitions depends on the scale. On the smallest scale, only immediate temporal neighbors are considered. Due to the spatial scale increment of √

2, the number of place cells in the suitable time window increases by 1 for every next scale. The learning procedure is drastically simplified in order to omit explicit modelling of potentially non-linear temporal dynamics of grid cells and their temporal integration windows. Nevertheless it is believed that it captures the algorithmic effects of temporal integration sufficiently well. A model involving STDP and spiking neurons to examine biologically accurate transition learning is currently in development but will be left for future work.

Retrieval of trajectories is performed as visualized in Figure 7.8. Note the difference to learning. Spatial transitions are only learned when both the location is within a spatial cluster as well as when it happened in the temporal integration window of the corresponding grid cell. Thereby, spatially adjacent locations will not automatically learn a spatial transition if there is no temporal correspondence. This is another severe simplification of a learning rule which depends on spike timings, but suffices

Figure 7.8Algorithm for retrieval of transitions in a spatio-temporal transition system P.

for demonstration purposes. Furthermore, learning will associate place cell activity with a sensory representation according to co-activation learning, which is expressible as a logical and operation. During retrieval, a spatial transition from one place cell to another requires the co-activation of the previous place as well as the activation of the grid cell which is associated with the previous place. However, activation of grid cells happens for any previously active place cell, which corresponds to a logical or operation. In other words, place cell activity can drive grid cells without sensory inputs during retrieval. If not stated otherwise, learning of transitions is continuous during retrieval. The reason is that retrieval may generate novel sequences of spatially adjacent locations which were previously restricted by temporal progression.

The following protocol was used to observe the impact of the scale-space rep-resentation and of continued learning during replay. The S-shaped trajectory was presented to P once to initially learn places and transitions. Afterwards, the sys-tem was queried to reconstruct the entire trajectory while learning was turned off.

Then, the trajectory was replayed. During replay, future places were allowed to be pre-fetched and stored in the temporal buffer. In contrast to exploration, replay was based only on spatial and temporal transitions, i.e. successive activation of spatial symbols without access to true sensory states. Subsequently, the system was queried again to record if replay lead to the detection of novel transitions, and thereby potential short-cuts. In addition to these two settings, the impact of the temporal buffering was examined. The learning and retrieval strategies were repeated, but the temporal integration window was ignored during the learning.

The results of the protocol, which results in a total of four configurations, are depicted in Figure 7.9. The network learned 38 places on theS shaped trajectory from start to goal, which resulted in 37 iterations of the entire system until the