Unsupervised automatic classification of all‑sky auroral images using deep clustering technology

(1)

https://doi.org/10.1007/s12145-021-00634-1 RESEARCH ARTICLE

Unsupervised automatic classification of all‑sky auroral images using deep clustering technology

Qiuju Yang¹ · Chang Liu¹ · Jimin Liang²

Received: 20 November 2020 / Accepted: 19 May 2021

Abstract

Reasonable classification of aurora is of great significance to the study of the generation mechanism of aurora and the dynamic process of the magnetosphere boundary layer. Previous aurora classification studies, both manual and automatic, rely on experts’ visual inspection and manual labeling of part or all of the data. However, there is currently no consensus on aurora classification schemes. In this paper, an auroral image clustering network (AICNet) is proposed to unsupervised classification of all-sky images by grouping observations according to their morphological similarities. AICNet is fully automatic and requires no human supervision to tell the classification scheme or manually label samples. In the experiments, 4000 dayside all-sky auroral images captured at the Chinese Yellow River Station during 2003–2008 were considered. The images were clustered into two classes. Auroral morphology in the two clusters exhibits high intra-cluster similarity and low inter-cluster similarity. The temporal occurrence distributions illustrate that one cluster appears a double-peak distribution and mostly occurs in the afternoon, while the other cluster mostly occurs before and at noon. Experimental results demonstrate that AICNet can discover the internal structures of auroras and would greatly improve the efficiency of auroral morphology classification.

Keywords Auroral image · Aurora classification · Deep learning · Unsupervised clustering

Introduction

An aurora is a natural light display caused by collisions between solar wind charged particles and the neutral particles in the polar atmosphere. It represents the most impressive manifestations of solar wind-magnetosphere coupling.

Since an aurora is formed through processes occurring in near-Earth space, the morphology of auroral forms as observed from the ground is integral to our understanding of magnetospheric dynamics (Clausen and Nickisch 2018).

The ground-based optical all-sky imager (ASI) is widely used to capture the two-dimensional morphology of auroral

displays and continuously observe auroras at high spatial and temporal resolutions (as shown in Fig. 1).

Auroral displays present a variety of forms when observed from the ground (as shown in Fig. 1a-c). These forms have been given distinct terminologies, from small diffuse patches to arcs, rays, bands, curtains, coronas, etc. These descrip- tions have been used to classify auroral displays. Since cer- tain physical processes occurring in the magnetosphere and ionosphere are responsible for auroral forms, aurora classification is important for studying the relationships between auroras and magnetospheric dynamics. However, due to the transient dynamic process of the magnetosphere boundary layer and the high convergence of the magnetic field lines in the polar regions, the motion and intensity of the auroral morphology change significantly. There is currently no consensus on the classification of auroral forms.

In 1955, Störmer (1955) classified auroras into forms with and without ray structures, and flaming auroras, which pioneered aurora classification and was widely accepted (e.g., in Baranoski et al. (2003)). Auroral forms without a ray structure were further subdivided into arcs, bands, and diffuse patches, while auroral forms with ray structures

Communicated by: H. Babaie

* Qiuju Yang

yangqiuju@snnu.edu.cn

1 School of Physics and Information Technology, Shaanxi Normal University, Xi’an 710119, China

2 School of Electronic Engineering, Xidian University, Xi’an 710071, China

/ Published online: 31 May 2021

(2)

included rayed arcs, rayed bands, rays, coronas, and drapery. Akasofu (1964) divided the nightside auroras into onset, poleward expanding bulge, westward traveling surges, and equatorward recovery, which not only helped to explain the acceleration mechanism of night auroras but also provided a basis for the study of auroral substorms. Based on

the all-sky images and simultaneous Defense Meteorologi- cal Satellite Program (DMSP) particle data, Ayukawa et al.

(1996) studied two characteristic polar cap auroras (polar arc and polar coronas) in the pre-noon sector. Based on the data obtained by the auroral large imaging system (ALIS), Steen et al. (1997) classified auroras into auroral arcs, fragmental

Fig. 1 Auroras and ASIs. (a)-(c) Auroras observed from the ground. (d) The ASIs at Yellow River Station. (e)-(g) ASI images of 557.7, 630.0 and 427.8 nm at 0856 UT on December 26, 2003

(3)

auroral structures, diffuse auroras, and unidentified auroras.

Based on the ground observations from Svalbard, Sandholt et al. (1998) classified the dayside auroral forms into six types: poleward moving auroral forms; multiple, discrete auroral arcs/bands or single arcs; rayed bands; and diffuse auroras. Simmons (1998) classified auroras into 14 types based on their geophysical characteristics including nightside and dayside events, discrete and diffuse events, and lower latitude auroras. Based on the statistical characteristics of the auroras observed at Antarctic Zhongshan Station, Hu et al. (1999) divided auroras into four types: coronas, bands, active surges, and sun-aligned arcs. Subsequently, based on the synoptic observations of auroras along the post-noon oval at Zhongshan station, Yang et al. (2000) divided the dayside auroras into arc and corona auroras. Based on the observations acquired from three-wavelength (427.8, 557.7, and 630.0 nm) ASIs at Yellow River Station (as shown in Fig. 1e-g), Hu et al. (2009) partitioned the dayside oval into four auroral active regions and further subdivided the corona auroras into radial coronas, drapery coronas, and hot-spot coronas according to the typical auroral structures in each region. Han et al. (2015) classified the auroras observed on the ground into two broad categories: discrete and diffuse auroras, where discrete auroras were those with structured forms like distinct arcs, bands, curls, and rays, while diffuse auroras represented regions of relatively homogenous luminosity.

The above aurora classification studies all relied on manual description and recognition by visual inspection combined with empirical judgments. Over the past few decades, in the face of the increasing amounts of massive auroral images, computer vision technology has become popular for automatic auroral image classification. Syrjäsuo and Donovan (2004) introduced computer vision into auroral image classification and classified auroral displays into arcs, patchy auroras, omega bands, and north–south structures according to the shape information. Biradar and Pratiksha (2012) classified ASI images into three distinct categories: arcs, patchy auroras, and omega-bands. In recent years, a number of automatic aurora classification studies have been carried out based on the ASI images captured at Yellow River Station (Yang et al. 2012; Niu et al. 2018; Zhang et al. 2019; Zhong et al. 2018; Yang and Zhou 2020), in which the dayside discrete auroras were classified into arcs, drapery coronas, radial coronas, and hot- spots. However, existing automatic classification methods design classifiers according to a pre-given classification scheme and a manually labeled dataset, which are supervised classification. The key to successful supervised classification is that the manual labels must be correct, because these labels guide the algorithm to learn important features and evaluate the subsequent classification results.

However, the lack of consensus on aurora classification

scheme makes accurate labeling difficult. Furthermore, the process of labeling massive amounts of data for supervised classification is often time-consuming and prone to human errors from factors such as loss of attention and fatigue.

Therefore, it would be highly desirable to develop an aurora classification method based on auroral observation itself regardless of any empirical conclusions or manual labeling.

With this in mind, this study proposes an auroral image clustering network (AICNet) to classify auroral morphology using the unsupervised deep clustering technology based on the all-sky auroral images observed at the Arctic Yel- low River Station. Based on the high-level semantic features extracted by deep convolutional auto-encoder (DCAE), AICNet separates auroral images without prior classification schemes or manual labels. While supervised classification algorithms learn to assign inputted labels to auroral images, the unsupervised clustering method examines the inherent similarities between auroral images and groups them accord- ingly, assigning its own new label to each group. Cluster- ing auroral images is a new attempt in aurora research, which aims to discover the underlying structures in auroral observations.

Methodology

Unsupervised classification is an important task in the field of machine learning. Compared with supervised classification, which is done using a ground truth (always manual labels) to train a model, unsupervised classification does not have labeled data and must infer the inherent structures of the data without using explicit labels. The inherent data structure is always represented by some features, thus an unsupervised algorithm attempts to discover the latent features that describe the data structure.

Figure 2 shows the proposed AICNet for auroral image clustering. DCAE_VGG is a deep convolutional auto- encoder (Rumelhart et al. 1986) whose encoder utilizes the feature extraction layers of VGGNet16 (Visual Geometry Group Network16, Simonyan and Zisserman 2015). AIC- Net consists of a feature extraction network and a clustering network. The feature extraction network obtains the morphological features of auroral images, which is composed of the encoder and hidden layer of DCAE_VGG. The clustering network clusters the extracted features with SpectralNet (Shaham et al. 2018) through affinities learning, spectral map learning, and k-means clustering.

Feature extraction

As shown in Fig. 2, DCAE_VGG is an auto-encoder. An auto-encoder consists of the encoder, hidden layers, and the decoder. It encodes a representation of the input into a

(4)

hidden layer and then decodes it into the output. An auto- encoder aims to reconstruct its input, learning the low- dimensional features from the input without manual labels.

It is trained by minimizing the difference between the input and output to make the model (hidden layer) better charac- terize the input.

Direct clustering the full-size image would bring redun- dant information into the network and result in model per- formance degradation. Before clustering, DCAE_VGG is utilized to reduce the dimension of the input data. When using a deep encoder for clustering, from the early deep embedded clustering (DEC) (Xie et al. 2016) to Spectral- Net (Shaham et al. 2018), the encoder network is always composed of several fully connected (FC) layers. In order to improve the representation ability of the auto-encoder, the feature extraction layers of VGGNet16 are used as the encoder of DCAE_VGG in this study, while the decoder uses only one de-convolutional layer every kernel size for simplicity.

As shown in Fig. 3a, VGGNet16 includes 13 convolutional layers, 5 max-pooling layers, and 3 FC layers. A lower convolutional neural network (CNN) layer provides the detailed local features (e.g., color, edge, and texture) of an image, as the depth of the convolutional layer increases, more abstract and high-level semantic features are captured.

Therefore, the proposed DCAE_VGG takes advantage of both the dimension reduction ability of the auto-encoder and the image feature extraction ability of VGGNet16. When designing the encoder of DCAE_VGG, as shown in Fig. 3b, the first five convolutional layers and all max-pooling layers of VGGNet16 are retained, and the three FC layers and the last Softmax layer of VGGNet16 are replaced by a 512-dimensional FC layer. The hidden layer of DCAE_VGG

is a 10-dimensional FC layer, and the decoder is composed of one 512-dimensional FC layer and four de-convolutional layers. Instead of the original images, the low-dimensional feature vector obtained by the hidden layer is input into the subsequent clustering network.

Spectral clustering

A clustering technique in unsupervised data analysis divides data into several groups, called clusters. Let X={

x₁,…,x_n}

⊆Rd denote a set of unlabeled objects.

Given the number of target clusters k and a distance measure between objects, clustering is to learn a similarity measure between objects in X and use it to learn a map that assigns each object to one of k possible clusters, so that similar objects tend to be grouped in the same cluster. Spectral clustering is a popular technique that has the advantages of simple implementation and strong adaptability to data distribution. However, it is limited in scalability and generaliza- tion. The key of spectral clustering is to solve the spectral decomposition of the similarity matrix to make the intra- cluster similarity and inter-cluster difference of the objects large. The objects are firstly represented by the eigenvectors obtained from the spectral decomposition, and then the eigenvectors are clustered by a traditional clustering method, such as k-means clustering.

Spectral map learning with SpectralNet

SpectralNet, a neural network approach for spectral clustering, was proposed to address the scalability and out-of-sample-extension problems of spectral clustering (Shaham et al. 2018). Once trained, SpectralNet computes

Fig. 2 Framework of the proposed AICNet for auroral image clustering

(5)

a map F_𝜃∶R^d→R^k and a cluster assignment function c:R^k→{1, ...,k} . It maps each input x to an output y = Fθ(x) and provides its cluster assignment c(y). The spectral map

F_θ is implemented using a neural network, and the parameter vector θ denotes the network weights. The training of SpectralNet consists of three steps: firstly, a Siamese network

Fig. 3 Network structures of VGGNet16 (a) and DCAE_VGG (b)

(6)

(Hadsell et al. 2006) is used to conduct the unsupervised learning on the affinity matrix W given the input distance measure; secondly, unsupervised learning of the map Fθ by optimizing a spectral clustering objective while enforcing orthogonality; thirdly, k-means clustering is used to learn the cluster assignments in the eigenspace of the associated graph Laplacian matrix (Shaham et al. 2018). The loss function of SpectralNet is

where w_ij expresses the similarity between x_i and x_j, and y_i is Fθ(x_i). The more similar between x_i and x_j, the smaller w_ij and the loss value L. The expectation E is determined by pairs of objects (x_i, x_j).

Affinities learning with Siamese network

Affinity measure plays an important role in spectral clustering. Shaham et al. (2018) proved that using the Siamese network to get the affinities could dramatically improve the clustering quality over using Euclidean distances. A Siamese network consists of twin networks that share the same network parameters (Chopra et al. 2005). Therefore, it is typically trained on a group of similar (positive) and dissimilar (negative) pairs of objects. By calculating the Euclidean distances between objects, the positive pairs are constructed by the nearest neighbors of each object, while the negative pairs are constructed by the objects that are far away from it. A Siamese network maps each object x_i into z_i=G_𝜃siamese(

x_i) and the network is trained to minimize the loss.

where c is a margin (typically set to 1). Once the Siamese network is well trained, it is used to define the affinity matrix W of SpectralNet,

where σ is the scale of a Gaussian kernel, and W is always symmetrized by setting W_ij=(

W_ij+W_ji)

∕2.

k‑means clustering

After obtaining the spectral map y = Fθ (x), k-means clustering is used to learn the cluster assignment c(y). k-means clustering (Hartigan and Wong 1979) is an iterative technique used to cluster data samples into k groups. The outputs are a set of “labels” assigning each image to one of the k groups.

LSpectralNet(𝜃) =E[

w_ij∥y_i−y_j∥²]

L_siamese(

𝜃_siamese;x_i,x_j)

=

{∥z_i−z_j∥², ( x_i,x_j)

is a positve pair max(

c− ∥z_i−z_j∥², 0)2

, (

x_i,x_j)

is a negative pair,

W_ij= {

exp(−^∥zⁱ^−z^j^∥²

2𝜎² ), z_jis among the nearest neighbors of z_i

0, otherwise ,

Specifically, we first choose the k centroids, then assign each object to its nearest centroid’s cluster (often measured using the Euclidean distance), and finally, update the centroid of each cluster to its center. The last two steps are repeated until the centroids stop moving significantly at each iteration (i.e., until the algorithm converges). k-means clustering is simple, easy to implement, and easy to interpret the clustering results.

Clustering and result analysis

k-means clustering needs to assign the cluster number k in advance. In this section, based on the feature visualization and the cluster validity indices analysis, we clustered the auroral images into 2 classes. Feature visualization and unsupervised classification experiments were conducted to evaluate the effectiveness of AICNet. The examples and the temporal occurrence distributions of the auroras in each cluster were presented to analyze the clustering results.

Data and implementation details

The auroral data explored in this paper were captured by the ASIs at Yellow River Station (YRS), Ny-Ålesund, Svalbard, between December 2003 and January 2008. Only dayside [0300–1500 Universal Time/0600–1800 Magnetic Local Time]

auroras at 557.7 nm were concerned. Altogether 4,000 images containing discrete auroras were used. Since auroras vary continuously and temporally adjacent ASI images are always more similar and more likely to group into a cluster, the 4000 images were randomly selected with a relatively long-time interval to ensure the typicality and diversity of auroral morphology.

DCAE_VGG was trained with stochastic gradient descent (SGD) for 250 epochs with a batch size of 256, where the mean squared error (MSE) was used as the loss function.

After that, the model structure and parameters were saved for the subsequent clustering. During the training process of the clustering network, the network size of both the Siamese network and SpectralNet was 512 × 512 × 4. We trained the Siamese network for 100 epochs with a batch size of 64, an initial learning rate of 0.001, and a weight decay of 0.1. And we trained SpectralNet for 100 epochs with a batch size of 256, an initial learning rate of 0.00001, and a weight decay of 0.001. All codes presented in this paper are downloadable at https:// github. com/ penny wei/ AICNet- clust ering.

Feature visualization by t‑SNE

t-Distributed Stochastic Neighbor Embedding (t-SNE) is a widespread technique for the visualization of high- dimensional data in the field of machine learning (Maaten

(7)

and Hinton 2008). It visualizes the data with hundreds or even thousands of dimensions by giving each data point a location in a two or three-dimensional map. We apply t- SNE here to visualize the auroral features extracted from the DCAE_VGG, and then analyze the effectiveness and separability of the features.

Figure 4 shows the visualization results. We can see that the features of the 4000 auroral images are grouped into two clusters in the two-dimensional map, indicating that the auroral images are partitioned into two classes in the high-dimensional feature space. In addition, the good separation of the two groups of features indicates that DCAE_VGG has extracted a highly separated representation of the auroral images.

Cluster validity indices for determining the best cluster number

In k-means clustering, the value of k must be determined in advance. The optimal k is the number of clusters coinciding with the ground truth. In practice, one will select the most

appropriate k by understanding the dataset or by calculating the cluster validity indices with different k. Cluster validity indices describe objective measures to evaluate clustering quality.

In our experiment, we examined three popular validity indices, Silhouette Coefficient (SC), Calinski-Harabasz Index (CH), and Davies-Bouldin Index (DB) (Sun and Yu 2018), as shown in Table 1. These indices evaluate the given partition of data by measuring cluster compactness and separation without knowing any information about the true partitioning. It should be self-evident that the hope is that images within a cluster are similar to each other while images between different clusters are as different as possible. Specifically, SC works based on the distance between each point within and between the clusters, and CH shows the quality of clustering solution based on the average sum of squares of between and within cluster (Ünlü and Xanthopoulos 2019). The larger the value of SC or CH, the better the clustering result. DB works based on the average similarity between each cluster and its most similar one. A smaller DB value reflects a better clustering result.

Fig. 4 Visualization of the feature vectors extracted from DCAE_VGG

Table 1 Cluster Validity Indices with different value of k. Bold fonts highlight the best results

Validity Index k = 2 k = 3 k = 4 k = 5 k = 6 k = 7 k = 8 k = 9 k = 10

SC 0.671 0.601 0.527 0.535 0.530 0.532 0.546 0.528 0.445

CH 12,869 7939 11,010 9490 9975 10,934 7963 7506 3868

DB 0.454 0.501 0.626 0.606 0.607 0.555 0.469 0.532 0.631

(8)

Based on our understanding of auroral data, we test k = 2 to k = 10 and compare the three cluster validity indices to obtain the optimal cluster number for our dataset. From Table 1, we can figure out the optimal cluster number is 2.

Therefore, this paper gives a detailed analysis of the clustering results when the clustering number is 2.

Clustering results

To give a visual inspection of the clustering results with cluster number = 2, Fig. 5 illustrates 21 examples in each cluster. All clustering results are available at https://

github. com/ penny wei/ AICNet- clust ering- resul ts. The auroral displays in the first cluster (cluster1) have distinct shape structures, which are mainly compose of arc, patch, and spot auroras; the second cluster (cluster2) mainly con- tains corona auroras with rich texture details. The overall intensity of the auroral displays in cluster1 is significantly stronger than that in cluster2. The high inter-class difference and high intra-class similarity of the auroral mor- phologies in the two clusters prove the clustering effectiveness of the proposed AICNet.

By investigating the auroral luminosity distribution and the dynamics of discrete auroral forms, Feldstein et al. (2014) concluded that the changes of discrete auroral forms versus local time exhibited a fixed pattern with respect to the sun: the auroral forms comprised rays near noon, homogeneous arcs during the evening, and rayed

arcs and bands during the night and in the morning. To further interrogate the clustering results, we next examined the temporal occurrence distributions of the images in each cluster, as shown in Fig. 6. The temporal axis was divided into 72 bins of 10-min duration. Images falling into each bin for each cluster were counted. The occurrence distributions were obtained by dividing the total number of images within each time bin. The global distributions of the clustering results reflect that different clusters approximately dominate different magnetic local time (MLT) regions. Figure 6a shows the occurrence distributions of the clustering results, where the red and green curves are the distributions of cluster1 and cluster2 respectively. It can be seen that the occurrence of cluster1 has a double-peak distribution and mostly occur in the afternoon, whereas cluster2 predominantly occurs before and at noon.

Comparison with existing classification mechanism Hu et al. (2009) manually classified the dayside auroras of YRS into the arc, drapery corona, radial corona, and hot- spots, as the examples of the four types given in Fig. 7.

From Fig. 7 we can see that both arc and hot-spot auroras usually have obvious shape structures with arc-shaped or irregular brightening spots, while drapery and radial corona auroras are weak in intensity and are dominated by textures. In order to compare the clustering results with

Fig. 5 Examples of the clustering results

cluster1

cluster2

(9)

the classification mechanism proposed by Hu et al. (2009), we labeled the clustered 4000 auroral images twice. First, according to the classification mechanism proposed by Hu et al. (2009), the 4000 auroral images were manually labeled into four classes, i.e. auroral arc, drapery corona, radial corona, and hot-spot aurora. Then, the four types of auroral images were further merged into two categories according

to the morphological characteristics of each auroral type:

the arc and hot-spot auroras were merged into one category and labeled as arc + hotspot; the drapery corona and radial corona were merged into the other category and labeled as drapery + radial. Figure 6b shows the temporal occurrence distributions of the two merged categories. For the conveni- ence of comparison, similar occurrence distributions are

Fig. 6 Temporal occurrence distributions. (a) shows the temporal occurrence distributions of the clustering results, and (b) shows the temporal occurrence distributions obtained by merging images from two categories that were manually labeled according to the classification mechanism proposed by Hu et al. (2009)

Fig. 7 Examples of the four categories of auroral images according to the classification scheme proposed by Hu et al. (2009)

(10)

drawn with the same color between Fig. 6a and b (Labeling a cluster is arbitrary: we only need to assign all the images in a group with the same label). Obviously, the occurrence time distributions in Fig. 6a and b are roughly the same.

The main difference between our clustering results and the classification mechanism proposed by Hu et al. (2009) lies in the so-called hot-spot auroras. Hu et al. (2009) classified hot-spots as a subclass of corona auroras, while in our clustering results, as shown in Fig. 5, hot-spot auroras appear in both cluster1 and cluster2, depending on the auroras are shape-dominated or texture-dominated. As can be seen from Fig. 6, many complex auroral structures, such as rayed structures, rayed bundles, spots, and irregular patches, may appear in the hot-spot displays. The hot-spot auroras were also misclassified as arc auroras or radial coronas in many previous classification studies (e.g., Wang et al. 2010; Yang et al. 2012; Yang and Zhou 2020).

Therefore, it remains to be discussed whether it is reasonable to classify hot-spots as a subclass of corona auroras.

Conclusions

As an impressive natural phenomenon with a wide diversity of forms and rapid variations in luminosity, auroras have yet to be fully characterized. There is no consensus on the aurora classification scheme, and the number of aurora categories and the characteristics of each category are not fully defined. In this paper, the deep learning technology was applied to auroral morphology clustering, and an auroral image clustering network (AICNet) was proposed to explore the optimal classification of the dayside auroras at Yellow River Station.

Clustering is an unsupervised machine learning method.

In our method, the cluster number must be determined in advance. According to the visualization of the features extracted by DCAE_VGG and the cluster validity indices, we clustered auroral images into two groups. Both the intra- cluster similarity and the inter-cluster difference of the auroral morphology (shape, texture, brightness, etc.) in the two clusters are high; and the temporal occurrence distributions of the auroras in the two clusters also have very different characteristics: one cluster has a double-peak distribution and mainly occurs in the afternoon, and the other cluster mostly occurs before noon and near noon. The proposed AICNet is knowledge-free and requires no manual labels so human bias can be avoided, which greatly improves the efficiency of auroral morphology classification.

This study represents a preliminary effort towards apply- ing unsupervised clustering to automatic auroral image recognition. Several issues require further investigation. First, we only randomly selected 4000 images from the auroral observations at Yellow River Station in 2003–2008 to prove

the validity of the method, more auroral data needs to be considered in future work. Second, only auroral morphology was characterized in this study. Since aurora is essentially a dynamically evolving process, it is expected to add motion features so that to classify auroras from both spatial structure and evolution. Third, more fine-grained features may be taken into account to further subdivide the two clusters, such as further dividing arcs into the single arc and multiple arcs.

Acknowledgements This work was supported by the National Natu- ral Science Foundation of China (Grants 41504122 and 61571353), National Science Basic Research Plan in Shaanxi Province of China (Grant 2020JM-272), and Fundamental Research Funds for the Central Universities (Grant GK202103020). Auroral observations at Yellow River Station (YRS) are supported by CHINARE and provided by Polar Research Institute of China (http:// www. china re. org. cn: 8000/

uap/ datab ase).

References

Akasofu SI (1964) The development of the auroral substorm. Planet Space Sci 12(4):273–282

Ayukawa M, Makita K, Yamagishi H, Ejiri M, Sakanoi T (1996) Char- acteristics of polar cap aurora. J Atmos Terr Phys 58:1885–1894 Baranoski GVG, Rokne JG, Shirley P, Trondsen TS, Rui B (2003)

Simulating the aurora. J Vis Comput Animat 14:43–59

Biradar C, Pratiksha SB (2012) An Innovative Approach for Aurora Recognition. International Journal of Engineering Research and Technology 1(7):1–5

Chopra S, Hadsell R, Lecun Y (2005) Learning a similarity metric discriminatively, with application to face verification. In Confer- ence on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, 539–546

Clausen LBN, Nickisch H (2018) Automatic classification of auroral images from the Oslo Auroral THEMIS (OATH) dataset using machine learning. J Geophys Res Space Physics 123:5640–5647 Feldstein YI, Vorobjev VG, Zverev VL, Forster M (2014) Investiga- tions of the auroral luminosity distribution and the dynamics of discrete auroral forms in a historical retrospective. Hist Geo Space Sci 5(1):81–135

Han D, Chen X, Liu J et al (2015) An extensive survey of dayside diffuse aurora based on optical observations at Yellow River Station.

J Geophys Res Space Physics 120:7447–7465

Hadsell R, Chopra S, LeCun Y (2006) Dimensionality reduction by learning an invariant mapping. IEEE Conference on Computer Vision and Pattern Recognition 2:1735–1742

Hartigan JA, Wong MA (1979) Algorithm AS 136: A K-Means Clus- tering Algorithm. J R Stat Soc Ser C 28(1):100–108

Hu H, Liu R, Wang J et al (1999) Statistic characteristics of the aurora observed at Zhongshan Station. Antarctica. Chinese Journal of Polar Research 11:8–18

Hu ZJ, Yang H, Huang D et al (2009) Synoptic distribution of dayside aurora: multiple-wavelength all-sky observation at Yellow River Station in Ny-Ålesund, Svalbard. J Atmos Solar Terr Phys 71(8):794–804

Maaten L, Hinton G (2008) Visualizing data using t-SNE. J Mach Learn Res 9:2579–2605

Niu C, Zhang J, Wang Q, Liang J (2018) Weakly supervised semantic segmentation for joint key local structure localization and classification of aurora image. IEEE Trans Geosci Remote Sens 99:1–14

(11)

Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323:533–536

Sandholt PE, Farrugia CJ, Moen J et al (1998) A classification of dayside auroral forms and activities as a function of interplan- etary magnetic field orientation. J Geophys Res Space Physics 103(A10):23325–23345

Shaham U, Stanton K, Li H (2018) Spectralnet: Spectral clustering using deep neural networks. In International Conference on Learn- ing Representations(ICLR2018), Vancouver, Canada, 1–20 Simmons D (1998) A classification of auroral types. J Br Astron Assoc

108:247–257

Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. The 3rd International Confer- ence on Learning Representations (ICLR2015), San Diego, CA, Steen Å, Brändström U, Gustavsson B, Aso T (1997) ALIS- a multi-1–14

station imaging system at high latitudes with multi-disciplinary scientific objectives. European Rocket & Balloon Programmes &

Related Research, 261–266

Störmer C (1955) The Polar Aurora. Clarendon Press, Oxford Sun N, Yu H (2018) A method to determine the number of clusters

based on multi-validity index. International Joint Conference on Rough Sets. Springer, Cham

Syrjäsuo MT, Donovan EF (2004) Diurnal auroral occurrence statistics obtained via machine vision. Ann Geophys 22:1103–2113

Ünlü R, Xanthopoulos P (2019) Estimating the number of clusters in a dataset via consensus clustering. Expert Syst Appl 125:33–39 Wang Q, Liang J, Hu ZJ et al (2010) Spatial texture based automatic

classification of dayside aurora in all-sky images. J Atmos Sol Terr Phys, 72(5):498–508

Xie J, Girshick R, Farhadi A (2016) Unsupervised deep embedding for clustering analysis. Proceedings of the 33rd International Confer- ence on Machine Learning

Yang H, Sato N, Makita K et al (2000) Synoptic observations of auroras along the postnoon oval: a survey with all-sky TV observations at zhongshan, antarctica. J Atmos Solar Terr Phys 62(9):787–797 Yang Q, Liang J, Hu Z, Zhao H (2012) Auroral sequence representation and classification using hidden markov models. IEEE Trans Geosci Remote Sens 50(12):5049–5060

Yang Q, Zhou P (2020) Representation and classification of auroral images based on convolutional neural networks. IEEE Journal of Selected Topics in Applied Earth Observations and Remote Sensing 13:523–534

Zhang J, Liu M, Lu K, Gao Y (2019) Group-wise learning for aurora image classification with multiple representations. IEEE Transac- tions on Cybernetics 99:1–13

Zhong Y, Huang R, Zhao J, Zhao B, Liu T (2018) Aurora image classification based on multi-feature latent dirichlet allocation. Remote Sensing 10(2):233–249