Solar Flare Final-4
Solar Flare Final-4
COS7046-B
1|Page
Table of Contents
Table of Figures 3
Introduction 4
Background 4
Methodology 5
Data Integration 5
Visualization Techniques 5
Visual Analysis 6
Correlation Analysis 7
Challenges Faced 9
Recommendations 9
Conclusion 10
References 11
2|Page
Table of Figures
Figure 1 Sunspot Numbers and Solar Flare Intensity Over Time..............................................7
Figure 2 Interactive plot of Sunspot Numbers and Flare Intensity............................................8
Figure 3 Scattered Plot Correlation between Sunspot Number and Flare Intensity..................8
Figure 4 Heatmap of Monthly Averages....................................................................................9
3|Page
Abstract
Using data visualization and machine learning, this study describes the correlation between
sunspot numbers and solar flare intensities. Some key findings include a strong positive
(0.72) correlation that exhibit moderate predictive accuracy by a regression model. To
safeguard critical infrastructure, it is recommended to improve the features and use more
advanced AI models in predicting solar storms..
Introduction
Space weather, a product of solar activity, has significant impacts on the earth’s technological
base (Fleishman et al., 2022). Solar flares combined with CMEs can affect all forms of
communication, destroy satellites and sometimes cause power blackouts (NOAA, 2024).
These events which are due to a change of the Sun’s magnetic field are major threats to
modern society making it very important to be able to predict them so as to ensure the safety
of global systems (Gopalswamy, 2022).
Predictions of space weather include tracking of the sunspots, which are relatively cool
regions on the surface of the sun caused by solar magnetisms (Mancini and Nastasi, 2020).
Solar flares refer to sudden, and sometimes violent discharge of energy and occur mainly in
areas that are sensitive in the Suns’ atmospheric discs especially in relation to complex
sunspot regions (Georgoulis et al., 2024). Sunspots and solar flares are central to the study
and prediction of solar storms; however, this is an area that presents a lot of challenging
problems and is not very well understood (Ren et al., 2021).
This project seeks to improve understanding of relationships between numbers of sunspots
and solar flares so as to create a platform for better prediction of future space storms.
Background
Space weather may be defined as the solar and solar wind conditions in relation to the
technologies in earth (Moldwin, 2022). Sunspots are cooler, magnetically active areas on the
Sun’s photosphere and cause solar flares – short, intense bursts of energy producing
electromagnetic waves (NOAA, 2024; Belhadi et al., 2020). They may interfere with
communication, affect GPS and power supply thus their study is important. Solar activity is
11-year cyclical, the risk is higher during the solar maximum (Brantley et al., 2024, Fletcher,
2024). Sequential recording of sunspots and flares is helpful in forecasting.
4|Page
Space weather data are used in scientific investigations and practical applications, so data
visualization is vital; simple visual tools, such as heatmaps, scatterplots, and time series,
make it easier to find patterns and relationships (Jovanovic et al., 2024). This project applies
SIDC daily sunspot numbers and NGDC flare statistics that include counts, flare intensity,
and time. Cleaning deals with handling of missing values and formatting before trend
analysis of solar activity.
Methodology
Data Investigation and Processing
The data for this study was obtained from SIDC for the sunspot data and NGDC for the solar
flare data. Each of the datasets needed pre-processing, as they also contained inconsistent
values. The initial sunspot data was in the form of missing values, negative observations and
raw form of data other than DateTime which required row deletion, normalization and
conversion of data type. The solar flare data also needed proper filtering of fields and
conversion of date to the standard DD format as their formats of columns were also
normalized. These operations were done within the program using pandas to normalize the
data by changing selected columns into the right format and deleting any rows with wrong
uploads.
Data Integration
The sunspot and solar flare datasets were merged based on the Date field to align daily
sunspot numbers with solar flare intensities. Missing values in either dataset were replaced
with zeros using pandas, ensuring the integrated data was complete and accurate. This
integration enabled further analysis, particularly in examining correlations and trends over
time.
Visualization Techniques
Various visualization tools were employed, including matplotlib, seaborn, and plotly.express,
to provide both static and interactive visualizations. The visualizations included:
Time Series Plots: Displaying trends in sunspot numbers and solar flare intensities
over time.
Scatter Plots: Highlighting the relationship between sunspot numbers and flare
intensities.
Heatmaps: Showing monthly averages of sunspot numbers and flare intensities to
identify cyclical patterns.
5|Page
Interactive Dashboards: Allowing users to explore correlations dynamically through
scatter plots.
For data modeling and training, the tensorflow library was used to construct and train a basic
linear regression model. The trained model used an optimizer known as Adam and a loss
function referred to as mean squared error, whereby the results were used to estimate flare
intensities based on sunspot numbers. This gave me the background to perform predictive
analysis and was able to show that there is a relationship between sunspot activity and flare
occurrences.
6|Page
Figure 1 Sunspot Numbers and Solar Flare Intensity Over Time
Correlation Analysis
The correlation between sunspot numbers and solar flare intensities was found to be
approximately 0.72, which was strong positive. This result demonstrates that flare intensities
increase with increasing sunspot numbers. To put this correlation in perspective, its
importance comes from its ability to help formulate improved predictive models for solar
flares from sunspot data. However, there was some variability that suggested that additional
factors, such as sunspot classification or location, may influence flare intensity and should be
further explored.
7|Page
Figure 2 Interactive plot of Sunspot Numbers and Flare Intensity
Scatter Plot Analysis
The general upward trend in the scatter plot of sunspot numbers versus flare intensities is
seen, but there are also obvious outliers. These days were also outliers in that flare intensities,
despite moderate sunspot numbers, were unusually high, indicating that factors other than
sunspot quantity play a role in flare activity. The scatter plot also suggested a non-linear
relationship, especially in high activity periods, and more advanced modelling techniques
may be needed to accurately predict.
Figure 3 Scattered Plot Correlation between Sunspot Number and Flare Intensity
8|Page
Figure 4 Heatmap of Monthly Averages
Machine Learning Results
A Mean Squared Error (MSE) of 0 was achieved on the test dataset by the linear regression
model, which exhibited good predictive capability. As shown in the visualization of actual
versus predicted intensities, most data points had predicted flare intensities that closely
matched actual values. But the model was not robust to extreme outliers or days of unusually
low or high flare activity.
Such limitations imply that further improvement may be gained by the inclusion of additional
features, for instance sunspot location, magnetic complexity, or flare classification, in the
model. Predictions may also be improved with the use of advanced machine learning
approaches like decision trees or neural networks, which are more capable of capturing non
linear relationships.
In general, the results showed a strong correlation between sunspot numbers and solar flares,
and areas for improving visualization and prediction methods in the future.
9|Page
Discussion and Recommendations
Challenges Faced
Some of the problems encountered during the analysis included data quality and structure.
The sunspot and solar flare datasets contain many missing values and other inconsistencies in
format that required filtering and validation for both it and sunspot (Georgoulis et al., 2024).
It means solar flare activity did not exhibit robust dependence on the fine properties of the
sunspot data including location or magnetic class (Nandy, 2021). Firstly, the flare dataset had
labelled outliers and non-standardized measurements which made the task of predictive
modelling even more difficult.
Recommendations
Subsequent studies should incorporate factors including the location of the sunspots, sunspot
types and magnetic complexity of the sunspots in order to get a complete picture on how the
two features relate to each other. It suggests that introducing these attributes can enhance
visualization and prediction outcomes simultaneously (Temmer, 2021). Nonlinear can be
handled experiment and different methods of AI such as neural networks or time-series
models need to be incorporated in order to enhance the intensity of flare prediction. Here, as
an example, we could use the recurrent neural networks (RNNs) to capture the temporal
patterns inherent in the tendency of solar activity (Brantley et al., 2024). Besides, more
detailed observations for example hourly sunspot may uncover short term trends that may not
be observed from the daily observations.
Conclusion
This project has demonstrated that sunspot activity and solar flare are related and that
sunspots can be used to predict solar flare. The 11-year solar cycle was particularly well
defined as noted also by other researchers through an array of visualization methods such as,
line plot, scatterplot, and heat map. These results prove that sunspot data is an effective
predictor for space weather but it identifies that there are many other predictors of flare
emission as well. Further practical applications of AI and machine learning further into the
future, may enhance the detection of some unknown patterns and thus enhance the capability
of predicting and managing space weather.
10 | P a g e
References
Angryk, R.A. et al. (2020) “Multivariate time series dataset for space weather data analytics,”
Scientific data, 7(1), pp. 1–13. Available at: https://doi.org/10.1038/s41597-020-0548-x.
Belhadi, A. et al. (2020) “Space–time series clustering: Algorithms, taxonomy, and case
study on urban smart cities,” Engineering applications of artificial intelligence, 95(103857),
p. 103857. Available at: https://doi.org/10.1016/j.engappai.2020.103857.
Chapman, S.C. et al. (2020) “Quantifying the solar cycle modulation of extreme space
weather,” Geophysical research letters, 47(11). Available at:
https://doi.org/10.1029/2020gl087795.
Curto, J.J. (2020) “Geomagnetic solar flare effects: a review,” Journal of space weather and
space climate, 10, p. 27. Available at: https://doi.org/10.1051/swsc/2020027.
Dueben, P.D. et al. (2022) “Challenges and benchmark datasets for machine learning in the
atmospheric sciences: Definition, status, and outlook,” Artificial Intelligence for the Earth
Systems, 1(3). Available at: https://doi.org/10.1175/aies-d-21-0002.1.
Fathi, M. et al. (2022) “Big data analytics in weather forecasting: A systematic review,”
Archives of Computational Methods in Engineering. State of the Art Reviews, 29(2), pp.
1247–1275. Available at: https://doi.org/10.1007/s11831-021-09616-4.
Fleishman, G.D. et al. (2022) “Solar flare accelerates nearly all electrons in a large coronal
volume,” Nature, 606(7915), pp. 674–677. Available at: https://doi.org/10.1038/s41586-022-
04728-8.
Fletcher, L. (2024) “Solar flare spectroscopy,” Annual review of astronomy and astrophysics,
62(1), pp. 437–474. Available at: https://doi.org/10.1146/annurev-astro-052920-010547.
Georgoulis, M.K. et al. (2024) “Prediction of solar energetic events impacting space weather
conditions,” Advances in space research: the official journal of the Committee on Space
Research (COSPAR) [Preprint]. Available at: https://doi.org/10.1016/j.asr.2024.02.030.
Gopalswamy, N. (2022) “The Sun and space weather,” Atmosphere, 13(11), p. 1781.
Available at: https://doi.org/10.3390/atmos13111781.
Hendricks, M.D. and Van Zandt, S. (2021) “Unequal protection revisited: Planning for
environmental justice, hazard vulnerability, and critical infrastructure in communities of
color,” Environmental justice, 14(2), pp. 87–97. Available at:
https://doi.org/10.1089/env.2020.0054.
11 | P a g e
Ji, A. and Aydin, B. (2023) “Interpretable solar flare prediction with sliding window
multivariate time series forests,” in 2023 IEEE International Conference on Big Data
(BigData). IEEE, pp. 1519–1524.
Jovanovic, L. et al. (2024) “Optimizing machine learning for space weather forecasting and
event classification using modified metaheuristics,” Soft computing, 28(7–8), pp. 6383–6402.
Available at: https://doi.org/10.1007/s00500-023-09496-9.
Kumar, N. et al. (2021) “A novel framework for risk assessment and resilience of critical
infrastructure towards climate change,” Technological forecasting and social change,
165(120532), p. 120532. Available at: https://doi.org/10.1016/j.techfore.2020.120532.
Mancini, F. and Nastasi, B. (2020) “Solar energy data analytics: PV deployment and land
use,” Energies, 13(2), p. 417. Available at: https://doi.org/10.3390/en13020417.
Martin, S.F. (2024) “Observations key to understanding solar cycles: a review,” Frontiers in
astronomy and space sciences, 10. Available at: https://doi.org/10.3389/fspas.2023.1177097.
Mohammad Reza, Eskandari Nasab, Shah Muhammad Hamdi, Soukaina Filali Boubrahimi
(2024) Impacts of Data Preprocessing and Sampling Techniques on Solar Flare Prediction
from Multivariate Time Series Data of Photospheric Magnetic Field Parameters, Iop.org.
Available at: https://iopscience.iop.org/article/10.3847/1538-4365/ad7c4a/meta (Accessed:
December 20, 2024).
Nandy, D. (2021) “Progress in solar cycle predictions: Sunspot cycles 24–25 in perspective:
Invited review,” Solar physics, 296(3). Available at: https://doi.org/10.1007/s11207-021-
01797-2.
“NOAA national centers for Environmental Information (NCEI)” (2012). Available at:
https://www.ngdc.noaa.gov/ (Accessed: December 20, 2024).
Novikov, V. et al. (2020) “Space weather and earthquakes: possible triggering of seismic
activity by strong solar flares,” Annals of geophysics, 63(5), pp. PA554–PA554. Available at:
https://doi.org/10.4401/ag-7975.
12 | P a g e
Platts, J. et al. (2022) “Solar flare prediction with recurrent neural networks,” Journal of
Astronautical Sciences, 69(5), pp. 1421–1440. Available at: https://doi.org/10.1007/s40295-
022-00340-0.
Ren, X. et al. (2021) “Deep learning-based weather prediction: A survey,” Big data research,
23(100178), p. 100178. Available at: https://doi.org/10.1016/j.bdr.2020.100178.
Ribeiro, F. and Gradvohl, A.L.S. (2021) “Machine learning techniques applied to solar flares
forecasting,” Astronomy and computing, 35(100468), p. 100468. Available at:
https://doi.org/10.1016/j.ascom.2021.100468.
Rouillard, A.P. et al. (2020) “Models and data analysis tools for the Solar Orbiter mission,”
Astronomy and astrophysics, 642, p. A2. Available at: https://doi.org/10.1051/0004-
6361/201935305.
Saini, K. et al. (2024) “Classification of major Solar Flares from extremely imbalanced
multivariate time series data using MINImally RandOm Convolutional KErnel Transform,”
Universe, 10(6), p. 234. Available at: https://doi.org/10.3390/universe10060234.
Schultz, M.G. et al. (2021) “Can deep learning beat numerical weather prediction?,”
Philosophical transactions. Series A, Mathematical, physical, and engineering sciences,
379(2194), p. 20200097. Available at: https://doi.org/10.1098/rsta.2020.0097.
Singh, A.K. et al. (2021) “Physics of space weather phenomena: A review,” Geosciences,
11(7), p. 286. Available at: https://doi.org/10.3390/geosciences11070286.
Temmer, M. (2021) “Space weather: the solar perspective: An update to Schwenn (2006),”
Living reviews in solar physics, 18(1). Available at: https://doi.org/10.1007/s41116-021-
00030-3.
Valio, A. et al. (2020) “Correlations of sunspot physical characteristics during solar cycle
23,” Solar physics, 295(9). Available at: https://doi.org/10.1007/s11207-020-01691-3.
Zeyu Sun, Monica Bobra, Xiantong Wang, Yu Wang, Hu Sun, Tamas Gombosi, Yang Chen,
Alfred Hero (2022) Predicting Solar Flares Using CNN and LSTM on Two Solar Cycles of
Active Region Data, Iop.org. Available at: https://iopscience.iop.org/article/10.3847/1538-
4357/ac64a6/meta (Accessed: December 20, 2024).
13 | P a g e