Simulation of a Virtual CPU Executing Mathematical Functions in Python
Simulation of a Virtual CPU Executing Mathematical Functions in Python
Executing Mathematical
Functions in Python
ChatGPT-4¹, Ninaad Das²
Abstract
In the field of computer science and computational modelling, an understanding of CPU behaviour
and workload distribution remains a crucial topic (Tanenbaum & Bos, 2014). In this study, a Python-
based virtual CPU environment has been developed to simulate multi-threaded execution of
mathematical functions and visualize the underlying computational behaviour (Silberschatz, Galvin,
& Gagne, 2018). The implementation utilizes multi-threading, real-time data visualization, and
interactive UI elements to simulate the execution of multiple mathematical functions across CPU
cores (McCool, Reinders, & Robison, 2012).
The study also investigates an unexpected phenomenon where the graphical representation of CPU
execution exhibits erratic distortions when the number of cores is set to a value different from four.
Through a detailed analysis of thread execution timing, aliasing effects, and function sampling
artifacts, it has been determined that these irregular patterns are not mere errors but reflections of
real-world computational challenges seen in parallel processing, load balancing, and numerical
stability in floating-point operations (Goldberg, 1991; Overton, 2001).
In addition to providing a detailed breakdown of CPU execution and visualization techniques, the
findings of this study hold significant educational and practical value. The visualization serves as a
teaching tool for multi-threaded execution, enabling students and researchers to gain insights into
the challenges of real-world CPU scheduling and numerical precision errors (Hennessy & Patterson,
2017). Furthermore, potential applications in debugging parallel computing algorithms, detecting
race conditions, and workload distribution analysis are discussed (McCool, Reinders, & Robison,
2012; LeCun, Bengio, & Hinton, 2015).
1. Introduction
The execution of mathematical operations virtual CPU environment using Python,
within a CPU is a fundamental aspect of enabling real-time observation of how multi-
computational systems. At the core of modern threaded workloads execute mathematical
computing lies the concept of parallel functions across multiple cores (Silberschatz,
execution, where multiple CPU cores handle Galvin, & Gagne, 2018).
separate tasks simultaneously (Tanenbaum &
Computational systems often rely on precise
Bos, 2014). This experiment aims to simulate a
function execution, yet real-world CPUs
introduce several factors that affect the core counts beyond four, leading to
accuracy, efficiency, and consistency of phase misalignments and aliasing
numerical computations. These factors errors (Oppenheim & Schafer, 2009).
include thread synchronization, floating-point
Through this experimental CPU simulation,
precision errors, aliasing effects, and load
the underlying computational principles
balancing challenges (Goldberg, 1991;
governing multi-core execution are explored.
Overton, 2001). While modern processors
Moreover, the observed graphical anomalies
incorporate optimizations to handle such
provide a unique opportunity to analyse
inefficiencies, an experimental simulation can
computational artifacts that frequently occur
reveal these behaviours explicitly (Hennessy
in scientific computing, signal processing,
& Patterson, 2017).
numerical modelling, and high-performance
This study presents a Python-based computing (Proakis & Manolakis, 2007).
simulation that:
This paper begins with an overview of Python
• Emulates CPU core behaviour by and the libraries used in the simulation,
assigning computational tasks to followed by an in-depth discussion of CPU
multiple threads (McCool, Reinders, & execution principles and a technical
Robison, 2012). breakdown of the implementation. The study
further investigates unexpected graphical
• Uses various mathematical functions
irregularities, offering insights into their
to model different types of CPU
computational significance and proposing
workloads (Hunter, 2007).
solutions to mitigate them. Finally, the
• Provides real-time graphical implications of this work for education,
visualization of CPU usage and debugging, and scientific computation are
execution timelines (Tufte, 2001). discussed (Hennessy & Patterson, 2017;
LeCun, Bengio, & Hinton, 2015).
• Observes unexpected computational
artifacts, particularly when increasing
• Handling floating-point arithmetic with improved performance over Python’s built-in lists.
The primary advantage of using NumPy over standard Python lists lies in its vectorized operations,
which allow batch processing of numerical data without explicit looping. This significantly improves
execution speed, particularly when updating CPU core usage values in real time.
For example, NumPy’s np.zeros(NUM_CORES) is used to initialize core usage values efficiently:
One of the most critical components is the real-time animation of CPU execution, which is
accomplished using:
This continuously calls the update() function, allowing graphs to refresh dynamically as CPU
workload values change. Without this, static plots would not reflect real-time changes in CPU
behaviour.
Another essential visualization method is polar plotting, used to depict mathematical function
execution across CPU cores:
This plot provides insight into how different CPU cores handle workloads asynchronously, with each
function tracing a unique execution pattern.
• Frame-based updates: The update() function is called every 50 milliseconds, ensuring that
graphs remain synchronized with CPU activity.
• Live function tracing: Execution of CPU workload functions is displayed progressively over
time, similar to how real CPU cycles process tasks in small increments.
The following example illustrates how the heatmap is refreshed in real time:
This results in a heatmap visualization, where high CPU usage regions appear bright, while low
usage areas remain dark, mimicking real-world thermal maps of CPU workloads.
Python’s Global Interpreter Lock (GIL) generally restricts parallel execution within a single Python
process, but since the mathematical function computations are lightweight, multi-threading remains
an effective approach for simulation.
The user interface ensures flexibility in configuring the simulation, making it an interactive and
educational tool for understanding CPU workload distribution.
• A task execution framework that ensures each core is assigned a function to evaluate.
• A graphical visualization component that tracks the execution process in real time.
In this section, the data structures, execution model, and CPU thread simulation techniques are
explored in depth.
Efficient data structures are crucial to managing the state of the virtual CPU. Several key variables
and arrays are used to store CPU workload distribution, function results, and execution timelines.
4.1 CPU Core Representation
Each CPU core is represented using an index-based approach, where the number of available cores is
user-defined and dynamically allocated. The core workload is stored in a NumPy array, which allows
efficient numerical computation and real-time updates:
Additionally, function execution data is stored for each core as a list of tuples:
This ensures that the simulation maintains a history of execution values, allowing dynamic
visualization of function behaviour over time.
Each core fetches a task from its queue, evaluates the assigned function, and updates the stored
values accordingly.
A thread is spawned for each CPU core, ensuring concurrent execution of tasks:
Each thread runs the cpu_task() function, which evaluates a mathematical function assigned to that
core. The daemon thread setting ensures that all threads terminate when the main program exits,
preventing orphan processes.
Each core executes its assigned function in a loop, continuously updating CPU usage values. The
function execution process is defined as:
• A data lock (data_lock) is used to prevent race conditions, ensuring that multiple threads
do not overwrite shared data simultaneously.
This mechanism mimics real-world CPU workloads, where each core processes a task independently,
updating execution results in a shared memory structure.
4.4 Mathematical Functions Simulated in CPU Execution
To provide a varied computational workload, multiple mathematical functions have been
incorporated into the simulation. Each function represents a distinct type of CPU workload,
mimicking different real-world processing tasks.
1. Rotary Flower:
2. Exponential Decay:
3. Chaotic Wave:
o Simulates real-world applications like noise filtering and chaotic system modeling.
4. Oscillating Spiral:
5. Random Spikes:
Each function is dynamically assigned to a core, allowing users to experiment with different CPU
workload distributions in real time.
The execution of mathematical functions in a real CPU follows these key stages:
1. Instruction Fetching: The CPU retrieves the mathematical instruction (e.g., sin(x), cos(x), or
exp(x)) from memory.
2. Instruction Decoding: The retrieved instruction is translated into machine code, which
determines how the CPU processes the computation.
3. Operand Fetching: The required data (input variables such as x) is loaded from registers,
cache, or RAM into the arithmetic execution unit.
4. Computation Execution: The CPU performs the mathematical operation using its floating-
point arithmetic unit (FPU) or vectorized SIMD (Single Instruction, Multiple Data)
instructions.
5. Write-Back Stage: The computed result is stored in registers or memory for further use.
Most mathematical functions require floating-point precision due to the nature of real numbers. The
IEEE 754 standard defines how floating-point numbers are represented and computed in a CPU.
where:
For example, the computation of sinusoidal functions like sin(x) follows the Taylor Series Expansion:
The CPU evaluates this using a series of floating-point multiplications, divisions, and summations,
optimized through instruction pipelining and hardware acceleration.
For a function like cos(x), the CPU executes it using the CORDIC (COordinate Rotation DIgital
Computer) algorithm or LUT (Look-Up Tables) for faster retrieval. The LUT approach precomputes
cosine values for common angles, reducing execution time.
Each virtual CPU core executes a user-selected mathematical function. The function evaluations
occur as follows:
• Erratic behavior arises when phase shifts are improperly handled (discussed in
Observations).
Each function is continuously evaluated and updated by CPU threads in the simulation loop.
Real CPUs use hardware-level floating-point optimizations, whereas Python relies on software-
based floating-point arithmetic using the IEEE 754 standard.
• The simulation relies on Python’s GIL (Global Interpreter Lock), which prevents true
parallelism.
These effects contribute to the erratic visual behavior observed when selecting different core
counts, which will be analyzed in the Observations section.
• Adjust the number of CPU cores, dynamically altering the simulation complexity.
• Initiate the real-time visualization of CPU workload distribution and function execution.
The Tkinter library has been used to construct the GUI, as it provides a lightweight yet flexible
interface for handling user input. Unlike complex GUI frameworks (such as PyQt or Kivy), Tkinter is
sufficiently fast for scientific visualization and does not introduce unnecessary overhead.
o Users can select one of the predefined mathematical functions for execution.
These components are arranged in a compact, user-friendly layout, ensuring efficient interaction
without excessive complexity.
The use of event-driven updates (bind method) ensures that changes in function selection or core
count immediately trigger updates in the underlying simulation.
Four types of graphs are used to depict different aspects of CPU execution:
o Utilizes a hot color map, where higher loads appear brighter and idle cores appear
darker.
o Displays the real-time CPU usage for each core using bar heights.
The update() function dynamically refreshes all graphs to reflect changes in CPU workload.
By continuously calling update(), the system ensures that all graphs remain synchronized with the
underlying CPU workload data.
6.3 Challenges in UI and Visualization Development
6.3.1 Managing Thread Safety in GUI Updates
A key challenge in GUI-based simulations is ensuring thread safety when updating shared data
structures. Python’s Tkinter is not thread-safe, meaning that concurrent access to UI elements from
multiple threads can lead to inconsistent state updates.
• Global variables are modified only within locked code blocks, ensuring exclusive access.
Since CPU workloads change every 50ms, visualization updates must remain smooth and non-
blocking. Strategies used include:
Under normal conditions, the polar plot should display smooth and periodic execution patterns
based on the selected function. Some examples include:
• Rotary Flower Function (sin(6θ) + 0.5 sin(3θ)) forming a symmetric six-petal structure.
These expected patterns are observed when the number of CPU cores remains at 4 (the default
setting). However, when the core count is changed to any other value, the graph becomes erratic
and loses its expected shape.
Whenever the number of cores was modified, the following distortions were observed:
These anomalies suggest a systematic mathematical inconsistency introduced by changing the core
count, which requires further analysis.
Each core executes its function with a fixed phase offset, given by:
where i represents the core index. This offset ensures that cores execute slightly different versions of
the function to prevent complete overlap.
Let us analyse the Rotary Flower function for different core values. The function is defined as:
Here, all cores maintain a stable and smooth function execution, allowing the expected polar plot to
appear correctly.
This is equivalent to destructive wave interference, where periodic functions partially cancel each
other out, disrupting the original function shape.
Python stores numbers using the IEEE 754 floating-point standard, which introduces rounding
errors. These errors accumulate as more cores contribute to the simulation.
The additional precision loss causes slight numerical inconsistencies, altering the computed function
values in unpredictable ways.
The Nyquist-Shannon theorem states that a function must be sampled at at least twice its highest
frequency to avoid aliasing.
the highest frequency component is 6. Thus, we need a sampling rate of at least 12 samples per
cycle.
With 4 cores, the effective sampling rate remains stable. However, changing the core count
introduces uneven sampling, leading to aliasing distortions.
1. Normalizing Phase Offsets Dynamically to maintain uniform function execution across cores.
This resulted in non-uniform phase shifts when the number of cores was changed, creating
destructive interference in the function outputs.
To maintain uniform function execution across any core count, the offset should be harmonically
distributed in a 2π range:
This ensures that each core remains equidistantly spaced around the unit circle, preventing
interference patterns.
Python Implementation
Python’s default floating-point representation (float64) introduces rounding errors over successive
computations. These errors accumulate when multiple cores execute trigonometric and exponential
functions, leading to erratic graph distortions.
This seemingly small deviation compounds over thousands of iterations, altering the execution
pattern unpredictably.
Python Implementation
The Nyquist-Shannon theorem states that a function must be sampled at at least twice its highest
frequency to prevent aliasing artifacts.
For the Rotary Flower function:
the highest frequency component is 6, requiring a sampling rate of at least 12 samples per cycle.
However, when core counts were changed, the sampling rate was inconsistent, leading to
distortions.
By increasing the sampling rate (evaluating multiple points per step) and averaging results, aliasing
can be mitigated.
Python Implementation
1. Parallel Computing:
3. Scientific Simulations:
Modern multi-core processors distribute computational tasks across multiple cores to optimize
performance. The behavior of this workload distribution depends on:
• Thread scheduling algorithms (e.g., Round Robin, FIFO, Dynamic Load Balancing).
This simulation provides a visual representation of such CPU workload balancing, helping engineers
and researchers analyze how different workloads are processed in parallel environments.
Using the heatmap and bar graph visualizations, the simulation allows users to:
This knowledge can be used for fine-tuning performance optimization strategies in real-world CPU-
intensive applications.
Many students struggle with understanding multi-threading, synchronization, and load balancing.
By providing real-time visual feedback, this simulation enables:
This approach aligns with active learning methodologies, which are more effective than traditional
textbook-based explanations.
Numerical precision errors in floating-point arithmetic are a critical concern in fields such as
scientific computing and machine learning. This simulation can be used to:
• Teach methods to mitigate these issues, such as increasing precision (float128) or using
compensation algorithms.
These concepts are crucial for students and researchers working with high-performance computing,
numerical modelling, and simulations.
The erratic behaviour observed in the polar function execution graph is directly related to real-world
multi-threaded computation challenges, such as:
By using this simulation, developers can analyse and debug these issues visually before deploying
multi-threaded applications in production environments.
Scientific computing applications, such as fluid dynamics simulations, molecular modelling, and AI
training, require efficient load balancing across CPU cores.
This is particularly useful in supercomputing environments, where optimizing CPU utilization can
lead to significant performance improvements.
By studying how function execution patterns change with different sampling rates (core counts),
engineers can apply these concepts to real-world DSP optimizations.
Currently, the simulation is CPU-based, but GPU acceleration using CUDA or OpenCL could be
implemented to:
The visualization techniques used in this experiment could be adapted for real-time system
monitoring, allowing:
10. Conclusion
10.1 Insights Gained from the Simulation
This study successfully developed a Python-based virtual CPU simulation, demonstrating the
principles of multi-threaded execution, numerical precision, and real-time workload visualization.
The key findings from the experiment include:
o The simulation provided an interactive way to observe how CPU cores process
mathematical functions in parallel.
o Real-time graphs (heatmaps, bar graphs, and polar plots) highlighted workload
distribution across CPU cores.
o This was traced to phase misalignment, aliasing, and numerical precision errors.
o The anomalies observed closely resemble real-world parallel computing issues, such
as:
These insights confirm that simulated CPU workload distributions can provide valuable
computational lessons in both theoretical and practical domains.
While initially perceived as an anomaly, the irregular graph distortions revealed deeper
computational insights applicable to various fields:
• Future researchers can extend this model to analyze race conditions and deadlocks in multi-
threaded systems.
• The effects of floating-point errors observed in this study parallel real-world computational
physics and numerical modeling issues.
• GPU-based deep learning models face similar precision loss when training on large datasets.
• This experiment highlights the need for phase-aware execution models to prevent
numerical drift in AI computations.
Thus, the findings have broader implications for real-world applications, beyond just CPU
simulations.
The graphical visualization of multi-core CPU workloads makes this simulation an ideal educational
tool for teaching:
1. Computer Science Students
• Extending the model to GPU-based parallel execution using CUDA or OpenCL would allow
massively parallel processing.
• This could be used to analyze AI workloads, fluid dynamics simulations, and cryptographic
computations.
• The heatmap and bar graph visualizations could be adapted for live CPU performance
tracking in server farms and cloud computing platforms.
• Integration with real-time telemetry systems could provide advanced workload diagnostics.
• Implementing adaptive precision scaling (e.g., switching between float64 and float128
based on error thresholds) could enhance computational accuracy.
These future expansions would significantly enhance the practical utility of the current virtual CPU
simulation.
• Floating-point precision and phase alignment are crucial for ensuring stable function
execution.
• Graphical representations of CPU execution serve as a powerful tool for education and
computational research.
By addressing both practical and theoretical aspects of CPU workload simulation, this study bridges
the gap between scientific computing, engineering applications, and educational tools.
11. References
The following references include scientific literature, official documentation, and research papers
relevant to the topics covered in this study. They provide supporting evidence for the computational
phenomena observed in the experiment and extend the discussion on multi-threaded execution,
numerical precision, and real-time visualization techniques.
2. Silberschatz, A., Galvin, P. B., & Gagne, G. (2018). Operating System Concepts (10th Edition).
Wiley.
4. McCool, M., Reinders, J., & Robison, A. (2012). Structured Parallel Programming: Patterns for
Efficient Computation. Elsevier.
o Introduces strategies for optimizing parallel workloads in CPU and GPU computing.
6. Overton, M. L. (2001). Numerical Computing with IEEE Floating Point Arithmetic. SIAM.
o Discusses numerical accuracy, precision loss, and error propagation in scientific
computing.
7. Kahan, W. (1996). The Baleful Effects of Computer Benchmarks upon Applied Mathematics,
Physics, and Chemistry. University of California, Berkeley.
9. Proakis, J. G., & Manolakis, D. G. (2007). Digital Signal Processing: Principles, Algorithms, and
Applications (4th Edition). Pearson.
• Explores the use of Matplotlib for dynamic visualization and function plotting.
11. Tufte, E. R. (2001). The Visual Display of Quantitative Information. Graphics Press.
12. Van Rossum, G., & Drake, F. L. (2009). The Python Language Reference Manual. Python
Software Foundation.
14. Dean, J., & Ghemawat, S. (2008). MapReduce: Simplified Data Processing on Large Clusters.
Communications of the ACM, 51(1), 107-113.
A
• Aliasing
• Aliasing is a phenomenon in signal processing and numerical computation where
an insufficient sampling rate leads to distorted or misleading data representations.
It occurs when a signal is sampled at a rate lower than twice its highest frequency
component, violating the Nyquist-Shannon sampling theorem.
• In this experiment, aliasing was responsible for the erratic distortions observed in
the polar function execution graph when changing the number of CPU cores.
• Anti-Aliasing
• A set of techniques used to reduce aliasing artifacts in computational systems. In
this experiment, oversampling and averaging were proposed to mitigate aliasing
effects in function execution graphs.
B
• Bottleneck (Computational Bottleneck)
• A bottleneck occurs when one part of a system limits overall performance due to
resource constraints (e.g., CPU, memory, or I/O speed). In parallel computing,
bottlenecks can arise from uneven workload distribution, thread synchronization
delays, or memory bandwidth limitations.
• The heatmap and bar graph in the experiment visually indicated potential CPU
bottlenecks caused by uneven load distribution.
C
• Cache (CPU Cache)
• A small, high-speed memory storage inside a CPU used to store frequently accessed
data and instructions. Cache memory significantly reduces access latency compared
to retrieving data from RAM.
• Although the experiment does not simulate CPU caching mechanisms, real-world
CPUs rely heavily on caches to optimize function execution speed.
• Catastrophic Cancellation
• A numerical error that occurs when subtracting two nearly equal numbers, leading to
a significant loss of precision in floating-point arithmetic.
• This error was indirectly referenced in the study when discussing floating-point
precision issues that contributed to erratic function plots.
D
• Daemon Thread
• A thread that runs in the background and automatically terminates when the main
program exits.
• In this experiment, each CPU core was simulated as a daemon thread, ensuring that
all core processes terminated properly when the simulation ended.
• Deadlock
• A situation in multi-threaded programming where two or more threads become stuck
in a waiting state, each waiting for a resource held by the other.
• While not observed in this experiment, deadlocks are a significant concern in real-
world parallel computing.
E
• Error Propagation
• The accumulation of small numerical errors over multiple iterative calculations,
leading to significant inaccuracies in results.
• This issue was observed when floating-point precision errors distorted function
execution in the polar graph.
F
• Fourier Transform
• A mathematical technique used to decompose complex waveforms into their
fundamental frequency components. The distortions observed in function execution
resemble Fourier analysis artifacts, commonly seen in signal processing.
• Frequency Components
• The individual sine or cosine waves that, when combined, form a more complex
waveform. In this experiment, the function sin(6θ) + 0.5 sin(3θ) contained
multiple frequency components, which interacted with different core counts, leading
to distortions.
G
• Gantt Chart
• A type of bar chart used to visualize task execution timelines in project management
and computing. In this study, a Gantt-like representation was used to show CPU
task execution timelines.
• Global Interpreter Lock (GIL)
• A mutex (mutual exclusion lock) in Python that prevents multiple threads from
executing Python bytecode simultaneously, limiting true parallelism in multi-
threaded Python programs.
• Although Python’s GIL prevents true multi-core execution, the experiment
successfully simulated multi-core behavior using threading and data locks.
• H
• Hyper-Threading
• A CPU technology that enables a single physical CPU core to execute multiple
threads concurrently, improving efficiency.
• Although not directly implemented in the simulation, hyper-threading is an important
real-world technique for multi-threaded execution optimization.
I
• IEEE 754 Standard
• A widely used floating-point arithmetic standard that defines how numbers are
stored and calculated in digital computers. The rounding errors observed in
function execution were caused by limitations in IEEE 754 precision.
• Instruction Pipeline
• A technique used in modern CPUs where multiple instructions are fetched, decoded,
and executed in parallel stages, improving computational speed.
• The experiment does not simulate hardware-level pipelining, but similar effects
were observed when multiple cores executed function evaluations concurrently.
L
• Load Balancing
• A method used in parallel computing to distribute workloads evenly across multiple
CPU cores to maximize efficiency.
• In the experiment, the bar graph visualization demonstrated how function execution
was distributed across cores, revealing potential imbalances.
N
• Nyquist-Shannon Sampling Theorem
• A fundamental theorem in signal processing stating that a signal must be sampled at
at least twice its highest frequency to be accurately reconstructed.
• The aliasing artifacts observed in the experiment were a direct result of sampling
errors violating this theorem.
O
• Oversampling
• A technique where a signal (or function) is sampled at a higher rate than necessary
to improve accuracy and reduce aliasing effects.
• Oversampling was implemented in the polar function execution plot to stabilize
distorted function curves.
P
• Phase Offset
• A shift in the starting angle of a periodic function. In this experiment, modifying the
core count changed the phase offsets, leading to function misalignment and
distortions.
• Precision Scaling (Adaptive Precision)
• A technique where a system dynamically adjusts numerical precision based on error
thresholds.
• Future extensions of this study could implement adaptive precision scaling to
minimize computational errors dynamically.
R
• Race Condition
• A programming issue where multiple threads access shared data concurrently,
leading to unpredictable behavior.
• In the simulation, thread locks (data_lock) were used to prevent race conditions
when updating core workload values.
S
• Sampling Rate
• The number of data points collected per unit time when evaluating a function. If too
low, it results in aliasing; if too high, it increases computational overhead.
• The experiment demonstrated how improper sampling altered the function plots
when changing core counts.
T
• Taylor Series Expansion
• A mathematical technique used to approximate functions using polynomials. Many
trigonometric functions in CPUs (such as sin(x), cos(x)) are computed using
Taylor series.
• Errors in the expansion process contribute to floating-point inaccuracies in real-
world CPU computations.
• Thread Synchronization
• The process of ensuring that multiple threads access shared resources safely. The
use of thread locks (data_lock) in this experiment prevented conflicting data
updates.