A case study on using AI with benchmarking-driven methodology to optimize GPU-accelerated space charge calculations
Published on July 30, 2025
In accelerator physics, unlike particles that approach the speed of light, the pace of software innovation is often slow. Despite being a field at the forefront of physics research, we often use incredibly old software written before I was born.
Meanwhile, outside our bubble, the AI world has been moving at the speed of light. AI-assisted coding is rapidly changing how software developers work, but in physics, the adoption of AI-assisted coding faces much resistance. Many dismiss it as unreliable or “vibe coding” generating code that looks correct but is subtly broken.
I wanted to know whether these LLMs could do more than vibe coding. Could they actually help build the kind of high-precision, high-performance physics solvers we can rely on? So I pointed Cursor and Claude at our space-charge problem, forced them through a measurement-driven loop, and ended the night with a package that runs over a thousand times faster.
In accelerator physics, “space charge” refers to the collective electromagnetic field generated by a beam of charged particles moving together. As these particles travel, charged particles interact with each other and push the beam apart, potentially degrading beam quality or even causing instabilities. Accurately computing and tracking these space charge effects is critical for designing and operating high-intensity accelerators.
My goal was to create a high-performance Julia package for space charge calculations, SpaceCharge.jl. The starting point was an established Fortran code, OpenSpaceCharge used in Bmad space charge tracking. I wanted to port it to Julia to leverage modern features like automatic differentiation and, most importantly, easy GPU acceleration.
I started by asking AI to translate the Fortran code to Julia. To my surprise, it got the bulk of the algorithm right immediately. It understood the physics concept and the mathematical operations, correctly translating the core logic of Green’s function integration and FFT convolutions.
However, the initial translation was far from perfect. The AI missed many Julia-specific quirks:
I had to manually intervene to fix these issues and get a working baseline. Part of it was on me: I asked for a translation without further specifications. Part of it was on Julia being a niche. And part of it was on the limitations of the foundation model.
The AI correctly translated the core logic but missed performance nuances. For example, in Fortran, multidimensional arrays are column-major, which matches Julia. However, the AI often introduced implicit copies:
# AI generated code (inefficient copies)
rho = particles[:, 1] # Creates a copy!
# Corrected efficient code (views)
rho = @view particles[:, 1] # Zero-cost view
Now I had working code, but it was way too slow. Even slower than the original Fortran code!
Space charge calculations are computationally expensive. In a typical simulation, we might track a million particles over thousands of steps. If the space charge calculation takes even a few seconds, the full simulation becomes impossibly slow. In practice, space charge calculations can take as much as 90% of the total simulation time.
My initial Julia implementation was working, but the performance was disappointing:
This was nowhere near good enough. For practical particle tracking, we need sub-second performance. The initial GPU speedup was only about 7-8x over the unoptimized CPU code.
I needed a way to optimize this drastically. But instead of spending a week manually tweaking the code, I assigned AI to the job with a systematic approach.
I was inspired by the methodology from AlphaEvolve, which combines LLMs with genetic algorithms. Traditionally, a genetic algorithm works on problems with clear, quantifiable parameters: a population of candidate solutions is iteratively improved through genetic operations such as mutation (small, random changes) and crossover (mixing parts of two solutions). At each step, a well-defined objective function determines which solutions are better, enabling the algorithm to select and breed the fittest candidates.
AlphaEvolve expands on this by introducing LLMs to the genetic algorithm. It use LLMs to propose code mutations and as “judges” for objectives that are hard to capture with simple metrics, such as code readability. Crucially, their method extends genetic algorithms beyond merely tuning parameters. Instead, the optimization happens in a “program space”. In this context, the entities being searched over are computer programs themselves, or even programs that generate solutions or provide heuristics to the problem we want to solve.
My methodology had four key components:
Before optimizing anything, I wrote comprehensive benchmarking scripts and tests. I measured runtime and accuracy. All tests must pass for changes to be adopted.
I used files as an external memory for the AI. I asked the AI to analyze the code and generate an optimization plan in a markdown file (which became Issue #1). This document tracked:
This gave the AI a persistent context. Instead of hallucinating new solutions every time, it could reference the plan. The AI was also allowed to edit the plan as it discovers new strategies during the process.
I asked the coding agent to implement one optimization at a time, always document changes it made, run tests until they pass, and record benchmarking results. The AI agent had access to all the tools it needed and performed these actions autonomously. No human participation was needed. I was away from the keyboard.
Key Message: This methodology turns AI from a “code generator” into a “systematic optimizer.”
Let’s look at two concrete examples of how this process played out.
deposit! Function OptimizationThe deposit! function maps particle charges onto the grid. It was a major bottleneck (5.3s on CPU).
The Process (from Issue #3):
Atomix.@atomic).Using atomic operations on the GPU fixed the race condition without destroying performance:
# Before (Race Condition)
grid[idx] += charge
# After (Atomic - Thread Safe)
Atomix.@atomic grid[idx] += charge
free_space.jl Function OptimizationThis function solves the Poisson equation using FFTs. The bottleneck was massive memory allocation (~13GB per call).
The Process (from Issue #1):
Result: The memory usage dropped to near zero (reusing pre-allocated buffers), and speed skyrocketed.
The final numbers were staggering. By rigorously following the analyze-plan-implement-benchmark loop, we achieved improvements that I honestly didn’t think were possible without weeks of manual work.
Final Performance (from Issue #4):
Baseline: 5.3s for 100k particles on CPU, 0.7-5.4s on GPU
Optimized full step (deposit + solve + interpolate field):
| Grid Size | Particles | CPU (ms) | GPU (ms) | Speedup |
|---|---|---|---|---|
| 32³ | 10k | 29.35 | 1.35 | 21.8x |
| 32³ | 100k | 31.56 | 1.42 | 22.2x |
| 64³ | 100k | 270.0 | 11.26 | 24.0x |
| 64³ | 1M | 299.21 | 11.9 | 25.1x |
| 128³ | 100k | 3186.89 | 91.94 | 34.7x |
| 128³ | 1M | 3458.98 | 92.66 | 37.3x |
Key Achievements:
Here are some things that worked out for me in this project:
Context is King: Write a design document or Product Requirements Document (PRD) before you jump into coding. It serves as a long-term memory for the AI agent, preventing it from getting lost in the weeds. It also forces you to think about high-level system design, which is getting increasingly valuable in the age of AI.
Tests and benchmarks are critical. They give immediate feedback to AI agents to self-correct. Once the guardrails were in place, AI can be much more autonomous and requires less human intervention.
Methodology matters as much as model quality. Good context engineering practices can improve performance significantly. This case study proves that scientific methodology applies to AI-assisted development too.
If you’re a scientist skeptical of AI coding, I encourage you to try this approach. Don’t just chat with the bot. Build a test harness, demand a plan, and verify everything. You might be surprised at how fast you can travel.