diff --git a/_blog.yml b/_blog.yml index 6609323e60..e3e7d6f185 100644 --- a/_blog.yml +++ b/_blog.yml @@ -6361,3 +6361,15 @@ - community - research - open-source-collab + +- local: shaping-laser-pulses + title: "Shaping Laser Pulses with Reinforcement Learning" + author: fracapuano + thumbnail: https://huggingface.co/datasets/fracapuano/rlaser-assets/resolve/main/assets/Figure1.png + date: July 16, 2025 + tags: + - reinforcement-learning + - physics + - ml-for-science + - research + - open-source diff --git a/shaping-laser-pulses.md b/shaping-laser-pulses.md new file mode 100644 index 0000000000..00ce48e906 --- /dev/null +++ b/shaping-laser-pulses.md @@ -0,0 +1,116 @@ +--- +title: Shaping Laser Pulses with Reinforcement Learning +thumbnail: https://huggingface.co/datasets/fracapuano/rlaser-assets/resolve/main/assets/Figure1.png +authors: +- user: fracapuano +--- + +# Table of Contents +- [TL;DR](#tl-dr) +- [Shaping Laser Pulses](#shaping-laser-pulses) +- [Automated approaches](#automated-approaches) +- [BO's limitations](#bos-limitations) +- [RL to the rescue](#rl-to-the-rescue) + + +## TL; DR: +We train a Reinforcement Learning agent to **optimally shape laser pulses** from readily-available diagnostics images, across a range of dynamics parameters for intensity maximization. +Our method **(1) completely bypasses imprecise reconstructions** of ultra-fast laser pulses, **(2) can learn to be robust to varying dynamics** and **(3) prevents erratic behavior** at test-time by training in coarse simulation only. + +
+ Phase changes animation +

(A) Schematic representation of the RL pipeline for pulse shaping in HPL systems. (B) Illustration of the process of linear and non-linear phase accumulation taking place along the pump-chain of laser systems.

+
+ +By opportunely controlling the phase imposed at the stretcher, one can benefit from both energy and duration gains, for maximal peak intensity. + +--- + +## Shaping Laser Pulses + +Ultra-fast light-matter interactions, such as laser-plasma physics and nonlinear optics, require precise shaping of the temporal pulse profile. +Optimizing such profiles is one of the most critical tasks to establish control over these interactions. +Typically, the highest intensities conveyed by laser pulses can usually be achieved by compressing a pulse to its transform-limited (TL) pulse shape, while some interactions may require arbitrary temporal shapes different from the TL profile (mainly to protect the system from potential damage). + + +
+ Phase changes animation +

Changes in the spectral phase applied on the input spectrum (left) have a direct impact on the temporal profile (right).

+
+ +In this work, we shape laser pulses by varying the GDD, TOD and FOD coefficients, effectively tuning the spectral phase applied to minimize temporal pulse duration. + + + +## Automated approaches + +The most common automated laser pulse shape optimization approaches mainly employ black-box algorithms, such as Bayesian Optimization (BO) and Evolutionary Strategies (ES). These algorithms are typically used in a closed feedback loop between the pulse shaper and various measurement devices. + +For pulse duration minimization, numerical methods including BO and ES require precise temporal shape reconstruction, to measure the loss against a target temporal profile, or obtain derived metrics such as duration at full-width half-max, or peak intensity value. + +Recently, approaches based on BO have gained popularity because of their broad applicability and sample efficiency over ES, often requiring a fraction of the function evaluations to obtain comparable performance. +Indeed, in automated pulse shaping, each function evaluation requires one (or more) real-world laser bursts. Therefore, methods that directly optimize real-world operational hardware are evaluated based on their efficiency in terms of number of the required interactions. + +### BO's limitations + +While effective, BO suffers from limitations related to (1) the need to perform precise pulse reconstruction (2) machine-safety and (3) transferability. To a large extent, these limitations are only more significant for other methods such as ES. + +#### 1. Imprecise pulse reconstruction +BO requires accurate measurements of the current pulse shape to guide optimization. However, real-world pulse reconstruction techniques can be **noisy or imprecise**, leading to poor state estimation, and increasingly high risk of applying suboptimal controls. + +
+ Phase changes animation +

Temporal profiles with temporal-domain reconstructed phase (top) versus diagnostic measures of the burst status (bottom), in the form of FROG traces. Image source: Zahavy et al., 2018.

+
+ +#### 2. Dependancy on the dynamics +BO typically optimizes for specific system parameters and **doesn't generalize well when laser dynamics change**. Each new experimental setup or parameter regime may require re-optimizing the process from scratch! + +This follows from standard BO optimizing a typically-scalar loss function under stationarity assumptions, which can prove rather problematic in the context of pulse-shaping. This follows from the fact day-to-day changes in the experimental setup can quite reasonably result in non-stationarity: **the same control, when applied in different experimental conditions, can yield significantly different results**. + +
+ Phase changes animation +

Impact of experimental conditions only, in this case a non-linearity parameter known as "B-integral", on the end-result of applying the same control.

+
+ +#### 3. Erratic exploration + +BO can endanger the system by applying **abrupt controls at initialization**. Controls are applied as temperature gradients applied on a gated-optical fiber, and as such successive controls cannot typically vary significantly because the one-step difference in temperature difference cannot vary arbitrarily. + +
+
+ BO temporal profile +
+
+ BO exploration +
+
+

BO, (left) temporal profile obtained probing points from the parameters space and (right) BO, evolution of the probed points as the parameters space is explored.

+ +## RL to the rescue + +In this work, we address all these limitations by **(1) learning policies directly from readily-available images**, capable of **(2) working across varying dynamics**, and **(3) trained in coarse simulation to prevent erratic-behavior** at test time. + +First, (1) we train our RL agent directly from readily available diagnostic measurements in the form of 64x64 images. This means we can **entirely bypass the reconstruction noise** arising from numerical methods for temporal pulse-shape reconstruction, learning straight from single-channel images. + +
+ +

Control is applied directly from images, thus learning to adjust to unmodeled changes in the environment.

+
+ +Further, (2) by training on diverse scenarios, RL can develop both **safe and general control strategies** adaptive to a range of different dynamics. In turn, this allows to run and lively update control policies across experimental conditions. +
+ +

We can retain high level of performance (>70%) even for larger---above 5, fictional---levels of non-linearity in the systems. This shows we can retain performance by applying a proper randomization technique.

+
+ +Lastly, (3) by learning in a corse simulation, we can **drastically limit the number of interactions at test time**, preventing erratic behavior which would endanger system's safety. + +
+ +

Controls applied (BO vs RL). As it samples from an iteratively-refined surrogate model of the objective function, BO explores much more erratically than RL.

+
+ +In conclusion, we demonstrate that deep reinforcement learning can master laser pulse shaping by learning **robust policies from raw diagnostics**, paving the way towards **autonomous control of complex physical systems**. + +If you're interested in learning more, check out [our latest paper](https://huggingface.co/papers/2503.00499), our [simulator's code](https://github.com/fracapuano/gym-laser), and try out the [live demo](https://huggingface.co/spaces/fracapuano/RLaser). \ No newline at end of file