Theodoros Panagiotakopoulos

Experience

Modeling Product Engineer, ASML, Silicon Valley

At ASML, I built a data processing and visualization framework to evaluate machine learning optics models against rigorous simulations at scale. In the production workflow, the optimizer does not evaluate on validation data during training because doing so would significantly increase runtime and memory usage. That constraint creates a real risk. A model can appear to improve during training while silently overfitting, and the training loop will not reveal it. I designed an independent evaluation pipeline that loads intermediate checkpoints, aligns them with rigorous reference outputs, applies scenario specific EUV and DUV sanity filters to reject unphysical values, and then computes physically meaningful error metrics that directly reflect optical fidelity. The core metric I emphasized was the aerial image RMSE, because it is both physically interpretable and sensitive to generalization failure. A model that memorizes training conditions tends to break when evaluated across the true sources of variation in the dataset, including slits, groups, and queueTags. My pipeline explicitly separates training and validation populations, computes aerial image RMSE across these axes, and aggregates results into compact summary tables that make failure modes obvious rather than hidden in averages. This turned model evaluation into an engineering diagnostic instead of a one number report. I engineered the system to handle the reality of the workload. The pipeline had two fundamentally different bottlenecks depending on the stage. The ingestion phase was dominated by I/O overhead because it required reading hundreds of intermediate checkpoint artifacts and merging them with rigorous reference datasets. I parallelized that stage using ThreadPoolExecutor to overlap many small file reads and merge operations without serial waiting, which substantially accelerated end to end throughput along a simulation path. After ingestion, the bottleneck shifted to computation. Computing aerial image RMSE and related metrics across slits, groups, and queueTags is CPU heavy, and the largest runs could exceed single machine memory limits when aggregating high resolution data across many conditions. I used Pandas as the default engine because it integrates cleanly with plotting, CSV export, and downstream analysis, but I treated it as the baseline rather than the ceiling. When DataFrame scale became the limiter, I introduced a layered execution strategy. I used ProcessPoolExecutor for compute parallelism where it provided the best speedup, and for the largest multi hundred gigabyte aggregations I switched to Dask so computations could be chunked and distributed across cores without memory thrashing. This preserved the pandas style workflow while removing the single node constraints. The final result was a terabyte scale CNN analysis pipeline that was fast, stable, and operationally usable by others. It reduced analysis time by roughly 80% , enabled routine large scale validation that previously could not be run often enough, and it exposed critical overfitting that was invisible inside the training loop. My team adopted the framework as a standard tool for large scale ML and simulation experiments because it made model quality measurable, comparable, and difficult to misinterpret.

I developed a Deep Learning framework to measure similarity between lithography simulation images using triplet learning. The goal was to replace brittle pixel level comparisons with a representation that captures pattern structure in a way that aligns with how lithography quality is judged in practice. I trained a ResNet18 convolutional model to project images into a compact embedding space where visually and structurally similar patterns are mapped close together and dissimilar patterns are pushed apart. I built the pipeline end to end, including preprocessing, data organization, training controls, and exportable analysis outputs. I applied consistent input transformations including resizing, grayscale conversion, normalization, and tensor conversion so the network sees a stable representation across simulation sources. The framework automatically partitions data into training and validation sets and uses early stopping so convergence is reliable rather than driven by overtraining. A key technical choice was how I constructed training examples. I generated hard triplets to force the model to learn the subtle features that actually distinguish lithography patterns. The anchor and positive samples come from the same pattern group, while the negative sample is intentionally selected from a different group that is visually similar. This design prevents the network from learning trivial shortcuts and instead drives it to separate patterns based on fine scale structure that matters for fidelity. In the embedding space, I used cosine similarity as the similarity measure because it captures structural alignment while being less sensitive to global intensity scaling. That is important in lithography, where two images can have comparable structure even when absolute brightness differs due to exposure and normalization effects. After training, the framework extracts embeddings for each pattern group, computes pairwise and group level cosine similarities, and exports the results to structured CSV files that integrate cleanly with downstream analysis and reporting. The resulting embeddings are not only useful for similarity scoring. They provide a reusable representation that supports clustering, anomaly detection, and model validation, which extends the framework into a general tool for organizing simulation outputs and diagnosing model behavior within lithography workflows.

To address class imbalance and improve generalization, I augmented the ResNet 18 training dataset using a U Net denoising diffusion model to synthesize additional minority class lithography simulation images. Several pattern groups were under represented compared with the dominant classes, which can bias metric learning by shaping the embedding space around what the model sees most often and reducing sensitivity to rare but important geometries. I trained the diffusion model to generate class consistent samples that preserve the structural signatures of each minority group while introducing realistic within class variation that mirrors what appears in lithography simulations. This included small edge shifts, subtle linewidth changes, localized smoothing, and process like noise that affect appearance without changing the underlying pattern identity. I then integrated these synthesized samples into the same preprocessing and triplet construction pipeline so they were treated as first class training examples rather than separate artifacts. Increasing minority class coverage improved the quality of hard triplets and reduced the chance that the model learned an embedding dominated by majority patterns. The result was improved validation stability and more reliable similarity rankings across pattern groups, with reduced bias toward over represented classes and better robustness to plausible structural variation.

I built a Physics Informed Neural Network for electromagnetics to solve the two dimensional Helmholtz equation. The network does not learn from labeled simulation targets. It learns by satisfying the governing physics, meaning its predictions are driven by the PDE and boundary conditions rather than by a dataset of precomputed fields. I implemented the solver with a SIREN sinusoidal representation network so it can accurately represent the high frequency oscillations that are intrinsic to wave propagation. The model takes spatial coordinates as input and outputs the real and imaginary components of the complex electric field. I chose SIREN because wave solutions are highly oscillatory and standard activation functions tend to smooth or underfit them. Sine activations with frequency scaled initialization make these oscillations learnable with a compact network, so the solver achieves high fidelity field structure without relying on brute force depth. Training is physics driven with two loss terms. In the interior of the domain, I minimize the Helmholtz residual computed through automatic differentiation, including the Laplacian operator and the material response through a spatially varying permittivity field. On the boundary, I enforce the Sommerfeld radiation condition to guarantee outward propagating waves and suppress nonphysical reflections. I weight the boundary term to keep the radiation constraint tight because correct far field behavior determines whether the solution is physically usable. A major stability and generalization choice is how collocation points are sampled. I resample interior and boundary points every epoch so the network cannot memorize a fixed grid and must satisfy the physics everywhere. Interior points are drawn partly uniformly across the domain and partly from a Gaussian distribution centered at the source, which forces the network to learn both the global wave structure and the steep local gradients near excitation. Boundary points are sampled evenly along all edges and paired with outward normal vectors so the radiation condition is applied correctly at every location. Optimization is carried out in two deliberate stages. I train first with Adam using cosine annealed learning rate scheduling and Gradient Clipping to stabilize updates under noisy physics residuals. Once the model is close to a solution, I switch to L-BFGS for curvature guided refinement. This second stage consistently reduces remaining residuals faster than first order methods alone and produces a visibly cleaner field solution. After training, I evaluate the learned field on a dense uniform grid to visualize the permittivity distribution, the source profile, and the real and imaginary field components. I also compute the root mean square PDE residual across the domain as a quantitative measure of physical consistency. I tested both free space and a dielectric inclusion case, showing the solver learns the correct governing behavior and boundary physics rather than overfitting to a fixed set of points. Overall, this project delivers a practical neural PDE solver built around the requirements that actually matter for wave problems. It uses a representation suited to oscillatory fields, enforces physically correct radiation behavior, and applies an optimization strategy that converges reliably. The result is a flexible foundation that can extend to more complex geometries, material layouts, and source configurations while keeping the solution anchored in first principles physics.

At ASML, I designed and built scalable data-processing libraries and distributed ETL pipelines using PySpark to standardize the cleaning, validation, and preprocessing of over 400+ GB of simulation metadata used by product engineering teams. The architecture was based on Apache Spark’s distributed computing framework, where raw simulation outputs (CSV, Parquet, JSON, and proprietary log formats) were ingested into a centralized data lake and processed using PySpark DataFrames and Spark SQL. I implemented schema enforcement, metadata harmonization, null handling, normalization, deduplication, and rule-based validation logic at scale to ensure consistency across heterogeneous simulation sources. The solution followed a layered data architecture (raw, standardized, curated), ensuring reproducibility and traceability of transformations. I developed modular, reusable Python libraries on top of PySpark to abstract common preprocessing logic, enforce data contracts, and allow engineering teams to onboard new simulation formats with minimal changes. Performance optimization techniques such as partitioning strategies, broadcast joins, caching, and predicate pushdown were applied to efficiently process large datasets. In addition to PySpark, I used supporting Python libraries such as Pandas and NumPy for local prototyping and validation logic, PyArrow for efficient columnar data handling, and structured logging frameworks for monitoring and debugging distributed jobs. The resulting framework reduced preprocessing time, improved data quality, and provided a scalable and standardized foundation for downstream analytics and model validation workflows across multiple engineering teams.

I worked with the Tachyon software stack and led a focused investigation of geometrical corner rounding with the goal of improving both accuracy and production efficiency. I personally designed and executed FEM plus simulation studies and quantified performance across the dimensions that matter in deployment, including total runtime, memory usage, and grid dependence. I then performed a detailed runtime breakdown to isolate the true cost drivers, separating corner rounding time, render area and render edge time, and EM3D time, which made optimization decisions defensible and measurable. For contour to contour validation, I designed LMC plus simulations and used the MPC layer as the golden reference for comparison. I translated these results into clear engineering conclusions, aligned on the interpretation with R&D, and drove the change through to production. The update was incorporated into the latest Tachyon release and delivered to one of our largest customers, turning analysis into customer facing impact. I also established that using an updated geometrical corner rounding value enables a significant reduction in computational runtime and memory usage while maintaining the required modeling fidelity. To operationalize the work, I wrote production grade Python and C++ tooling to extract and process large simulation datasets, packaged the workflows into reusable internal libraries, and shared them across teams. These tools were adopted by the product engineering group, improving throughput and consistency beyond my own projects. In parallel, I identified a critical defect in Tachyon that produced asymmetric jog artifacts after applying the corner rounding algorithm. I isolated the root cause, proposed a concrete fix, and worked with R&D to integrate the solution into the codebase. The result was a more robust corner rounding workflow with improved reliability under real production conditions.

I led a full transition cross coefficient optimization effort designed to cut compute cost without sacrificing optical accuracy. I built a rigorous simulation campaign in Tachyon using M3D workflows with FEM plus, and I structured the study to identify the smallest TCC basis that still reproduces baseline behavior within tight manufacturing tolerances. To make the result robust and transferable, I tested a broad matrix of imaging conditions and mask types. I evaluated high numerical aperture and low numerical aperture regimes, paired with both low refraction index masks and binary masks with strong refractive index contrast. I also diversified the geometric coverage of the test suite using one dimensional and two dimensional patterns, plus circular, elliptical, and polygonal structures. This ensured the optimized TCC setting was not tuned to a narrow corner case but validated across realistic pattern classes. I quantified performance using the metric that matters for imaging fidelity, the aerial image critical dimension difference between the baseline model and reduced TCC models. I identified an optimal TCC number that delivered a significant reduction in runtime and memory usage while keeping accuracy inside strict specifications, including aerial image agreement within 0.1 nanometer and positional shift in x and y below 0.01 nanometer. The outcome was a computational setting that improves throughput while preserving the integrity of the image formation model. I translated the analysis into a production change by aligning the interpretation with research and development and driving the update into the next Tachyon software release. The optimized setting was also adopted by the customer, demonstrating that the work delivered both internal product value and external impact. To support scale and repeatability, I wrote custom Python and C++ tooling to extract, filter, and analyze simulation output at volume. I implemented a workflow distinct from my prior project, built around automation and structured validation, so large datasets could be processed consistently with minimal manual effort. The result was an analysis pipeline that enables rapid iteration, defensible comparisons, and reliable decision making for future modeling and optimization studies.

To support my projects, I engineered my analysis code into an object oriented Python framework built for automated large scale dataset processing. I designed the architecture for reuse and extension, applying core OOP principles such as inheritance and polymorphism to keep new workflows consistent without duplicating logic. I implemented a modular design with optimized input output handling for high volume simulation data, and I wrote clear documentation so the framework could be adopted and maintained beyond my own use. I integrated the work into the company’s GitLab repository using clean structure and reproducible execution patterns, enabling team level collaboration and long term support. The framework was later adopted by ASML product engineering, demonstrating that the work delivered lasting value as shared infrastructure, not just a one off solution for a single project.

I enhanced the API library used to build and run Tachyon simulations, aligning the interface and behavior with ASML standards. I audited the codebase end to end, identified multiple reliability issues, and delivered targeted fixes that eliminated failure modes and reduced debugging overhead for the team. Beyond bug fixes, I expanded the library with new functionality that streamlines simulation setup and accelerates result generation. The updates improve day to day productivity by reducing boilerplate, enforcing consistent configuration, and enabling faster iteration when exploring parameter space. The impact was especially strong for FEM plus workflows, where the improvements increased performance and strengthened runtime stability under demanding workloads. The outcome is a more robust, efficient, and maintainable API layer that makes high quality simulations easier to build, easier to reproduce, and faster to deliver. It raised the baseline for the entire workflow, strengthening the team’s ability to execute complex studies with confidence and speed.

Research Assistant, University of Central Florida

Artificial Inteligence, DOE - NSF Grant

This project focuses on a fundamental problem in semiconductor fabrication. Understanding and predicting how metals deposit on semiconductor surfaces is critical because deposition controls contact formation, interface stability, and device reliability. I studied epitaxial Pb growth on the Ge(111) surface, where experiments show an initially uniform wetting layer followed by an abrupt transition to island formation at a critical coverage that is difficult to determine precisely from microscopy alone. I established the physical baseline using first principles modeling. I performed Density Functional Theory calculations for Pb on Ge(111) and used DFT + U+ J to correct known limitations of standard approximations for germanium. By calibrating the Hubbard U and exchange J parameters, I reproduced the experimental Ge band gap near 0.67 eV, ensuring that the electronic structure governing bonding and charge redistribution at the interface was accurately captured. With these calibrated energetics, I computed the Pb chemical potential as a function of coverage from 1.0 to 1.7 monolayers. Chemical potential represents the energetic cost of adding one more Pb atom, allowing a direct comparison between wetting layer stability and bulk like clustering. The DFT results show a clear transition at about 1.33 monolayers, identifying the onset of nucleation. The next challenge is scale. While DFT provides accuracy, it is too computationally expensive for large surfaces and many local configurations. I therefore developed a machine learning surrogate explicitly designed to preserve chemical potential behavior. I first trained a geometry only energy model and evaluated it through the most sensitive metric, the chemical potential computed from energy differences. Although the average energy error appeared small, the resulting chemical potential was noisy and failed to reproduce the nucleation threshold. I analyzed the residuals in detail and identified a systematic overprediction that drifted with coverage. This is not random error but a signature of missing physics. In Pb on Ge(111), adsorption induces charge transfer and interfacial dipoles, and the electrostatic stabilization evolves with coverage. A geometry only model cannot capture this effect, leading to a coverage dependent bias that distorts chemical potential. This insight drove the final model design. I built a two stage, physics informed machine learning pipeline that predicts atomic charges first and energies second, using the same charge input during inference as during training. For Model 1, I trained a neural network to predict atomic charges from local atomic environments extracted from DFT relaxed slab configurations spanning coverages from 1.0 to 1.7 monolayers. Each training example is a fixed local window containing 50 Ge atoms and 10 Pb atoms, encoded using a characteristic matrix built from ordered neighbor environments. I generated 10,000 local windows and used DFT derived charges as labels. The final charge model achieves a validation error on the order of 0.03 e and exhibits a stable error distribution with no heavy tail, which is essential for reliable aggregation across overlapping windows. For Model 2, I trained a second neural network to predict system energies using atomic positions together with predicted charges. The two models are deployed on large surfaces using a sliding window approach with overlap aware averaging, ensuring that each atom and each region of the surface contributes consistently to the reconstructed global charge maps and total energies. This pipeline resolves the failure mode that matters most. The two stage model produces energy residuals tightly centered near zero that do not drift with coverage, which is exactly what is required for a stable chemical potential. When I computed chemical potential using the machine learning energies, the curve was smooth and it predicted nucleation at 1.33 monolayers, in direct agreement with the DFT reference. The model therefore reproduces the thermodynamic switch between wetting and island formation while enabling large scale simulations that are impractical with DFT alone. Overall, I developed an end to end framework that combines calibrated first principles energetics with scalable machine learning for metal on semiconductor growth. The results and the computational capability they enabled played a pivotal role in securing NSF funding.

In this project, I delivered an end to end capability to predict the final deposition morphology of Pb on Ge111, producing relaxed atomic patterns that normally require expensive simulation. I translated the physics of atomic relaxation into a deployable machine learning workflow that is accurate on held out structures and scalable to large surfaces. I built the solution by first exposing the real failure mode. I implemented a 2D CNN baseline as a fast diagnostic model and evaluated it against the reference morphology. It captured the broad pattern but failed in the regions that determine reliability. The largest errors concentrated at island edges and steps, and the error distribution developed a long right tail, meaning a small fraction of atoms produced large mistakes that dominate worst case behavior. I treated that signal as an engineering constraint and redesigned the representation to remove the root cause rather than tuning around it. I then implemented a 3D voxel CNN that preserves depth information and learns true three dimensional local structure. For each reference atom, I voxelized its neighborhood into a 10 × 10 × 10 grid with 6 feature channels, forming an input tensor of size 10 × 10 × 10 × 6. The network predicts a physically meaningful target, the local displacement of the center atom as Δx, Δy, Δz, which makes the output directly actionable for reconstructing the relaxed structure. I drove model selection through disciplined experimentation. I explored convolution depth, filter counts, pooling strategy, dense layer width, dropout, learning rate schedules, and batch size, and I selected the final architecture using an 80 20 train validation split, mean squared error loss, early stopping, and learning rate scheduling. The objective was reliability, not just a good average. The design target was a model that reduces edge driven outliers because those are the cases that determine whether a morphology predictor is usable at scale. I trained using the Adam optimizer with a learning rate near 10⁻³, with early stopping and scheduling to maximize generalization. To scale the method to large systems, I engineered a sliding window inference pipeline. Each atom appears in multiple overlapping voxel neighborhoods, so it receives multiple displacement predictions from different local contexts. I combine these using overlap aware averaging to produce one final displacement per atom, then reconstruct the relaxed structure by adding the predicted displacement to the initial position. This converts a fixed input CNN into a large surface morphology engine while maintaining consistency across the full system. On held out test structures, the 3D voxel CNN reproduces the reference morphology with a displacement mean absolute error near 0.04 angstrom and a strongly reduced large error tail relative to the 2D projection baseline. The remaining hardest atoms are still near edges where coordination changes rapidly, but the hotspots are far weaker and far less widespread. The result is a predictor that remains stable in the exact regimes where simpler models break down. To support high throughput training and reliable generalization, I built a distributed PySpark data pipeline to convert more than 10K atomic configurations into more than 600K voxel tensors, eliminating ingestion bottlenecks for 3D model training. To address imbalance in local environments and deposition conditions, especially the under represented edge and step neighborhoods that drive the long tail, I trained a 3D U Net denoising diffusion model to synthesize additional voxelized neighborhoods for augmentation. The diffusion generated tensors increased coverage of rare structural regimes and exposed the displacement regressor to a wider range of physically plausible local morphologies, improving validation stability and reducing bias toward the most common interior environments. I also implemented a 3D Vision Transformer to evaluate whether self attention could better capture broader context within voxelized atomic neighborhoods. Each 10 × 10 × 10 × 6 input volume was partitioned into fixed size 3D patches, embedded with positional information, and processed by a stack of Transformer encoder layers before a regression head predicted Δx, Δy, Δz. While the Transformer captured coarse structural context, it proved less reliable in localized edge dominated regimes and incurred substantially higher inference cost, which reinforced the choice of convolutional architectures for scalable deployment. The outcome is a morphology workflow that is fast, scalable, and designed for real decision making. Once trained, it produces relaxed deposition patterns in minutes rather than days, enabling large scale nucleation and morphology studies that would otherwise be computationally impractical. This capability, combined with the energy and chemical potential modeling, played a pivotal role in securing NSF funding.

I developed a segmentation and analysis pipeline to extract physically reliable structural information from noisy STM images of Pb deposition on Ge. STM was the primary experimental probe, so the problem was not cosmetic image cleaning. The objective was to isolate physically meaningful atomic islands in a way that remains stable across large datasets where background variation, tip artifacts, and closely spaced islands make conventional image processing unreliable. Raw STM scans were first passed through standard statistical preprocessing. I applied Gaussian denoising to suppress high-frequency noise, background flattening to remove scan-dependent height gradients, and threshold-based suppression to eliminate obvious tip artifacts. These steps improved image quality, but they did not solve the core problem. Simple thresholding and connected-component filtering worked on clean scans but failed when background contrast varied or when islands touched or partially overlapped. The method was fragile and required manual tuning, which made it unsuitable for scalable analysis. To address that limitation, I transitioned to a deep learning–based segmentation approach. I first trained a Faster R-CNN model with a ResNet-50 backbone to localize islands using bounding boxes. The detector was stable under moderate background variation and performed well for island counting and coarse spatial localization. However, bounding boxes are fundamentally too coarse for growth analysis. They cannot capture island edge structure, detailed shape, or height distributions, which are required for physically meaningful morphology metrics. I extended the pipeline to Mask R-CNN using the same backbone and feature pyramid network, adding a pixel-level mask head. The mask head consisted of stacked 3×3 convolutional layers with ReLU activations followed by a 1×1 convolution that produced per-pixel instance masks. Training combined classification loss, bounding-box regression loss, and binary cross-entropy mask loss. To stabilize optimization, I applied weight decay, gradient clipping, early stopping, and light data augmentation including small rotations and intensity shifts. Mask R-CNN resolved the failure modes that broke classical methods. It produced clean, instance-separated island masks even in crowded regions and in the presence of STM-specific artifacts. Adjacent islands were reliably separated, residual tip artifacts were rejected, and performance remained stable under background contrast variation without manual retuning. These segmentation masks defined the physically meaningful regions of each STM scan and became the foundation for all downstream analysis. Using the instance-resolved masks, I extracted morphological features in a consistent and reproducible way. These included island area, perimeter, height histograms, and edge-versus-interior classifications derived directly from the STM height signal. By restricting feature extraction strictly to segmented island regions, background noise and scan artifacts were excluded from measurement. This ensured that reported statistics reflect true surface morphology rather than imaging conditions. I packaged the full segmentation and feature-extraction workflow into a reusable script that outputs per-island masks and structured summary tables. The system can be applied to new STM datasets without manual intervention, enabling scalable and consistent morphological analysis. The result is a stable analysis pipeline that converts noisy experimental STM scans into quantitatively reliable morphology data suitable for modeling thin-film growth and deposition dynamics.

I built a deep learning framework to model electrochemical time series data with the goal of predicting transient current response under controlled potential protocols and turning sequence modeling into a physically meaningful diagnostic tool. The data came from chronoamperometry and cyclic voltammetry experiments and consisted of ordered sequences of applied potential, time increment, and normalized current density. The challenge was not simply fitting curves. Electrochemical transients contain memory effects, multi timescale relaxation, and path dependent hysteresis that must be preserved if the model is to remain physically credible. I designed a preprocessing pipeline that enforces physical consistency before any model sees the data. Raw signals were denoised using Savitzky Golay filtering, baseline corrected to remove drift, and normalized by electrochemically active surface area so experiments are directly comparable. This prevents the network from learning artifacts caused by measurement noise or geometric scaling and ensures it focuses on transport and kinetic behavior rather than instrument variability. To model the temporal dynamics, I implemented recurrent neural network (RNN) architectures. The primary model was an LSTM, a gated form of RNN designed to retain information over extended sequences. The architecture used one to two stacked recurrent layers with 32 to 64 hidden units, with dropout between layers at 0.2 to 0.3 to control overfitting. A fully connected regression head mapped the hidden representation to time resolved current density predictions. Inputs were fixed length windows of prior potential and time history, and outputs were the predicted current evolution over subsequent time steps. I trained the models using mean squared error (MSE) loss and evaluated them on strictly held out experimental sequences. I also trained GRU models with comparable parameter counts under identical conditions. The LSTM based RNN consistently achieved lower prediction error and more stable long horizon forecasts. The reason was structural. Electrochemical signals contain fast double layer charging, slow diffusion driven relaxation, and hysteresis between forward and reverse potential sweeps. The gated memory cells in LSTM regulate information flow with explicit input, forget, and output gates, which preserves long term dependencies without losing short term detail. The GRU architecture, while efficient, provides less control over long memory retention and showed degradation when predicting extended transients. To test whether global attention mechanisms could improve performance, I implemented Transformer based models using self attention. I constructed a causal Transformer encoder with two to four attention layers, model dimensions of 64 to 128, and four attention heads per layer. Each block contained multi head attention, a feedforward network with 128 to 256 neurons, residual connections, layer normalization, and dropout between 0.1 and 0.2. Because Transformers do not inherently encode temporal order, I added sinusoidal positional encodings after projecting the input features. The output of the Transformer was passed through a fully connected regression head to predict time resolved current density, again trained using MSE. The Transformer models learned short range correlations and some global structure within fixed windows, but they did not outperform the LSTM based RNN. There were clear reasons. First, the dataset size was moderate, and attention based models typically need more data to learn stable patterns without overfitting. Second, electrochemical transients have strong local continuity governed by physical processes, and recurrent updates align naturally with this sequential causality. Third, the quadratic cost of self attention limited efficient modeling of long trajectories, while RNN models maintain a consistent memory update regardless of sequence length. Beyond prediction, I treated the trained LSTM model as an analysis tool. I examined hidden state dynamics and prediction residuals to identify transitions between kinetically controlled and diffusion limited regimes. These transitions aligned with independently observed changes in Tafel slope and diffusion related signatures, supporting the physical relevance of the learned temporal features. The final architecture selected for use was the LSTM based RNN. It provided the best balance of predictive accuracy, long horizon stability, and physical interpretability, and it consistently captured the multi timescale memory behavior that defines electrochemical transients.

I initiated an interdisciplinary collaboration between the Department of Physics and the Department of Statistics and led the development of a machine learning methodology for node classification on incomplete graph datasets, where missing nodes can destabilize message passing and degrade accuracy. I designed the study to measure how different node removal patterns affect performance and to build a strategy that remains reliable under sparse observations. I selected Simplified Graph Convolutional Networks for their efficiency and interpretability and implemented the full pipeline from scratch in Julia using Flux, including preprocessing, operator construction, training, and evaluation. I introduced three reduction protocols, truncation of the last n nodes, random removal of a fixed fraction of nodes, and removal of a contiguous node block. After each reduction, I recomputed the normalized adjacency operator with self loops and degree normalization and applied third order propagation to capture higher order structure with minimal overhead. Training followed the SGC workflow using precomputed propagated features and a linear classifier optimized with cross entropy loss and the Adam optimizer under a 60% training and 40% testing split. I evaluated performance on reduced graphs and tested transfer back to the original full graph. By introducing a parameter averaging strategy across models trained on slightly different reduced graphs, I improved robustness under structured missingness. On the Cora benchmark, accuracy increased from about 91% to as high as 94% when applying the aggregated model back to the full dataset. I validated scaling behavior on synthetic networks generated using the Barabasi Albert model to confirm generalization beyond a single dataset. I extended this structured reduction framework to density functional theory based materials modeling by representing atomic systems as graphs, where atoms are nodes and interatomic interactions define edges. I applied physically consistent reduction patterns, such as clustered removals and surface like truncations, to mimic vacancies and finite size effects instead of deleting atoms arbitrarily. I then trained a graph neural network on DFT derived quantities, including local energy contributions and atomic forces, using these reduced atomic graphs as inputs. The model learned how local electronic environments contribute to system level properties even when coordination is partially degraded. Within the DFT workflow, the reduced structure GNN served as a screening model. Instead of performing full self consistent Kohn Sham calculations for every large or defected configuration, I first evaluated reduced representations to estimate local energy trends and identify promising candidates. Only selected structures were passed to high fidelity DFT evaluation. Because structured reduction preserved the most predictive structural information, the surrogate maintained strong agreement with full scale simulations while reducing the total number of expensive DFT calculations required. This accelerated the study of defected and truncated materials systems while maintaining physical consistency.

Data Modeling and Simulations, DOE Grant

I introduced a novel method to model the electrolyte at the electrochemical interface using ab initio simulations of charge transfer processes at surfaces. I presented a simple capacitor model of the interface that addresses key challenges, including: i) the requirement for large cell heights to achieve convergence, which incurs significant computational costs, and ii) the costly iterative calculations of reaction energetics needed to tune the surface charge to the desired potential. I derived a correction to the energy for finite cell heights, enabling the calculation of large cell energies without additional computational expense. Furthermore, I demonstrated that reaction energetics determined at constant charge can be easily mapped to those at constant potential, eliminating the need for iterative schemes to tune the system to a constant potential. This work provided an efficient and accurate approach to modeling electrochemical interfaces.

In my recent research, I led a joint experimental and computational effort on the Hydrogen Evolution Reaction, a central process for water splitting and clean energy conversion, with a deliberate focus on neutral electrolyte conditions where the Volmer step is rate limiting and therefore sets the performance ceiling. We observed that non metallic cations, especially ammonium and methylammonium, produce a pronounced enhancement in HER activity on Au(111) relative to sodium, and I translated that observation into a mechanistic question that could be answered with first principles modeling. I implemented Grand Canonical DFT using the TPOT workflow with VASP and VASPsol, enabling electron count to adjust self consistently to maintain a constant electrode potential, and I built a realistic interfacial model using a 4 by 4 by 5 Au(111) slab with 40 water molecules to capture solvent structure and cation hydration. I then computed activation barriers with a slow growth approach and resolved the hydrogen adsorption landscape into two baseline pathways, adsorption driven by water dissociation inside the cation hydration shell and adsorption driven by bulk water outside it. Critically, in the presence of ammonium and methylammonium I identified two additional routes, direct cation splitting and proton shuttling, which are not available for conventional metal cations and which lower the barrier for hydrogen adsorption, directly accelerating the rate limiting step. The outcome is a clear, actionable mechanistic conclusion that cation identity controls both pathway availability and barrier height, and that non metallic cations can unlock proton transfer channels that materially improve HER kinetics, providing a concrete direction for designing electrolyte environments and interfaces that outperform traditional metal cation systems.

This project targets a central constraint in sustainable electrochemistry, how electrolyte composition and specifically cation identity controls the electrochemical reduction of carbon dioxide CO₂RR. I focused on Bi(111), a highly promising platform for CO₂ conversion, and I built a mechanistic picture of how non metallic cations reshape CO₂ adsorption, reaction energetics, and product selectivity for CO and formate. I designed and executed high fidelity simulations using grand canonical density functional theory with TPOT coupled to VASP and VASPsol, which allowed me to hold electrode potential fixed by dynamically optimizing electron count, isolating potential dependent effects in a controlled and reproducible way. I systematically compared NH₄⁺, CH₃NH₃⁺, and (CH₃)₄N⁺ against Na⁺, and I quantified CO₂ binding energetics, finding that non metallic cations can enhance CO₂ adsorption by up to 0.2 eV relative to sodium through stronger electrostatic stabilization and more favorable interfacial charge redistribution. I then mapped the reaction landscape from adsorption through the key intermediates leading to CO and HCOO⁻, and resolved two distinct formate routes, direct hydrogenation of solvated CO₂ and surface adsorption followed by hydrogenation, with the solvated route emerging as energetically preferred and comparatively insensitive to cation identity, while CO formation showed strong sensitivity to cation hydration structure and interfacial proton delivery. To quantify hydrogen availability and transfer, I performed nudged elastic band calculations for water dissociation and hydrogen shuttling and obtained activation barriers near 1.10 to 1.12 eV for Na⁺ and NH₄⁺, establishing the kinetic cost of generating reactive hydrogen at the interface. Within that framework, I identified a specific advantage of NH₄⁺, it stabilizes key intermediates through directional hydrogen bonding, improving intermediate persistence and lowering desorption barriers in the CO pathway, which directly supports enhanced CO₂RR activity. The outcome is a coherent mechanistic model connecting cation chemistry to adsorption strength, barrier structure, and pathway selectivity, providing practical guidance for electrolyte and interface design aimed at carbon neutral fuel synthesis. This work supported experimental collaborations and contributed to Department of Energy funding, while establishing a foundation for future data driven modeling of electrochemical interfaces.

I studied and reported the effect of creating an interface between a semiconducting polyaniline polymer or a polar poly-D-lysine molecular film and one of two valence tautomeric complexes, i.e., [CoIII(SQ)(Cat)(4-CN-py)2] ↔ [CoII(SQ)2(4-CN-py)2] and [CoIII(SQ)(Cat)(3-tpp)2] ↔ [CoII(SQ)2(3-tpp)2]. I identified the electronic transitions and orbitals using X-ray photoemission, X-ray absorption, inverse photoemission, and optical absorption spectroscopy measurements, guided by density functional theory. My findings revealed that, except for slightly modified binding energies and shifted orbital levels, the choice of the underlying substrate layer had little effect on the electronic structure. Additionally, I observed a prominent unoccupied ligand-to-metal charge transfer state in [CoIII(SQ)(Cat)(3-tpp)2] ↔ [CoII(SQ)2(3-tpp)2], which remained virtually insensitive to the interface between the polymer and tautomeric complexes in the Co(II) high-spin state. This research led to the publication of a paper.

I implemented and developed additional batches, debugged, and optimized codes used in computational material science, including VASP, VASPsol, and Quantum Espresso. I performed, analyzed, and summarized validation simulations on high-performance computing (HPC) systems running Linux. I worked extensively on the creation, refinement, and advancement of application software and methods designed for analyzing and interpreting data in the physical sciences. Over the course of four years, I actively contributed to the development and optimization of codes used in HPC environments, ensuring robust performance and accuracy for computational simulations.

Teaching - Consulting, UCF Funded

I recently started coaching and supervising new graduate students to help them acclimate within the group and execute their research projects. I provide computational support, assist with class selection, and engage in discussions with the advisor and senior group scientists to plan the future steps of their research. I stay up-to-date with new developments in Computational Modeling and proactively introduce these advancements to new graduate students. Additionally, I support senior scientists with grant proposals by contributing sections that describe the interplay between their research and high-end computing resources.

I have served as a specialized physics lab instructor, focusing on Machine Learning and Data Science. I lead physics labs by incorporating comprehensive lessons on analyzing and applying simple machine learning models. I design and utilize artificial data derived from simulations I created, as well as real datasets from our laboratory, to provide engaging learning experiences. I demonstrate an effective integration of physics and data science, ensuring students gain a deeper understanding of how these disciplines intersect in modern research and applications.

I guided undergraduates in mastering the complexities of statistical data analysis, emphasizing the critical importance of data preparation for the effective application of machine learning algorithms. I implemented feature engineering techniques, including meticulous data cleaning and transformation, to enhance the quality and relevance of datasets, ensuring robust and meaningful analysis.

Research Assistant, National and Kapodistrian University of Athens

NKUA Funded

Engaged in the cutting-edge development of a sophisticated Machine Learning approach for Dark Matter Particle Identification, adeptly navigating the challenges posed by extremely low temperatures with unwavering precision and ingenious solutions. The model accurately predicts the origin of dark matter from the LSP.

Conducted immersive physics labs for undergraduates, delving into the complexities of statistical data analysis and cultivating a deep understanding of the art of data preparation for the seamless application of advanced machine learning algorithms. Simulations were developed by our simulation group, providing us with a vast pool of artificial data for cleaning and training purposes.

Theodoros Panagiotakopoulos

Experience

Modeling Product Engineer, ASML, Silicon Valley

Research Assistant, University of Central Florida

Research Assistant, National and Kapodistrian University of Athens

Education

University of Central Florida

National and Kapodistrian University of Athens

National and Kapodistrian University of Athens

Material Science Skills

Data Science Skills

Coding Languages

Commonly used Libraries

Management and Communication Skills

Awards and Fellowships

Conferences

Interests

Selected Publications