Automation

Automated Microscopy and Spectroscopy Data Processing Pipeline

A workflow developed to automate microscopy and spectroscopy analysis, reduce repetitive manual handling, and make large experimental datasets easier to process, compare, and interpret, including particle recognition tasks in TEM and SEM imaging.

Code Repository

View the full workflow, scripts, and supporting materials for this project on GitHub.

View on GitHub →

Overview

This project emerged directly from the data pressure of my excitation transfer work. As dataset size increased, one-off manual analysis became inefficient, error-prone, and difficult to standardize across experiments. I initially used MATLAB to handle batch processing, but over time I pushed that approach toward a more organized automated pipeline that could support larger workflows rather than isolated scripts.

The main goal was not only speed. I wanted a system that could improve consistency across experiments, reduce repetitive manual steps, preserve the experimental hierarchy of the data, and make it easier to compare outputs across samples, runs, and conditions. In practice, that meant thinking about the workflow as infrastructure for scientific reasoning, not just a convenience layer on top of the experiment.

An important part of this work also involved microscopy image analysis for TEM and SEM datasets, where the challenge was not only organizing files but extracting useful particle-level information. That included workflows for particle recognition, quantitative counting, size measurement, and morphology-aware analysis, all of which are difficult to do consistently by hand at scale.

What I Worked On

Built batch-oriented processing for microscopy and spectroscopy datasets that were too large and repetitive to handle manually.
Developed routines for file traversal, frame extraction, aggregation, structured outputs, and downstream-ready data organization.
Built analysis workflows for TEM and SEM image datasets to support particle recognition, particle counting, size measurement, and morphology-related statistics.
Reduced repeated manual operations and made analysis more reproducible across experiments, samples, and acquisition conditions.
Used coding not only to accelerate analysis, but to formalize workflow logic and make decisions traceable and repeatable.

TEM and SEM Particle Recognition

One of the more practically important aspects of this project was image analysis for electron microscopy. In TEM and SEM datasets, useful conclusions often depend on extracting quantitative particle-level information rather than simply inspecting representative images. I was interested in building workflows that could move from image collections to structured statistics, including particle counts, size distributions, and morphological descriptors.

This kind of work sits naturally between classical image analysis and computer vision. Even when the workflow is not framed as a full machine learning problem, it still requires the same mindset: define the signal, separate it from background and artifacts, and produce measurements that are consistent enough to support comparison across datasets. That experience also helped shape my later interest in more explicit computer vision and image recognition tasks.

How I Think About This Work

What I value most about this project is that it changed how I think about coding in scientific settings. At first, the motivation was practical: there was simply too much data to process by hand. But over time, I realized that automation was doing more than saving time. It was forcing the analysis to become explicit, structured, and reproducible, which in turn made the science easier to trust and easier to extend.

That shift is a big part of why I became more interested in computation-heavy work. Once a workflow is well structured, it becomes a foundation for better quantitative analysis, cleaner comparisons across experiments, and eventually more advanced modeling. In that sense, this project sits upstream of many of the machine learning directions I later became interested in.

Operational and Technical Value

Improved reproducibility by ensuring that repeated analyses followed the same logic across datasets.
Made large experimental studies easier to compare by preserving structure and reducing ad hoc analysis choices.
Made particle-level measurements from TEM and SEM images more systematic and scalable.
Created a more scalable foundation for future modeling, visualization, and machine learning.
Reflects the kind of work I find highly relevant to R&D environments, where throughput, consistency, and data quality all matter.

Why It Matters

This project marks the point where coding became central to how I work. It taught me that computation is valuable not only for modeling, but also for making experimental science more reliable, scalable, and easier to learn from. It is also one of the clearest examples of how I approach problems more broadly: define the workflow carefully, reduce friction in the process, and build systems that make later analysis stronger rather than more complicated.