Pulse: An Outlier-Sensitive Downsampling Algorithm for Timeseries Data
Purpose-built for preserving distinct features in large datasets
Overview
Pulse is a downsampling algorithm designed for timeseries data that contain brief but important outliers. Many datasets become so large that visualization tools and web browsers cannot render them efficiently. Standard downsampling algorithms reduce data volume, but they often fail to preserve distinct features or transitions. Pulse was created to address this gap.
The algorithm was originally developed to downsample galvanostatic electrolysis stack test data at the Idaho National Laboratory. These datasets contained approximately four million records, most of which were extremely uniform. However, during short periods when the stack test changed state, such as powering on or adding load, the data produced sparse asymptotes. Existing downsampling algorithms were unable to preserve these features. Pulse instead retains these important outliers while aggressively reducing uniform sections of the dataset.
This same approach makes Pulse useful in other fields where sparse, meaningful anomalies matter, including seismology and astronomy.
Why Pulse Exists
Large timeseries datasets overwhelm plotting tools, leading analysts to rely on downsampling. Most algorithms aim to preserve general trends, which works for smooth signals but fails when datasets rely on distinct events or transitions. Pulse was created specifically to preserve these critical anomalies while reducing the rest.How Pulse Works
Pulse evaluates timeseries data by identifying sparse but important deviations produced during state changes or other significant events. These outliers are kept intact, while the algorithm removes large volumes of uniform data. This allows users to view and analyze very large datasets without losing scientifically meaningful information.
Benefits
Pulse supports users working with large, anomaly-focused datasets. This includes developers contributing to the SciPy Python ecosystem, who may be interested in an algorithm that fills a specific gap in timeseries visualization needs. Researchers in fields such as electrolysis testing, seismology, astronomy, or any domain with rare but important anomalies can also use Pulse to reduce data volume without losing critical information.
Why It Matters
By preserving distinct features while reducing overall data density, Pulse enables more accurate visualization and interpretation of large timeseries datasets. It supports research workflows that depend on capturing brief but meaningful events without overwhelming visualization tools.