NHR3 NGA-CycleCount Dataset

EQ Ground-Motion Cycle Count Database:
NGA-West2 + NGA-Subduction Datasets
R250310

March 2025

Brian Carlton[1], Silvia Mazzoni[2], Yousef Bozorgnia[3]

Four datasets have been published on Zenodo:
NGA-West2:
NGA-West2 PGA-Bins Dataset 12 files 53.9 MB (format: .hdf5)
NGA-West2 Raw-Data Dataset 12 files 6.6 GB (format: .hdf5)
NGA-Subduction
NGA-Subduction PGA-Bins Dataset 12 files 186.4 MB (format: .hdf5)
NGA-Subduction Raw-Data Dataset: 84 files 18.5 GB (format: .hdf5)
How to Cite this Datasets:
NGA-West2 PGA-Bins Dataset:
Carlton, B., Mazzoni, S., & Bozorgnia, Y. (2025). EQ Ground-Motion Cycle Count Database: NGA-West2 PGA-Bins Dataset [Data set]. Zenodo. https://doi.org/10.5281/zenodo.15015061
NGA-West2 Raw-Data Dataset:
Carlton, B., Mazzoni, S., & Bozorgnia, Y. (2025). EQ Ground-Motion Cycle Count Database: NGA-West2 Raw-Data Dataset [Data set]. Zenodo. https://doi.org/10.5281/zenodo.15003087
NGA-Subduction PGA-Bins Dataset:
Carlton, B., Mazzoni, S., & Bozorgnia, Y. (2025). EQ Ground-Motion Cycle Count Database: NGA-Subduction PGA-Bins Dataset [Data set]. Zenodo. https://doi.org/10.5281/zenodo.15015213
NGA-Subduction Raw-Data Dataset:
Carlton, B., Mazzoni, S., & Bozorgnia, Y. (2025). EQ Ground-Motion Cycle Count Database: NGA-Subduction Raw-Data Dataset [Data set]. Zenodo. https://doi.org/10.5281/zenodo.14967664

[1] Senior Geotechnical Engineer at Norwegian Geotechnical Institute (NGI), Oslo, Norway. Email: Brian.Carlton@ngi.no
[2] Project Scientist, UCLA, Civil and Environ. Eng., Los Angeles, CA, 90095. Email: smazzoni@g.ucla.edu
[3] Professor, UCLA, Civil and Environ. Eng., Los Angeles, CA, 90095. Email: yousefbozorgnia@g.ucla.edu
Introduction
Cyclic shearing caused by earthquake shaking can cause pore pressure build up and liquefaction in cohesionless soils, strength and stiffness degradation in cohesive soils, and fatigue damage in structures. Seed et al. (1975) proposed a method to convert an earthquake time series to an equivalent number of uniform cycles (neq). The main concept is that neq uniform cycles at a given cyclic stress ratio (CSR = applied shear stress, τ, divided by the vertical effective stress, σ'v) predict the same amount of damage as the actual ground motion. The first step of the method is to convert an acceleration time series to a series of uniform cycles at different amplitudes using a cycle counting method (CCM). The second step is to sum the number of uniform cycles at different amplitudes to predict an equivalent number of uniform cycles at a single amplitude using a weighting factor curve (WFC). Stelzer et al. (2020) showed that different CCM with the same WFC can predict neq values with average differences up to 35%. Therefore, there is a large amount of epistemic uncertainty based on the choice of CCM to convert an acceleration time series to a series of uniform cycles at different amplitudes.
Due to this uncertainty, and to make the database more applicable to all types of analyses, we use four different cyclic counting methods: peak counting, mean-crossing, level crossing, and rainflow counting. This allows practitioners to pick the CCM that is most appropriate for their application and site, or use several to include epistemic uncertainty in their analyses. In addition, we use three different duration filters, resulting in 12 different measures of uniform cycles per acceleration time series. Finally, we provide both the raw data, and the aggregated number of cycles per amplitude bin, where we have chosen 10 amplitude bins evenly spaced between 0 and the PGA of the acceleration time series.
Methodology
Duration filtering
There is a significant amount of processing necessary before acceleration time series can be used in engineering analyses (Boore and Bommer, 2005). Part of the processing procedure for the NGA databases is to taper the beginning and end of the acceleration time series (Ancheta et al., 2013). Tapering reduces the acceleration amplitude and shifts cycles from larger amplitude bins to smaller amplitude bins. In addition, the beginning and end portions of acceleration time series in the NGA databases can include long sections of background noise. As a result, counting the number of cycles of the entire acceleration time series as provided may lead to inflated values for small amplitude cycles. Therefore, we created three separate databases by truncating the acceleration time series to use either 1) the entire time series as given in the NGA database (no truncation), 2) only the time series between when 5% and 95% of the accumulated Arias intensity occurs, or 3) only the time series between when 5% and 75% of the accumulated Arias intensity occurs, where Arias intensity is a cumulative measure over time proportional to earthquake energy (Arias, 1970).
Cycle counting methods
For each of the three duration filters we applied four different cycle counting methods. We used peak counting, mean-crossing, level-crossing and rainflow counting methods as defined by the American Society for Testing and Materials standard E 1049 – 85 (ASTM, 2017). Each of these methods can be used to convert a continuous time series of irregular loads, such as an earthquake, into a discrete number of uniform load cycles with various amplitudes. To calculate the number of cycles based on the rainflow method, we used the program developed by Niesłony (2009), which is written in C and compiled into a MATLAB MEX function. The following sections give a brief description of each of the cycle counting methods as applied in this study.
Peak counting method
The peak counting method (Figure 1) identifies all peaks (maxima) above zero and all valleys (minima) below zero. Load cycles are then constructed by combining the largest peak with the lowest valley, the second largest peak with the second lowest valley etc., until all peaks and valleys are used (Figure 1b). If there are an odd number of peaks and valleys, the last (i.e. smallest) peak or valley is counted as a half cycle. The amplitude of the cycle is then calculated as one half of the range, where the range is the difference in amplitude between the peak and valley. Further explanation is given in ASTM (2017) section 5.2.1.

Figure 1. Peak counting method from ASTM E 1049 – 85 (ASTM, 2017)

Mean-crossing method
Mean-crossing (also called peak counting excluding non-zero crossing peaks by Hancock and Bommer, 2005), is a variation of the peak counting method where only the largest peak or lowest valley is counted between two zero crossings (Figure 2). This method results in fewer small amplitude cycles that occur within larger cycles, but still counts small amplitude cycles that occur about zero. Further explanation is given in ASTM (2017) section 5.2.2.

Figure 2. Counting of peaks in the time series for the mean-crossing method from ASTM E 1049 – 85 (ASTM, 2017)

Level crossing method
The level-crossing method (Figure 3) counts all times that the acceleration time series crosses threshold levels above zero with a positive slope (increasing) and all times that it crosses threshold levels below zero with a negative slope (decreasing). In this study we set threshold levels at -1*PGA to 1*PGA with intervals every 0.10*PGA. The most damaging cycle is then constructed by using one count from all levels. The next most damaging cycle is constructed by using one count from all levels with counts remaining. This procedure is continued until there are no counts remaining at any levels. Similar to the peak and mean crossing methods, the amplitude of the cycle is taken as one half of the cycle range. Further explanation is given in ASTM (2017) section 5.1.

Figure 3. Level-crossing method from ASTM E 1049 – 85 (ASTM, 2017)
Rainflow counting method
The rainflow counting method (Matsuishi and Endo, 1968) creates uniform cycle parcels by stepping through the time series and considering successive ranges (Figure 4a). If the absolute value of the next range in the acceleration time series is larger than the current range, the current range is counted as one cycle with the amplitude equal to half the range. The two points comprising the current range are then removed. If the current range includes the earliest time point remaining in the time series, it is counted as one half cycle and only the earliest time point is removed (Figure 4b and c). If the absolute value of the next range in the acceleration time series is smaller than the current range, the current range is skipped and the calculation steps one range forward. This is repeated until the absolute value of the next range in the acceleration time series is larger than the current range (Figure 4d). Once this occurs the previous and current ranges are then formed by the first three points remaining in the time series (Figure 4e), and the entire process is repeated. When the end of the time series is reached, all the remaining ranges are counted as half cycles (Figure 4f). Further explanation is given in ASTM (2017) section 5.4.4.

Figure 4. Rainflow method from ASTM E 1049 – 85 (ASTM, 2017)

Data output format
We provide data in two forms: raw data and binned. The raw data is the direct output of each cycle counting method. For the binned data, we have aggregated the number of cycles per amplitude bin. We have chosen 10 amplitude bins evenly spaced between 0 and the PGA of the acceleration time series. The binned data can be derived from the raw data. If different amplitude bins are desired, these can be calculated from the raw data.
Raw data output
The raw data output consists of one folder per cyclic counting method and duration filter (12 total). The folders are named first for the cyclic counting method (lc = level crossing, mp = mean crossing, pk = peak counting, rf = rainflow) and then for the duration filter (nothing = entire time series, 75 = truncated at 5 and 75% Arias intensity, 95 = truncated at 5 and 95% Arias intensity). For example, the folder lc75 contains raw data for the level crossing cyclic counting method performed on acceleration time series truncated at 5% and 75% Arias intensity. In each folder is a separate csv file for each ground motion component. The name of the csv file is the same as the acceleration time series file except with extension .csv instead of .AT2. The following sections describe the output for each cycle counting method.
Peak Counting
The csv files for the peak counting method consist of one column. The column is a list of cycle amplitudes (half of the range) in g. Each row represents one cycle, except if there are an odd number of rows, then the last row is one half cycle.
Mean-crossing
The csv files for the mean-crossing method are formatted the same as for the peak counting method. The file consists of one column, which is a list of cycle amplitudes (half of the range) in g. Each row represents one cycle, except if there are an odd number of rows, then the last row is one half cycle.
Level crossing
The csv files for the level crossing method have three columns. The first column lists the cycle amplitudes (half of the range) in g. The second column lists the cycle amplitudes (half of the range) as a fraction of the PGA. For example, an amplitude of 0.05 = 5% of the acceleration time series PGA. The PGA is for the acceleration time series, so it is component specific. Each component is divided by its own PGA. Column three lists the number of cycles that have the given amplitude. The amplitudes will only be in increments of 0.05 PGA, because we only calculated crossings for levels of -1*PGA to 1*PGA with intervals every 0.10*PGA. Therefore, there should not be more than 20 rows of data and one header row per csv.
Rainflow
The csv files for the rainflow method consist of 5 columns. The first column lists the cycle amplitudes (half of the range) in g. The second column lists the mean amplitude of the cycle in g. As seen in Figure 4, the mean amplitude of most cycles is not zero because the absolute values of the peak and valley used to construct it are not equal. The third column lists the number of cycles for the given amplitude and mean (1 or 0.5). The fourth column lists the start time of the cycle in seconds. The start times correspond to the times in the acceleration time series file. The fifth column lists the period of the cycle in seconds.
Binned data
The binned data consists of one csv file per cyclic counting method and duration filter (12 total). The names of each csv file are the same as the folders for the raw data. The csv files are named first for the cyclic counting method (lc = level crossing, mp = mean crossing, pk = peak counting, rf = rainflow) and then for the duration filter (nothing = entire time series, 75 = truncated at 5 and 75% Arias intensity, 95 = truncated at 5 and 95% Arias intensity). Each csv file contains 12 columns. The first column is the ID of the acceleration time series. This is the same as the name of the acceleration time series .AT2 file. The second column is the PGA of the acceleration time series in g. Columns 3 through 12 are the number of cycles for each amplitude bin, starting with amplitudes of 0.0-0.1 of PGA, to 0.9-1.0 of PGA.
References
Ancheta, T. D., Darragh, R. B., Stewart, J. P., Seyhan, E., Silva,W. J., Chiou, B. S. J., Wooddell, K. E., Graves, R. W., Kottke, A. R., Boore, D. M., Kishida, T., and Donahue, J.L., (2014). NGA-West2 database, Earthquake Spectra, 30, 989–1005.
Arias, A. (1970). A measure of earthquake intensity. In Hansen, R., editor, Seismic Design for Nuclear Power Plants, pages 438–483. MIT Press.
ASTM (2017). Standard practices for cycle counting in fatigue analysis. Standard E 1049 – 85(2017), American Society for Testing and Materials, West Conshohocken, PA, USA.
Boore, D.M. and Bommer, J.J, (2005). Processing of strong-motion accelerograms: needs, options and consequences. Soil Dynamics and Earthquake Engineering, 25, 93–115.
Hancock, J. and Bommer, J. J. (2005). The effective number of cycles of earthquake ground motion. Earthquake Engineering and Structural Dynamics, 34(6):637–664.
Matsuishi, M., and Endo, T., (1968). Fatigue of metals subject to varying stress, in: Proc. Kyushu Branch of Japan Society of Mechanical Engineers, Fukuoka, 37-40.
Mazzoni S, Kishida T, Stewart JP, et al. Relational database used for ground-motion model development in the NGA-sub project. Earthquake Spectra. 2021;38(2):1529-1548. doi:10.1177/87552930211055204
Niesłony, A., (2009). Determination of fragments of multiaxial service loading strongly influencing the fatigue of machine components, Mechanical Systems and Signal Processing, Vol. 23(8), pp. 2712-2721. https://www.mathworks.com/matlabcentral/fileexchange/3026-rainflow-counting-algorithm
Seed, H. B., Idriss, I. M., Makdisi, F., and Banerjee, N. (1975). Representation of irregular stress time histories by equivalent uniform stress series in liquefaction analysis. Technical Report EERC 75-29, Earthquake Engineering Research Center, College of Engineering, University of California, Berkeley, California, USA.
Stafford, P.J., Bommer, J.J. (2009). Empirical equations for the prediction of the equivalent number of cycles of earthquake ground motion. Soil Dynamics and Earthquake Engineering, 29: 1425-1436.
Stelzer, R., Carlton, B., Mazzoni, S. (2020). Comparison of cycle counting methods for potential liquefaction or structural fatigue assessment. Proceedings 17th WCEE, Sendai, Japan, 13-18.

ABOUT THE PROCESSING ALGORITHMS:
Summary of Key Processing Steps
✔ Read earthquake acceleration data from .AT2 files.
✔ Compute seismic parameters such as PGA and Arias intensity.
✔ Identify duration segments based on 5-75% and 5-95% energy levels.
✔ Apply three cycle counting methods:
Rainflow (identifies stress cycles)
Peak counting (counts half-cycle amplitudes)
Level crossing (counts how often acceleration crosses certain thresholds)
✔ Save results into different CSV files for further analysis

Pseudo-code for Estimating Number of Cycles and Cycle Amplitudes — PGA bins
Initialize the script
Start a timer.
Clear variables, the command window, and close all figures.
Add the current directory to the search path.
Set up input parameters
Define thisRun as either "NGASubduction" or "NGAWest2".
Based on the selected option, set the InputFolder and OutputFolder paths.
Define bin edges (Nedge) ranging from 0 to 1 in steps of 0.1.
Retrieve file names
Change to the InputFolder directory.
Get a list of all files with the .AT2 extension.
Store file names in a list (GM).
Convert file names to valid structure names for MATLAB.
Preallocate arrays for efficiency
Create empty arrays (rf, rf75, rf95, pk, pk75, pk95, mp, mp75, mp95, lc, lc75, lc95) to store results.
Loop through each file
Set the start and end indices (istart and iend).
For each file in the list:
Read the acceleration data
Open the file and extract the time step (dt).
Read acceleration values into an array (A).
Compute the corresponding time vector (T).
Close the file.
Estimate seismic parameters
Define gravity constant g = 9.81.
Compute Peak Ground Acceleration (PGA) as the maximum absolute value of A.
Store PGA values in the result arrays.
Compute Arias Intensity as the cumulative sum of normalized acceleration squared.
Find the time intervals for significant durations (5-75% and 5-95% Arias Intensity).
If no valid duration is found, print a warning and skip to the next file.
Extract acceleration data for significant durations
Create subsets of acceleration data (A75, A95) for the identified time intervals.
Compute corresponding time vectors (T75, T95).
Apply the Rainflow Counting Method
Extract extrema from acceleration data.
Apply the rainflow function to count cycles.
Normalize cycle amplitudes by PGA.
Remove small cycles (less than 1% of PGA).
Identify half-cycles and adjust counts.
Store results in the rf array.
Repeat the process for A75 and A95 and store results in rf75 and rf95.
Apply the Peak Counting Method
Use the countpeaks function to count peaks.
Apply this to the original acceleration, as well as A75 and A95.
Store results in pk, pk75, pk95, mp, mp75, and mp95.
Apply the Level Crossing Method
Define level thresholds based on PGA.
Count how many times the acceleration crosses each level.
Apply this method to the original acceleration, A75, and A95.
Store results in lc, lc75, and lc95.
Save the results to CSV files
Convert the results into table format.
Write tables to CSV files for each method (rf, rf75, rf95, pk, pk75, etc.).
Save all output files in the OutputFolder.

Pseudo-Code for Estimating Number of Cycles and Cycle Amplitudes — Raw Data
1. Setup and Initialization
Clear variables, command window, and close all open figures.
Add the current directory to the MATLAB search path.
2. Define Input Parameters
Set the dataset type (thisRun) to either "NGASubduction" or "NGAWest2".
Define input and output folder paths based on the dataset type.
Define different cycle counting methods to be used (rf, pk, mp, lc with their variations).
3. Create Output Folders
Loop through each method and check if its corresponding output folder exists.
If the folder does not exist, create it.
4. Get List of Acceleration Time Series Files
Change directory to the input folder.
Get all files with the ".AT2" extension.
Store the file names and convert them to valid structure names.
5. Define Processing Range
Set the start (istart) and end (iend) indices for processing the files.
These define which files will be processed in the loop.
6. Loop Through Each File and Process Data
For each acceleration time series file in the selected range:
6.1 Read and Extract Data
Open the file and read the time step (dt).
Read the acceleration data (A) and compute the time vector (T).
Close the file.
6.2 Compute Seismic Parameters
Set gravitational acceleration constant (g = 9.81 m/s²).
Compute Peak Ground Acceleration (PGA) as the max absolute acceleration value.
Compute Arias intensity, which represents the energy content of the earthquake record.
6.3 Determine Significant Durations
Find the time intervals where Arias intensity is between 5% and 75% of the total value (t_5_75).
If no valid time interval is found, display an error message and skip this file.
Repeat for the 5% to 95% Arias intensity duration (t_5_95).
6.4 Extract Segments for Durations 5-75% and 5-95%
Extract acceleration and time data for 5-75% duration (A75, T75).
Extract acceleration and time data for 5-95% duration (A95, T95).
6.5 Apply Rainflow Cycle Counting Method
Extract extrema (turning points) from the original acceleration data.
Apply rainflow counting to detect cyclic loading patterns.
Normalize cycle amplitudes by PGA.
Remove cycles with amplitudes less than 1% of PGA.
Repeat the rainflow method for A75 and A95.
6.6 Apply Peak Counting Method
Count the number of half-cycle peaks in the original acceleration data.
Repeat peak counting for A75 and A95.
6.7 Apply Level Crossing Method
Define acceleration levels for counting crossings.
Count how many times the acceleration crosses each level.
Repeat level crossing analysis for A75 and A95.
7. Save Results to CSV Files
For each method (rf, pk, mp, lc and their variations):
Convert numerical results to scientific notation format.
Convert the results into a table format with appropriate column names.
Save the table as a .csv file in the corresponding method's output folder.

NGA CycleCount Db West2+Sub

EQ Ground-Motion Cycle Count Database: NGA-West2 + NGA-Subduction Datasets R250310

March 2025

Introduction

Methodology

Duration filtering

Cycle counting methods

Peak counting method

Mean-crossing method

Level crossing method

Rainflow counting method

Data output format

Raw data output

Peak Counting

Mean-crossing

Level crossing

Rainflow

Binned data

References

Summary of Key Processing Steps

Pseudo-code for Estimating Number of Cycles and Cycle Amplitudes — PGA bins

Pseudo-Code for Estimating Number of Cycles and Cycle Amplitudes — Raw Data

1. Setup and Initialization

2. Define Input Parameters

3. Create Output Folders

4. Get List of Acceleration Time Series Files

5. Define Processing Range

6. Loop Through Each File and Process Data

6.1 Read and Extract Data

6.2 Compute Seismic Parameters

6.3 Determine Significant Durations

6.4 Extract Segments for Durations 5-75% and 5-95%

6.5 Apply Rainflow Cycle Counting Method

6.6 Apply Peak Counting Method

6.7 Apply Level Crossing Method

7. Save Results to CSV Files

EQ Ground-Motion Cycle Count Database:
NGA-West2 + NGA-Subduction Datasets
R250310