sinefit documentation

sinefit fits a least-squares estimate of a sinusoid to time series data that have a periodicity of 1 year.

For related functions, see sineval and sinefit_bootstrap documentation.

Back to Climate Data Tools Contents



ft = sinefit(t,y)
ft = sinefit(t,y,'terms',TermOption)
[ft,rmse] = sinefit(...)


ft = sinefit(t,y) fits a sinusoid with a periodicity of 1 year to input data y collected at times t. Input times can be in datenum, datetime, or datestr format, and do not need to be sampled at regular intervals. The output ft contains the coefficients of the terms in the calculation described below.

ft = sinefit(t,y,'terms',TermOption) specifies which terms are calculated in the sinusoid fit. TermOption can be 2, 3, 4, or 5:

[ft,rmse] = sinefit(...) returns the root-mean-square error of the residuals of y after removing the sinusoidal fit. This is one measure of how well the sinusoid fits the data, but for a more in-depth understanding of the uncertainties, including uncertainties in the timing, see sinefit_bootstrap.

Example 1: Fit a sinusoid to toy data

In this example we'll make up some data with known parameters, then use sinefit to fit a sinusoid to the data. Let's assume you have 100 measurements taken between Jan 1, 2005 and today. Use sineval to generate a sinudoid of 75 unit amplitude, having a maximum value on day 123 of each year (July 5), and a long-term linear trend of -2.2 units per year. Note that in sineval we'll also have to enter a somewhat strange value of 5e3, which is just the y-intercept at year zero, but is not very meaningful to us in the 21st century.

To make it more of a challenge for sinefit, we'll also add gaussian noise that has a standard deviation of 36 units. (This is an appreciable amount of noise relative to the 75 unit amplitude of the sinusoid.)

% 100 random dates between January 1, 2005 and today:
t = datenum('jan 1, 2005') + (now-datenum('jan 1, 2005'))*rand(100,1);

% Sinusoid of amplitude 75; max on day 123 (July 5); trend -2.2 units/year:
sine_parameters = [75 123 5e3 -2.2];
y = sineval([75 123 5e3 -2.2],t) + 36*randn(size(t)); % sinusoid + noise

% Plot the 100 data points:
axis tight % removes white space
box off    % removes frame
datetick('x','keeplimits') % formats x date labels

Believe it or not, that dataset is sinusoidal. If you have Matlab version R2014b or later, you can see for yourself by plotting the data as a function of day of year:

% get the day of year corresponding to each measurement t:
doy = day(datetime(t,'convertfrom','datenum'),'dayofyear');


axis tight
box off
xlabel 'day of year'

In this toy dataset, we know we should have a sinusoid with an amplitude of 75 units, a maximum value on day 123, and a long term trend of -2.2 units per year. And since we intentionally added 36 units of gaussian noise, we know the sinusoid should match the "measurements" to about an rms error of 36 units.

[ft,ft_error] = sinefit(t,y,'terms',4)
ft =
   1.0e+03 *
    0.0763    0.1145    4.7391   -0.0021
ft_error =

The numbers above tell us that sinefit has determined our 100 noisy datapoints are characterized by a sinusoid with an amplitude of 77.81 units (quite close to the prescribed value of 75), a maximum on day 123.4 (within a half day of the presecribed) and a linear trend of -2.26 units per year (-2.2 presecribed). The optional second output of the sinefit function tells us the sinusoid matches the "measurements" to a 1-sigma uncertainty of 36.37, which matches our prescribed value of gaussian noise.

Here is the sinusoid fit to the data:

% A daily array of times from the first datapoint to the last:
t_daily = min(t):max(t);

% The fit sinusoid at daily resolution:
y_daily = sineval(ft,t_daily);

figure(1) % goes back to the first figure
hold on

For a more in-depth analysis of uncertainty in the sinusoid fit, check out sinefit_bootstrap, which can provide 1-sigma levels of uncertainty for each parameter like this (give it a few seconds):

ftb = sinefit_bootstrap(t,y,'terms',4);

% 1 sigma uncertainty for each parameter:
ans =
   1.0e+03 *
    0.0049    0.0040    1.9926    0.0010

That tells us sinefit should be accurate to 1-sigma levels of 4.87 for the amplitude (77.81 versus the prescribed value of 75--yep); timing should be accurate within 4.14 days (day of maximum is 123.4 versus prescribed 123--yep); and the trend should be accurate within 0.93 units per year (-2.26 versus prescribed value of -2.2--yep.)

Example 2: Fit a sinusoid to real sea ice data

Let's take a look at some real sea ice data from the NSIDC. I've packaged up a time series of sea ice extent in .mat format, and included in this File Exchange upload so you can follow along:

% Load sample data:
load seaice_extent

% Plot raw data:
hold on
axis tight
box off
ylabel 'NH sea ice extent (10^6 km^2)'

Quite clearly that data has some periodicity to it at the 1/yr frequency. How much does Northern Hemisphere sea ice extent vary at subannual timescales, when does it typically reach a maximum value, and how much is sea ice changing in terms of long-term trend?

% Fit a sinusoid:
ft = sinefit(t,extent_N,'terms',4)
ft =
    4.4079   66.8394  125.2987   -0.0569

Similar to Example 1, the output of sinefit tells us Arctic sea ice tends to vary by about 4.41 million square kilometers each year, it reaches a maximum around day 66 (March 7) (see note below), and northern hemisphere sea ice has decreased by 60,000 square kilometers per year since 1978.

Here's the sinusoid fit plotted on top of the raw data:

hold on
legend('raw data','fit by sinefit')

And zoom in a bit to see the structure:

xlim([datetime(1995,1,1) datetime(2000,1,1)])

Note to users

One brief note related to the all the parameters estimated by sinefit: These parameters describe the best-fit sinusoid, but that does not necessarily mean they fully describe the behavior of the underlying data itself. For example, in terms of climatological average, Northern Hemisphere sea ice extent actually typically reaches a maximum around day 71 (March 12), whereas sinefit says the maximum value of the best-fit sinusoid occurs on day 66 (March 7). That's because the true behavior of sea ice extent is more complex than a simple sinusoid. In your work, be sure to consider the difference between true behavior and the 1/yr frequency component of the true behavior.

Author Info

This function is part of the Climate Data Toolbox for Matlab. The sinefit, sineval, and sinefit_bootstrap functions as well as the supporting documentation were written by Chad A. Greene of the University of Texas Institute for Geophysics.