CCFTool uses one of three non-linear regression algorithms to find the best fit between a chosen function and your data.
Like all regression methods, these algorithms make a number of underlying assumptions about the data and its statistical properties.
Assumptions in non-linear regression
CFTool’s algorithms assume that:
- The sample is representative of the population.
- The independent variables are measured with negligible error compared to the dependent variables.
- (Except in Orthogonal Distance regression, where X errors are explicitly included.)
- The variance of the residuals is constant across all observations.
- The residuals are uncorrelated with one another.
These assumptions are standard for regression analysis and are generally reasonable when the underlying data follow a normal (Gaussian) distribution — which is true for much scientific and real-world data.
Small sample size
When the number of data points is small (typically fewer than about 30, depending on the function), the confidence intervals calculated by ordinary regression will usually be too narrow — in other words, the uncertainties in the fitted coefficients will be underestimated.
This effect arises because small samples tend to give a smaller apparent spread than the true population.
In a small dataset, it is statistically more likely that most points lie near the population mean, simply because there are not enough points to sample the full tails of the distribution.
As the sample size increases, the sample standard deviation approaches the true population standard deviation, and the calculated uncertainties become more accurate.
Correcting for small-sample bias
To correct for the underestimate of uncertainty in small samples, CFTool uses the Student’s t-distribution rather than a standard Gaussian distribution when calculating confidence intervals.
The Student’s t-distribution has wider tails, reflecting the greater uncertainty inherent in small datasets.
This results in confidence intervals that more accurately represent the true range of possible values for the fitted coefficients.
Degrees of freedom
The width of the confidence intervals depends on the degrees of freedom of the fit, defined as:
Degrees of freedom = (Number of points) – (Number of coefficients)
Examples:
- Fitting
y = a*x + bto three data points gives 1 degree of freedom. - Fitting
y = a*xto the same three points gives 2 degrees of freedom.
CFTool can still produce a fit even with a single degree of freedom, but it will issue a warning that the resulting uncertainties and fit quality indicators (such as R²) may not be statistically meaningful.
This is not a program error — it simply reflects the fact that a dataset with too few points cannot reliably define a statistical distribution for the fitted parameters.
Summary
In short:
- CFTool calculates confidence intervals using the Student’s t-distribution.
- This ensures more realistic uncertainties, especially for small datasets.
- The reliability of the results improves as the degrees of freedom increase.
- Warnings for very low degrees of freedom are informational, not errors — they remind you that statistical significance may be limited.