Curve Fitting
Curve fitting allows a curve to be formed using a data set to allow an approximation of data points not lying in the data set.
Contents |
Interpolation of Different Degree Polynomials
It is possible to fit a curve around a data set using polynomials of any degree. The coefficients are required to be calculated for the polynomial. To do this, a matrix is formed and solved for the coefficients using Gaussian elimination.
Example:
Data:
Displacement (m) | Velocity (m/s) |
---|---|
0 | 0 |
2 | 1.3 |
4 | 3.7 |
6 | 8.1 |
8 | 13.9 |
10 | 20 |
The polynomials of degree 5 form 6 equations, each with a data point substituted for x and y.
a0 + a1(0)1 + a2(0)2 + a3(0)3 + a4(0)4 + a5(0)5 = 0
a0 + a1(2)1 + a2(2)2 + a3(2)3 + a4(2)4 + a5(2)5 = 1.3
...
a0 + a1(10)1 + a2(10)2 + a3(10)3 + a4(10)4 + a5(10)5 = 20
These are then put into matrix form as such, and Gaussian elimination produces the final coefficient values.
This can be solved using the Matlab command >> A\b, where A is the matrix of coefficients and b is the column vector of solutions to the equations.
f(x) = 0.8125x - 0.251x2 + 0.1021x3 - 0.0091x4 + 0.0003x5
Lagrange Interpolating Polynomial
This type of interpolation uses the data points around the required approximated data point to calculate an equation. The following equation is a 3rd order Lagrange interpolating polynomial, using 4 data points.
This method is disadvantageous as it requires a large amount of arithmetic, can not be used if the data points change, and the error is difficult to calculate.
Least Square Regression
Linear Regression
This type of curve fitting is linear, where the graph is approximated as a straight line, y(x) = a + bx, using the given data points. Regression is different to interpolation as it does not require all the data points to be held true for the approximated graph, where interpolation requires the new graph to pass through each point.
Linear regression requires finding the coefficients, a and b, of the approximated line. This requires a matrix of the following to be solved:
First find the sum of x, x2, y and x*y of the given data points first. Assuming the data points are as follows:
x | 2 | 4 | 6 | 8 | 10 |
y(x) | 4000 | 8000 | 11000 | 13000 | 14000 |
Therefore, we get:
Which can be substituted and then solved using Gaussian elimination and back substitution:
It should be noted that this approximation will give large errors if the data isn't linear, and should not be alarming. This type of question is also the most likely to be seen in tests.
End
This is the end of this topic. Click here to go back to the main subject page for Numerical Methods & Statistics.