# Introduction to Statistics

Statistics is the science that deals with the collection, classification, analysis, and interpretation of numerical facts or data. It allows us to impose order to data to find the appropriate results to solve important problems.[1] Statistics allows scientists and engineers to predict possible outcomes out of data that presents randomness, uncertainty and variation in the data points.

## Process of Statistical Analysis

The following steps are followed to ensure rigure during statistical analysis:

1. Set clear goals for the data investigation, eg. What is the question being asked?
2. Understand what data is required and collect that data
3. Present the data through appropriate forms eg. graphs, and check for unusual data, eg. outliers, errors in data collection
4. Apply the appropriate statistical analysis to the data, and then extract information that results from such analysis
5. Draw conclusions from the analysis and communicate them

It is important to have a large data set to get any certainty with the conclusions from the data. Also, large variations in the data can make it hard to make clear distinctions in results.

## Sampling

Sampling involves taking a subset of data points from a 'population' for analysis.

• A population is a group which consists of all elements which can be used to collect data for analysis.
• Sampling is undertaken because it is easier to take a sample than to analyse every individual element, which may be expensive or even impossible.

When sampling, it is important to get data which is representative of the population.

• To do this, random sampling is used, meaning no prior consideration has been undertaken in choice of data points.
• This randomness can be taken into account during analysis, allowing the data to be generalised for the entire population.

## Variables

Variables are the characteristics of each individual element in a data set. There are two types of variables in which have two subsets types:

• Categorical / Qualitative: Variables that do not have a numerical value, but are categories. Eg. Eye colour, brand, degree
• Ordinal: There is clear order in the categories. Eg. Hair colour (light, medium, dark), pain threshold (low, medium, high)
• Nominal: There is no intrinsic order in the categories. Eg. Gender
• Numerical / Quantitative: Variables which are measured numerically. Eg. Height, weight, rainfall
• Discrete: The variable is an integer value. Eg. Number of pets, number of courses enrolled
• Continuous: The variable can cantain any value. Eg. Height, rainfall, temperature

## Dot Plots

It is important to use graphical representations to understand the characteristics of data.

• Dot plots are able to show numerical data for small data sets.
• The dot plot consists of data points being stacked vertically across the horizontal scale as shown.

The data is considered positively/right skewed if there is a large clump of dots with a trail of dots to the right.

• Outlier: A data point which is extreme in comparison to the rest of the data set.

## End

This is the end of this topic. Click here to go back to the main subject page for Numerical Methods & Statistics.

## References

1. Statistics. Dictionary.com. Collins English Dictionary - Complete & Unabridged 10th Edition. HarperCollins Publishers. http://dictionary.reference.com/browse/statistics (accessed: July 16, 2012).