Statistics - Foundation - Basic Terminology

Featured image

This article is a part of the Statistics - 101 series, you can access the full version of the series here:

Welcome to the first article of the Statistics 101 series, after reading this article, you will learn:

This is the start of the 101 series on Statistics, a very important field in Data Science. Statistics is a very wide field, and as you may have known, it has proven its usability and application and thus, has been separated from Mathematics to bear its own name.

Learning Statistics is not easy. Not at all! But if you can grasp its idea, your perception on the world and every problem will be sharpened, allowing to draw more concrete conclusion. So prepare yourself with that.

There will be times you may get lost, or bored along the way to accumulate your knowledge. But if you truly understand why you should learn it and what it can bring to you, I can guarantee one thing: You will be in love with Statistics.

1. What is statistics?

First let’s take the image below as an example.

Seeing that pictures, can you answer these following questions?

These questions lie in the field of descriptive statistics.

Descriptive statistics provides us with tool (tables, graphs, averages, ranges, correlations) for organizing and summarizing the inevitable variability in collections of actual observations or scores.

There is also another field of statistics, known as inferential statistics, which helps generalize insights extracted from a set of actual observations.

Inferential statistic provides tools (a variety of tests and estimates) for generalizing beyond collections of actual observations.

Notice the terms actual observations and generalize. They will be further discussed in part 3

If we have to take an example of inferential statistics, also from the picture above, assume that these students come from the same university and all have the GPA of 3.6 or higher. A question that would like in the field of inferential statistics would be: does every student in that university achieve a GPA of at least 3.6?

2. Different types of data and their level of measurement

Any statistical analysis is performed on data, a collection of actual observations or scores in a survey or an experiment. In statistics, there are 3 types of data and 3 corresponding level of measurement, which are:

3. Sample vs Population

The terms generalize and collections of actual observations pretty much sum up the definition of sample and population in statistics.

In statistics:

  • A population = complete collection of observations or potential observations
  • A sample = smaller collection of actual observations drawn from a population.

Whether a collection of observations is a population or a sample needs to be assessed on a case by case basis.

For example, a set of weights reported of 53 males in a class can either be seen as a population when you are concerned about exceeding the load-bearing capacity of an excursion boat (chartered by the 53 students to celebrate successfully completing their stat class!), or as a sample from a population because you wish to generalize to the weights of all male statistics students or all male college students.

One important feature of a good sample is that it must represent the population; otherwise, any generalization might be erroneous.

A sample must represent the population

How to achieve a sample that represent the population will be further discussed in further articles.

4. Variable and different types of variable

A variable is any characteristics, number, or quantity that can be measured or counted. Age, or gender are examples of variables. It is called a variable because the value may vary between data units in a population, and may change in value over time.

There are 3 types of variable:

As an exercise, try to determine which is which type of variable in the following situation: Weight loss among obese males who choose to participate either in a weight-loss program or a self-esteem enhancement program.

5. Last words

Now that we have gone through every basic terminology in statistics, you will be more comfortable on the road ahead towards in-depth statistics.

In the next article, we will talk about different types of study, and how to efficiently conduct a statistical study in real life. Stay tuned!

REFERENCE: Robert S. Witte, John S. Witte - Statistics-Wiley (2016)