Data Interpretation

Data interpretation refers to the process of subjecting data to predefined processes such as the organization of tables, charts, or Graphs so that logical and statistical conclusions can be derived. This part of Statistics answers a common question among researchers: what exactly are we supposed to present?

Explore our app and discover over 50 million learning materials for free.

- Applied Mathematics
- Calculus
- Decision Maths
- Discrete Mathematics
- Geometry
- Logic and Functions
- Mechanics Maths
- Probability and Statistics
- Pure Maths
- Statistics
- ANOVA
- Bayesian Statistics
- Bias in Experiments
- Binomial Distribution
- Binomial Hypothesis Test
- Biostatistics
- Bivariate Data
- Box Plots
- Categorical Data Analysis
- Categorical Variables
- Causal Inference
- Central Limit Theorem
- Chi Square Test for Goodness of Fit
- Chi Square Test for Homogeneity
- Chi Square Test for Independence
- Chi-Square Distribution
- Cluster Analysis
- Combining Random Variables
- Comparing Data
- Comparing Two Means Hypothesis Testing
- Conditional Probability
- Conducting A Study
- Conducting a Survey
- Conducting an Experiment
- Confidence Interval for Population Mean
- Confidence Interval for Population Proportion
- Confidence Interval for Slope of Regression Line
- Confidence Interval for the Difference of Two Means
- Confidence Intervals
- Correlation Math
- Cox Regression
- Cumulative Distribution Function
- Cumulative Frequency
- Data Analysis
- Data Interpretation
- Decision Theory
- Degrees of Freedom
- Discrete Random Variable
- Discriminant Analysis
- Distributions
- Dot Plot
- Empirical Bayes Methods
- Empirical Rule
- Errors In Hypothesis Testing
- Estimation Theory
- Estimator Bias
- Events (Probability)
- Experimental Design
- Factor Analysis
- Frequency Polygons
- Generalization and Conclusions
- Geometric Distribution
- Geostatistics
- Hierarchical Modeling
- Histograms
- Hypothesis Test for Correlation
- Hypothesis Test for Regression Slope
- Hypothesis Test of Two Population Proportions
- Hypothesis Testing
- Inference For Distributions Of Categorical Data
- Inferences in Statistics
- Item Response Theory
- Kaplan-Meier Estimate
- Kernel Density Estimation
- Large Data Set
- Lasso Regression
- Latent Variable Models
- Least Squares Linear Regression
- Linear Interpolation
- Linear Regression
- Logistic Regression
- Machine Learning
- Mann-Whitney Test
- Markov Chains
- Mean and Variance of Poisson Distributions
- Measures of Central Tendency
- Methods of Data Collection
- Mixed Models
- Multilevel Modeling
- Multivariate Analysis
- Neyman-Pearson Lemma
- Non-parametric Methods
- Normal Distribution
- Normal Distribution Hypothesis Test
- Normal Distribution Percentile
- Ordinal Regression
- Paired T-Test
- Parametric Methods
- Path Analysis
- Point Estimation
- Poisson Regression
- Principle Components Analysis
- Probability
- Probability Calculations
- Probability Density Function
- Probability Distribution
- Probability Generating Function
- Product Moment Correlation Coefficient
- Quantile Regression
- Quantitative Variables
- Quartiles
- Random Effects Model
- Random Variables
- Randomized Block Design
- Regression Analysis
- Residual Sum of Squares
- Residuals
- Robust Statistics
- Sample Mean
- Sample Proportion
- Sampling
- Sampling Distribution
- Sampling Theory
- Scatter Graphs
- Sequential Analysis
- Single Variable Data
- Skewness
- Spearman's Rank Correlation
- Spearman's Rank Correlation Coefficient
- Standard Deviation
- Standard Error
- Standard Normal Distribution
- Statistical Graphs
- Statistical Inference
- Statistical Measures
- Stem and Leaf Graph
- Stochastic Processes
- Structural Equation Modeling
- Sum of Independent Random Variables
- Survey Bias
- Survival Analysis
- Survivor Function
- T-distribution
- The Power Function
- Time Series Analysis
- Transforming Random Variables
- Tree Diagram
- Two Categorical Variables
- Two Quantitative Variables
- Type I Error
- Type II Error
- Types of Data in Statistics
- Variance for Binomial Distribution
- Venn Diagrams
- Wilcoxon Test
- Zero-Inflated Models
- Theoretical and Mathematical Physics

Lerne mit deinen Freunden und bleibe auf dem richtigen Kurs mit deinen persönlichen Lernstatistiken

Jetzt kostenlos anmeldenNie wieder prokastinieren mit unseren Lernerinnerungen.

Jetzt kostenlos anmeldenData interpretation refers to the process of subjecting data to predefined processes such as the organization of tables, charts, or Graphs so that logical and statistical conclusions can be derived. This part of Statistics answers a common question among researchers: what exactly are we supposed to present?

It is not ideal for researchers to present numerical values of data collected from instruments or surveys. Data need to be organized to tell the story of what you want to emphasize in your research. This should focus on the problem you want to solve - also known as 'the statement of the problem'. It is the primary function of the research.

Statistical tools are used in the process, helping you to transform data into useful information that can help you to arrive at important conclusions. This process is called data analysis. It is after this process that data can be fully interpreted.

Statistical methods allow you to work on your data. Imagine you have the exam scores for 100 students, and you want to interpret this data. Scanning through the scores by eye alone might be quite tough! Here are two methods that would simplify this task.

Central tendency values are used to describe some key characteristics of the whole data set, producing a single value that is typical of the whole set. For example, the mode will give you the value that occurs the most often.

The

**mean**is the most commonly reported measure of central tendency and it is the mathematical average. To calculate your mean, you add up all of your values available and divide that by the Number of values you added. The mean is represented by μ, and its formula is$\mu =\frac{\Sigma x}{n}$, where n is the Number of data items in the sample and$\Sigma x$ is the sum of all data values.

The

**median**is the mid-point value in your data set. Where the median is two numbers, it is the average of both values in ordered data.

The

**mode**is the value that occurs the most often.

Another statistical measure that is commonly used is variability, also known as spread. The range is the simplest form of variability. Let's take the exam score dataset again - the range is the span between the lowest and highest numerical values.

Another common measure is variance; which is the squared average deviation from the mean. This number indicates how much the individual values deviate from the mean. What you will see reported more often is the Standard Deviation. This is modelled as the square root of the variance. Standard Deviation expresses how much individual class scores differ from the mean value for the group. Mathematically, it can be modelled into an equation:

$s=\sqrt{\frac{\Sigma {({x}_{i}-\overline{x})}^{2}}{n-1}}$

Single Variable Data involves examining one particular variable relevant to a dataset. Single data analysis is common in descriptive forms of analysis and uses Histograms, frequency Distributions, and Box Plots among other methods. This is mostly used in the first step of investigating data. Let's take a look at a box plot.

A box plot displays a five-number summary of a dataset. They are the minimum, first quartile, median, third quartile, and maximum. Quartiles tell us about the spread of data by breaking the data set into quarters. The lower quartile, Q_{1} represents 25%, the middle quartile equals 50% and the upper quartile represents 75%.

The ages of 10 students in grade 12 were collected and they are as follows:

15, 21, 19, 19, 17, 16, 17, 18, 19, 18.

Let's first arrange these in ascending order.

15, 16, 17, 17, 18, 18, 19, 19, 19, 21.

We can now find the median, which is the middle number. And since we have an even number, we have two of them. Finding the average is standard practice; however, with ours, we have the same number.

median = 18

We will find the quartiles now. The first is the median to the left of the overall median.

That will mean we are finding the median for 15, 16, 17, 17, 18.

This equals 17.

The third quartile will be the median to the right of the median.

18, 19, 19, 19, 21

Which will make that 19.

Now we will document the minimum number which is 15.

And also document the maximum which is 21.

The image above is the box plot representing the data of the ages of the students in grade 12.

We will take another example with an odd number of data points.

The table below is data of basketball players' points scored per game over a seven-game span. Visualise this on a box and whisker plot.

Game | Points |

1 | 10 |

2 | 17 |

3 | 5 |

4 | 32 |

5 | 16 |

6 | 18 |

7 | 20 |

Step 1.

Rearrange the values in the data set from lowest to highest.

5, 10, 16, 17, 18, 20, 32.

Step 2.

Now identify the highest and lowest values in the data set

Highest value: 32

Lowest value: 5

Step 3.

We can now identify the midpoint value (median) of the data set.

Median = 17

Step 4.

We will now find the upper and lower quartiles.

The lower quartile is the median for the first half of the data set.

That will mean that we are finding the median for 5, 10, 16

Lower quartile = 10

The upper quartile is the median for the second half of the data set.

That will also mean that we are finding the median for 18, 20, 32

Upper quartile = 20

Step 5.

Now that we have all our necessary values, we will construct our box and whisker plot.

Highest value = 32

Lowest value = 5

Median = 17

Upper quartile = 20

Lower quartile = 10

We will first draw a Number Line that fits the data, and plot all the necessary values we found.

Construct a rectangle that encloses the median of the entire data set that its vertical lines pass through the upper and lower quartiles. Now construct a vertical line through the median that hits both ends of the rectangle.

There, we have our box and whisker plot for the basketball games.

In contrast to Single Variable Data, Bivariate Data consist of two variables for each individual. For example, in large studies in the health sector, it is common to collect variables such as height, age, blood pressure, etc. in each individual. Let's look at an example in a two-way frequency table.

These are the number of males and females who had each grade on a math project in school.

degrees | $\mathrm{Male}$ | Female | Total |

A | 9 | $12$ | 21 |

B. | 18 | $14$ | 32 |

C | $8$ | 11 | $19$ |

D | 2 | 3 | 5 |

E | 1 | 2 | 3 |

Total | 38 | 42 | 80 |

We can see there are 9 males and 12 females who got an A, 18 males and 14 females who got a B, and so on.

Now we can answer a couple of questions.

How many students in total had an A?

Answer: 21 students.

How many males were surveyed?

Answer: 38 males.

How many males earned a grade of A?

Answer: 9.

Below is a graph representation of two variables, the sales of ice cream in a given shop against the temperature of the day. This demonstrates how much ice cream is purchased at every given temperature.

Probability is the measure of how likely an event is to happen. Probabilities can be placed on a Number Line between 0 and 1, as shown below.

So if the Probability of an event is zero, then it is impossible for the event to occur. Whilst if it is 1, then it is certain. Then we have variant degrees in between the two values, and 0.5 would mean there is an even chance of the event happening.

Probabilities are written down using the following Notation :

P (A): the probability of event A happening.

P (A '): the probability of event A not happening.

If event A has a between happening and not happening, then the probability of event A not happening = 1 - P (A ')

For example, if the P (A) = 0.8

P(A') = 0.2.

They should both add up to 1.

These are the basic concepts you would be using throughout probability at this level. You can be reintroduced to Venn diagrams, tree diagrams, etc. as well!

- Data interpretation refers to the process of subjecting collected data to predefined processes so logical and statistical conclusions can be derived.
- Presentation refers to the representation of data in Graphs, plots, frequency tables, etc.
- The measure of central tendency produces a single value that is typical of the whole set. The basic values are mean, mode and median.
- Single variable data involves examining one particular variable relevant in a dataset.
- In contrast to single variable data, bivariate data consist of two variables for each individual.
- Probability is the measure of how likely an event is to happen.

You carry out analysis by selecting each component of the data and seeing if there are any patterns.

Data interpretation involves explaining what these findings mean with reference to the statement of the problem.

It's necessary to organise and group ideas in a logical way.

What is cumulative frequency?

The cumulative frequency at a point x is the sum of the individual frequencies up to and at the point x.

Which of the following can you obtain from a cumulative frequency distribution? a) median b) quartiles c) percentiles d) all of the above

d

If a cumulative frequency for the (n-1)th value is 85 in discrete frequency distribution with 110 data points, what is the raw frequency for the nth value?

25

For a grouped frequency distribution, what is the class mark for the class 0.5 - 1.0?

0.75

For a grouped frequency distribution, what is the class mark for the class 2.5 - 3.5?

3.0

For a grouped frequency distribution, what is the class mark for the class 8 - 12?

10

Already have an account? Log in

Open in App
More about Data Interpretation

The first learning app that truly has everything you need to ace your exams in one place

- Flashcards & Quizzes
- AI Study Assistant
- Study Planner
- Mock-Exams
- Smart Note-Taking

Sign up to highlight and take notes. It’s 100% free.

Save explanations to your personalised space and access them anytime, anywhere!

Sign up with Email Sign up with AppleBy signing up, you agree to the Terms and Conditions and the Privacy Policy of StudySmarter.

Already have an account? Log in

Already have an account? Log in

The first learning app that truly has everything you need to ace your exams in one place

- Flashcards & Quizzes
- AI Study Assistant
- Study Planner
- Mock-Exams
- Smart Note-Taking

Sign up with Email

Already have an account? Log in