Data Analysis

You are given a responsibility to make an assessment report for your grade classes in the school depending upon the scores and grades received from subjects. Your principal has allowed you one week to make the report. You are confused about where to start and how to proceed. How will you pull this off and submit the report on time?

Explore our app and discover over 50 million learning materials for free.

- Applied Mathematics
- Calculus
- Decision Maths
- Discrete Mathematics
- Geometry
- Logic and Functions
- Mechanics Maths
- Probability and Statistics
- Pure Maths
- Statistics
- ANOVA
- Bayesian Statistics
- Bias in Experiments
- Binomial Distribution
- Binomial Hypothesis Test
- Biostatistics
- Bivariate Data
- Box Plots
- Categorical Data Analysis
- Categorical Variables
- Causal Inference
- Central Limit Theorem
- Chi Square Test for Goodness of Fit
- Chi Square Test for Homogeneity
- Chi Square Test for Independence
- Chi-Square Distribution
- Cluster Analysis
- Combining Random Variables
- Comparing Data
- Comparing Two Means Hypothesis Testing
- Conditional Probability
- Conducting A Study
- Conducting a Survey
- Conducting an Experiment
- Confidence Interval for Population Mean
- Confidence Interval for Population Proportion
- Confidence Interval for Slope of Regression Line
- Confidence Interval for the Difference of Two Means
- Confidence Intervals
- Correlation Math
- Cox Regression
- Cumulative Distribution Function
- Cumulative Frequency
- Data Analysis
- Data Interpretation
- Decision Theory
- Degrees of Freedom
- Discrete Random Variable
- Discriminant Analysis
- Distributions
- Dot Plot
- Empirical Bayes Methods
- Empirical Rule
- Errors In Hypothesis Testing
- Estimation Theory
- Estimator Bias
- Events (Probability)
- Experimental Design
- Factor Analysis
- Frequency Polygons
- Generalization and Conclusions
- Geometric Distribution
- Geostatistics
- Hierarchical Modeling
- Histograms
- Hypothesis Test for Correlation
- Hypothesis Test for Regression Slope
- Hypothesis Test of Two Population Proportions
- Hypothesis Testing
- Inference For Distributions Of Categorical Data
- Inferences in Statistics
- Item Response Theory
- Kaplan-Meier Estimate
- Kernel Density Estimation
- Large Data Set
- Lasso Regression
- Latent Variable Models
- Least Squares Linear Regression
- Linear Interpolation
- Linear Regression
- Logistic Regression
- Machine Learning
- Mann-Whitney Test
- Markov Chains
- Mean and Variance of Poisson Distributions
- Measures of Central Tendency
- Methods of Data Collection
- Mixed Models
- Multilevel Modeling
- Multivariate Analysis
- Neyman-Pearson Lemma
- Non-parametric Methods
- Normal Distribution
- Normal Distribution Hypothesis Test
- Normal Distribution Percentile
- Ordinal Regression
- Paired T-Test
- Parametric Methods
- Path Analysis
- Point Estimation
- Poisson Regression
- Principle Components Analysis
- Probability
- Probability Calculations
- Probability Density Function
- Probability Distribution
- Probability Generating Function
- Product Moment Correlation Coefficient
- Quantile Regression
- Quantitative Variables
- Quartiles
- Random Effects Model
- Random Variables
- Randomized Block Design
- Regression Analysis
- Residual Sum of Squares
- Residuals
- Robust Statistics
- Sample Mean
- Sample Proportion
- Sampling
- Sampling Distribution
- Sampling Theory
- Scatter Graphs
- Sequential Analysis
- Single Variable Data
- Skewness
- Spearman's Rank Correlation
- Spearman's Rank Correlation Coefficient
- Standard Deviation
- Standard Error
- Standard Normal Distribution
- Statistical Graphs
- Statistical Inference
- Statistical Measures
- Stem and Leaf Graph
- Stochastic Processes
- Structural Equation Modeling
- Sum of Independent Random Variables
- Survey Bias
- Survival Analysis
- Survivor Function
- T-distribution
- The Power Function
- Time Series Analysis
- Transforming Random Variables
- Tree Diagram
- Two Categorical Variables
- Two Quantitative Variables
- Type I Error
- Type II Error
- Types of Data in Statistics
- Variance for Binomial Distribution
- Venn Diagrams
- Wilcoxon Test
- Zero-Inflated Models
- Theoretical and Mathematical Physics

Lerne mit deinen Freunden und bleibe auf dem richtigen Kurs mit deinen persönlichen Lernstatistiken

Jetzt kostenlos anmeldenNie wieder prokastinieren mit unseren Lernerinnerungen.

Jetzt kostenlos anmeldenYou are given a responsibility to make an assessment report for your grade classes in the school depending upon the scores and grades received from subjects. Your principal has allowed you one week to make the report. You are confused about where to start and how to proceed. How will you pull this off and submit the report on time?

You can use the **data analysis** to make the report. Data analysis is a way to collect and analyze data to interpret results from it. In this section, you will learn the concept of **data analysis in statistics** and how to apply it.

Whenever you take any decision in your day-to-day life, either by reflecting on the past outcome, or future prediction based on a particular decision, you are, in fact, analyzing everything to make decisions based on it. For example, you recall your working technique and management to study for exams to pass them. In doing this, you are scrutinizing past events to make the decision to achieve a certain goal for the next exam. So, you are analyzing some data here. The same thing is done by analysts for business purposes, scientists, and researchers to gain an understanding of a phenomenon, and this process is called Data Analysis.

When you work with statistics and statistical methods, you require some information or data to interpret your results. This data should be appropriate to the individual problem. You can ensure this with the data analysis.

The process to extract useful information to make decisions by collecting, transforming, processing, and analyzing raw data is called **data analysis.**

The main aim of data analysis is to organize the data and summarize it to make the proper decision.

When you analyze your data, you might want to know why it is worth all your efforts. Below you can see some of the benefits of data analysis.

Data analysis helps you to get informed of the latest trends for the study and helps in making the correct decision.

It can help you to identify and understand the problem and some errors occurring and try to rectify them.

It can help you to improve the efficiency of different methods and processes.

Data analysis can be quite handy for market research to make effective strategies.

Data analysis consists of different methods and techniques which can be applied to various types of data. Generally, data can be categorized into two types - Qualitative data and Quantitative data.

The data or variables used for any study can be **qualitative data** and are also known as categorical variables. Qualitative data provides describes, explains, and characterizes information in form of words.

The collected data or variable which falls into categories and deals with quantity is called **qualitative data.**

Such data is **non-numerical** and only uses words or numbers which stand in for a concept (for example satisfaction levels). The data can be in the form of one-variable data (univariate), two-variable data (bivariate), or multi-variable data (multivariate). Usually, the researcher uses firsthand observations, documents, archival materials, or interviewed information as qualitative data.

Qualitative data is quite flexible and can generate new ideas, but it can be unreliable, subjective, and requires intensive labor work. You can summarize and represent qualitative data by data analysis in the form of frequency distribution and bar graphs.

Example of qualitative/categorical variables is:

Suppose you went to the movie theater with your group of friends. After the movie, you gather data on whether they liked the movie or not. Some replied they liked it, and some didn't like it.

So, your data is in the form of two quality categories "liked" and "didn't like".

More information can be found on this data type and the techniques used with it in the article Categorical Variables.

As the name suggests the **quantitative variables**** **or data will be in terms of quantity or numbers. It involves working with numbers, percentages, calculations, and measurements in numerical form.

The data which have observations in the form of numbers and whose values can be counted is known as **quantitative data**.

As the data is in numeric form, you can compute **mathematical calculations** and statistical tests using it. The data analysis of quantitative data can summarize in the form of dot plots, box plots, histograms, pie-chart, and Stem-and-leaf-graphs. Just like qualitative data, quantitative data are also in the form of one-variable data, two-variable data, or multi-variable data.

The height and weight of students, score points in a football match, and temperature are some examples of quantitative data.

More information on this kind of data and the techniques used on it can be found in the article Quantitative Variables.

Now that you know about different variables which are collected based on the required type, you should know how to properly **organize** and **summarize** them to give the conclusion. It is done based on two widely used data analysis methods.

**Descriptive statistics****Inferential statistics**

**Descriptive statistics** is considered the branch of statistics that organizes and summarizes in a proper manner. It tells you what has happened and provides you with summarized statistical data. In other words, descriptive statistics shows the relationship between variables of the sample by providing a summary in forms like mean, median, and mode.

Descriptive statistics do not include theories or conclusion but shows the available sample data. The different type of descriptive statistics includes mean, median, mode, distribution, standard deviation, and variance.

You want to study the popular activity among kids. So, you conduct a survey of your neighborhood kids and ask them how many times they did the following activity:

- Dance
- Football
- Video games

So, from your collected data, you can represent it in form of a frequency table and calculate mean, median, or mode as your requirement.

You can apply these methods to one variable at a time or can compare it with multiple variables.

Now that you have summarized your data, the next step is to confirm your claim and **get results** which can be done by **inferential statistics**. Inferential statistics help in making predictions and provide conclusions for your data.

Inferential statistics helps you in understanding a large population set by taking the sample and testing it. It uses data samples to state a hypothesis and gives a conclusion based on it. Inferences in statistics is a large category that includes methods like confidence intervals and hypothesis testing.

You randomly select test scores from the group of students from your class. Using inferential statistics on the collected data you can make certain estimates or hypothesis claims for the whole class.

Note that it is important that you use random sampling methods for valid inferential statistics.

One of the useful and important data analysis methods you will use is exploratory data analysis. **Exploratory data analysis** is the way to analyze data in **visual form**. You will represent and analyze data in form of different graphs. It is a form of descriptive statistics, and you need to perform descriptive analysis before moving to exploratory analysis.

Exploratory data analysis can be performed at different stages of the data analysis process and uses techniques like bar graphs, box plots, histograms, and scatter plots. You can divide exploratory data analysis into two parts based on the number of variables - univariate data or multivariate data.

If the data is univariate (one-variable data) you can analyze data by using bar graphs, box plots, and histograms. And if your data is multivariate, use scatter plots to analyze it.

You can see the importance and use of exploratory data analysis below.

Visual representation of data shows characteristics in a more clear manner.

It helps in spotting missing and incorrect data.

The underlying structure of data can be understood precisely.

It identifies features that are helpful for high-dimensional data.

Scientific studies are conducted to get answers to certain questions. Like is the new treatment for cancer effective? Do science students require more grades than law students for admission to college? All these require the collection of data and analysis. Below are the steps for the process of data analysis from collecting the data to giving the conclusion:

**1. Understand the problem**

For effective analysis and better results, it is important to have a clear understanding and direction of the problem.

**2. Decide what to find**

The next step is to know what information you need from the particular problem/question. Carefully define your variables and decide on the appropriate methods.

**3. Collect data**

This is a crucial step in the analysis process. According to your needs, you should collect your data from the appropriate populations. It is important to keep in mind the purpose of the data collection.

**4. Summarize data**

After you have collected the needed data and information, now numerically or graphically summarize it and choose the appropriate method to analyze it.

**5. Analyze the data**

Using the inferential methods, formally analyze the data for a conclusion.

**6. Conclude and interpret results**

In this last step give your conclusion and interpret it to obtain answers to your question.

You can see some examples of data analysis in this section.

Identify the type of data from the following types and state the reason for it.

Ordinal, Nominal, Discrete, or Continuous

1. Genres of movies like horror, comedy, etc.

2. Quantity of rain in a year.

3. Number of pages in the math textbook.

4. Grades - A^{+}, A, A^{-}, B^{+}, B.

**Solution:**

1. Nominal - As it is a quality and there is no particular order in genres, you can list them in any order you like.

2. Continuous - The quantity of rain is represented in the form of a number, but is not particularly countable.

3. Discrete - The number of pages in a book can be counted and is a numeric value.

4. Ordinal - The data is in word format and not a number, and it has a particular order in it depending on the performance.

The below example shows the exploratory data analysis.

The data of graduate students in a city is considered for the year \(2010-2021\). Summarize the given data by exploratory data analysis method.

Year | No. of graduate students | Year | No. of graduate students. |

\(2010\) | \(600\) | \(2016\) | \(798\) |

\(2011\) | \(650\) | \(2017\) | \(1005\) |

\(2012\) | \(550\) | \(2018\) | \(1123\) |

\(2013\) | \(590\) | \(2019\) | \(1160\) |

\(2014\) | \(678\) | \(2020\) | \(1300\) |

\(2015\) | \(742\) | \(2021\) | \(1368\) |

Table 1. Data of graduated students per year.

**Solution:**

Here, represent the given data in a graph, as exploratory data analysis is a visual representation. The given data are bi-variate, so the graph will be a scatter graph.

From the given data plot a scatter graph.

- Data analysis is a process to collect and analyze data to interpret results from it.
- The collected data or variable which falls into categories and deals with quantity is called qualitative data.
- The data which have observations in the form of numbers and whose values can be counted is known as quantitative data.
- Descriptive statistics is considered the branch of statistics that organizes and summarizes in a proper manner.
- Inferential statistics help in making predictions and provide conclusions for your data.
- Exploratory data analysis is the way to analyze data in visual form.

Data analysis is a process to collect and analyze data to interpret results from it.

Data analysis is used to collect, organize, and extract information to find results.

The following steps are included in data analysis:

1. Understand the problem

2. Decide what to find

3. Collect data

4. Summarize data

5. Analyze the data

6. Conclude and interpret results

Data Mining is also known as.......

Text Analysis

Regression Analysis examines the the relationship between........

Dependent and independent variables

The statistical evaluation method used to study the strength of the relationship between two continuous variables is..........

Correlation Analysis

Age and Height of grade 12 learners is a two-variable data set. True/False

True

The product of a Statistical Analysis is...........

Conclusion

Data that can be counted or expressed numerically is called .....

Quantitative Data

Already have an account? Log in

Open in App
More about Data Analysis

The first learning app that truly has everything you need to ace your exams in one place

- Flashcards & Quizzes
- AI Study Assistant
- Study Planner
- Mock-Exams
- Smart Note-Taking

Sign up to highlight and take notes. It’s 100% free.

Save explanations to your personalised space and access them anytime, anywhere!

Sign up with Email Sign up with AppleBy signing up, you agree to the Terms and Conditions and the Privacy Policy of StudySmarter.

Already have an account? Log in

Already have an account? Log in

The first learning app that truly has everything you need to ace your exams in one place

- Flashcards & Quizzes
- AI Study Assistant
- Study Planner
- Mock-Exams
- Smart Note-Taking

Sign up with Email

Already have an account? Log in