Isom 2500 - Cheat Sheet
Autor: Adnan • March 21, 2018 • 2,119 Words (9 Pages) • 734 Views
...
A graphical representation of the relative frequency distribution with a maximum value of 1 (Area bounded by the curve is 1) which is formed by having the mid-point of each class represent the data in that class and then connect the sequence of midpoints.
Ogive
A graphical representation of the cumulative relative frequency distribution, where the cumulative relative frequency maximum equals to 1.[pic 29][pic 30][pic 31]
Describing Time-Series Data (One variable)
A plot of variable over time, which is constructed by plotting the value of the variable on the vertical axis and the time periods on the horizontal axis.[pic 32]
[pic 33]
Describing Two numerical Variables
Scatter diagram: shows relationship of two variables where x is an independent variable and y is a dependent variable.[pic 34][pic 35]
Summary:
One categorical variable
Frequency Distribution
Bar Chart
Pie Chart
Two categorical variable
Contingency Table
Cluster Bar Chart
One numerical variable
DotPlot
Stem-and-Leaf
Histogram
Polygon
Ogive
Two numerical variable
Scatter Diagram
Time-series data
Time-series graph
Measures of Central Location // Central tendency (Mean, Median and Mode)
Mean[pic 36]
Population Mean: = [pic 37][pic 38][pic 39]
[pic 40]
Sample Mean: =[pic 43][pic 41][pic 42]
Weighted Mean: (There are (N) and (n) for weighted mean)[pic 44][pic 45][pic 46]
[pic 47]
We use Mean in most situations. However, if there are extreme values in the data, then Mean is not best choice. Instead, we use median.
Median([pic 48]
Median is the middle value of data when ordered in ascending order.
, where n is the sample size. If is not a an integer, then mid-point.[pic 49][pic 50]
We use median when there are extreme values, and it can be used for ratio, interval and ordinal scale.[pic 51][pic 52]
Mode ([pic 53]
Mode is the value that’s occurs that most number of times (highest frequency)
There can be no mode, one mode or more than one mode in different sets of data. And mode is usually used in qualitative data and nominal scales.
Mean and Median (QUANTITATIVE)
Median is preferred to mean when:
Few extreme scores in the distribution // Some scores have undetermined values // There is an open ended distribution // Data are measured in an ordinal scale[pic 54][pic 55][pic 56][pic 57][pic 58][pic 59]
Mode is the preferred measure when: (QUALITATIVE)
Data are measured in a nominal scale
Geometric mean is the preferred measure of central tendency when:
data are measured in a logarithmic scale.[pic 60][pic 61]
Geometric Mean
It is the most useful measure in handling data with variable is a growth rate or rate of change over periods of time.
[pic 62]
[pic 63]
where is the rate of change at period.[pic 64][pic 65]
Step 1: Find R by [pic 66]
Step 2: Apply Formula
**Note that , , then [pic 67][pic 68][pic 69]
Measure of Variability[pic 70][pic 71][pic 72]
Range: Largest data – Smallest data
Mean Absolute Deviation: OR [pic 75][pic 73][pic 74]
** If not taking absolute value, then [pic 76]
Population Variance: [pic 78][pic 79][pic 77]
Sample Variance: [pic 81][pic 80]
Empirical Rule (Bell-Shaped Curve)
~68% of data will lie between the interval of and [pic 82][pic 83]
~95% of data will lie between the interval of and [pic 84][pic 85]
~99.7% of data will lie between the interval of and [pic 86][pic 87]
Chebyshev’s Theorem (For any curve)
Given any set of observations (sample or population), the proportion of the observations that lie within k standard deviations from the mean is at least 1 for k [pic 88][pic 89]
**Note that the Empirical rule provides approximate proportions, whereas Chebyshev’s Theorem provides lower bounds on the proportions contained in the intervals.
Coefficient of Variation
CV = , where is standard deviation and mean[pic 90][pic 91]
If CV is low, then it is less risky is finance, vice versa.
** Note that the coefficient of variation is a relative measure of the amount of variation
...