1
00:00:00,000 --> 00:00:00,040

2
00:00:00,040 --> 00:00:02,460
The following content is
provided under a Creative

3
00:00:02,460 --> 00:00:03,870
Commons license.

4
00:00:03,870 --> 00:00:06,910
Your support will help MIT
OpenCourseWare continue to

5
00:00:06,910 --> 00:00:10,560
offer high quality educational
resources for free.

6
00:00:10,560 --> 00:00:13,460
To make a donation, or view
additional materials from

7
00:00:13,460 --> 00:00:17,390
hundreds of MIT courses, visit
MIT OpenCourseWare at

8
00:00:17,390 --> 00:00:22,620
ocw.mit.edu,

9
00:00:22,620 --> 00:00:23,380
OK.

10
00:00:23,380 --> 00:00:24,630
So let us start.

11
00:00:24,630 --> 00:00:55,510

12
00:00:55,510 --> 00:00:55,890
All right.

13
00:00:55,890 --> 00:01:00,230
So today we're starting a
new unit in this class.

14
00:01:00,230 --> 00:01:03,390
We have covered, so far, the
basics of probability theory--

15
00:01:03,390 --> 00:01:06,890
the main concepts and tools, as
far as just probabilities

16
00:01:06,890 --> 00:01:08,150
are concerned.

17
00:01:08,150 --> 00:01:11,300
But if that was all that there
is in this subject, the

18
00:01:11,300 --> 00:01:13,060
subject would not
be rich enough.

19
00:01:13,060 --> 00:01:16,070
What makes probability theory
a lot more interesting and

20
00:01:16,070 --> 00:01:19,590
richer is that we can also talk
about random variables,

21
00:01:19,590 --> 00:01:25,230
which are ways of assigning
numerical results to the

22
00:01:25,230 --> 00:01:27,430
outcomes of an experiment.

23
00:01:27,430 --> 00:01:32,500
So we're going to define what
random variables are, and then

24
00:01:32,500 --> 00:01:35,560
we're going to describe them
using so-called probability

25
00:01:35,560 --> 00:01:36,410
mass functions.

26
00:01:36,410 --> 00:01:39,770
Basically some numerical values
are more likely to

27
00:01:39,770 --> 00:01:43,260
occur than other numerical
values, and we capture this by

28
00:01:43,260 --> 00:01:46,260
assigning probabilities
to them the usual way.

29
00:01:46,260 --> 00:01:49,870
And we represent these in
a compact way using the

30
00:01:49,870 --> 00:01:51,950
so-called probability
mass functions.

31
00:01:51,950 --> 00:01:55,340
We're going to see a couple of
examples of random variables,

32
00:01:55,340 --> 00:01:58,260
some of which we have already
seen but with different

33
00:01:58,260 --> 00:01:59,930
terminology.

34
00:01:59,930 --> 00:02:04,950
And so far, it's going to be
just a couple of definitions

35
00:02:04,950 --> 00:02:06,790
and calculations of
the type that you

36
00:02:06,790 --> 00:02:08,810
already know how to do.

37
00:02:08,810 --> 00:02:11,370
But then we're going to
introduce the one new, big

38
00:02:11,370 --> 00:02:12,980
concept of the day.

39
00:02:12,980 --> 00:02:17,040
So up to here it's going to be
mostly an exercise in notation

40
00:02:17,040 --> 00:02:18,190
and definitions.

41
00:02:18,190 --> 00:02:20,850
But then we got our big concept
which is the concept

42
00:02:20,850 --> 00:02:24,190
of the expected value of a
random variable, which is some

43
00:02:24,190 --> 00:02:27,290
kind of average value of
the random variable.

44
00:02:27,290 --> 00:02:30,260
And then we're going to also
talk, very briefly, about

45
00:02:30,260 --> 00:02:33,560
close distance of the
expectation, which is the

46
00:02:33,560 --> 00:02:37,010
concept of the variance
of a random variable.

47
00:02:37,010 --> 00:02:37,910
OK.

48
00:02:37,910 --> 00:02:40,455
So what is a random variable?

49
00:02:40,455 --> 00:02:43,860

50
00:02:43,860 --> 00:02:47,710
It's an assignment of a
numerical value to every

51
00:02:47,710 --> 00:02:49,950
possible outcome of
the experiment.

52
00:02:49,950 --> 00:02:51,430
So here's the picture.

53
00:02:51,430 --> 00:02:54,800
The sample space is this class,
and we've got lots of

54
00:02:54,800 --> 00:02:56,900
students in here.

55
00:02:56,900 --> 00:03:00,480
This is our sample
space, omega.

56
00:03:00,480 --> 00:03:04,220
I'm interested in the height
of a random student.

57
00:03:04,220 --> 00:03:08,990
So I'm going to use a real line
where I record height,

58
00:03:08,990 --> 00:03:12,490
and let's say this is
height in inches.

59
00:03:12,490 --> 00:03:16,380
And the experiment happens,
I pick a random student.

60
00:03:16,380 --> 00:03:19,510
And I go and measure the height
of that random student,

61
00:03:19,510 --> 00:03:22,200
and that gives me a
specific number.

62
00:03:22,200 --> 00:03:25,310
So what's a good number
in inches?

63
00:03:25,310 --> 00:03:28,430
Let's say 60.

64
00:03:28,430 --> 00:03:28,840
OK.

65
00:03:28,840 --> 00:03:33,480
Or I pick another student, and
that student has a height of

66
00:03:33,480 --> 00:03:36,030
71 inches, and so on.

67
00:03:36,030 --> 00:03:37,420
So this is the experiment.

68
00:03:37,420 --> 00:03:39,020
These are the outcomes.

69
00:03:39,020 --> 00:03:43,070
These are the numerical values
of the random variable that we

70
00:03:43,070 --> 00:03:46,340
call height.

71
00:03:46,340 --> 00:03:46,690
OK.

72
00:03:46,690 --> 00:03:49,770
So mathematically, what are
we dealing with here?

73
00:03:49,770 --> 00:03:54,100
We're basically dealing with a
function from the sample space

74
00:03:54,100 --> 00:03:56,230
into the real numbers.

75
00:03:56,230 --> 00:04:01,280
That function takes as argument,
an outcome of the

76
00:04:01,280 --> 00:04:04,720
experiment, that is a typical
student, and produces the

77
00:04:04,720 --> 00:04:07,520
value of that function, which
is the height of that

78
00:04:07,520 --> 00:04:09,100
particular student.

79
00:04:09,100 --> 00:04:12,600
So we think of an abstract
object that we denote by a

80
00:04:12,600 --> 00:04:17,480
capital H, which is the random
variable called height.

81
00:04:17,480 --> 00:04:21,279
And that random variable is
essentially this particular

82
00:04:21,279 --> 00:04:25,870
function that we talked
about here.

83
00:04:25,870 --> 00:04:26,490
OK.

84
00:04:26,490 --> 00:04:29,190
So there's a distinction that
we're making here--

85
00:04:29,190 --> 00:04:32,170
H is height in the abstract.

86
00:04:32,170 --> 00:04:33,570
It's the function.

87
00:04:33,570 --> 00:04:36,440
These numbers here are
particular numerical values

88
00:04:36,440 --> 00:04:39,580
that this function takes when
you choose one particular

89
00:04:39,580 --> 00:04:41,400
outcome of the experiment.

90
00:04:41,400 --> 00:04:44,910
Now, when you have a single
probability experiment, you

91
00:04:44,910 --> 00:04:47,690
can have multiple random
variables.

92
00:04:47,690 --> 00:04:52,690
So perhaps, instead of just
height, I'm also interested in

93
00:04:52,690 --> 00:04:55,890
the weight of a typical
student.

94
00:04:55,890 --> 00:04:58,680
And so when the experiment
happens, I

95
00:04:58,680 --> 00:05:00,690
pick that random student--

96
00:05:00,690 --> 00:05:02,250
this is the height
of the student.

97
00:05:02,250 --> 00:05:05,450
But that student would also
have a weight, and I could

98
00:05:05,450 --> 00:05:06,880
record it here.

99
00:05:06,880 --> 00:05:09,570
And similarly, every student
is going to have their own

100
00:05:09,570 --> 00:05:10,910
particular weight.

101
00:05:10,910 --> 00:05:13,850
So the weight function is a
different function from the

102
00:05:13,850 --> 00:05:17,330
sample space to the real
numbers, and it's a different

103
00:05:17,330 --> 00:05:18,680
random variable.

104
00:05:18,680 --> 00:05:21,840
So the point I'm making here is
that a single probabilistic

105
00:05:21,840 --> 00:05:26,825
experiment may involve several
interesting random variables.

106
00:05:26,825 --> 00:05:30,190
I may be interested in the
height of a random student or

107
00:05:30,190 --> 00:05:31,760
the weight of the
random student.

108
00:05:31,760 --> 00:05:33,300
These are different
random variables

109
00:05:33,300 --> 00:05:35,140
that could be of interest.

110
00:05:35,140 --> 00:05:37,580
I can also do other things.

111
00:05:37,580 --> 00:05:44,000
Suppose I define an object such
as H bar, which is 2.58.

112
00:05:44,000 --> 00:05:46,420
What does that correspond to?

113
00:05:46,420 --> 00:05:50,540
Well, this is the height
in centimeters.

114
00:05:50,540 --> 00:05:55,160
Now, H bar is a function of H
itself, but if you were to

115
00:05:55,160 --> 00:05:57,740
draw the picture, the picture
would go this way.

116
00:05:57,740 --> 00:06:03,100
60 gets mapped to 150, 71 gets
mapped to, oh, that's

117
00:06:03,100 --> 00:06:04,720
too hard for me.

118
00:06:04,720 --> 00:06:10,040
OK, gets mapped to something,
and so on.

119
00:06:10,040 --> 00:06:14,220
So H bar is also a
random variable.

120
00:06:14,220 --> 00:06:15,100
Why?

121
00:06:15,100 --> 00:06:19,140
Once I pick a particular
student, that particular

122
00:06:19,140 --> 00:06:24,120
outcome determines completely
the numerical value of H bar,

123
00:06:24,120 --> 00:06:29,070
which is the height of that
student but measured in

124
00:06:29,070 --> 00:06:31,080
centimeters.

125
00:06:31,080 --> 00:06:33,820
What we have here is actually
a random variable, which is

126
00:06:33,820 --> 00:06:37,860
defined as a function of another
random variable.

127
00:06:37,860 --> 00:06:41,010
And the point that this example
is trying to make is

128
00:06:41,010 --> 00:06:42,640
that functions of
random variables

129
00:06:42,640 --> 00:06:44,630
are also random variables.

130
00:06:44,630 --> 00:06:47,390
The experiment happens, the
experiment determines a

131
00:06:47,390 --> 00:06:49,410
numerical value for
this object.

132
00:06:49,410 --> 00:06:51,630
And once you have the numerical
value for this

133
00:06:51,630 --> 00:06:54,180
object, that determines
also the numerical

134
00:06:54,180 --> 00:06:55,900
value for that object.

135
00:06:55,900 --> 00:06:59,080
So given an outcome, the
numerical value of this

136
00:06:59,080 --> 00:07:00,840
particular object
is determined.

137
00:07:00,840 --> 00:07:05,730
So H bar is itself a function
from the sample space, from

138
00:07:05,730 --> 00:07:08,340
outcomes to numerical values.

139
00:07:08,340 --> 00:07:11,350
And that makes it a random
variable according to the

140
00:07:11,350 --> 00:07:13,920
formal definition that
we have here.

141
00:07:13,920 --> 00:07:18,610
So the formal definition is that
the random variable is

142
00:07:18,610 --> 00:07:22,570
not random, it's not a variable,
it's just a function

143
00:07:22,570 --> 00:07:25,000
from the sample space
to the real numbers.

144
00:07:25,000 --> 00:07:29,220
That's the abstract, right way
of thinking about them.

145
00:07:29,220 --> 00:07:32,000
Now, random variables can
be of different types.

146
00:07:32,000 --> 00:07:34,440
They can be discrete
or continuous.

147
00:07:34,440 --> 00:07:38,490
Suppose that I measure the
heights in inches, but I round

148
00:07:38,490 --> 00:07:40,270
to the nearest inch.

149
00:07:40,270 --> 00:07:43,490
Then the numerical values I'm
going to get here would be

150
00:07:43,490 --> 00:07:45,000
just integers.

151
00:07:45,000 --> 00:07:47,080
So that would make
it an integer

152
00:07:47,080 --> 00:07:48,520
valued random variable.

153
00:07:48,520 --> 00:07:50,870
And this is a discrete
random variable.

154
00:07:50,870 --> 00:07:54,950
Or maybe I have a scale for
measuring height which is

155
00:07:54,950 --> 00:07:58,640
infinitely precise and records
your height to an infinite

156
00:07:58,640 --> 00:08:00,830
number of digits of precision.

157
00:08:00,830 --> 00:08:03,390
In that case, your height
would be just a

158
00:08:03,390 --> 00:08:05,430
general real number.

159
00:08:05,430 --> 00:08:08,510
So we would have a random
variable that takes values in

160
00:08:08,510 --> 00:08:10,720
the entire set of
real numbers.

161
00:08:10,720 --> 00:08:14,490
Well, I guess not really
negative numbers, but the set

162
00:08:14,490 --> 00:08:16,350
of non-negative numbers.

163
00:08:16,350 --> 00:08:19,880
And that would be a continuous
random variable.

164
00:08:19,880 --> 00:08:22,250
It takes values in
a continuous set.

165
00:08:22,250 --> 00:08:25,020
So we will be talking about both
discrete and continuous

166
00:08:25,020 --> 00:08:26,330
random variables.

167
00:08:26,330 --> 00:08:28,790
The first thing we will do
will be to devote a few

168
00:08:28,790 --> 00:08:32,030
lectures on discrete random
variables, because discrete is

169
00:08:32,030 --> 00:08:33,380
always easier.

170
00:08:33,380 --> 00:08:35,640
And then we're going to
repeat everything in

171
00:08:35,640 --> 00:08:37,710
the continuous setting.

172
00:08:37,710 --> 00:08:41,760
So discrete is easier, and
it's the right place to

173
00:08:41,760 --> 00:08:45,140
understand all the concepts,
even those who may appear to

174
00:08:45,140 --> 00:08:47,090
be elementary.

175
00:08:47,090 --> 00:08:50,110
And then you will be set to
understand what's going on

176
00:08:50,110 --> 00:08:51,650
when we go to the
continuous case.

177
00:08:51,650 --> 00:08:54,260
So in the continuous case, you
get all the complications of

178
00:08:54,260 --> 00:08:57,840
calculus and some extra math
that comes in there.

179
00:08:57,840 --> 00:09:00,910
So it's important to have been
down all the concepts very

180
00:09:00,910 --> 00:09:04,320
well in the easy, discrete case
so that you don't have

181
00:09:04,320 --> 00:09:06,770
conceptual hurdles when
you move on to

182
00:09:06,770 --> 00:09:08,730
the continuous case.

183
00:09:08,730 --> 00:09:13,610
Now, one important remark that
may seem trivial but it's

184
00:09:13,610 --> 00:09:17,920
actually very important so that
you don't get tangled up

185
00:09:17,920 --> 00:09:20,550
between different types
of concepts--

186
00:09:20,550 --> 00:09:23,440
there's a fundamental
distinction between the random

187
00:09:23,440 --> 00:09:26,220
variable itself, and
the numerical

188
00:09:26,220 --> 00:09:29,080
values that it takes.

189
00:09:29,080 --> 00:09:31,670
Abstractly speaking, or
mathematically speaking, a

190
00:09:31,670 --> 00:09:38,350
random variable, x, or H in this
example, is a function.

191
00:09:38,350 --> 00:09:39,140
OK.

192
00:09:39,140 --> 00:09:43,750
Maybe if you like programming
the words "procedure" or

193
00:09:43,750 --> 00:09:45,820
"sub-routine" might be better.

194
00:09:45,820 --> 00:09:48,290
So what's the sub-routine
height?

195
00:09:48,290 --> 00:09:51,470
Given a student, I take that
student, force them on the

196
00:09:51,470 --> 00:09:53,110
scale and measure them.

197
00:09:53,110 --> 00:09:56,610
That's the sub-routine that
measures heights.

198
00:09:56,610 --> 00:10:00,270
It's really a function that
takes students as input and

199
00:10:00,270 --> 00:10:02,670
produces numbers as output.

200
00:10:02,670 --> 00:10:05,790
The sub-routine we denoted
by capital H.

201
00:10:05,790 --> 00:10:07,450
That's the random variable.

202
00:10:07,450 --> 00:10:10,500
But once you plug in a
particular student into that

203
00:10:10,500 --> 00:10:14,460
sub-routine, you end up getting
a particular number.

204
00:10:14,460 --> 00:10:17,610
This is the numerical output
of that sub-routine or the

205
00:10:17,610 --> 00:10:19,900
numerical value of
that function.

206
00:10:19,900 --> 00:10:24,390
And that numerical value is an
element of the real numbers.

207
00:10:24,390 --> 00:10:29,040
So the numerical value is a
real number, where this

208
00:10:29,040 --> 00:10:35,670
capital X is a function from
omega to the real numbers.

209
00:10:35,670 --> 00:10:38,400
So they are very different
types of objects.

210
00:10:38,400 --> 00:10:41,510
And the way that we keep track
of what we're talking about at

211
00:10:41,510 --> 00:10:45,020
any given time is by using
capital letters for random

212
00:10:45,020 --> 00:10:49,150
variables and lower case
letters for numbers.

213
00:10:49,150 --> 00:10:52,260

214
00:10:52,260 --> 00:10:52,700
OK.

215
00:10:52,700 --> 00:11:00,520
So now once we have a random
variable at hand, that random

216
00:11:00,520 --> 00:11:04,410
variable takes on different
numerical values.

217
00:11:04,410 --> 00:11:08,980
And we want to describe to say
something about the relative

218
00:11:08,980 --> 00:11:12,760
likelihoods of the different
numerical values that the

219
00:11:12,760 --> 00:11:15,350
random variable can take.

220
00:11:15,350 --> 00:11:21,300
So here's our sample space,
and here's the real line.

221
00:11:21,300 --> 00:11:23,850

222
00:11:23,850 --> 00:11:28,900
And there's a bunch of outcomes
that gave rise to one

223
00:11:28,900 --> 00:11:30,940
particular numerical value.

224
00:11:30,940 --> 00:11:33,660
There's another numerical value
that arises if we have

225
00:11:33,660 --> 00:11:34,270
this outcome.

226
00:11:34,270 --> 00:11:37,620
There's another numerical value
that arises if we have

227
00:11:37,620 --> 00:11:38,390
this outcome.

228
00:11:38,390 --> 00:11:40,430
So our sample space is here.

229
00:11:40,430 --> 00:11:42,530
The real numbers are here.

230
00:11:42,530 --> 00:11:46,600
And what we want to do is to ask
the question, how likely

231
00:11:46,600 --> 00:11:50,000
is that particular numerical
value to occur?

232
00:11:50,000 --> 00:11:53,620
So what we're essentially asking
is, how likely is it

233
00:11:53,620 --> 00:11:57,450
that we obtain an outcome that
leads to that particular

234
00:11:57,450 --> 00:11:59,000
numerical value?

235
00:11:59,000 --> 00:12:02,680
We calculate that overall
probability of that numerical

236
00:12:02,680 --> 00:12:07,810
value and we represent that
probability using a bar so

237
00:12:07,810 --> 00:12:13,300
that we end up generating
a bar graph.

238
00:12:13,300 --> 00:12:16,550
So that could be a possible
bar graph

239
00:12:16,550 --> 00:12:19,210
associated with this picture.

240
00:12:19,210 --> 00:12:22,860
The size of this bar is the
total probability that our

241
00:12:22,860 --> 00:12:27,590
random variable took on this
numerical value, which is just

242
00:12:27,590 --> 00:12:32,240
the sum of the probabilities of
the different outcomes that

243
00:12:32,240 --> 00:12:34,310
led to that numerical value.

244
00:12:34,310 --> 00:12:37,100
So the thing that we're plotting
here, the bar graph--

245
00:12:37,100 --> 00:12:39,370
we give a name to it.

246
00:12:39,370 --> 00:12:43,910
It's a function, which we denote
by lowercase b, capital

247
00:12:43,910 --> 00:12:47,690
X. The capital X indicates which
random variable we're

248
00:12:47,690 --> 00:12:48,920
talking about.

249
00:12:48,920 --> 00:12:54,670
And it's a function of little
x, which is the range of

250
00:12:54,670 --> 00:12:58,530
values that our random
variable is taking.

251
00:12:58,530 --> 00:13:04,830
So in mathematical notation, the
value of the PMF at some

252
00:13:04,830 --> 00:13:09,220
particular number, little x,
is the probability that our

253
00:13:09,220 --> 00:13:14,060
random variable takes on the
numerical value, little x.

254
00:13:14,060 --> 00:13:17,510
And if you want to be precise
about what this means, it's

255
00:13:17,510 --> 00:13:23,020
the overall probability of all
outcomes for which the random

256
00:13:23,020 --> 00:13:26,770
variable ends up taking
that value, little x.

257
00:13:26,770 --> 00:13:34,110
So this is the overall
probability of all omegas that

258
00:13:34,110 --> 00:13:36,950
lead to that particular
numerical

259
00:13:36,950 --> 00:13:39,260
value, x, of interest.

260
00:13:39,260 --> 00:13:44,630
So what do we know about PMFs?

261
00:13:44,630 --> 00:13:47,600
Since there are probabilities,
all these entries in the bar

262
00:13:47,600 --> 00:13:49,880
graph have to be non-negative.

263
00:13:49,880 --> 00:13:54,610
Also, if you exhaust all the
possible values of little x's,

264
00:13:54,610 --> 00:13:57,840
you will have exhausted all the
possible outcomes here.

265
00:13:57,840 --> 00:14:01,030
Because every outcome leads
to some particular x.

266
00:14:01,030 --> 00:14:03,160
So the sum of these
probabilities

267
00:14:03,160 --> 00:14:04,760
should be equal to one.

268
00:14:04,760 --> 00:14:06,890
This is the second
relation here.

269
00:14:06,890 --> 00:14:10,970
So this relation tell
us that some little

270
00:14:10,970 --> 00:14:13,150
x is going to happen.

271
00:14:13,150 --> 00:14:15,500
They happen with different
probabilities, but when you

272
00:14:15,500 --> 00:14:19,370
consider all the possible little
x's together, one of

273
00:14:19,370 --> 00:14:21,750
those little x's is going
to be realized.

274
00:14:21,750 --> 00:14:25,640
Probabilities need
to add to one.

275
00:14:25,640 --> 00:14:26,090
OK.

276
00:14:26,090 --> 00:14:31,200
So let's get our first example
of a non-trivial bar graph.

277
00:14:31,200 --> 00:14:35,780
Consider the experiment where
I start with a coin and I

278
00:14:35,780 --> 00:14:38,270
start flipping it
over and over.

279
00:14:38,270 --> 00:14:42,670
And I do this until I obtain
heads for the first time.

280
00:14:42,670 --> 00:14:45,610
So what are possible outcomes
of this experiment?

281
00:14:45,610 --> 00:14:48,850
One possible outcome is that
I obtain heads at the first

282
00:14:48,850 --> 00:14:50,930
toss, and then I stop.

283
00:14:50,930 --> 00:14:55,040
In this case, my random variable
takes the value 1.

284
00:14:55,040 --> 00:14:59,360
Or it's possible that I obtain
tails and then heads.

285
00:14:59,360 --> 00:15:02,500
How many tosses did it take
until heads appeared?

286
00:15:02,500 --> 00:15:04,390
This would be x equals to 2.

287
00:15:04,390 --> 00:15:09,930
Or more generally, I might
obtain tails for k minus 1

288
00:15:09,930 --> 00:15:15,000
times, and then obtain heads
at the k-th time, in which

289
00:15:15,000 --> 00:15:19,930
case, our random variable takes
the value, little k.

290
00:15:19,930 --> 00:15:21,210
So that's the experiment.

291
00:15:21,210 --> 00:15:25,350
So capital X is a well defined
random variable.

292
00:15:25,350 --> 00:15:29,070
It's the number of tosses it
takes until I see heads for

293
00:15:29,070 --> 00:15:30,710
the first time.

294
00:15:30,710 --> 00:15:32,330
These are the possible
outcomes.

295
00:15:32,330 --> 00:15:34,710
These are elements of
our sample space.

296
00:15:34,710 --> 00:15:38,750
And these are the values of X
depending on the outcome.

297
00:15:38,750 --> 00:15:43,950
Clearly X is a function
of the outcome.

298
00:15:43,950 --> 00:15:47,570
You tell me the outcome, I'm
going to tell you what X is.

299
00:15:47,570 --> 00:15:54,520
So what we want to do now is to
calculate the PMF of X. So

300
00:15:54,520 --> 00:15:59,210
Px of k is, by definition, the
probability that our random

301
00:15:59,210 --> 00:16:02,810
variable takes the value k.

302
00:16:02,810 --> 00:16:07,250
For the random variable to take
the value of k, the first

303
00:16:07,250 --> 00:16:09,680
head appears at toss number k.

304
00:16:09,680 --> 00:16:13,550
The only way that this event
can happen is if we obtain

305
00:16:13,550 --> 00:16:15,650
this sequence of events.

306
00:16:15,650 --> 00:16:19,290
T's the first k minus
1 times, tails, and

307
00:16:19,290 --> 00:16:21,820
heads at the k-th flip.

308
00:16:21,820 --> 00:16:25,980
So this event, that the random
variable is equal to k, is the

309
00:16:25,980 --> 00:16:30,590
same as this event, k minus 1
tails followed by 1 head.

310
00:16:30,590 --> 00:16:32,780
What's the probability
of that event?

311
00:16:32,780 --> 00:16:36,450
We're assuming that the coin
tosses are independent.

312
00:16:36,450 --> 00:16:39,190
So to find the probability
of this event, we need to

313
00:16:39,190 --> 00:16:41,860
multiply the probability of
tails, times the probability

314
00:16:41,860 --> 00:16:43,680
of tails, times the probability
of tails.

315
00:16:43,680 --> 00:16:47,370
We multiply k minus one times,
times the probability of

316
00:16:47,370 --> 00:16:50,660
heads, which puts an
extra p at the end.

317
00:16:50,660 --> 00:16:56,470
And this is the formula for the
so-called geometric PMF.

318
00:16:56,470 --> 00:16:58,650
And why do we call
it geometric?

319
00:16:58,650 --> 00:17:04,859
Because if you go and plot the
bar graph of this random

320
00:17:04,859 --> 00:17:10,510
variable, X, we start
at 1 with a certain

321
00:17:10,510 --> 00:17:14,300
number, which is p.

322
00:17:14,300 --> 00:17:20,550
And then at 2 we get p(1-p).

323
00:17:20,550 --> 00:17:23,640
At 3 we're going to get
something smaller, it's p

324
00:17:23,640 --> 00:17:25,900
times (1-p)-squared.

325
00:17:25,900 --> 00:17:29,740
And the bars keep going down
at the rate of geometric

326
00:17:29,740 --> 00:17:30,730
progression.

327
00:17:30,730 --> 00:17:34,490
Each bar is smaller than the
previous bar, because each

328
00:17:34,490 --> 00:17:38,380
time we get an extra factor
of 1-p involved.

329
00:17:38,380 --> 00:17:42,480
So the shape of this
PMF is the graph

330
00:17:42,480 --> 00:17:44,300
of a geometric sequence.

331
00:17:44,300 --> 00:17:48,330
For that reason, we say that
it's the geometric PMF, and we

332
00:17:48,330 --> 00:17:51,860
call X also a geometric
random variable.

333
00:17:51,860 --> 00:17:55,730
So the number of coin tosses
until the first head is a

334
00:17:55,730 --> 00:17:58,290
geometric random variable.

335
00:17:58,290 --> 00:18:00,730
So this was an example
of how to compute the

336
00:18:00,730 --> 00:18:02,630
PMF of a random variable.

337
00:18:02,630 --> 00:18:06,510
This was an easy example,
because this event could be

338
00:18:06,510 --> 00:18:09,520
realized in one and
only one way.

339
00:18:09,520 --> 00:18:12,650
So to find the probability of
this, we just needed to find

340
00:18:12,650 --> 00:18:15,510
the probability of this
particular outcome.

341
00:18:15,510 --> 00:18:18,680
More generally, there's going
to be many outcomes that can

342
00:18:18,680 --> 00:18:22,120
lead to the same numerical
value.

343
00:18:22,120 --> 00:18:25,010
And we need to keep track
of all of them.

344
00:18:25,010 --> 00:18:28,030
For example, in this picture,
if I want to find this value

345
00:18:28,030 --> 00:18:31,610
of the PMF, I need to add up the
probabilities of all the

346
00:18:31,610 --> 00:18:34,550
outcomes that leads
to that value.

347
00:18:34,550 --> 00:18:37,070
So the general procedure
is exactly what

348
00:18:37,070 --> 00:18:38,240
this picture suggests.

349
00:18:38,240 --> 00:18:43,050
To find this probability, you go
and identify which outcomes

350
00:18:43,050 --> 00:18:47,770
lead to this numerical value,
and add their probabilities.

351
00:18:47,770 --> 00:18:49,590
So let's do a simple example.

352
00:18:49,590 --> 00:18:51,820
I take a tetrahedral die.

353
00:18:51,820 --> 00:18:53,820
I toss it twice.

354
00:18:53,820 --> 00:18:55,850
And there's lots of random
variables that you can

355
00:18:55,850 --> 00:18:57,850
associate with the
same experiment.

356
00:18:57,850 --> 00:19:01,450
So the outcome of the first
throw, we can call it F.

357
00:19:01,450 --> 00:19:05,300
That's a random variable because
it's determined once

358
00:19:05,300 --> 00:19:09,840
you tell me what happens
in the experiment.

359
00:19:09,840 --> 00:19:11,615
The outcome of the
second throw is

360
00:19:11,615 --> 00:19:13,470
another random variable.

361
00:19:13,470 --> 00:19:16,890
The minimum of the two throws
is also a random variable.

362
00:19:16,890 --> 00:19:20,580
Once I do the experiment, this
random variable takes on a

363
00:19:20,580 --> 00:19:22,580
specific numerical value.

364
00:19:22,580 --> 00:19:26,850
So suppose I do the experiment
and I get a 2 and a 3.

365
00:19:26,850 --> 00:19:29,530
So this random variable is going
to take the numerical

366
00:19:29,530 --> 00:19:30,420
value of 2.

367
00:19:30,420 --> 00:19:32,440
This is going to take the
numerical value of 3.

368
00:19:32,440 --> 00:19:35,500
This is going to take the
numerical value of 2.

369
00:19:35,500 --> 00:19:38,830
And now suppose that I want to
calculate the PMF of this

370
00:19:38,830 --> 00:19:40,490
random variable.

371
00:19:40,490 --> 00:19:43,311
What I will need to do is to
calculate Px(0), Px(1), Px(2),

372
00:19:43,311 --> 00:19:47,980
Px(3), and so on.

373
00:19:47,980 --> 00:19:50,680
Let's not do the entire
calculation then, let's just

374
00:19:50,680 --> 00:19:54,770
calculate one of the
entries of the PMF.

375
00:19:54,770 --> 00:19:56,010
So Px(2)--

376
00:19:56,010 --> 00:19:58,870
that's the probability that the
minimum of the two throws

377
00:19:58,870 --> 00:20:00,280
gives us a 2.

378
00:20:00,280 --> 00:20:04,080
And this can happen
in many ways.

379
00:20:04,080 --> 00:20:06,390
There are five ways that
it can happen.

380
00:20:06,390 --> 00:20:11,010
Those are all of the outcomes
for which the smallest of the

381
00:20:11,010 --> 00:20:13,780
two is equal to 2.

382
00:20:13,780 --> 00:20:18,090
That's five outcomes assuming
that the tetrahedral die is

383
00:20:18,090 --> 00:20:20,920
fair and the tosses
are independent.

384
00:20:20,920 --> 00:20:24,450
Each one of these outcomes
has probability of 1/16.

385
00:20:24,450 --> 00:20:27,185
There's five of them, so
we get an answer, 5/16.

386
00:20:27,185 --> 00:20:30,490

387
00:20:30,490 --> 00:20:33,770
Conceptually, this is just the
procedure that you use to

388
00:20:33,770 --> 00:20:37,280
calculate PMFs the way that
you construct this

389
00:20:37,280 --> 00:20:38,730
particular bar graph.

390
00:20:38,730 --> 00:20:41,340
You consider all the possible
values of your random

391
00:20:41,340 --> 00:20:43,860
variable, and for each one of
those random variables you

392
00:20:43,860 --> 00:20:47,090
find the probability that the
random variable takes on that

393
00:20:47,090 --> 00:20:49,710
value by adding the
probabilities of all the

394
00:20:49,710 --> 00:20:51,750
possible outcomes that
leads to that

395
00:20:51,750 --> 00:20:54,100
particular numerical value.

396
00:20:54,100 --> 00:20:57,620
So let's do another, more
interesting one.

397
00:20:57,620 --> 00:21:00,270
So let's revisit the
coin tossing

398
00:21:00,270 --> 00:21:02,490
problem from last time.

399
00:21:02,490 --> 00:21:11,600
Let us fix a number n, and we
decide to flip a coin n

400
00:21:11,600 --> 00:21:13,080
consecutive times.

401
00:21:13,080 --> 00:21:16,100
Each time the coin tosses
are independent.

402
00:21:16,100 --> 00:21:19,300
And each one of the tosses will
have a probability, p, of

403
00:21:19,300 --> 00:21:20,960
obtaining heads.

404
00:21:20,960 --> 00:21:23,590
Let's consider the random
variable, which is the total

405
00:21:23,590 --> 00:21:26,325
number of heads that
have been obtained.

406
00:21:26,325 --> 00:21:29,690
Well, that's something that
we dealt with last time.

407
00:21:29,690 --> 00:21:33,380
We know the probabilities for
different numbers of heads,

408
00:21:33,380 --> 00:21:35,530
but we're just going
to do the same now

409
00:21:35,530 --> 00:21:37,960
using today's notation.

410
00:21:37,960 --> 00:21:41,610
So let's, for concreteness,
n equal to 4.

411
00:21:41,610 --> 00:21:48,410
Px is the PMF of that random
variable, X. Px(2) is meant to

412
00:21:48,410 --> 00:21:52,130
be, by definition, it's the
probability that a random

413
00:21:52,130 --> 00:21:54,420
variable takes the value of 2.

414
00:21:54,420 --> 00:21:57,410
So this is the probability
that we have, exactly two

415
00:21:57,410 --> 00:22:00,080
heads in our four tosses.

416
00:22:00,080 --> 00:22:03,910
The event of exactly two heads
can happen in multiple ways.

417
00:22:03,910 --> 00:22:05,740
And here I've written
down the different

418
00:22:05,740 --> 00:22:06,920
ways that it can happen.

419
00:22:06,920 --> 00:22:09,230
It turns out that there's
exactly six

420
00:22:09,230 --> 00:22:10,920
ways that it can happen.

421
00:22:10,920 --> 00:22:15,010
And each one of these ways,
luckily enough, has the same

422
00:22:15,010 --> 00:22:16,180
probability--

423
00:22:16,180 --> 00:22:19,460
p-squared times (1-p)-squared.

424
00:22:19,460 --> 00:22:24,690
So that gives us the value for
the PMF evaluated at 2.

425
00:22:24,690 --> 00:22:28,370
So here we just counted
explicitly that we have six

426
00:22:28,370 --> 00:22:31,170
possible ways that this can
happen, and this gave rise to

427
00:22:31,170 --> 00:22:32,900
this factor of 6.

428
00:22:32,900 --> 00:22:37,360
But this factor of 6 turns
out to be the same as

429
00:22:37,360 --> 00:22:39,350
this 4 choose 2.

430
00:22:39,350 --> 00:22:42,490
If you remember definition from
last time, 4 choose 2 is

431
00:22:42,490 --> 00:22:45,650
4 factorial divided by 2
factorial, divided by 2

432
00:22:45,650 --> 00:22:49,940
factorial, which is
indeed equal to 6.

433
00:22:49,940 --> 00:22:52,370
And this is the more general
formula that

434
00:22:52,370 --> 00:22:53,830
you would be using.

435
00:22:53,830 --> 00:22:59,190
In general, if you have n tosses
and you're interested

436
00:22:59,190 --> 00:23:02,540
in the probability of obtaining
k heads, the

437
00:23:02,540 --> 00:23:05,560
probability of that event is
given by this formula.

438
00:23:05,560 --> 00:23:08,710
So that's the formula that
we derived last time.

439
00:23:08,710 --> 00:23:11,230
Except that last time we didn't
use this notation.

440
00:23:11,230 --> 00:23:15,300
We just said the probability of
k heads is equal to this.

441
00:23:15,300 --> 00:23:18,020
Today we introduce the
extra notation.

442
00:23:18,020 --> 00:23:22,470
And also having that notation,
we may be tempted to also plot

443
00:23:22,470 --> 00:23:26,310
a bar graph for the Px.

444
00:23:26,310 --> 00:23:29,390
In this case, for the coin
tossing problem.

445
00:23:29,390 --> 00:23:35,090
And if you plot that bar graph
as a function of k when n is a

446
00:23:35,090 --> 00:23:40,850
fairly large number, what you
will end up obtaining is a bar

447
00:23:40,850 --> 00:23:47,525
graph that has a shape of
something like this.

448
00:23:47,525 --> 00:23:53,840

449
00:23:53,840 --> 00:23:58,800
So certain values of k are more
likely than others, and

450
00:23:58,800 --> 00:24:00,790
the more likely values
are somewhere in the

451
00:24:00,790 --> 00:24:02,230
middle of the range.

452
00:24:02,230 --> 00:24:03,490
And extreme values--

453
00:24:03,490 --> 00:24:07,110
too few heads or too many
heads, are unlikely.

454
00:24:07,110 --> 00:24:09,870
Now, the miraculous thing is
that it turns out that this

455
00:24:09,870 --> 00:24:15,550
curve gets a pretty definite
shape, like a so-called bell

456
00:24:15,550 --> 00:24:18,210
curve, when n is big.

457
00:24:18,210 --> 00:24:20,770

458
00:24:20,770 --> 00:24:24,920
This is a very deep and central
fact from probability

459
00:24:24,920 --> 00:24:30,210
theory that we will get to
in a couple of months.

460
00:24:30,210 --> 00:24:33,900
For now, it just could be
a curious observation.

461
00:24:33,900 --> 00:24:38,390
If you go into MATLAB and put
this formula in and ask MATLAB

462
00:24:38,390 --> 00:24:41,540
to plot it for you, you're going
to get an interesting

463
00:24:41,540 --> 00:24:43,140
shape of this form.

464
00:24:43,140 --> 00:24:46,700
And later on we will have to
sort of understand where this

465
00:24:46,700 --> 00:24:50,760
is coming from and whether
there's a nice, simple formula

466
00:24:50,760 --> 00:24:54,920
for the asymptotic
form that we get.

467
00:24:54,920 --> 00:24:55,370
All right.

468
00:24:55,370 --> 00:25:00,580
So, so far I've said essentially
nothing new, just

469
00:25:00,580 --> 00:25:05,240
a little bit of notation and
this little conceptual thing

470
00:25:05,240 --> 00:25:07,900
that you have to think of random
variables as functions

471
00:25:07,900 --> 00:25:09,060
in the sample space.

472
00:25:09,060 --> 00:25:11,620
So now it's time to introduce
something new.

473
00:25:11,620 --> 00:25:14,250
This is the big concept
of the day.

474
00:25:14,250 --> 00:25:17,180
In some sense it's
an easy concept.

475
00:25:17,180 --> 00:25:23,420
But it's the most central, most
important concept that we

476
00:25:23,420 --> 00:25:26,970
have to deal with random
variables.

477
00:25:26,970 --> 00:25:28,790
It's the concept
of the expected

478
00:25:28,790 --> 00:25:30,860
value of a random variable.

479
00:25:30,860 --> 00:25:34,570
So the expected value is meant
to be, let's speak loosely,

480
00:25:34,570 --> 00:25:38,100
something like an average,
where you interpret

481
00:25:38,100 --> 00:25:41,520
probabilities as something
like frequencies.

482
00:25:41,520 --> 00:25:46,490
So you play a certain game and
your rewards are going to be--

483
00:25:46,490 --> 00:25:49,660

484
00:25:49,660 --> 00:25:52,010
use my standard numbers--

485
00:25:52,010 --> 00:25:54,530
your rewards are going
to be one dollar

486
00:25:54,530 --> 00:25:58,040
with probability 1/6.

487
00:25:58,040 --> 00:26:04,711
It's going to be 2 dollars with
probability 1/2, and four

488
00:26:04,711 --> 00:26:08,670
dollars with probability 1/3.

489
00:26:08,670 --> 00:26:11,920
So this is a plot of the PMF
of some random variable.

490
00:26:11,920 --> 00:26:15,270
If you play that game and you
get so many dollars with this

491
00:26:15,270 --> 00:26:18,520
probability, and so on, how much
do you expect to get on

492
00:26:18,520 --> 00:26:21,670
the average if you play the
game a zillion times?

493
00:26:21,670 --> 00:26:23,420
Well, you can think
as follows--

494
00:26:23,420 --> 00:26:27,990
one sixth of the time I'm
going to get one dollar.

495
00:26:27,990 --> 00:26:31,620
One half of the time that
outcome is going to happen and

496
00:26:31,620 --> 00:26:34,140
I'm going to get two dollars.

497
00:26:34,140 --> 00:26:37,920
And one third of the time the
other outcome happens, and I'm

498
00:26:37,920 --> 00:26:40,690
going to get four dollars.

499
00:26:40,690 --> 00:26:45,230
And you evaluate that number
and it turns out to be 2.5.

500
00:26:45,230 --> 00:26:45,490
OK.

501
00:26:45,490 --> 00:26:50,410
So that's a reasonable way of
calculating the average payoff

502
00:26:50,410 --> 00:26:52,550
if you think of these
probabilities as the

503
00:26:52,550 --> 00:26:56,440
frequencies with which you
obtain the different payoffs.

504
00:26:56,440 --> 00:26:59,430
And loosely speaking, it doesn't
hurt to think of

505
00:26:59,430 --> 00:27:02,430
probabilities as frequencies
when you try to make sense of

506
00:27:02,430 --> 00:27:04,990
various things.

507
00:27:04,990 --> 00:27:06,480
So what did we do here?

508
00:27:06,480 --> 00:27:11,710
We took the probabilities of the
different outcomes, of the

509
00:27:11,710 --> 00:27:15,370
different numerical values, and
multiplied them with the

510
00:27:15,370 --> 00:27:17,610
corresponding numerical value.

511
00:27:17,610 --> 00:27:19,910
Similarly here, we have
a probability and the

512
00:27:19,910 --> 00:27:24,890
corresponding numerical value
and we added up over all x's.

513
00:27:24,890 --> 00:27:26,430
So that's what we did.

514
00:27:26,430 --> 00:27:29,750
It looks like an interesting
quantity to deal with.

515
00:27:29,750 --> 00:27:32,800
So we're going to give a name to
it, and we're going to call

516
00:27:32,800 --> 00:27:35,740
it the expected value of
a random variable.

517
00:27:35,740 --> 00:27:39,610
So this formula just captures
the calculation that we did.

518
00:27:39,610 --> 00:27:43,520
How do we interpret the
expected value?

519
00:27:43,520 --> 00:27:46,490
So the one interpretation
is the one that I

520
00:27:46,490 --> 00:27:48,110
used in this example.

521
00:27:48,110 --> 00:27:52,290
You can think of it as the
average that you get over a

522
00:27:52,290 --> 00:27:56,200
large number of repetitions
of an experiment where you

523
00:27:56,200 --> 00:27:59,330
interpret the probabilities as
the frequencies with which the

524
00:27:59,330 --> 00:28:02,090
different numerical
values can happen.

525
00:28:02,090 --> 00:28:04,870
There's another interpretation
that's a little more visual

526
00:28:04,870 --> 00:28:07,550
and that's kind of insightful,
if you remember your freshman

527
00:28:07,550 --> 00:28:10,860
physics, this kind of formula
gives you the center of

528
00:28:10,860 --> 00:28:14,390
gravity of an object
of this kind.

529
00:28:14,390 --> 00:28:17,290
If you take that picture
literally and think of this as

530
00:28:17,290 --> 00:28:20,700
a mass of one sixth sitting
here, and the mass of one half

531
00:28:20,700 --> 00:28:24,000
sitting here, and one third
sitting there, and you ask me

532
00:28:24,000 --> 00:28:26,920
what's the center of gravity
of that structure.

533
00:28:26,920 --> 00:28:29,320
This is the formula that gives
you the center of gravity.

534
00:28:29,320 --> 00:28:30,900
Now what's the center
of gravity?

535
00:28:30,900 --> 00:28:34,960
It's the place where if you put
your pen right underneath,

536
00:28:34,960 --> 00:28:38,050
that diagram will stay in place
and will not fall on one

537
00:28:38,050 --> 00:28:40,440
side and will not fall
on the other side.

538
00:28:40,440 --> 00:28:44,950
So in this thing, by picture,
since the 4 is a little more

539
00:28:44,950 --> 00:28:47,880
to the right and a little
heavier, the center of gravity

540
00:28:47,880 --> 00:28:50,200
should be somewhere
around here.

541
00:28:50,200 --> 00:28:52,290
And that's what for
the math gave us.

542
00:28:52,290 --> 00:28:54,740
It turns out to be
two and a half.

543
00:28:54,740 --> 00:28:56,920
Once you have this
interpretation about centers

544
00:28:56,920 --> 00:28:58,890
of gravity, sometimes
you can calculate

545
00:28:58,890 --> 00:29:01,090
expectations pretty fast.

546
00:29:01,090 --> 00:29:04,410
So here's our new
random variable.

547
00:29:04,410 --> 00:29:07,840
It's the uniform random variable
in which each one of

548
00:29:07,840 --> 00:29:10,420
the numerical values
is equally likely.

549
00:29:10,420 --> 00:29:13,980
Here there's a total of n plus
1 possible numerical values.

550
00:29:13,980 --> 00:29:17,600
So each one of them has
probability 1 over (n + 1).

551
00:29:17,600 --> 00:29:20,650
Let's calculate the expected
value of this random variable.

552
00:29:20,650 --> 00:29:24,620
We can take the formula
literally and consider all

553
00:29:24,620 --> 00:29:28,920
possible outcomes, or all
possible numerical values, and

554
00:29:28,920 --> 00:29:32,330
weigh them by their
corresponding probability, and

555
00:29:32,330 --> 00:29:35,170
do this calculation and
obtain an answer.

556
00:29:35,170 --> 00:29:38,520
But I gave you the intuition
of centers of gravity.

557
00:29:38,520 --> 00:29:41,990
Can you use that intuition
to guess the answer?

558
00:29:41,990 --> 00:29:46,680
What's the center of gravity
infrastructure of this kind?

559
00:29:46,680 --> 00:29:47,860
We have symmetry.

560
00:29:47,860 --> 00:29:50,710
So it should be in the middle.

561
00:29:50,710 --> 00:29:51,970
And what's the middle?

562
00:29:51,970 --> 00:29:54,850
It's the average of the
two end points.

563
00:29:54,850 --> 00:29:57,850
So without having to do the
algebra, you know that's the

564
00:29:57,850 --> 00:30:01,200
answer is going to
be n over 2.

565
00:30:01,200 --> 00:30:05,850
So this is a moral that you
should keep whenever you have

566
00:30:05,850 --> 00:30:11,460
PMF, which is symmetric around
a certain point.

567
00:30:11,460 --> 00:30:15,350
That certain point is going
to be the expected value

568
00:30:15,350 --> 00:30:17,460
associated with this
particular PMF.

569
00:30:17,460 --> 00:30:21,920

570
00:30:21,920 --> 00:30:22,380
OK.

571
00:30:22,380 --> 00:30:29,610
So having defined the expected
value, what is there that's

572
00:30:29,610 --> 00:30:31,810
left for us to do?

573
00:30:31,810 --> 00:30:37,290
Well, we want to investigate how
it behaves, what kind of

574
00:30:37,290 --> 00:30:43,390
properties does it have, and
also how do you calculate

575
00:30:43,390 --> 00:30:48,040
expected values of complicated
random variables.

576
00:30:48,040 --> 00:30:52,130
So the first complication that
we're going to start with is

577
00:30:52,130 --> 00:30:54,985
the case where we deal with a
function of a random variable.

578
00:30:54,985 --> 00:30:59,002

579
00:30:59,002 --> 00:30:59,680
OK.

580
00:30:59,680 --> 00:31:05,890
So let me redraw this same
picture as before.

581
00:31:05,890 --> 00:31:07,090
We have omega.

582
00:31:07,090 --> 00:31:09,580
This is our sample space.

583
00:31:09,580 --> 00:31:12,310
This is the real line.

584
00:31:12,310 --> 00:31:17,370
And we have a random variable
that gives rise to various

585
00:31:17,370 --> 00:31:24,400
values for X. So the random
variable is capital X, and

586
00:31:24,400 --> 00:31:28,690
every outcome leads to a
particular numerical value x

587
00:31:28,690 --> 00:31:33,030
for our random variable X. So
capital X is really the

588
00:31:33,030 --> 00:31:37,930
function that maps these points
into the real line.

589
00:31:37,930 --> 00:31:42,710
And then I consider a function
of this random variable, call

590
00:31:42,710 --> 00:31:47,080
it capital Y, and it's
a function of my

591
00:31:47,080 --> 00:31:49,980
previous random variable.

592
00:31:49,980 --> 00:31:54,190
And this new random variable Y
takes numerical values that

593
00:31:54,190 --> 00:31:58,140
are completely determined once
I know the numerical value of

594
00:31:58,140 --> 00:32:03,090
capital X. And perhaps you get
a diagram of this kind.

595
00:32:03,090 --> 00:32:08,520

596
00:32:08,520 --> 00:32:10,970
So X is a random variable.

597
00:32:10,970 --> 00:32:14,506
Once you have an outcome, this
determines the value of x.

598
00:32:14,506 --> 00:32:16,760
Y is also a random variable.

599
00:32:16,760 --> 00:32:19,200
Once you have the outcome,
that determines

600
00:32:19,200 --> 00:32:21,230
the value of y.

601
00:32:21,230 --> 00:32:26,630
Y is completely determined once
you know X. We have a

602
00:32:26,630 --> 00:32:31,560
formula for how to calculate
the expected value of X.

603
00:32:31,560 --> 00:32:34,380
Suppose that you're interested
in calculating the expected

604
00:32:34,380 --> 00:32:39,910
value of Y. How would
you go about it?

605
00:32:39,910 --> 00:32:40,710
OK.

606
00:32:40,710 --> 00:32:43,580
The only thing you have in your
hands is the definition,

607
00:32:43,580 --> 00:32:47,750
so you could start by just
using the definition.

608
00:32:47,750 --> 00:32:50,150
And what does this entail?

609
00:32:50,150 --> 00:32:55,330
It entails for every particular
value of y, collect

610
00:32:55,330 --> 00:32:59,160
all the outcomes that leads
to that value of y.

611
00:32:59,160 --> 00:33:01,010
Find their probability.

612
00:33:01,010 --> 00:33:02,280
Do the same here.

613
00:33:02,280 --> 00:33:04,550
For that value, collect
those outcomes.

614
00:33:04,550 --> 00:33:07,700
Find their probability
and weight by y.

615
00:33:07,700 --> 00:33:13,050
So this formula does the
addition over this line.

616
00:33:13,050 --> 00:33:17,060
We consider the different
outcomes and add things up.

617
00:33:17,060 --> 00:33:20,290
There's an alternative way of
doing the same accounting

618
00:33:20,290 --> 00:33:23,930
where instead of doing the
addition over those numbers,

619
00:33:23,930 --> 00:33:26,540
we do the addition up here.

620
00:33:26,540 --> 00:33:30,250
We consider the different
possible values of x, and we

621
00:33:30,250 --> 00:33:31,500
think as follows--

622
00:33:31,500 --> 00:33:34,500

623
00:33:34,500 --> 00:33:38,900
for each possible value of x,
that value is going to occur

624
00:33:38,900 --> 00:33:41,310
with this probability.

625
00:33:41,310 --> 00:33:45,890
And if that value has occurred,
this is how much I'm

626
00:33:45,890 --> 00:33:47,840
getting, the g of x.

627
00:33:47,840 --> 00:33:52,990
So I'm considering the
probability of this outcome.

628
00:33:52,990 --> 00:33:56,050
And in that case, y
takes this value.

629
00:33:56,050 --> 00:34:00,240
Then I'm considering the
probabilities of this outcome.

630
00:34:00,240 --> 00:34:04,650
And in that case, g of x
takes again that value.

631
00:34:04,650 --> 00:34:08,100
Then I consider this particular
x, it happens with

632
00:34:08,100 --> 00:34:11,280
this much probability, and in
that case, g of x takes that

633
00:34:11,280 --> 00:34:14,300
value, and similarly here.

634
00:34:14,300 --> 00:34:18,170
We end up doing exactly the same
arithmetic, it's only a

635
00:34:18,170 --> 00:34:21,760
question whether we bundle
things together.

636
00:34:21,760 --> 00:34:25,790
That is, if we calculate the
probability of this, then

637
00:34:25,790 --> 00:34:28,239
we're bundling these
two cases together.

638
00:34:28,239 --> 00:34:32,110
Whereas if we do the addition
up here, we do a separate

639
00:34:32,110 --> 00:34:32,949
calculation--

640
00:34:32,949 --> 00:34:35,639
this probability times this
number, and then this

641
00:34:35,639 --> 00:34:37,989
probability times that number.

642
00:34:37,989 --> 00:34:41,420
So it's just a simple
rearrangement of the way that

643
00:34:41,420 --> 00:34:45,330
we do the calculations, but it
does make a big difference in

644
00:34:45,330 --> 00:34:49,010
practice if you actually want
to calculate expectations.

645
00:34:49,010 --> 00:34:52,389
So the second procedure that I
mentioned, where you do the

646
00:34:52,389 --> 00:34:56,790
addition by running
over the x-axis

647
00:34:56,790 --> 00:34:59,710
corresponds to this formula.

648
00:34:59,710 --> 00:35:05,830
Consider all possibilities for x
and when that x happens, how

649
00:35:05,830 --> 00:35:07,530
much money are you getting?

650
00:35:07,530 --> 00:35:10,850
That gives you the average money
that you are getting.

651
00:35:10,850 --> 00:35:11,270
All right.

652
00:35:11,270 --> 00:35:14,840
So I kind of hand waved and
argued that it's just a

653
00:35:14,840 --> 00:35:17,690
different way of accounting, of
course one needs to prove

654
00:35:17,690 --> 00:35:19,060
this formula.

655
00:35:19,060 --> 00:35:20,950
And fortunately it
can be proved.

656
00:35:20,950 --> 00:35:23,470
You're going to see that
in recitation.

657
00:35:23,470 --> 00:35:25,710
Most people, once they're a
little comfortable with the

658
00:35:25,710 --> 00:35:28,570
concepts of probability,
actually believe that this is

659
00:35:28,570 --> 00:35:30,130
true by definition.

660
00:35:30,130 --> 00:35:31,860
In fact it's not true
by definition.

661
00:35:31,860 --> 00:35:34,610
It's called the law of the
unconscious statistician.

662
00:35:34,610 --> 00:35:37,930
It's something that you always
do, but it's something that

663
00:35:37,930 --> 00:35:40,750
does require justification.

664
00:35:40,750 --> 00:35:41,100
All right.

665
00:35:41,100 --> 00:35:44,160
So this gives us basically a
shortcut for calculating

666
00:35:44,160 --> 00:35:47,770
expected values of functions of
a random variable without

667
00:35:47,770 --> 00:35:51,990
having to find the PMF
of that function.

668
00:35:51,990 --> 00:35:54,470
We can work with the PMF of
the original function.

669
00:35:54,470 --> 00:35:57,140

670
00:35:57,140 --> 00:35:57,430
All right.

671
00:35:57,430 --> 00:36:00,940
So we're going to use this
property over and over.

672
00:36:00,940 --> 00:36:06,570
Before we start using it, one
general word of caution--

673
00:36:06,570 --> 00:36:10,640
the average of a function of a
random variable, in general,

674
00:36:10,640 --> 00:36:16,400
is not the same as the function
of the average.

675
00:36:16,400 --> 00:36:20,820
So these two operations of
taking averages and taking

676
00:36:20,820 --> 00:36:23,180
functions do not commute.

677
00:36:23,180 --> 00:36:28,130
What this inequality tells you
is that, in general, you can

678
00:36:28,130 --> 00:36:30,280
not reason on the average.

679
00:36:30,280 --> 00:36:34,420

680
00:36:34,420 --> 00:36:38,600
So we're going to see instances
where this property

681
00:36:38,600 --> 00:36:39,610
is not true.

682
00:36:39,610 --> 00:36:41,080
You're going to see
lots of them.

683
00:36:41,080 --> 00:36:43,920
Let me just throw it here that
it's something that's not true

684
00:36:43,920 --> 00:36:47,710
in general, but we will be
interested in the exceptions

685
00:36:47,710 --> 00:36:51,480
where a relation like
this is true.

686
00:36:51,480 --> 00:36:53,360
But these will be
the exceptions.

687
00:36:53,360 --> 00:36:56,960
So in general, expectations
are average,

688
00:36:56,960 --> 00:36:58,850
something like averages.

689
00:36:58,850 --> 00:37:02,400
But the function of an average
is not the same as the average

690
00:37:02,400 --> 00:37:05,070
of the function.

691
00:37:05,070 --> 00:37:05,440
OK.

692
00:37:05,440 --> 00:37:09,530
So now let's go to properties
of expectations.

693
00:37:09,530 --> 00:37:15,170
Suppose that alpha is a real
number, and I ask you, what's

694
00:37:15,170 --> 00:37:17,740
the expected value of
that real number?

695
00:37:17,740 --> 00:37:21,010
So for example, if I write
down this expression--

696
00:37:21,010 --> 00:37:23,070
expected value of 2.

697
00:37:23,070 --> 00:37:25,930
What is this?

698
00:37:25,930 --> 00:37:29,470
Well, we defined random
variables and we defined

699
00:37:29,470 --> 00:37:31,860
expectations of random
variables.

700
00:37:31,860 --> 00:37:35,870
So for this to make syntactic
sense, this thing inside here

701
00:37:35,870 --> 00:37:37,670
should be a random variable.

702
00:37:37,670 --> 00:37:39,260
Is 2 --

703
00:37:39,260 --> 00:37:41,140
the number 2 --- is it
a random variable?

704
00:37:41,140 --> 00:37:44,740

705
00:37:44,740 --> 00:37:48,420
In some sense, yes.

706
00:37:48,420 --> 00:37:55,750
It's the random variable that
takes, always, the value of 2.

707
00:37:55,750 --> 00:37:59,220
So suppose that you have some
experiment and that experiment

708
00:37:59,220 --> 00:38:02,580
always outputs 2 whenever
it happens.

709
00:38:02,580 --> 00:38:05,880
Then you can say, yes, it's
a random experiment but it

710
00:38:05,880 --> 00:38:06,960
always gives me 2.

711
00:38:06,960 --> 00:38:08,600
The value of the random
variable is

712
00:38:08,600 --> 00:38:10,460
always 2 no matter what.

713
00:38:10,460 --> 00:38:13,200
It's kind of a degenerate random
variable that doesn't

714
00:38:13,200 --> 00:38:17,230
have any real randomness in it,
but it's still useful to

715
00:38:17,230 --> 00:38:20,130
think of it as a special case.

716
00:38:20,130 --> 00:38:23,000
So it corresponds to a function
from the sample space

717
00:38:23,000 --> 00:38:26,750
to the real line that takes
only one value.

718
00:38:26,750 --> 00:38:30,390
No matter what the outcome is,
it always gives me a 2.

719
00:38:30,390 --> 00:38:30,770
OK.

720
00:38:30,770 --> 00:38:34,390
If you have a random variable
that always gives you a 2,

721
00:38:34,390 --> 00:38:37,980
what is the expected
value going to be?

722
00:38:37,980 --> 00:38:40,530
The only entry that shows
up in this summation

723
00:38:40,530 --> 00:38:43,000
is that number 2.

724
00:38:43,000 --> 00:38:46,270
The probability of a 2 is equal
to 1, and the value of

725
00:38:46,270 --> 00:38:48,330
that random variable
is equal to 2.

726
00:38:48,330 --> 00:38:51,030
So it's the number itself.

727
00:38:51,030 --> 00:38:53,910
So the average value in an
experiment that always gives

728
00:38:53,910 --> 00:38:56,580
you 2's is 2.

729
00:38:56,580 --> 00:38:57,100
All right.

730
00:38:57,100 --> 00:38:59,450
So that's nice and simple.

731
00:38:59,450 --> 00:39:04,890
Now let's go to our experiment
where age was

732
00:39:04,890 --> 00:39:07,310
your height in inches.

733
00:39:07,310 --> 00:39:11,160
And I know your height in
inches, but I'm interested in

734
00:39:11,160 --> 00:39:15,880
your height measured
in centimeters.

735
00:39:15,880 --> 00:39:19,040
How is that going
to be related to

736
00:39:19,040 --> 00:39:22,675
your height in inches?

737
00:39:22,675 --> 00:39:27,440
Well, if you take your height
in inches and convert it to

738
00:39:27,440 --> 00:39:30,690
centimeters, I have another
random variable, which is

739
00:39:30,690 --> 00:39:34,280
always, no matter what, two and
a half times bigger than

740
00:39:34,280 --> 00:39:36,570
the random variable
I started with.

741
00:39:36,570 --> 00:39:40,470
If you take some quantity and
always multiplied by two and a

742
00:39:40,470 --> 00:39:43,610
half what happens to the average
of that quantity?

743
00:39:43,610 --> 00:39:46,990
It also gets multiplied
by two and a half.

744
00:39:46,990 --> 00:39:52,030
So you get a relation like
this, which says that the

745
00:39:52,030 --> 00:39:56,480
average height of a student
measured in centimeters is two

746
00:39:56,480 --> 00:39:58,660
and a half times the
average height of a

747
00:39:58,660 --> 00:40:01,660
student measured in inches.

748
00:40:01,660 --> 00:40:03,730
So that makes perfect
intuitive sense.

749
00:40:03,730 --> 00:40:07,490
If you generalize it, it gives
us this relation, that if you

750
00:40:07,490 --> 00:40:13,790
have a number, you can pull it
outside the expectation and

751
00:40:13,790 --> 00:40:16,210
you get the right result.

752
00:40:16,210 --> 00:40:20,440
So this is a case where you
can reason on the average.

753
00:40:20,440 --> 00:40:23,150
If you take a number, such as
height, and multiply it by a

754
00:40:23,150 --> 00:40:25,500
certain number, you can
reason on the average.

755
00:40:25,500 --> 00:40:27,650
I multiply the numbers
by two, the averages

756
00:40:27,650 --> 00:40:29,630
will go up by two.

757
00:40:29,630 --> 00:40:33,750
So this is an exception to this
cautionary statement that

758
00:40:33,750 --> 00:40:35,460
I had up there.

759
00:40:35,460 --> 00:40:39,860
How do we prove that
this fact is true?

760
00:40:39,860 --> 00:40:44,360
Well, we can use the expected
value rule here, which tells

761
00:40:44,360 --> 00:40:52,690
us that the expected value of
alpha X, this is our g of X,

762
00:40:52,690 --> 00:40:59,720
essentially, is going to be
the sum over all x's of my

763
00:40:59,720 --> 00:41:04,900
function, g of X, times the
probability of the x's.

764
00:41:04,900 --> 00:41:11,270
In our particular case, g of X
is alpha times X. And we have

765
00:41:11,270 --> 00:41:12,450
those probabilities.

766
00:41:12,450 --> 00:41:15,600
And the alpha goes outside
the summation.

767
00:41:15,600 --> 00:41:23,100
So we get alpha, sum over x's,
x Px of x, which is alpha

768
00:41:23,100 --> 00:41:26,740
times the expected value of X.

769
00:41:26,740 --> 00:41:30,580
So that's how you prove this
relation formally using this

770
00:41:30,580 --> 00:41:32,490
rule up here.

771
00:41:32,490 --> 00:41:35,810
And the next formula that
I have here also gets

772
00:41:35,810 --> 00:41:37,310
proved the same way.

773
00:41:37,310 --> 00:41:41,110
What does this formula
tell you?

774
00:41:41,110 --> 00:41:46,560
If I take everybody's height
in centimeters--

775
00:41:46,560 --> 00:41:49,030
we already multiplied
by alpha--

776
00:41:49,030 --> 00:41:52,800
and the gods give everyone
a bonus of ten extra

777
00:41:52,800 --> 00:41:54,670
centimeters.

778
00:41:54,670 --> 00:41:57,720
What's going to happen to the
average height of the class?

779
00:41:57,720 --> 00:42:02,800
Well, it will just go up by
an extra ten centimeters.

780
00:42:02,800 --> 00:42:08,040
So this expectation is going to
be giving you the bonus of

781
00:42:08,040 --> 00:42:15,710
beta just adds a beta to the
average height in centimeters,

782
00:42:15,710 --> 00:42:20,740
which we also know to be alpha
times the expected

783
00:42:20,740 --> 00:42:24,430
value of X, plus beta.

784
00:42:24,430 --> 00:42:29,390
So this is a linearity property
of expectations.

785
00:42:29,390 --> 00:42:34,140
If you take a linear function
of a single random variable,

786
00:42:34,140 --> 00:42:38,390
the expected value of that
linear function is the linear

787
00:42:38,390 --> 00:42:41,140
function of the expected
value.

788
00:42:41,140 --> 00:42:44,100
So this is our big exception to
this cautionary note, that

789
00:42:44,100 --> 00:42:48,710
we have equal if g is linear.

790
00:42:48,710 --> 00:42:55,840

791
00:42:55,840 --> 00:42:57,090
OK.

792
00:42:57,090 --> 00:42:59,790

793
00:42:59,790 --> 00:43:00,790
All right.

794
00:43:00,790 --> 00:43:05,850
So let's get to the last
concept of the day.

795
00:43:05,850 --> 00:43:07,470
What kind of functions
of random

796
00:43:07,470 --> 00:43:11,010
variables may be of interest?

797
00:43:11,010 --> 00:43:15,660
One possibility might be the
average value of X-squared.

798
00:43:15,660 --> 00:43:18,780

799
00:43:18,780 --> 00:43:20,150
Why is it interesting?

800
00:43:20,150 --> 00:43:21,760
Well, why not.

801
00:43:21,760 --> 00:43:24,290
It's the simplest function
that you can think of.

802
00:43:24,290 --> 00:43:27,560

803
00:43:27,560 --> 00:43:30,800
So if you want to calculate
the expected value of

804
00:43:30,800 --> 00:43:35,260
X-squared, you would use this
general rule for how you can

805
00:43:35,260 --> 00:43:39,470
calculate expected values of
functions of random variables.

806
00:43:39,470 --> 00:43:41,340
You consider all the
possible x's.

807
00:43:41,340 --> 00:43:45,550
For each x, you see what's the
probability that it occurs.

808
00:43:45,550 --> 00:43:49,790
And if that x occurs, you
consider and see how big

809
00:43:49,790 --> 00:43:52,090
x-squared is.

810
00:43:52,090 --> 00:43:54,810
Now, the more interesting
quantity, a more interesting

811
00:43:54,810 --> 00:43:58,580
expectation that you can
calculate has to do not with

812
00:43:58,580 --> 00:44:03,570
x-squared, but with the distance
of x from the mean

813
00:44:03,570 --> 00:44:05,710
and then squared.

814
00:44:05,710 --> 00:44:10,800
So let's try to parse what
we've got up here.

815
00:44:10,800 --> 00:44:14,970
Let's look just at the
quantity inside here.

816
00:44:14,970 --> 00:44:16,610
What kind of quantity is it?

817
00:44:16,610 --> 00:44:19,190

818
00:44:19,190 --> 00:44:21,030
It's a random variable.

819
00:44:21,030 --> 00:44:22,370
Why?

820
00:44:22,370 --> 00:44:26,540
X is random, the random
variable, expected value of X

821
00:44:26,540 --> 00:44:28,090
is a number.

822
00:44:28,090 --> 00:44:30,800
Subtract a number from a random
variable, you get

823
00:44:30,800 --> 00:44:32,460
another random variable.

824
00:44:32,460 --> 00:44:35,630
Take a random variable and
square it, you get another

825
00:44:35,630 --> 00:44:36,810
random variable.

826
00:44:36,810 --> 00:44:40,590
So the thing inside here is a
legitimate random variable.

827
00:44:40,590 --> 00:44:44,950
What kind of random
variable is it?

828
00:44:44,950 --> 00:44:47,720
So suppose that we have our
experiment and we have

829
00:44:47,720 --> 00:44:49,452
different x's that can happen.

830
00:44:49,452 --> 00:44:52,310

831
00:44:52,310 --> 00:44:56,090
And the mean of X in this
picture might be somewhere

832
00:44:56,090 --> 00:44:57,340
around here.

833
00:44:57,340 --> 00:45:00,500

834
00:45:00,500 --> 00:45:02,570
I do the experiment.

835
00:45:02,570 --> 00:45:05,350
I obtain some numerical
value of x.

836
00:45:05,350 --> 00:45:09,610
Let's say I obtain this
numerical value.

837
00:45:09,610 --> 00:45:13,810
I look at the distance from
the mean, which is this

838
00:45:13,810 --> 00:45:18,460
length, and I take the
square of that.

839
00:45:18,460 --> 00:45:22,730
Each time that I do the
experiment, I go and record my

840
00:45:22,730 --> 00:45:25,780
distance from the mean
and square it.

841
00:45:25,780 --> 00:45:29,490
So I give more emphasis
to big distances.

842
00:45:29,490 --> 00:45:33,370
And then I take the average over
all possible outcomes,

843
00:45:33,370 --> 00:45:35,520
all possible numerical values.

844
00:45:35,520 --> 00:45:39,510
So I'm trying to compute
the average squared

845
00:45:39,510 --> 00:45:42,980
distance from the mean.

846
00:45:42,980 --> 00:45:47,770
This corresponds to
this formula here.

847
00:45:47,770 --> 00:45:51,110
So the picture that I drew
corresponds to that.

848
00:45:51,110 --> 00:45:55,920
For every possible numerical
value of x, that numerical

849
00:45:55,920 --> 00:45:59,010
value corresponds to a certain
distance from the mean

850
00:45:59,010 --> 00:46:03,580
squared, and I weight according
to how likely is

851
00:46:03,580 --> 00:46:07,360
that particular value
of x to arise.

852
00:46:07,360 --> 00:46:10,840
So this measures the
average squared

853
00:46:10,840 --> 00:46:13,280
distance from the mean.

854
00:46:13,280 --> 00:46:17,180
Now, because of that expected
value rule, of course, this

855
00:46:17,180 --> 00:46:20,010
thing is the same as
that expectation.

856
00:46:20,010 --> 00:46:23,880
It's the average value of the
random variable, which is the

857
00:46:23,880 --> 00:46:26,300
squared distance
from the mean.

858
00:46:26,300 --> 00:46:29,820
With this probability, the
random variable takes on this

859
00:46:29,820 --> 00:46:33,050
numerical value, and the squared
distance from the mean

860
00:46:33,050 --> 00:46:37,200
ends up taking that particular
numerical value.

861
00:46:37,200 --> 00:46:37,680
OK.

862
00:46:37,680 --> 00:46:40,560
So why is the variance
interesting?

863
00:46:40,560 --> 00:46:45,380
It tells us how far away from
the mean we expect to be on

864
00:46:45,380 --> 00:46:46,900
the average.

865
00:46:46,900 --> 00:46:49,550
Well, actually we're not
counting distances from the

866
00:46:49,550 --> 00:46:51,630
mean, it's distances squared.

867
00:46:51,630 --> 00:46:56,500
So it gives more emphasis to the
kind of outliers in here.

868
00:46:56,500 --> 00:46:59,090
But it's a measure of
how spread out the

869
00:46:59,090 --> 00:47:01,180
distribution is.

870
00:47:01,180 --> 00:47:05,240
A big variance means that those
bars go far to the left

871
00:47:05,240 --> 00:47:07,010
and to the right, typically.

872
00:47:07,010 --> 00:47:10,230
Where as a small variance would
mean that all those bars

873
00:47:10,230 --> 00:47:13,850
are tightly concentrated
around the mean value.

874
00:47:13,850 --> 00:47:16,190
It's the average squared
deviation.

875
00:47:16,190 --> 00:47:18,970
Small variance means that
we generally have small

876
00:47:18,970 --> 00:47:19,580
deviations.

877
00:47:19,580 --> 00:47:22,500
Large variances mean that
we generally have large

878
00:47:22,500 --> 00:47:24,210
deviations.

879
00:47:24,210 --> 00:47:27,310
Now as a practical matter, when
you want to calculate the

880
00:47:27,310 --> 00:47:31,140
variance, there's a handy
formula which I'm not proving

881
00:47:31,140 --> 00:47:33,110
but you will see it
in recitation.

882
00:47:33,110 --> 00:47:36,270
It's just two lines
of algebra.

883
00:47:36,270 --> 00:47:40,680
And it allows us to calculate it
in a somewhat simpler way.

884
00:47:40,680 --> 00:47:43,110
We need to calculate the
expected value of the random

885
00:47:43,110 --> 00:47:45,210
variable and the expected value
of the squares of the

886
00:47:45,210 --> 00:47:47,580
random variable, and
these two are going

887
00:47:47,580 --> 00:47:49,710
to give us the variance.

888
00:47:49,710 --> 00:47:53,970
So to summarize what we did
up here, the variance, by

889
00:47:53,970 --> 00:47:57,370
definition, is given
by this formula.

890
00:47:57,370 --> 00:48:01,470
It's the expected value of
the squared deviation.

891
00:48:01,470 --> 00:48:06,380
But we have the equivalent
formula, which comes from

892
00:48:06,380 --> 00:48:13,960
application of the expected
value rule, to the function g

893
00:48:13,960 --> 00:48:18,690
of X, equals to x minus the
(expected value of X)-squared.

894
00:48:18,690 --> 00:48:25,640

895
00:48:25,640 --> 00:48:26,330
OK.

896
00:48:26,330 --> 00:48:27,460
So this is the definition.

897
00:48:27,460 --> 00:48:31,170
This comes from the expected
value rule.

898
00:48:31,170 --> 00:48:35,010
What are some properties
of the variance?

899
00:48:35,010 --> 00:48:38,650
Of course variances are
always non-negative.

900
00:48:38,650 --> 00:48:40,880
Why is it always non-negative?

901
00:48:40,880 --> 00:48:43,650
Well, you look at the definition
and your just

902
00:48:43,650 --> 00:48:45,660
adding up non-negative things.

903
00:48:45,660 --> 00:48:47,630
We're adding squared
deviations.

904
00:48:47,630 --> 00:48:50,100
So when you add non-negative
things, you get something

905
00:48:50,100 --> 00:48:51,400
non-negative.

906
00:48:51,400 --> 00:48:55,800
The next question is, how do
things scale if you take a

907
00:48:55,800 --> 00:48:59,880
linear function of a
random variable?

908
00:48:59,880 --> 00:49:02,350
Let's think about the
effects of beta.

909
00:49:02,350 --> 00:49:06,200
If I take a random variable and
add the constant to it,

910
00:49:06,200 --> 00:49:09,820
how does this affect the amount
of spread that we have?

911
00:49:09,820 --> 00:49:10,950
It doesn't affect--

912
00:49:10,950 --> 00:49:14,610
whatever the spread of this
thing is, if I add the

913
00:49:14,610 --> 00:49:18,840
constant beta, it just moves
this diagram here, but the

914
00:49:18,840 --> 00:49:21,930
spread doesn't grow
or get reduced.

915
00:49:21,930 --> 00:49:24,470
The thing is that when I'm
adding a constant to a random

916
00:49:24,470 --> 00:49:28,160
variable, all the x's that are
going to appear are further to

917
00:49:28,160 --> 00:49:32,890
the right, but the expected
value also moves to the right.

918
00:49:32,890 --> 00:49:35,960
And since we're only interested
in distances from

919
00:49:35,960 --> 00:49:39,500
the mean, these distances
do not get affected.

920
00:49:39,500 --> 00:49:42,180
x gets increased by something.

921
00:49:42,180 --> 00:49:44,390
The mean gets increased by
that same something.

922
00:49:44,390 --> 00:49:46,180
The difference stays the same.

923
00:49:46,180 --> 00:49:49,350
So adding a constant to a random
variable doesn't do

924
00:49:49,350 --> 00:49:51,050
anything to it's variance.

925
00:49:51,050 --> 00:49:54,940
But if I multiply a random
variable by a constant alpha,

926
00:49:54,940 --> 00:49:58,730
what is that going to
do to its variance?

927
00:49:58,730 --> 00:50:04,720
Because we have a square here,
when I multiply my random

928
00:50:04,720 --> 00:50:08,430
variable by a constant, this x
gets multiplied by a constant,

929
00:50:08,430 --> 00:50:12,310
the mean gets multiplied by a
constant, the square gets

930
00:50:12,310 --> 00:50:15,650
multiplied by the square
of that constant.

931
00:50:15,650 --> 00:50:18,960
And because of that reason, we
get this square of alpha

932
00:50:18,960 --> 00:50:20,210
showing up here.

933
00:50:20,210 --> 00:50:22,870
So that's how variances
transform under linear

934
00:50:22,870 --> 00:50:23,650
transformations.

935
00:50:23,650 --> 00:50:26,180
You multiply your random
variable by constant, the

936
00:50:26,180 --> 00:50:30,540
variance goes up by the square
of that same constant.

937
00:50:30,540 --> 00:50:31,290
OK.

938
00:50:31,290 --> 00:50:32,950
That's it for today.

939
00:50:32,950 --> 00:50:34,200
See you on Wednesday.

940
00:50:34,200 --> 00:50:34,750