1
00:00:00,499 --> 00:00:01,950
The following
content is provided

2
00:00:01,950 --> 00:00:04,900
by MIT OpenCourseWare under
a Creative Commons License.

3
00:00:04,900 --> 00:00:08,230
Additional information
about our license,

4
00:00:08,230 --> 00:00:10,560
and MIT OpenCourseWare
in general,

5
00:00:10,560 --> 00:00:11,780
is available at ocw.mit.edu.

6
00:00:16,570 --> 00:00:17,280
PROFESSOR: OK.

7
00:00:17,280 --> 00:00:21,230
Now, where am I
with this problem?

8
00:00:21,230 --> 00:00:28,697
Well, last time I spoke about
what the situation's like

9
00:00:28,697 --> 00:00:29,780
as alpha goes to infinity.

10
00:00:29,780 --> 00:00:37,530
And I want to say also a word
about -- more than a word --

11
00:00:37,530 --> 00:00:39,000
about alpha going to 0.

12
00:00:41,960 --> 00:00:48,900
And then, the real problems
come when alpha is in between.

13
00:00:48,900 --> 00:00:51,930
The real problem
-- the situations,

14
00:00:51,930 --> 00:00:58,570
these ill-posed problems that
come from inverse problems,

15
00:00:58,570 --> 00:01:01,730
trying to find out what's
inside your brain by taking

16
00:01:01,730 --> 00:01:03,520
measurements at the skull.

17
00:01:03,520 --> 00:01:12,350
All sorts of applications
involve a finite alpha.

18
00:01:12,350 --> 00:01:18,310
And I'm not quite ready
to discuss those topics.

19
00:01:18,310 --> 00:01:23,020
I mean, roughly speaking --

20
00:01:23,020 --> 00:01:26,240
I'll write down a reminder now.

21
00:01:26,240 --> 00:01:29,290
What happened when
alpha went to infinity?

22
00:01:29,290 --> 00:01:33,460
When alpha went to
infinity, this part

23
00:01:33,460 --> 00:01:35,070
became the important part.

24
00:01:35,070 --> 00:01:42,230
So as alpha went to infinity,
the limit was u_infinity,

25
00:01:42,230 --> 00:01:43,710
shall I call it? u_infinity.

26
00:01:47,210 --> 00:01:50,860
Well, so u_infinity was
a minimizer of this term,

27
00:01:50,860 --> 00:02:02,100
u_infinity minimized
B*u minus d squared.

28
00:02:02,100 --> 00:02:10,490
In fact, in my last lecture,
I was taking B*u equal d

29
00:02:10,490 --> 00:02:13,390
as an equation that
had exact solutions,

30
00:02:13,390 --> 00:02:17,060
and saying how did we
actually solve B*u equal d.

31
00:02:17,060 --> 00:02:21,450
So u_infinity
minimizes B*u minus d.

32
00:02:21,450 --> 00:02:24,080
But that might
leave some freedom.

33
00:02:24,080 --> 00:02:26,760
If B doesn't have
that many rows,

34
00:02:26,760 --> 00:02:30,350
if its rank is not
that big, then this

35
00:02:30,350 --> 00:02:32,510
doesn't finish the job.

36
00:02:32,510 --> 00:02:39,610
So among these, if
there are many --

37
00:02:39,610 --> 00:02:43,760
and that's what were
interested in --

38
00:02:43,760 --> 00:02:51,810
u hat infinity, that limit,
will minimize the other bit,

39
00:02:51,810 --> 00:02:55,540
A*u minus b square.

40
00:02:55,540 --> 00:02:58,110
Does that makes sense somehow?

41
00:02:58,110 --> 00:03:01,480
This is in the problem
here, for any finite alpha.

42
00:03:01,480 --> 00:03:03,600
As alpha gets bigger
and bigger, we

43
00:03:03,600 --> 00:03:05,920
push harder and
harder on this one,

44
00:03:05,920 --> 00:03:10,440
so we get a one that's
a winner for this one,

45
00:03:10,440 --> 00:03:14,720
but the trace of this
first part is still around

46
00:03:14,720 --> 00:03:19,770
and if there are many winners,
then having this first part

47
00:03:19,770 --> 00:03:24,970
in there will give us, among
the winners, the one that

48
00:03:24,970 --> 00:03:29,110
does the best on that term.

49
00:03:29,110 --> 00:03:33,620
And small alpha,
going to 0, will just

50
00:03:33,620 --> 00:03:35,540
be the opposite right?

51
00:03:35,540 --> 00:03:37,880
This finally struck
me over the weekend,

52
00:03:37,880 --> 00:03:40,300
you know, like I could
divide this quantity,

53
00:03:40,300 --> 00:03:44,380
this whole expression by
alpha, so then I have a 1

54
00:03:44,380 --> 00:03:48,440
there, and a 1 over alpha
here, and now as alpha

55
00:03:48,440 --> 00:03:51,710
goes 0 this is the big term.

56
00:03:51,710 --> 00:03:56,490
So now u -- shall
I call this u_0?

57
00:03:56,490 --> 00:03:58,700
Brilliant notation, right?

58
00:03:58,700 --> 00:04:02,640
So this produces a u_alpha.

59
00:04:06,190 --> 00:04:10,200
In that limit it converges to
a u_infinity that focuses first

60
00:04:10,200 --> 00:04:13,200
on this problem, but
in the other limit,

61
00:04:13,200 --> 00:04:15,970
when alpha's going to 0, it's
this term that's biggest.

62
00:04:15,970 --> 00:04:25,530
So u_0 minimizes A*u minus b
squared and if there are many

63
00:04:25,530 --> 00:04:31,265
minimizers, among these --
well, you know what I'm going

64
00:04:31,265 --> 00:04:35,590
to write. u_0, see I
put a little hat there.

65
00:04:35,590 --> 00:04:36,330
Did I?

66
00:04:36,330 --> 00:04:38,560
I don't know I haven't
stayed with these hats

67
00:04:38,560 --> 00:04:43,060
very much but maybe
I'll add them.

68
00:04:43,060 --> 00:04:48,870
u hat minimizes the term
that's not so important,

69
00:04:48,870 --> 00:04:51,290
B*u minus d square.

70
00:04:51,290 --> 00:05:00,780
OK, so today's lecture is still
about these limiting cases.

71
00:05:00,780 --> 00:05:09,210
As I said, the scientific
problems, ill-posed problems,

72
00:05:09,210 --> 00:05:12,470
especially these
inverse problems,

73
00:05:12,470 --> 00:05:16,180
give situations in which
these limiting problems are

74
00:05:16,180 --> 00:05:21,170
really bad, and you don't get
to the limit, you don't want to.

75
00:05:21,170 --> 00:05:27,040
The whole point is to
have a finite alpha.

76
00:05:27,040 --> 00:05:33,020
But choosing that alpha
correctly is the art --

77
00:05:33,020 --> 00:05:36,460
let me just say
why -- so I almost,

78
00:05:36,460 --> 00:05:41,030
I'm sort of anticipating what
I'm not ready to do properly.

79
00:05:41,030 --> 00:05:49,010
So, I'll say why a finite
alpha, on Wednesday.

80
00:05:49,010 --> 00:05:51,590
Why?

81
00:05:51,590 --> 00:05:54,440
Because noisy data.

82
00:06:04,500 --> 00:06:14,360
Because of the noise, at
best u is only determined,

83
00:06:14,360 --> 00:06:20,100
because of the noise,
up to some order,

84
00:06:20,100 --> 00:06:23,110
say, order of some
small quantity

85
00:06:23,110 --> 00:06:25,390
delta that measures the noise.

86
00:06:25,390 --> 00:06:27,250
This is like a
measure of the noise.

87
00:06:32,750 --> 00:06:35,920
Then there's no reason to
do what we did last time,

88
00:06:35,920 --> 00:06:39,450
like forcing B*u equal d.

89
00:06:39,450 --> 00:06:43,970
There's no point in forcing
B*u equal d if the d in that

90
00:06:43,970 --> 00:06:51,980
equation has noise in it,
then pushing it all the way

91
00:06:51,980 --> 00:06:59,570
to the limit is unreasonable,
and may produce a very,

92
00:06:59,570 --> 00:07:03,960
you know, a
catastrophic illness.

93
00:07:03,960 --> 00:07:08,350
So that's when -- so it's
really the presence of noise,

94
00:07:08,350 --> 00:07:15,180
the presence of uncertainty in
the first place that says OK,

95
00:07:15,180 --> 00:07:18,710
a finite alpha is fine, you're
not looking for perfection,

96
00:07:18,710 --> 00:07:21,110
what you're looking
for is some stability,

97
00:07:21,110 --> 00:07:25,460
some control on the stability.

98
00:07:25,460 --> 00:07:27,040
OK, right.

99
00:07:27,040 --> 00:07:30,670
But now -- so that's Wednesday.

100
00:07:34,300 --> 00:07:43,310
Today, let me go -- I didn't
give an example, so today,

101
00:07:43,310 --> 00:07:51,990
two topics, one is an
example with B*u equal d.

102
00:07:51,990 --> 00:07:54,260
That was last
lecture's, and that's

103
00:07:54,260 --> 00:07:58,450
the case when alpha
goes to infinity,

104
00:07:58,450 --> 00:08:03,010
then secondly is something
called a pseudoinverse.

105
00:08:03,010 --> 00:08:06,440
You may have seen that
expression, the pseudoinverse

106
00:08:06,440 --> 00:08:11,160
of A, and sometimes it's
written A with a dagger or A

107
00:08:11,160 --> 00:08:13,390
with a plus sign.

108
00:08:13,390 --> 00:08:15,460
And that is worth knowing about.

109
00:08:15,460 --> 00:08:18,140
So this is a topic
in linear algebra.

110
00:08:18,140 --> 00:08:19,910
It would be in my
linear algebra book,

111
00:08:19,910 --> 00:08:23,970
but it's a topic that never
gets into the 18.06 course cause

112
00:08:23,970 --> 00:08:26,490
it's sort of a little late.

113
00:08:26,490 --> 00:08:30,620
And that will appear
as alpha goes to 0.

114
00:08:30,620 --> 00:08:31,120
Right.

115
00:08:31,120 --> 00:08:34,860
So that's what
today is about, it's

116
00:08:34,860 --> 00:08:41,790
linear algebra, because I'm
not ready for the noise yet.

117
00:08:41,790 --> 00:08:48,310
But it's the noisy data
that we have in reality

118
00:08:48,310 --> 00:08:55,340
and that's why, in reality,
alpha will be chosen finite.

119
00:08:55,340 --> 00:08:55,950
OK.

120
00:08:55,950 --> 00:09:02,760
So part one, then, is to do a
very simple example with B*u

121
00:09:02,760 --> 00:09:03,900
equal d.

122
00:09:03,900 --> 00:09:06,060
And here is the example.

123
00:09:06,060 --> 00:09:07,160
OK.

124
00:09:07,160 --> 00:09:11,859
So this is my sum of
squares in which I plan

125
00:09:11,859 --> 00:09:13,150
to let alpha to go to infinity.

126
00:09:18,130 --> 00:09:21,870
So A is the identity
matrix and b is 0.

127
00:09:21,870 --> 00:09:26,120
So that quantity is simple.

128
00:09:26,120 --> 00:09:36,580
Here, I have just one equation,
so p is 1; p by n is 1 by 2.

129
00:09:36,580 --> 00:09:41,940
I've just one equation u_1 minus
u_2 equals 6, and in the limit,

130
00:09:41,940 --> 00:09:43,930
as alpha go to
infinity, I expect

131
00:09:43,930 --> 00:09:46,220
to see that that
equation is enforced.

132
00:09:49,930 --> 00:09:54,790
So there's two ways to do it,
we can let alpha go to infinity

133
00:09:54,790 --> 00:10:02,820
and look at u_alpha
going toward u_infinity,

134
00:10:02,820 --> 00:10:05,690
maybe with their little hats.

135
00:10:05,690 --> 00:10:12,670
Or the second method which is
the null space method, which is

136
00:10:12,670 --> 00:10:15,500
what I spoke about last time.

137
00:10:15,500 --> 00:10:24,290
the null space method solves the
constraint B*u equal d which is

138
00:10:24,290 --> 00:10:26,540
just u_1 minus u_2 equal 6.

139
00:10:26,540 --> 00:10:27,040
OK.

140
00:10:27,040 --> 00:10:33,390
And that's -- maybe I'll
start with that one.

141
00:10:33,390 --> 00:10:35,313
Which looks so simple,
of course, just

142
00:10:35,313 --> 00:10:37,620
to solve u_1 minus u_2 equal 6.

143
00:10:37,620 --> 00:10:41,980
I mean, everybody
would say, OK, solve it

144
00:10:41,980 --> 00:10:45,290
for u_2 equals u_1 minus 6.

145
00:10:49,580 --> 00:10:54,650
So here is the method any
sensible person would use.

146
00:10:54,650 --> 00:10:56,210
But this course doesn't.

147
00:10:56,210 --> 00:11:03,630
OK, the sensible method
would be u_2 is u_1 minus 6;

148
00:11:03,630 --> 00:11:12,440
plug that into the
squares and minimize.

149
00:11:12,440 --> 00:11:18,090
So when I plug this in,
of course, this is exact,

150
00:11:18,090 --> 00:11:23,470
and this becomes u
-- so I'm minimizing,

151
00:11:23,470 --> 00:11:28,610
minimizing u_1 squared
plus, what was it?

152
00:11:28,610 --> 00:11:30,190
u_1 minus 6 square.

153
00:11:34,930 --> 00:11:39,410
So that's reduced the
problem to one unknown,

154
00:11:39,410 --> 00:11:41,250
this is the null space method.

155
00:11:41,250 --> 00:11:45,040
The null space method
is to solve the equation

156
00:11:45,040 --> 00:11:48,260
and remove unknowns.

157
00:11:48,260 --> 00:11:51,970
Remove p unknowns coming
from the p constraints,

158
00:11:51,970 --> 00:11:54,150
and here p is 1.

159
00:11:54,150 --> 00:11:54,660
OK.

160
00:11:54,660 --> 00:11:58,170
And by the way, can we
just guess, or not guess,

161
00:11:58,170 --> 00:12:05,770
but pretty well be sure,
what's the minimizer here?

162
00:12:05,770 --> 00:12:09,120
Anybody just tell me what
u_1 would minimize that?

163
00:12:09,120 --> 00:12:10,570
Just make a guess, maybe?

164
00:12:13,090 --> 00:12:19,520
I'm looking for a number sort
of halfway between 0 and 6

165
00:12:19,520 --> 00:12:21,110
somehow.

166
00:12:21,110 --> 00:12:27,630
You won't be surprised
that the u_1 is 3.

167
00:12:27,630 --> 00:12:32,450
And then, from this equation, I
should learn that u_2 is minus

168
00:12:32,450 --> 00:12:34,700
3 -- u_2, no, u_(2, infinity).

169
00:12:37,740 --> 00:12:42,130
Now I've got too many --
u_(2, infinity) is minus 3.

170
00:12:42,130 --> 00:12:47,430
Anyway, simple calculus, if you
just set the derivative to 0,

171
00:12:47,430 --> 00:12:50,430
you'll get 3 and then
you get minus 3 for u_2.

172
00:12:50,430 --> 00:12:53,650
So that's the null space
method, except that I

173
00:12:53,650 --> 00:13:02,180
didn't follow my complicated
QR orthogonalization.

174
00:13:02,180 --> 00:13:05,220
And I just want
to do that quickly

175
00:13:05,220 --> 00:13:07,320
to reach the same answer.

176
00:13:10,270 --> 00:13:15,250
And to say, why don't
I just do this anyway?

177
00:13:15,250 --> 00:13:18,860
This is what -- this
would be the row --

178
00:13:18,860 --> 00:13:23,030
this would be the standard
method in the first month

179
00:13:23,030 --> 00:13:27,180
of linear algebra would be to
use the row reduced echelon

180
00:13:27,180 --> 00:13:30,300
form, which of course
is going to be really,

181
00:13:30,300 --> 00:13:32,520
really simple for
this matrix; in fact,

182
00:13:32,520 --> 00:13:37,440
that's already in row reduced
echelon form -- elimination,

183
00:13:37,440 --> 00:13:40,930
row reduction has nothing
to do to improve that --

184
00:13:40,930 --> 00:13:44,280
and then solve and then
plug in and then go with it.

185
00:13:44,280 --> 00:13:49,290
OK, well the thing
is that that row

186
00:13:49,290 --> 00:13:53,960
reduced echelon form, the
stuff you teach, is not,

187
00:13:53,960 --> 00:13:59,730
for large systems,
guaranteed stable.

188
00:13:59,730 --> 00:14:01,640
It's not numerically stable.

189
00:14:01,640 --> 00:14:07,200
And the option of using--
of orthogonalizing

190
00:14:07,200 --> 00:14:11,200
is the right one to
know for a large system.

191
00:14:11,200 --> 00:14:16,010
So you you'll have to allow me,
on this really small example,

192
00:14:16,010 --> 00:14:19,820
to use a method that
I described last time.

193
00:14:19,820 --> 00:14:24,970
And I just want to recap with
an example on the small system.

194
00:14:24,970 --> 00:14:27,780
OK, so what was that method?

195
00:14:27,780 --> 00:14:30,950
So this is the null space
method using qr now.

196
00:14:34,920 --> 00:14:39,710
The MATLAB command qr, so
what did we -- qr of B prime.

197
00:14:39,710 --> 00:14:43,130
Do you remember
that we took that --

198
00:14:43,130 --> 00:14:47,140
that's the MATLAB command
that eventually will,

199
00:14:47,140 --> 00:14:50,750
or is actually already in
the notes for this section,

200
00:14:50,750 --> 00:14:53,460
and those notes
will get updated --

201
00:14:53,460 --> 00:14:57,540
but that's step one in
the null space method,

202
00:14:57,540 --> 00:14:58,940
qr B prime.

203
00:14:58,940 --> 00:15:00,920
And this gives me a
chance to say what's

204
00:15:00,920 --> 00:15:04,790
up with this qr algorithm.

205
00:15:04,790 --> 00:15:11,150
I mean after lu, qr is the most
important algorithm in MATLAB.

206
00:15:11,150 --> 00:15:13,340
And so what does it do?

207
00:15:13,340 --> 00:15:20,460
B prime, the transpose of B,
is just 1, minus 1, right?

208
00:15:20,460 --> 00:15:23,200
OK.

209
00:15:23,200 --> 00:15:30,800
Now what does Gram-Schmidt
do to that matrix?

210
00:15:34,240 --> 00:15:37,250
Well, the idea of
Gram-Schmidt is

211
00:15:37,250 --> 00:15:41,370
to produce orthonormal columns.

212
00:15:41,370 --> 00:15:45,220
So the most basic Gram-Schmidt
idea would say, so what would

213
00:15:45,220 --> 00:15:46,390
Gram and Schmidt say?

214
00:15:46,390 --> 00:15:49,020
They'd say, well, we
only have one column,

215
00:15:49,020 --> 00:15:52,340
and all we would have
to do is normalize it

216
00:15:52,340 --> 00:16:01,940
So Gram-Schmidt would produce
the normalized thing --

217
00:16:01,940 --> 00:16:05,830
times square root of 2.

218
00:16:05,830 --> 00:16:09,810
That would be the q, and
this would be the r, 1 by 1,

219
00:16:09,810 --> 00:16:12,650
in Gram Schmidt.

220
00:16:12,650 --> 00:16:20,540
OK, but here's the point, that
the qr algorithm in MATLAB,

221
00:16:20,540 --> 00:16:23,560
which no longer uses
the Gram-Schmidt idea,

222
00:16:23,560 --> 00:16:31,000
instead uses a Householder idea,
and one nice thing about this

223
00:16:31,000 --> 00:16:39,950
is that it produces not just
this column, but another one,

224
00:16:39,950 --> 00:16:47,520
it produces a column for the
-- it completes the basis

225
00:16:47,520 --> 00:16:50,220
to a full orthonormal basis.

226
00:16:50,220 --> 00:16:54,020
So it finds a second vector.

227
00:16:54,020 --> 00:16:56,770
So ordinary
Gram-Schmidt just had

228
00:16:56,770 --> 00:16:59,270
one column times one number.

229
00:16:59,270 --> 00:17:05,480
What qr actually does is it
ends up with two columns.

230
00:17:05,480 --> 00:17:12,410
And well, everybody can see
what's the other column --

231
00:17:12,410 --> 00:17:16,610
that has length 1, of course,
and is orthogonal to the first

232
00:17:16,610 --> 00:17:17,500
column.

233
00:17:17,500 --> 00:17:21,770
And now, that is
multiplied by 0.

234
00:17:27,310 --> 00:17:30,370
So this is what qr does.

235
00:17:30,370 --> 00:17:35,160
We have this 2 by 1 matrix,
it produces a 2 by 2 times a 2

236
00:17:35,160 --> 00:17:35,690
by 1.

237
00:17:38,820 --> 00:17:43,000
And you might say, it
was wasting its time,

238
00:17:43,000 --> 00:17:48,810
to find this part, because
it's multiplied by 0,

239
00:17:48,810 --> 00:17:54,910
but what are we learning
from the vector?

240
00:17:54,910 --> 00:17:58,410
From this [1, 1] vector or
1 over square root of 2,

241
00:17:58,410 --> 00:18:00,190
1 over square root of 2 vector?

242
00:18:00,190 --> 00:18:03,880
What good can that do us?

243
00:18:03,880 --> 00:18:11,490
It's the null space of
B. So B was 1, minus 1,

244
00:18:11,490 --> 00:18:16,340
So let me just -- so that's
the connection with null space

245
00:18:16,340 --> 00:18:23,120
of B. If I look at vectors
-- there's my matrix B,

246
00:18:23,120 --> 00:18:29,330
and if I'm solving B*u equal
d, if I'm solving B*u equal d,

247
00:18:29,330 --> 00:18:37,200
then u is u_particular
and u null space,

248
00:18:37,200 --> 00:18:42,960
and if I want u null space, then
that's where this -- and these,

249
00:18:42,960 --> 00:18:46,950
whatever extra columns, this
might be p columns and then

250
00:18:46,950 --> 00:18:50,640
this would be n minus p columns,
that's what that's good for.

251
00:18:50,640 --> 00:18:53,180
And of course that
column tells me

252
00:18:53,180 --> 00:18:56,380
about the null space
which, for this matrix,

253
00:18:56,380 --> 00:19:05,400
is one-dimensional
and easy to find, OK.

254
00:19:05,400 --> 00:19:13,210
So that may be just to, so you
know the difference between

255
00:19:13,210 --> 00:19:16,650
Gram-Schmidt's qr
which stops with --

256
00:19:16,650 --> 00:19:19,450
if you had one column
you end with one column,

257
00:19:19,450 --> 00:19:24,500
and the MATLAB Householder
qr which finds a full square

258
00:19:24,500 --> 00:19:25,170
matrix.

259
00:19:25,170 --> 00:19:29,410
OK, just good to know and
here we've found a use for it.

260
00:19:29,410 --> 00:19:29,910
OK.

261
00:19:29,910 --> 00:19:36,010
So then, the algorithm
that I gave last time --

262
00:19:36,010 --> 00:19:39,510
and I'll give the
code in the notes --

263
00:19:39,510 --> 00:19:45,740
goes through the steps of
finding a u_particular,

264
00:19:45,740 --> 00:19:51,600
and actually, the u_particular
that it would find happens

265
00:19:51,600 --> 00:19:59,540
to be -- [3, minus 3] happens
to be the actual winner.

266
00:19:59,540 --> 00:20:06,085
And therefore, the u null space
that that algorithm would find

267
00:20:06,085 --> 00:20:08,950
-- if I went through
all the steps,

268
00:20:08,950 --> 00:20:15,020
you would see that because I'm
in this special case of b being

269
00:20:15,020 --> 00:20:16,360
0 and so on,

270
00:20:16,360 --> 00:20:19,110
that the vector that
it would choose --

271
00:20:19,110 --> 00:20:21,440
this is the basis
for the null space,

272
00:20:21,440 --> 00:20:28,620
but it would choose 0 of that
basis vector and would come up

273
00:20:28,620 --> 00:20:30,500
with that answer.

274
00:20:30,500 --> 00:20:34,240
OK so that's what the
algorithm from last time

275
00:20:34,240 --> 00:20:39,010
would have done to this problem.

276
00:20:39,010 --> 00:20:46,120
I also, over the weekend,
thought OK, if it's all true,

277
00:20:46,120 --> 00:20:51,460
I should be able to
use my first method.

278
00:20:51,460 --> 00:20:55,190
The large alpha method
and just find the answer

279
00:20:55,190 --> 00:20:58,880
to the original problem and
let alpha go to infinity.

280
00:20:58,880 --> 00:21:01,260
Are you willing to do that?

281
00:21:01,260 --> 00:21:04,010
That might take a
little more calculation,

282
00:21:04,010 --> 00:21:06,400
but let me try that.

283
00:21:06,400 --> 00:21:10,430
I'm hoping, you know, that
it approaches this answer.

284
00:21:10,430 --> 00:21:13,810
This is the answer
I'm looking for.

285
00:21:13,810 --> 00:21:22,360
OK so, do your mind just
-- suppose I had to do that

286
00:21:22,360 --> 00:21:24,650
minimization.

287
00:21:24,650 --> 00:21:27,230
Again, now I'm not using
the null space method,

288
00:21:27,230 --> 00:21:30,610
so I'm not reducing, I'm not
getting u_2 out of the problem.

289
00:21:30,610 --> 00:21:38,730
I'm doing the minimum as it
stands, and so what do I get?

290
00:21:38,730 --> 00:21:41,750
Well, I've got two
variables u_1 and u_2.

291
00:21:41,750 --> 00:21:45,210
So I take the derivatives
with respect u_1 --

292
00:21:45,210 --> 00:21:47,520
I'm minimizing --
everybody, when I point,

293
00:21:47,520 --> 00:21:50,360
I'm pointing at that top line.

294
00:21:50,360 --> 00:21:56,030
So it's 2*u_1, and
what do I have here?

295
00:21:56,030 --> 00:22:03,990
2*alpha u_1 minus u_2
minus 6 equaling 0.

296
00:22:03,990 --> 00:22:08,350
Is that -- did I take the
u_1 derivative correctly?

297
00:22:08,350 --> 00:22:12,590
Now if I take the u_2
derivative I get two u_2's.

298
00:22:12,590 --> 00:22:15,800
And now, the chain rule is
going to give me a minus sign,

299
00:22:15,800 --> 00:22:22,850
so it would be a minus 2*alpha
u_1 minus u_2 minus 6 equals 0.

300
00:22:22,850 --> 00:22:26,770
So those two equations
will determine u_1 and u_2

301
00:22:26,770 --> 00:22:29,370
for a finite alpha.

302
00:22:29,370 --> 00:22:33,360
And then I'll let alpha head to
infinity and see what happens.

303
00:22:33,360 --> 00:22:37,730
OK, first I'll
multiply by a half

304
00:22:37,730 --> 00:22:42,470
and get rid of those
useless 2's, and then

305
00:22:42,470 --> 00:22:45,360
solve this equation.

306
00:22:45,360 --> 00:22:48,080
OK, so what do I have here?

307
00:22:48,080 --> 00:22:54,160
I've got a matrix -- u_1 is
multiplying 1 plus alpha,

308
00:22:54,160 --> 00:22:57,250
u_2 has a minus alpha.

309
00:22:57,250 --> 00:23:06,340
In this line, u_1 has a minus
alpha, u_1 has a 1 minus,

310
00:23:06,340 --> 00:23:09,470
minus, plus alpha, am I right?

311
00:23:09,470 --> 00:23:14,950
Times [u 1, u 2] equals --
what's my right-hand side?

312
00:23:14,950 --> 00:23:17,940
I guess the right-hand
side has alphas in it.

313
00:23:17,940 --> 00:23:23,300
6*alpha and minus
6*alpha, I think.

314
00:23:23,300 --> 00:23:23,810
OK.

315
00:23:27,010 --> 00:23:29,510
two equations, two unknowns.

316
00:23:29,510 --> 00:23:34,190
These are the normal equation
for this problem, written out

317
00:23:34,190 --> 00:23:35,610
explicitly.

318
00:23:35,610 --> 00:23:39,630
And probably I can
find the solution

319
00:23:39,630 --> 00:23:41,750
and let alpha go to infinity.

320
00:23:41,750 --> 00:23:44,680
You could say,
what are you doing,

321
00:23:44,680 --> 00:23:47,870
Professor Strang, this
elementary calculation?

322
00:23:47,870 --> 00:23:50,810
But there is something sort
of satisfying about seeing

323
00:23:50,810 --> 00:23:53,030
a small example actually work.

324
00:23:53,030 --> 00:23:54,550
At least to me.

325
00:23:54,550 --> 00:23:58,850
OK, so how do I solve
those equations?

326
00:23:58,850 --> 00:24:00,050
Well, good question.

327
00:24:03,430 --> 00:24:05,610
Should I -- with
a 2 by 2 matrix,

328
00:24:05,610 --> 00:24:09,560
can I do the unforgivable and
actually find its inverse?

329
00:24:09,560 --> 00:24:14,260
I mean, it's like not allowed
in true linear algebra

330
00:24:14,260 --> 00:24:18,750
to find the inverse, but
maybe we could do it here.

331
00:24:18,750 --> 00:24:25,410
So [u 1, u 2] is going to be
the inverse matrix, which is --

332
00:24:25,410 --> 00:24:31,120
so my little recipe for finding
inverses is take the entries,

333
00:24:31,120 --> 00:24:38,790
this entry goes up is up here,
that entry goes down there --

334
00:24:38,790 --> 00:24:41,150
well, you couldn't
see the difference --

335
00:24:41,150 --> 00:24:46,640
this entry stays stays in
place, those change sign,

336
00:24:46,640 --> 00:24:49,560
and then I have to divide
by the determinant.

337
00:24:49,560 --> 00:24:51,680
So what was the
determinant of this?

338
00:24:51,680 --> 00:24:55,190
1 plus 2*alpha plus alpha
squared minus alpha squared,

339
00:24:55,190 --> 00:24:57,290
I get 1 plus 2 alpha.

340
00:24:57,290 --> 00:25:02,830
And that's the inverse matrix
now that's multiplying 6*alpha

341
00:25:02,830 --> 00:25:06,650
and minus 6*alpha, OK.

342
00:25:06,650 --> 00:25:10,100
And if I can do that
multiplication, I have -- well,

343
00:25:10,100 --> 00:25:13,390
there's this factor 1
over 1 plus 2*alpha,

344
00:25:13,390 --> 00:25:15,880
and what do I have?

345
00:25:15,880 --> 00:25:20,180
6*alpha, 6 alpha squared,
minus 6 alpha squared,

346
00:25:20,180 --> 00:25:22,730
I think 6*alpha?

347
00:25:22,730 --> 00:25:27,640
6 alpha squared, minus 6*alpha
squared plus -- minus 6 alpha,

348
00:25:27,640 --> 00:25:30,200
I think it's that,
minus 6*alpha.

349
00:25:30,200 --> 00:25:36,320
And, ready for the great moment?

350
00:25:36,320 --> 00:25:44,100
Let alpha go to infinity,
and what do I get?

351
00:25:44,100 --> 00:25:52,580
As alpha goes to infinity,
the 1 becomes insignificant,

352
00:25:52,580 --> 00:25:59,660
the alpha cancels the alpha,
so that approaches [3, -3].

353
00:25:59,660 --> 00:26:03,590
So there you see the large
alpha method in practice.

354
00:26:03,590 --> 00:26:04,370
OK.

355
00:26:04,370 --> 00:26:11,130
And you see what -- well,
there's something quite

356
00:26:11,130 --> 00:26:13,550
important here.

357
00:26:13,550 --> 00:26:15,482
Something quite important,
and it's connected

358
00:26:15,482 --> 00:26:16,440
with the pseudoinverse.

359
00:26:19,470 --> 00:26:27,350
The pseudoinverse -- so now,
I want to, we got this answer.

360
00:26:27,350 --> 00:26:35,300
And what I want to say is that
the alpha, the limiting alpha

361
00:26:35,300 --> 00:26:43,300
system, has produced
this pseudoinverse.

362
00:26:43,300 --> 00:26:46,160
So now I have to tell you
about the pseudoinverse

363
00:26:46,160 --> 00:26:47,810
and what it means.

364
00:26:47,810 --> 00:26:50,210
And basically, the essential
thing that it means

365
00:26:50,210 --> 00:26:59,840
is, the pseudoinverse
gives the solution u which

366
00:26:59,840 --> 00:27:03,300
has no null space component.

367
00:27:03,300 --> 00:27:05,120
That's what the
pseudoinverse is about.

368
00:27:05,120 --> 00:27:07,840
I'll draw a picture to
say what I'm saying.

369
00:27:07,840 --> 00:27:12,010
But it's this fact that
means that this part,

370
00:27:12,010 --> 00:27:20,150
which was this number,
is the output --

371
00:27:20,150 --> 00:27:28,530
this is the pseudoinverse
of B applied to [6, 6].

372
00:27:28,530 --> 00:27:29,670
You see the point?

373
00:27:29,670 --> 00:27:33,370
B hasn't got an inverse.

374
00:27:33,370 --> 00:27:34,410
B is 1, minus 1.

375
00:27:34,410 --> 00:27:40,200
It's a rectangular matrix.

376
00:27:40,200 --> 00:27:50,220
And it's not invertible
in the normal sense.

377
00:27:50,220 --> 00:27:53,590
I can't find a
two-sided inverse;

378
00:27:53,590 --> 00:27:58,490
a B inverse doesn't exist.

379
00:27:58,490 --> 00:28:02,100
But a pseudoinverse counts.

380
00:28:02,100 --> 00:28:06,020
So, just to give a MATLAB -- as
long as I've written a MATLAB

381
00:28:06,020 --> 00:28:11,910
command here, why don't I
write the other MATLAB command?

382
00:28:11,910 --> 00:28:17,940
u is the pseudoinverse -- you
remember that pseudo starts

383
00:28:17,940 --> 00:28:27,530
with a letter p, so P-I-N-V
-- of B multiplying d.

384
00:28:27,530 --> 00:28:34,660
That's what we
got automatically.

385
00:28:34,660 --> 00:28:38,370
And it's what we get --
and the reason we got

386
00:28:38,370 --> 00:28:40,660
the pseudoinverse.

387
00:28:40,660 --> 00:28:43,420
So let me just say
what was special here.

388
00:28:43,420 --> 00:28:46,060
What was special that
produced this pseudoinverse --

389
00:28:46,060 --> 00:28:48,300
that I'm going to
speak about more --

390
00:28:48,300 --> 00:28:54,040
was this choice A equal
the identity and b equal 0,

391
00:28:54,040 --> 00:28:59,530
the fact that we just put the
norm of u squared there --

392
00:28:59,530 --> 00:29:02,470
well, the idea is this
produces the pseudoinverse.

393
00:29:06,030 --> 00:29:12,110
And if you like -- so, can I
say a little more about this

394
00:29:12,110 --> 00:29:14,710
pseudoinverse before drawing
the picture that shows what

395
00:29:14,710 --> 00:29:15,620
it's about?

396
00:29:15,620 --> 00:29:19,570
So I took this thing and
let alpha go to infinity.

397
00:29:19,570 --> 00:29:24,340
OK, so I could equally well
have divided it by alpha,

398
00:29:24,340 --> 00:29:27,370
the whole -- if I divide
the whole thing by alpha,

399
00:29:27,370 --> 00:29:32,500
that won't change the minimizer;
certainly the same u's will

400
00:29:32,500 --> 00:29:33,650
win.

401
00:29:33,650 --> 00:29:37,830
And now I see one
over alpha going to 0.

402
00:29:37,830 --> 00:29:41,510
And that's where the
pseudoinverse is usually seen.

403
00:29:41,510 --> 00:29:46,550
We take the given problem,
which does not completely

404
00:29:46,550 --> 00:29:49,920
determine u_1 and
u_2, and we throw

405
00:29:49,920 --> 00:29:54,800
in a small amount
of norm u squared,

406
00:29:54,800 --> 00:29:59,590
and find the minimum
for that, right.

407
00:29:59,590 --> 00:30:01,380
So yeah.

408
00:30:03,990 --> 00:30:06,440
Let me say it, somehow.

409
00:30:06,440 --> 00:30:15,870
I take the B transpose B
plus the 1 over alpha I --

410
00:30:15,870 --> 00:30:23,560
now alpha is still going to
infinity in this lecture,

411
00:30:23,560 --> 00:30:30,890
so 1 over alpha, the whole
thing is headed for 0 --

412
00:30:30,890 --> 00:30:34,390
times the norm of u square.

413
00:30:34,390 --> 00:30:37,720
This is the u_1 squared
plus u_2 squared.

414
00:30:37,720 --> 00:30:40,650
OK.

415
00:30:40,650 --> 00:30:49,350
And that inverse, that quantity
inverse approaches the -- well,

416
00:30:49,350 --> 00:30:54,020
once I -- I'm not giving
the complete formula,

417
00:30:54,020 --> 00:30:59,020
but that's is what entering
here and it leads to --

418
00:30:59,020 --> 00:31:05,830
may I see the vague word leads
toward the pseudoinverse B

419
00:31:05,830 --> 00:31:07,370
plus.

420
00:31:07,370 --> 00:31:07,870
Yeah.

421
00:31:07,870 --> 00:31:11,130
And I'll do better with that.

422
00:31:11,130 --> 00:31:13,750
OK, I want to go
on to the picture.

423
00:31:13,750 --> 00:31:17,770
OK, so right.

424
00:31:17,770 --> 00:31:20,430
Do you know the most important
picture of linear algebra?

425
00:31:20,430 --> 00:31:24,610
The whole picture what a
matrix is actually doing?

426
00:31:24,610 --> 00:31:29,120
Here we have a great example
to draw that picture.

427
00:31:29,120 --> 00:31:33,690
So here's the picture
that 18.06 is --

428
00:31:33,690 --> 00:31:34,860
it's at the center of 18.06.

429
00:31:34,860 --> 00:31:39,060
For our 2 by 1 matrix.

430
00:31:39,060 --> 00:31:43,680
So this is our matrix
is B equals 1, minus 1.

431
00:31:43,680 --> 00:31:45,810
This is the picture
for that matrix.

432
00:31:45,810 --> 00:31:51,660
OK, so that matrix
has a row space.

433
00:31:51,660 --> 00:31:54,210
The row space is the
set of all vectors that

434
00:31:54,210 --> 00:31:56,430
are a combinations of the rows.

435
00:31:56,430 --> 00:32:00,400
But there's only one row, so
the row space is only a line.

436
00:32:00,400 --> 00:32:03,720
I guess it's probably that line.

437
00:32:03,720 --> 00:32:09,770
So the row space
of B, of my matrix,

438
00:32:09,770 --> 00:32:16,180
is all multiples of [1, -1].

439
00:32:16,180 --> 00:32:18,070
So it's a line.

440
00:32:18,070 --> 00:32:21,020
Let's put the zero point in.

441
00:32:21,020 --> 00:32:24,540
OK, then the matrix
also has a null space.

442
00:32:24,540 --> 00:32:30,420
The null space as the side
of solutions to B*x equals 0.

443
00:32:30,420 --> 00:32:36,040
It's a line, and in fact
it's a perpendicular line.

444
00:32:36,040 --> 00:32:45,830
So this is the null space
of B, and it contains all --

445
00:32:45,830 --> 00:32:47,480
what does it contain?

446
00:32:47,480 --> 00:32:51,340
All the solutions to B*u
equals 0 which, in this case,

447
00:32:51,340 --> 00:32:56,690
are all multiples of [1, 1].

448
00:32:56,690 --> 00:33:00,860
And just to come back
to my early comment,

449
00:33:00,860 --> 00:33:06,500
that's what the qr, the extra
half of the qr algorithm

450
00:33:06,500 --> 00:33:10,710
is telling us; it's giving
us a beautiful basis

451
00:33:10,710 --> 00:33:11,920
for the null space.

452
00:33:11,920 --> 00:33:16,610
And so the key point is that
the null space is always

453
00:33:16,610 --> 00:33:22,130
perpendicular to the row space,
which of course we see here.

454
00:33:22,130 --> 00:33:28,240
This z is what we had to
compute when there were

455
00:33:28,240 --> 00:33:30,590
p components and not just one.

456
00:33:30,590 --> 00:33:38,040
And now, where is -- let's
see, what else goes into this

457
00:33:38,040 --> 00:33:39,450
picture?

458
00:33:39,450 --> 00:33:44,740
Where are the solutions to
my equation B*u equal d?

459
00:33:44,740 --> 00:33:50,620
So my equation was -- my
equation was u_1 minus u_2

460
00:33:50,620 --> 00:33:56,690
equal a particular number, 6,
and where are the solutions

461
00:33:56,690 --> 00:33:58,870
to u_1 minus u_2 equals 6?

462
00:34:02,250 --> 00:34:11,080
OK, so now I want to draw all
the -- Where are all the --

463
00:34:11,080 --> 00:34:13,700
so this is the u1, u2 plane?

464
00:34:13,700 --> 00:34:20,320
OK, so one solution is
take c equal to 3. [3, -3],

465
00:34:20,320 --> 00:34:24,090
the combination [3, -3],
which is right there,

466
00:34:24,090 --> 00:34:29,410
is my particular solution, so
u_particular, or u row space,

467
00:34:29,410 --> 00:34:30,650
is [3, -3].

468
00:34:33,540 --> 00:34:38,530
That solves the equation,
and it lies in the row space.

469
00:34:38,530 --> 00:34:41,090
And now, if you
understand the whole point

470
00:34:41,090 --> 00:34:47,400
of linear equations, where
are the rest of the solutions?

471
00:34:47,400 --> 00:34:50,710
How do I draw the
rest of the solutions?

472
00:34:50,710 --> 00:34:55,890
Well, to a particular solution I
add on any null space solution.

473
00:34:55,890 --> 00:34:59,310
The null space
solutions go this way.

474
00:34:59,310 --> 00:35:06,100
So I add on -- so this is my
whole line of all solutions,

475
00:35:06,100 --> 00:35:08,890
so this is the line
of all solutions.

476
00:35:17,960 --> 00:35:24,230
And now, the key question is,
which solution is the smallest?

477
00:35:24,230 --> 00:35:26,990
When -- so this is the
idea this pseudoinverse.

478
00:35:26,990 --> 00:35:31,260
When there are many solutions,
pick the smallest one,

479
00:35:31,260 --> 00:35:34,720
pick the shortest one, it's
the most stable somehow.

480
00:35:34,720 --> 00:35:36,530
It's the natural one.

481
00:35:36,530 --> 00:35:39,890
And which one is it?

482
00:35:39,890 --> 00:35:42,860
OK which -- so
here is the origin.

483
00:35:42,860 --> 00:35:46,640
What point on that line
is closest to the origin?

484
00:35:46,640 --> 00:35:51,930
What point minimizes u_1
square plus u_2 square?

485
00:35:51,930 --> 00:35:53,070
Everybody can see.

486
00:35:53,070 --> 00:36:01,000
This guy, that minimi--
so the pseudoinverse says,

487
00:36:01,000 --> 00:36:04,470
wait a minute, when you've
got a whole line of solutions,

488
00:36:04,470 --> 00:36:06,490
just tell me a good one.

489
00:36:06,490 --> 00:36:08,250
Tell me the special one.

490
00:36:08,250 --> 00:36:11,430
And the special one is
the one in the row space.

491
00:36:14,110 --> 00:36:16,740
And that's the one that
the pseudoinverse picks.

492
00:36:16,740 --> 00:36:23,210
So the pseudoinverse of a matrix
-- so the general rule is,

493
00:36:23,210 --> 00:36:26,170
and part of the lecture
was the fact that,

494
00:36:26,170 --> 00:36:29,550
as alpha goes to
infinity in this problem,

495
00:36:29,550 --> 00:36:31,290
the pseudoinverse will do it.

496
00:36:31,290 --> 00:36:35,310
Or I could say, just directly,
what does the pseudoinverse do?

497
00:36:35,310 --> 00:36:46,450
The pseudoinverse -- so B plus,
the pseudoinverse, chooses,

498
00:36:46,450 --> 00:36:55,500
it chooses u_p, if you like,
u_p -- that's the B plus,

499
00:36:55,500 --> 00:37:01,290
that multiplies -- the solution
-- I can't say B inverse d.

500
00:37:01,290 --> 00:37:05,070
Everybody knows my
equation is B*u equal d.

501
00:37:05,070 --> 00:37:08,850
So this is my
equation, B*u equal d.

502
00:37:08,850 --> 00:37:14,000
And my particular solution,
my pseudo-solution, my best

503
00:37:14,000 --> 00:37:16,790
solution, is going
to be B plus d,

504
00:37:16,790 --> 00:37:21,380
and it's going to
be in the row space,

505
00:37:21,380 --> 00:37:26,930
because it's the
smallest solution.

506
00:37:30,650 --> 00:37:33,640
So if you meet the
idea of pseudoinverses,

507
00:37:33,640 --> 00:37:38,130
now you know what
it's talking about.

508
00:37:38,130 --> 00:37:40,060
Because we don't
have a true inverse,

509
00:37:40,060 --> 00:37:45,230
we have a whole line of a
solutions, we want to pick one,

510
00:37:45,230 --> 00:37:48,200
and the pseudoinverse
picks this one.

511
00:37:48,200 --> 00:37:50,580
It's the one in the row
space, and it's the shortest,

512
00:37:50,580 --> 00:37:53,260
because these are orthogonal.

513
00:37:53,260 --> 00:38:02,210
Because these are orthogonal
-- u is u_p plus u_n,

514
00:38:02,210 --> 00:38:03,900
and because those
are orthogonal,

515
00:38:03,900 --> 00:38:06,800
the length of u
squared, by Pythagoras,

516
00:38:06,800 --> 00:38:11,670
is the length of u_p squared
plus the length of u_n squared.

517
00:38:11,670 --> 00:38:15,820
And which one is shortest?

518
00:38:15,820 --> 00:38:18,830
The one that has no u_n.

519
00:38:18,830 --> 00:38:22,710
That orthogonal
component might as well

520
00:38:22,710 --> 00:38:26,300
be 0 if you want the shortest.

521
00:38:26,300 --> 00:38:30,420
So all solutions have
this, and this is

522
00:38:30,420 --> 00:38:32,030
the length of the shortest one.

523
00:38:32,030 --> 00:38:32,860
OK.

524
00:38:32,860 --> 00:38:35,740
So that tells you what
the pseudoinverse is.

525
00:38:35,740 --> 00:38:41,550
At least it tells you what
it is for a 1 by 2 matrix.

526
00:38:41,550 --> 00:38:48,710
As long as I'm trying to
speak about the pseudoinverse,

527
00:38:48,710 --> 00:38:52,170
let me complete this thought.

528
00:38:52,170 --> 00:38:56,570
But you saw the idea,
that the thought was --

529
00:38:56,570 --> 00:38:59,320
there are there two
ways to get it, again.

530
00:38:59,320 --> 00:39:02,940
The null space method
that goes for it directly,

531
00:39:02,940 --> 00:39:09,070
or the big alpha method that
we checked actually works.

532
00:39:09,070 --> 00:39:10,920
So that was the point
of this board here.

533
00:39:10,920 --> 00:39:15,780
That the big alpha method, also
produces, in the limit as alpha

534
00:39:15,780 --> 00:39:17,610
goes to infinity, u_p.

535
00:39:17,610 --> 00:39:23,850
And there's a little
-- it doesn't have --

536
00:39:23,850 --> 00:39:27,390
if alpha was a 1,000, I wouldn't
get exactly the right answer,

537
00:39:27,390 --> 00:39:32,320
because this would be
2,001 in the denominator.

538
00:39:32,320 --> 00:39:36,780
But as 1,000 becomes a million
and alpha goes to infinity,

539
00:39:36,780 --> 00:39:40,110
I guess the exact one.

540
00:39:40,110 --> 00:39:42,570
OK, so here I was going
to draw the picture,

541
00:39:42,570 --> 00:39:53,170
so if I draw row space -- can I
imagine this is the row space,

542
00:39:53,170 --> 00:39:58,580
whose dimension is the
rank of the matrix.

543
00:39:58,580 --> 00:40:07,510
Perpendicular to it is the
null space whose dimension is

544
00:40:07,510 --> 00:40:12,940
the rest -- the rank, that was
the rank that I always call r,

545
00:40:12,940 --> 00:40:16,420
then this will have the
dimension n minus r,

546
00:40:16,420 --> 00:40:19,680
the number of --
this is exactly,

547
00:40:19,680 --> 00:40:27,590
these are the two things
that MATLAB found here.

548
00:40:27,590 --> 00:40:34,030
These were the r vectors in the
row space, turned into columns,

549
00:40:34,030 --> 00:40:37,680
and these were the n minus r
-- but that was only one --

550
00:40:37,680 --> 00:40:40,340
vectors in the null space.

551
00:40:40,340 --> 00:40:46,690
So normally, we're up in n
dimensions, not just two.

552
00:40:46,690 --> 00:40:50,810
With two dimensions, I just
had lines; in n dimensions

553
00:40:50,810 --> 00:40:54,040
I have an r-dimensional
subspace perpendicular

554
00:40:54,040 --> 00:40:56,870
to an n minus r
dimensional subspace.

555
00:40:56,870 --> 00:41:01,150
And now B. What does B do?

556
00:41:01,150 --> 00:41:02,030
OK.

557
00:41:02,030 --> 00:41:06,990
So suppose I take a vector
u_n in the null space

558
00:41:06,990 --> 00:41:10,130
o B. Then B takes it to 0.

559
00:41:10,130 --> 00:41:13,180
So can I just draw
that with an arrow?

560
00:41:13,180 --> 00:41:14,630
This'll be 0.

561
00:41:14,630 --> 00:41:20,210
B*u_n is 0, that's
the whole idea.

562
00:41:20,210 --> 00:41:21,380
OK.

563
00:41:21,380 --> 00:41:26,810
But a vector in a row
space is not taken to 0.

564
00:41:26,810 --> 00:41:32,110
B will take that --- dot
dot dot dot -- into the --

565
00:41:32,110 --> 00:41:43,740
I better draw here -- the
column space of B. OK.

566
00:41:43,740 --> 00:41:49,350
Which I'm drawing as a subspace
whose dimension is also,

567
00:41:49,350 --> 00:41:57,590
is this same rank r, that's
the great fact about matrices,

568
00:41:57,590 --> 00:41:59,650
that the number of
independent rows

569
00:41:59,650 --> 00:42:02,280
equals the number of
independent columns.

570
00:42:02,280 --> 00:42:11,550
So this guy heads off
to some B u row space.

571
00:42:11,550 --> 00:42:12,730
OK.

572
00:42:12,730 --> 00:42:16,900
And if I've complete the
picture, as I really should,

573
00:42:16,900 --> 00:42:23,360
there's another
subspace over here,

574
00:42:23,360 --> 00:42:28,250
which happened to be the zero
subspace in this example,

575
00:42:28,250 --> 00:42:29,970
but usually it's here.

576
00:42:29,970 --> 00:42:35,230
It's the null space
of B transpose.

577
00:42:39,040 --> 00:42:45,470
In that example, B
transpose was [1, -1]

578
00:42:45,470 --> 00:42:51,140
and its column was independent,
so there was no null space.

579
00:42:51,140 --> 00:42:53,300
So I had a simple
picture, and that's

580
00:42:53,300 --> 00:42:57,270
why I wanted to draw you
a bigger picture with it.

581
00:42:57,270 --> 00:43:02,950
It's dimension will be,
well, not n minus r,

582
00:43:02,950 --> 00:43:10,850
but if B is m by
n, let's say, then

583
00:43:10,850 --> 00:43:13,190
it turns out that
this null space will

584
00:43:13,190 --> 00:43:15,250
have dimension m minus r.

585
00:43:15,250 --> 00:43:16,100
No problem.

586
00:43:16,100 --> 00:43:16,720
OK.

587
00:43:16,720 --> 00:43:20,420
Now, in the last
three minutes, I

588
00:43:20,420 --> 00:43:25,740
want to draw the pseudoinverse.

589
00:43:25,740 --> 00:43:28,910
So what I'm saying is
that every matrix B,

590
00:43:28,910 --> 00:43:32,020
every rectangular
or square matrix B,

591
00:43:32,020 --> 00:43:34,450
has these four spaces.

592
00:43:34,450 --> 00:43:37,890
Four fundamental subspaces
they've come to be called.

593
00:43:37,890 --> 00:43:46,290
OK and the null space is the
vectors which B takes to 0.

594
00:43:46,290 --> 00:43:50,030
B takes any vector
into its column space.

595
00:43:50,030 --> 00:43:55,950
So now let me just draw what
happens to u equal u null

596
00:43:55,950 --> 00:43:59,010
space plus u row space.

597
00:43:59,010 --> 00:44:03,220
So this was a guy
in the row space.

598
00:44:03,220 --> 00:44:08,210
If I -- B, what will B do
when multiplies this vector?

599
00:44:08,210 --> 00:44:12,010
This vector has a part
that's in the null space,

600
00:44:12,010 --> 00:44:14,320
and a part that's
in the row space.

601
00:44:14,320 --> 00:44:19,860
But when I multiply by B,
what happens to this part?

602
00:44:19,860 --> 00:44:21,180
Gone.

603
00:44:21,180 --> 00:44:24,060
When I multiply that
by B, where does it go?

604
00:44:24,060 --> 00:44:24,740
There.

605
00:44:24,740 --> 00:44:31,000
So this, all these guys
feed into that same point.

606
00:44:31,000 --> 00:44:37,400
B*u is also going there.

607
00:44:37,400 --> 00:44:39,350
That's why it's not invertible.

608
00:44:39,350 --> 00:44:40,440
Of course.

609
00:44:40,440 --> 00:44:42,800
That's why it's not invertible.

610
00:44:42,800 --> 00:44:51,120
Here, I guess --
yeah, here I -- sorry.

611
00:44:51,120 --> 00:44:51,750
Yeah.

612
00:44:51,750 --> 00:44:54,986
This was the null space of
B. I didn't write in what

613
00:44:54,986 --> 00:44:56,380
it was the null space of.

614
00:44:56,380 --> 00:44:57,310
OK.

615
00:44:57,310 --> 00:45:00,630
So the matrix couldn't be
invertible, and actually,

616
00:45:00,630 --> 00:45:04,300
because it has a null space,
and they all send those --

617
00:45:04,300 --> 00:45:07,880
so what is the
pseudoinverse, finally?

618
00:45:07,880 --> 00:45:13,640
Finally, last moment, the
pseudoinverse is the matrix --

619
00:45:13,640 --> 00:45:17,540
it's like an inverse matrix
that comes backwards, right?

620
00:45:17,540 --> 00:45:21,650
It reverses what B does.

621
00:45:21,650 --> 00:45:28,000
What it cannot do is reverse
stuff that's appeared at 0.

622
00:45:28,000 --> 00:45:31,960
No matrix could send
0 back to u_n right?

623
00:45:31,960 --> 00:45:34,650
If i multiply by
the zero vector,

624
00:45:34,650 --> 00:45:36,890
I'm only going to
get the zero vector.

625
00:45:36,890 --> 00:45:44,730
So the pseudoinverse has to --
what it can do is it can send

626
00:45:44,730 --> 00:45:50,220
this stuff back to this.

627
00:45:50,220 --> 00:45:51,760
This is what the
pseudoinverse does.

628
00:45:51,760 --> 00:45:55,460
If I had a different color
chalk I would use it now.

629
00:45:55,460 --> 00:45:59,920
But let use two
arrows or even three.

630
00:45:59,920 --> 00:46:02,470
This is what the
pseudoinverse does.

631
00:46:02,470 --> 00:46:08,570
It takes the column space and
sends it back to the row space.

632
00:46:08,570 --> 00:46:13,710
And because these have the same
dimension r -- the point is,

633
00:46:13,710 --> 00:46:18,880
inside B is this r by
r matrix that's cool.

634
00:46:18,880 --> 00:46:20,640
It's totally invertible.

635
00:46:20,640 --> 00:46:23,270
And B plus inverts it.

636
00:46:23,270 --> 00:46:25,900
So from row space
to column space

637
00:46:25,900 --> 00:46:29,860
goes B; from column
space back to row space

638
00:46:29,860 --> 00:46:34,420
comes the pseudoinverse, but I
can't call it a genuine inverse

639
00:46:34,420 --> 00:46:40,300
because all this stuff,
including 0, the best I can do

640
00:46:40,300 --> 00:46:43,790
is send those all back to 0.

641
00:46:43,790 --> 00:46:45,400
There.

642
00:46:45,400 --> 00:46:49,320
Now I've really wiped
out that figure.

643
00:46:49,320 --> 00:46:52,490
But I'll put the
three arrows there

644
00:46:52,490 --> 00:46:56,410
that makes it crystal clear.

645
00:46:56,410 --> 00:46:58,740
So this, those three
arrows are indicating

646
00:46:58,740 --> 00:47:00,895
what the pseudoinverse does.

647
00:47:00,895 --> 00:47:08,580
It takes the column space -- Its
column space is the row space.

648
00:47:08,580 --> 00:47:13,370
The column space of B plus
is the row space of B.

649
00:47:13,370 --> 00:47:16,700
You know, sort of, in these
two spaces, that's where

650
00:47:16,700 --> 00:47:19,010
the pseudoinverse is alive.

651
00:47:19,010 --> 00:47:25,410
And B kills the null space and
B plus kills the null space --

652
00:47:25,410 --> 00:47:26,420
the other null space.

653
00:47:26,420 --> 00:47:28,000
The null space of B transpose.

654
00:47:28,000 --> 00:47:35,690
Anyway, that pseudoinverse is at
the center of the whole theory

655
00:47:35,690 --> 00:47:38,240
here.

656
00:47:38,240 --> 00:47:40,340
You know, when I take out
books from the library

657
00:47:40,340 --> 00:47:44,530
about regularizing
least squares,

658
00:47:44,530 --> 00:47:49,470
they begin by explaining
the pseudoinverse.

659
00:47:49,470 --> 00:47:55,470
Which, as we've seen, arises
as alpha goes to infinity

660
00:47:55,470 --> 00:47:59,920
or 0, whichever end we're at.

661
00:47:59,920 --> 00:48:02,220
And what I have
still to do next time

662
00:48:02,220 --> 00:48:06,740
is, what happens if
I'm not prepared to go

663
00:48:06,740 --> 00:48:08,820
all the way to
the pseudoinverse,

664
00:48:08,820 --> 00:48:17,110
because it blows up on me,
and I want a finite alpha,

665
00:48:17,110 --> 00:48:19,610
what should that alpha be?

666
00:48:19,610 --> 00:48:26,710
And that alpha will be
determined by, as I said,

667
00:48:26,710 --> 00:48:30,270
somehow by the noise
level in the system.

668
00:48:30,270 --> 00:48:30,860
Right.

669
00:48:30,860 --> 00:48:34,540
And just to emphasize another
example that I'll probably

670
00:48:34,540 --> 00:48:38,200
mention, you know,
CT scans, MRI,

671
00:48:38,200 --> 00:48:42,420
all those things that
are trying to reconstruct

672
00:48:42,420 --> 00:48:46,990
the results from limited number
of measurements, measurements

673
00:48:46,990 --> 00:48:51,270
that are not really enough
to perfect reconstruction,

674
00:48:51,270 --> 00:48:55,760
so this is the theory of
imperfect reconstruction,

675
00:48:55,760 --> 00:49:00,640
if I can invent an
expression, having

676
00:49:00,640 --> 00:49:03,490
met perfect reconstruction
in the world of wavelets

677
00:49:03,490 --> 00:49:06,480
and signal processing,
this is the subject

678
00:49:06,480 --> 00:49:09,280
of imperfect
reconstruction and I'll

679
00:49:09,280 --> 00:49:12,270
hope to do justice
to it on Wednesday.

680
00:49:12,270 --> 00:49:12,920
OK.

681
00:49:12,920 --> 00:49:14,490
Thank you.