1
00:00:00,790 --> 00:00:03,130
The following content is
provided under a Creative

2
00:00:03,130 --> 00:00:04,550
Commons license.

3
00:00:04,550 --> 00:00:06,760
Your support will help
MIT OpenCourseWare

4
00:00:06,760 --> 00:00:10,850
continue to offer high quality
educational resources for free.

5
00:00:10,850 --> 00:00:13,390
To make a donation or to
view additional materials

6
00:00:13,390 --> 00:00:17,320
from hundreds of MIT courses
visit MIT OpenCourseWare

7
00:00:17,320 --> 00:00:18,570
at ocw.mit.edu.

8
00:00:29,770 --> 00:00:32,170
JOHN GUTTAG: We ended
the last lecture

9
00:00:32,170 --> 00:00:35,200
looking at greedy algorithms.

10
00:00:35,200 --> 00:00:38,370
Today I want to discuss the
pros and cons of greedy.

11
00:00:38,370 --> 00:00:40,240
Oh, I should mention--

12
00:00:40,240 --> 00:00:44,310
in response to popular demand,
I have put the PowerPoint up,

13
00:00:44,310 --> 00:00:48,920
so if you download the ZIP
file, you'll find the questions,

14
00:00:48,920 --> 00:00:52,600
including question 1, the
first question, plus the code,

15
00:00:52,600 --> 00:00:56,090
plus the PowerPoint.

16
00:00:56,090 --> 00:00:59,300
We actually do read Piazza,
and sometimes, at least,

17
00:00:59,300 --> 00:01:00,380
pay attention.

18
00:01:00,380 --> 00:01:03,710
We should pay
attention all the time.

19
00:01:03,710 --> 00:01:08,150
So what are the pros
and cons of greedy?

20
00:01:08,150 --> 00:01:11,160
The pro-- and it's
a big pro-- is

21
00:01:11,160 --> 00:01:14,790
that it's really easy to
implement, as you could see.

22
00:01:14,790 --> 00:01:18,280
Also enormously important--
it's really fast.

23
00:01:18,280 --> 00:01:20,100
We looked at the
complexity last time--

24
00:01:20,100 --> 00:01:23,730
it was m log n-- quite quick.

25
00:01:23,730 --> 00:01:27,090
The downside-- and this can
be either a big problem or not

26
00:01:27,090 --> 00:01:28,500
a big problem--

27
00:01:28,500 --> 00:01:30,840
is that it doesn't
actually solve

28
00:01:30,840 --> 00:01:34,440
the problem, in the sense
that we've asked ourselves

29
00:01:34,440 --> 00:01:36,460
to optimize something.

30
00:01:36,460 --> 00:01:40,930
And we get a solution that
may or may not be optimal.

31
00:01:40,930 --> 00:01:43,900
Worse-- we don't even
know, in this case,

32
00:01:43,900 --> 00:01:46,540
how close to optimal it is.

33
00:01:46,540 --> 00:01:50,830
Maybe it's almost optimal, but
maybe it's really far away.

34
00:01:50,830 --> 00:01:54,920
And that's a big problem
with many greedy algorithms.

35
00:01:54,920 --> 00:01:58,660
There are some very
sophisticated greedy algorithms

36
00:01:58,660 --> 00:02:02,200
we won't be looking at that
give you a bound on how good

37
00:02:02,200 --> 00:02:07,010
the approximation is, but
most of them don't do that.

38
00:02:07,010 --> 00:02:10,550
Last time we looked
at an alternative

39
00:02:10,550 --> 00:02:13,310
to a greedy algorithm
that was guaranteed

40
00:02:13,310 --> 00:02:14,540
to find the right solution.

41
00:02:14,540 --> 00:02:16,820
It was a brute force algorithm.

42
00:02:16,820 --> 00:02:19,040
The basic idea is simple--

43
00:02:19,040 --> 00:02:23,040
that you enumerate all
possible combinations of items,

44
00:02:23,040 --> 00:02:25,680
remove the combination
whose total units exceed

45
00:02:25,680 --> 00:02:28,920
the allowable weight, and
then choose the winner

46
00:02:28,920 --> 00:02:32,400
from those that are remaining.

47
00:02:32,400 --> 00:02:34,360
Now let's talk about
how to implement it.

48
00:02:34,360 --> 00:02:36,630
And the way I want to
implement it is using something

49
00:02:36,630 --> 00:02:39,360
called a search tree.

50
00:02:39,360 --> 00:02:42,310
There are lots of different
ways to implement it.

51
00:02:42,310 --> 00:02:44,180
In the second half
of today's lecture,

52
00:02:44,180 --> 00:02:45,950
you'll see why I
happen to choose

53
00:02:45,950 --> 00:02:48,780
this particular approach.

54
00:02:48,780 --> 00:02:52,010
So what is a search tree?

55
00:02:52,010 --> 00:02:56,810
A tree is, basically,
a kind of graph.

56
00:02:56,810 --> 00:03:00,000
And we'll hear much more
about graphs next week.

57
00:03:00,000 --> 00:03:04,450
But this is a simple form
where you have a root

58
00:03:04,450 --> 00:03:06,520
and then children of the root.

59
00:03:06,520 --> 00:03:09,310
In this particular
form, research C,

60
00:03:09,310 --> 00:03:10,930
you have two children.

61
00:03:10,930 --> 00:03:12,250
So we start with the root.

62
00:03:15,220 --> 00:03:17,680
And then we look at
our list of elements

63
00:03:17,680 --> 00:03:21,430
to be considered
that we might take,

64
00:03:21,430 --> 00:03:24,700
and we look at the first
element in that list.

65
00:03:24,700 --> 00:03:28,660
And then we draw a
left branch, which

66
00:03:28,660 --> 00:03:32,590
shows the consequence of
choosing to take that element,

67
00:03:32,590 --> 00:03:36,250
and a right branch, which
shows the consequences of not

68
00:03:36,250 --> 00:03:37,210
taking that element.

69
00:03:39,980 --> 00:03:43,250
And then we consider
the second element,

70
00:03:43,250 --> 00:03:49,390
and so on and so forth, until we
get to the bottom of the tree.

71
00:03:49,390 --> 00:03:52,560
So by convention, the left
element will mean we took it,

72
00:03:52,560 --> 00:03:54,670
the right direction will
mean we didn't take it.

73
00:03:59,000 --> 00:04:02,930
And then we apply it recursively
to the non-leaf children.

74
00:04:02,930 --> 00:04:04,520
The leaf means we
get to the end,

75
00:04:04,520 --> 00:04:07,400
we've considered the last
element to be considered.

76
00:04:07,400 --> 00:04:10,130
Nothing else to think about.

77
00:04:10,130 --> 00:04:11,660
When we get to the
code, we'll see

78
00:04:11,660 --> 00:04:15,140
that, in addition to the
description being recursive,

79
00:04:15,140 --> 00:04:19,000
it's convenient to write
the code that way, too.

80
00:04:19,000 --> 00:04:21,250
And then finally,
we'll choose the node

81
00:04:21,250 --> 00:04:24,560
that has the highest value
that meets our constraints.

82
00:04:24,560 --> 00:04:26,950
So let's look at an example.

83
00:04:26,950 --> 00:04:29,560
My example is I have
my backpack that

84
00:04:29,560 --> 00:04:33,920
can hold a certain number
of calories if you will.

85
00:04:33,920 --> 00:04:37,360
And I'm choosing between, to
keep it small, a beer, a pizza,

86
00:04:37,360 --> 00:04:39,580
and a burger--

87
00:04:39,580 --> 00:04:43,320
three essential food groups.

88
00:04:43,320 --> 00:04:47,830
The first thing I explore on
the left is take the beer,

89
00:04:47,830 --> 00:04:50,350
and then I have the
pizza and the burger

90
00:04:50,350 --> 00:04:53,160
to continue to consider.

91
00:04:53,160 --> 00:04:56,520
I then say, all right,
let's take the pizza.

92
00:04:56,520 --> 00:04:58,170
Now I have just the burger.

93
00:04:58,170 --> 00:05:00,510
Now I taste the burger.

94
00:05:00,510 --> 00:05:04,500
This traversal of this
generation of the tree

95
00:05:04,500 --> 00:05:07,980
is called left-most depth-most.

96
00:05:07,980 --> 00:05:11,900
So I go all the way down
to the bottom of the tree.

97
00:05:11,900 --> 00:05:14,680
I then back up a
level and say, all

98
00:05:14,680 --> 00:05:17,740
right, I'm now at the bottom.

99
00:05:17,740 --> 00:05:24,130
Let's go back and
see what happens

100
00:05:24,130 --> 00:05:29,210
if I make the other choice
at the one level up the tree.

101
00:05:29,210 --> 00:05:31,600
So I went up and
said, well, now let's

102
00:05:31,600 --> 00:05:37,000
see what happens if I
make a different decision,

103
00:05:37,000 --> 00:05:40,520
as in we didn't take the burger.

104
00:05:40,520 --> 00:05:41,780
And then I work my way--

105
00:05:41,780 --> 00:05:43,850
this is called backtracking--

106
00:05:43,850 --> 00:05:45,470
up another level.

107
00:05:45,470 --> 00:05:49,500
I now say, suppose, I didn't
take the piece of pizza.

108
00:05:49,500 --> 00:05:52,250
Now I have the beer
only and only the burger

109
00:05:52,250 --> 00:05:58,110
to think about, so
on and so forth,

110
00:05:58,110 --> 00:06:00,630
until I've generated
the whole tree.

111
00:06:00,630 --> 00:06:04,050
You'll notice it will always be
the case that the leftmost leaf

112
00:06:04,050 --> 00:06:08,800
of this tree has got all
the possible items in it,

113
00:06:08,800 --> 00:06:12,080
and the rightmost leaf none.

114
00:06:12,080 --> 00:06:14,960
And then I just check
which of these leaves

115
00:06:14,960 --> 00:06:19,180
meets the constraint
and what are the values.

116
00:06:19,180 --> 00:06:24,430
And if I compute the value
and the calories in each one,

117
00:06:24,430 --> 00:06:28,120
and if our constraint
was 750 calories,

118
00:06:28,120 --> 00:06:30,505
then I get to choose
the winner, which is--

119
00:06:33,356 --> 00:06:34,980
I guess, it's the
pizza and the burger.

120
00:06:34,980 --> 00:06:35,710
Is that right?

121
00:06:38,440 --> 00:06:45,810
The most value under 750.

122
00:06:45,810 --> 00:06:49,180
That's the way I go through.

123
00:06:49,180 --> 00:06:52,710
It's quite a
straightforward algorithm.

124
00:06:52,710 --> 00:06:56,350
And I don't know why we draw our
trees with the root at the top

125
00:06:56,350 --> 00:06:58,290
and the leaves at the bottom.

126
00:06:58,290 --> 00:07:00,960
My only conjecture is
computer scientists

127
00:07:00,960 --> 00:07:02,355
don't spend enough
time outdoors.

128
00:07:06,210 --> 00:07:09,870
Now let's think of the
computational complexity

129
00:07:09,870 --> 00:07:13,020
of this process.

130
00:07:13,020 --> 00:07:15,960
The time is going to be based
on the total number of nodes

131
00:07:15,960 --> 00:07:18,100
we generate.

132
00:07:18,100 --> 00:07:21,550
So if we know the number of
nodes that are in the tree,

133
00:07:21,550 --> 00:07:24,160
we then know the complexity
of the algorithm,

134
00:07:24,160 --> 00:07:27,330
the asymptotic complexity.

135
00:07:27,330 --> 00:07:31,460
Well, how many levels
do we have in the tree?

136
00:07:31,460 --> 00:07:33,920
Just the number of items, right?

137
00:07:33,920 --> 00:07:35,660
Because at each
level of the tree

138
00:07:35,660 --> 00:07:39,040
we're deciding to take
or not to take an item.

139
00:07:39,040 --> 00:07:43,210
And so we can only do that for
the number of items we have.

140
00:07:43,210 --> 00:07:46,840
So if we go back, for example,
and we look at the tree--

141
00:07:46,840 --> 00:07:50,420
not that tree, that tree--

142
00:07:50,420 --> 00:07:52,590
and we count the
number of levels,

143
00:07:52,590 --> 00:07:56,600
it's going to be based upon
the total number of items.

144
00:07:56,600 --> 00:07:59,150
We know that because if you
look at, say, the leftmost node

145
00:07:59,150 --> 00:08:04,160
at the bottom, we've made
three separate decisions.

146
00:08:04,160 --> 00:08:08,660
So counting the
root, it's n plus 1.

147
00:08:08,660 --> 00:08:10,940
But we don't care
about plus 1 when we're

148
00:08:10,940 --> 00:08:14,890
doing asymptotic complexity.

149
00:08:14,890 --> 00:08:19,310
So that tells us how many
levels we have in the tree.

150
00:08:19,310 --> 00:08:23,030
The next question we
need to ask is, how many

151
00:08:23,030 --> 00:08:26,170
nodes are there at each level?

152
00:08:26,170 --> 00:08:30,580
And you can look at this
and see-- the deeper we go,

153
00:08:30,580 --> 00:08:34,590
the more nodes we
have at each level.

154
00:08:34,590 --> 00:08:39,340
In fact, if we come
here, we can see

155
00:08:39,340 --> 00:08:41,970
that the number of
nodes at level i--

156
00:08:41,970 --> 00:08:46,650
depth i of the
tree-- is 2 to the i.

157
00:08:46,650 --> 00:08:48,840
That makes sense if you
remember last time we

158
00:08:48,840 --> 00:08:50,970
looked at binary numbers.

159
00:08:50,970 --> 00:08:53,550
We're saying we're representing
our choices as either 0

160
00:08:53,550 --> 00:08:55,560
or 1 for what we take.

161
00:08:55,560 --> 00:08:58,230
If we have n items
to choose from,

162
00:08:58,230 --> 00:09:00,240
then the number of
possible choices

163
00:09:00,240 --> 00:09:03,350
is 2 to the n, the
size of the powerset.

164
00:09:03,350 --> 00:09:05,865
So that will tell us the
number of nodes at each level.

165
00:09:09,550 --> 00:09:13,470
So if there are n items, the
number of nodes in the tree

166
00:09:13,470 --> 00:09:18,510
is going to be the sum
from 0 to n of 2 to the i

167
00:09:18,510 --> 00:09:21,450
because we have
that many levels.

168
00:09:21,450 --> 00:09:23,850
And if you've studied
a little math,

169
00:09:23,850 --> 00:09:28,400
you know that's exactly
2 to the n plus 1.

170
00:09:28,400 --> 00:09:30,980
Or if you do what I do,
you look it up in Wikipedia

171
00:09:30,980 --> 00:09:34,660
and you know it's
2 to the n plus 1.

172
00:09:34,660 --> 00:09:37,210
Now, there's an
obvious optimization.

173
00:09:37,210 --> 00:09:41,320
We don't need to
explore the whole tree.

174
00:09:41,320 --> 00:09:45,760
If we get to a point where
the backpack is overstuffed,

175
00:09:45,760 --> 00:09:48,640
there's no point in saying,
should we take this next item?

176
00:09:48,640 --> 00:09:50,920
Because we know we can't.

177
00:09:50,920 --> 00:09:53,320
I generated a bunch
of leaves that

178
00:09:53,320 --> 00:09:57,820
were useless because
the weight was too high.

179
00:09:57,820 --> 00:10:01,660
So you could always
abort early and say, oh,

180
00:10:01,660 --> 00:10:05,320
no point in generating the
rest of this part of the tree

181
00:10:05,320 --> 00:10:09,320
because we know everything
in it will be too heavy.

182
00:10:09,320 --> 00:10:13,460
Adding something cannot
reduce the weight.

183
00:10:13,460 --> 00:10:14,780
It's a nice optimization.

184
00:10:14,780 --> 00:10:17,820
It's one you'll see we
actually do in the code.

185
00:10:17,820 --> 00:10:20,750
But it really doesn't
change the complexity.

186
00:10:20,750 --> 00:10:24,260
It's not going to change
the worst-cost complexity.

187
00:10:28,830 --> 00:10:32,390
Exponential, as we saw this,
I think, in Eric's lecture,

188
00:10:32,390 --> 00:10:33,750
is a big number.

189
00:10:33,750 --> 00:10:36,150
You don't usually
like 2 to the n.

190
00:10:36,150 --> 00:10:39,300
Does this mean that brute
force is never useful?

191
00:10:39,300 --> 00:10:40,620
Well, let's give it a try.

192
00:10:43,675 --> 00:10:44,675
We'll look at some code.

193
00:10:48,440 --> 00:10:49,770
Here is the implementation.

194
00:10:57,850 --> 00:11:02,400
So it's maxVal,
toConsider, and avail.

195
00:11:02,400 --> 00:11:08,790
And then we say, if toConsider
is empty or avail is 0--

196
00:11:08,790 --> 00:11:11,700
avail is an index, we're going
to go through the list using

197
00:11:11,700 --> 00:11:14,580
that to tell us
whether or not we still

198
00:11:14,580 --> 00:11:17,170
have an element to consider--

199
00:11:17,170 --> 00:11:22,610
then the result will be the
tuple 0 and the empty tuple.

200
00:11:25,280 --> 00:11:26,420
We couldn't take anything.

201
00:11:26,420 --> 00:11:29,530
This is the base
of our recursion.

202
00:11:29,530 --> 00:11:32,510
Either there's nothing
left to consider or there's

203
00:11:32,510 --> 00:11:34,250
no available weight--

204
00:11:34,250 --> 00:11:35,630
the Val, as the
amount of weight,

205
00:11:35,630 --> 00:11:39,410
is 0 or toConsider is empty.

206
00:11:39,410 --> 00:11:42,830
Well, if either
of those are true,

207
00:11:42,830 --> 00:11:45,850
then we ask whether
to consider * 0,

208
00:11:45,850 --> 00:11:48,460
the first element to look at.

209
00:11:48,460 --> 00:11:51,670
Is that cost greater
than availability?

210
00:11:54,740 --> 00:11:58,960
If it is, we don't need to
explore the left branch.

211
00:11:58,960 --> 00:12:01,270
because it means we can't
afford to put that thing

212
00:12:01,270 --> 00:12:03,490
in the backpack, the knapsack.

213
00:12:03,490 --> 00:12:05,350
There's just no room for it.

214
00:12:05,350 --> 00:12:08,530
So we'll explore the
right branch only.

215
00:12:08,530 --> 00:12:13,390
The result will be whatever the
maximum value is of toConsider

216
00:12:13,390 --> 00:12:17,530
of the remainder of the list--
the list with the first element

217
00:12:17,530 --> 00:12:18,900
sliced off--

218
00:12:18,900 --> 00:12:22,070
and availability unchanged.

219
00:12:22,070 --> 00:12:24,480
So it's a recursive
implementation, saying,

220
00:12:24,480 --> 00:12:27,930
now we only have to consider
the right branch of the tree

221
00:12:27,930 --> 00:12:29,910
because we knew we
couldn't take this element.

222
00:12:29,910 --> 00:12:32,490
It just weighs too
much, or costs too much,

223
00:12:32,490 --> 00:12:36,000
or was too fattening,
in my case.

224
00:12:36,000 --> 00:12:41,210
Otherwise, we now have to
consider both branches.

225
00:12:41,210 --> 00:12:46,000
So we'll set next item to
toConsider of 0, the first one,

226
00:12:46,000 --> 00:12:47,270
and explore the left branch.

227
00:12:51,040 --> 00:12:53,980
On this branch, there
are two possibilities

228
00:12:53,980 --> 00:13:01,280
to think about, which I'm
calling withVal and withToTake.

229
00:13:01,280 --> 00:13:05,770
So I'm going to call maxVal
of toConsider of everything

230
00:13:05,770 --> 00:13:12,310
except the current element and
pass in an available weight

231
00:13:12,310 --> 00:13:15,700
of avail minus whatever--

232
00:13:15,700 --> 00:13:20,010
well, let me widen this so
we can see the whole code.

233
00:13:23,986 --> 00:13:28,040
This is not going to let me
widen this window any more.

234
00:13:28,040 --> 00:13:28,800
Shame on it.

235
00:13:28,800 --> 00:13:30,591
Let me see if I can
get rid of the console.

236
00:13:37,060 --> 00:13:38,770
Well, we'll have
to do this instead.

237
00:13:45,190 --> 00:13:48,690
So we're going to call
maxVal with everything

238
00:13:48,690 --> 00:13:51,300
except the current
element and give it

239
00:13:51,300 --> 00:13:58,200
avail minus the cost of that
next item of toConsider sub 0.

240
00:13:58,200 --> 00:14:01,050
Because we know that the
availability, available weight

241
00:14:01,050 --> 00:14:03,130
has to have that cost
subtracted from it.

242
00:14:09,160 --> 00:14:18,410
And then we'll add to withVal
next item dot getValue.

243
00:14:18,410 --> 00:14:22,200
So that's a value
if we do take it.

244
00:14:22,200 --> 00:14:23,950
Then we'll explore the
right branch-- what

245
00:14:23,950 --> 00:14:25,270
happens if we don't take it?

246
00:14:27,939 --> 00:14:29,605
And then we'll choose
the better branch.

247
00:14:33,670 --> 00:14:36,910
So it's a pretty simple
recursive algorithm.

248
00:14:36,910 --> 00:14:40,180
We just go all the
way to the bottom

249
00:14:40,180 --> 00:14:42,250
and make the right
choice at the bottom,

250
00:14:42,250 --> 00:14:46,690
and then percolate back up, like
so many recursive algorithms.

251
00:14:52,680 --> 00:14:54,570
We have a simple
program to test it.

252
00:15:02,414 --> 00:15:04,580
I better start a console
now if I'm going to run it.

253
00:15:12,190 --> 00:15:14,870
And we'll testGreedys on foods.

254
00:15:14,870 --> 00:15:18,790
Well, we'll testGreedys
and then we'll testMaxVal.

255
00:15:18,790 --> 00:15:20,710
So I'm building
the same thing we

256
00:15:20,710 --> 00:15:23,920
did in Monday's
lecture, the same menu.

257
00:15:23,920 --> 00:15:25,540
And I'll run the
same testGreedys

258
00:15:25,540 --> 00:15:27,460
we looked at last time.

259
00:15:27,460 --> 00:15:31,030
And we'll see whether or not
we get something better when

260
00:15:31,030 --> 00:15:32,770
we run the truly optimal one.

261
00:15:41,260 --> 00:15:43,240
Well, indeed we do.

262
00:15:43,240 --> 00:15:45,640
You remember that last
time and, fortunately,

263
00:15:45,640 --> 00:15:52,150
this time too, the best
we did was a value of 318.

264
00:15:52,150 --> 00:15:56,470
But now we see we can
actually get to 353 if we use

265
00:15:56,470 --> 00:16:00,600
the truly optimal algorithm.

266
00:16:00,600 --> 00:16:05,110
So we see it ran pretty
quickly and actually

267
00:16:05,110 --> 00:16:10,420
gave us a better answer than we
got from the greedy algorithm.

268
00:16:10,420 --> 00:16:12,760
And it's often the case.

269
00:16:12,760 --> 00:16:14,680
If I have time at the
end, I'll show you

270
00:16:14,680 --> 00:16:16,360
an optimization
program you might

271
00:16:16,360 --> 00:16:21,220
want to run that works
perfectly fine to use

272
00:16:21,220 --> 00:16:24,310
this kind of brute
force algorithm on.

273
00:16:24,310 --> 00:16:28,230
Let's go back to the PowerPoint.

274
00:16:28,230 --> 00:16:31,560
So I'm just going through
the code again we just ran.

275
00:16:31,560 --> 00:16:35,220
This was the header we saw--

276
00:16:35,220 --> 00:16:37,700
toConsider, as the
items that correspond

277
00:16:37,700 --> 00:16:42,200
to nodes higher up the
tree, and avail, as I said,

278
00:16:42,200 --> 00:16:44,619
the amount of space.

279
00:16:44,619 --> 00:16:46,410
And again, here's what
the body of the code

280
00:16:46,410 --> 00:16:49,035
loooked like, I took
out the comments.

281
00:16:51,364 --> 00:16:53,530
One of the things you might
think about in your head

282
00:16:53,530 --> 00:16:57,190
when you look at this code is
putting the comments back in.

283
00:16:57,190 --> 00:16:59,410
I always find that for
me a really good way

284
00:16:59,410 --> 00:17:04,060
to understand code that I didn't
write is to try and comment it.

285
00:17:04,060 --> 00:17:06,579
And that helps me sort of
force myself to think about

286
00:17:06,579 --> 00:17:09,069
what is it really doing.

287
00:17:09,069 --> 00:17:12,099
So you'll have both versions--
you'll have the PowerPoint

288
00:17:12,099 --> 00:17:15,010
version without the
comments and the actual code

289
00:17:15,010 --> 00:17:16,900
with the comments.

290
00:17:16,900 --> 00:17:18,579
You can think about
looking at this

291
00:17:18,579 --> 00:17:20,710
and then looking
at the real code

292
00:17:20,710 --> 00:17:23,465
and making sure that
you're understanding jibes.

293
00:17:28,000 --> 00:17:30,530
I should point out that
this doesn't actually

294
00:17:30,530 --> 00:17:33,510
build the search tree.

295
00:17:33,510 --> 00:17:42,660
We've got this local variable
result, starting here,

296
00:17:42,660 --> 00:17:48,050
that records the best
solution found so far.

297
00:17:48,050 --> 00:17:51,210
So it's not the picture I drew
where I generate all the nodes

298
00:17:51,210 --> 00:17:53,090
and then I inspect them.

299
00:17:53,090 --> 00:17:54,200
I just keep track--

300
00:17:54,200 --> 00:17:57,890
as I generate a node, I
say, how good is this?

301
00:17:57,890 --> 00:17:59,970
Is it better than the
best I've found so far?

302
00:17:59,970 --> 00:18:03,180
If so, it becomes the new best.

303
00:18:03,180 --> 00:18:07,000
And I can do that because
every node I generate

304
00:18:07,000 --> 00:18:12,610
is, in some sense, a legal
solution to the problem.

305
00:18:12,610 --> 00:18:17,200
Probably rarely is it the
final optimal solution

306
00:18:17,200 --> 00:18:19,030
but it's at least
a legal solution.

307
00:18:19,030 --> 00:18:20,530
And so if it's
better than something

308
00:18:20,530 --> 00:18:25,830
we saw before, we can
make it the new best.

309
00:18:25,830 --> 00:18:26,920
This is very common.

310
00:18:26,920 --> 00:18:29,310
And this is, in fact, what
most people do with it

311
00:18:29,310 --> 00:18:31,350
when they use a search tree--

312
00:18:31,350 --> 00:18:33,900
they don't actually
build the tree

313
00:18:33,900 --> 00:18:36,660
in the pictorial way
we've looked at it

314
00:18:36,660 --> 00:18:39,932
but play some trick like
this of just keeping

315
00:18:39,932 --> 00:18:40,890
track of their results.

316
00:18:44,610 --> 00:18:45,825
Any questions about this?

317
00:18:50,290 --> 00:18:52,210
All right.

318
00:18:52,210 --> 00:18:57,110
We did just try it on
example from lecture 1.

319
00:18:57,110 --> 00:18:59,000
And we saw that it worked great.

320
00:18:59,000 --> 00:19:00,980
It gave us a better answer.

321
00:19:00,980 --> 00:19:03,320
It finished quickly.

322
00:19:03,320 --> 00:19:06,890
But we should not take too
much solace from the fact

323
00:19:06,890 --> 00:19:10,160
that it finished quickly
because 2 to the eighth

324
00:19:10,160 --> 00:19:11,750
is actually a
pretty tiny number.

325
00:19:16,160 --> 00:19:19,760
Almost any algorithm is fine
when I'm working on something

326
00:19:19,760 --> 00:19:21,260
this small.

327
00:19:21,260 --> 00:19:24,140
Let's look now at what happens
if we have a bigger menu.

328
00:19:28,870 --> 00:19:33,130
Here is some code
to do a bigger menu.

329
00:19:33,130 --> 00:19:37,440
Since, as you will discover
if you haven't already,

330
00:19:37,440 --> 00:19:39,910
I'm a pretty lazy
person, I didn't

331
00:19:39,910 --> 00:19:43,570
want to write out a menu with
a 100 items or even 50 items.

332
00:19:43,570 --> 00:19:46,690
So I wrote some code
to generate the menus.

333
00:19:46,690 --> 00:19:50,350
And I used randomness
to do that.

334
00:19:50,350 --> 00:19:52,810
This is a Python
library we'll be

335
00:19:52,810 --> 00:19:57,950
using a lot for the
rest of the semester.

336
00:19:57,950 --> 00:20:04,250
It's used any time you want
to generate things at random

337
00:20:04,250 --> 00:20:05,750
and do many other things.

338
00:20:05,750 --> 00:20:07,430
We'll come back to it a lot.

339
00:20:07,430 --> 00:20:10,940
Here we're just going to
use a very small part of it.

340
00:20:14,210 --> 00:20:19,472
To build a large menu
of some numItems--

341
00:20:19,472 --> 00:20:21,180
and we're going to
give the maximum value

342
00:20:21,180 --> 00:20:25,220
and the maximum
cost for each item.

343
00:20:25,220 --> 00:20:30,420
We'll assume the minimum
is, in this case, 1.

344
00:20:30,420 --> 00:20:32,010
Items will start empty.

345
00:20:32,010 --> 00:20:35,630
And then for i in
range number of items,

346
00:20:35,630 --> 00:20:41,690
I'm going to call this function
random dot randint that

347
00:20:41,690 --> 00:20:46,780
takes a range of integers from
1 to, actually in this case,

348
00:20:46,780 --> 00:20:52,790
maxVal minus 1, or 1 to
maxVal, actually, in this case.

349
00:20:52,790 --> 00:20:55,820
And it just chooses
one of them at random.

350
00:20:55,820 --> 00:20:59,050
So when you run this, you don't
know what it's going to get.

351
00:20:59,050 --> 00:21:03,040
Random dot randint might
return 1, it might return 23,

352
00:21:03,040 --> 00:21:04,779
it might return 54.

353
00:21:04,779 --> 00:21:06,820
The only thing you know
is it will be an integer.

354
00:21:09,640 --> 00:21:13,980
And then I'm going to build
menus ranging from 5 items

355
00:21:13,980 --> 00:21:14,920
to 60 items--

356
00:21:19,760 --> 00:21:29,720
buildLargeMenu, the number
of items, with maxVal of 90

357
00:21:29,720 --> 00:21:35,250
and a maxCost of 250,
pleasure and calories.

358
00:21:35,250 --> 00:21:39,330
And then I'm going to test
maxVal on each of these menus.

359
00:21:43,050 --> 00:21:46,590
So building menus of
various sizes at random

360
00:21:46,590 --> 00:21:52,220
and then just trying to find the
optimal value for each of them.

361
00:21:52,220 --> 00:21:53,240
Let's look at the code.

362
00:21:56,610 --> 00:22:03,600
Let's comment this out, we
don't need to run that again.

363
00:22:10,432 --> 00:22:13,430
So we'll build a
large menu and then

364
00:22:13,430 --> 00:22:16,010
we'll try it for a bunch of
items and see what we get.

365
00:22:29,440 --> 00:22:30,990
So it's going along.

366
00:22:30,990 --> 00:22:34,720
Trying the menu up to
30 went pretty quickly.

367
00:22:34,720 --> 00:22:38,010
So even 2 to the 30
didn't take too long.

368
00:22:38,010 --> 00:22:41,530
But you might notice it's kind
of bogging down, we got 35.

369
00:22:46,250 --> 00:22:48,120
I guess, I could ask
the question now--

370
00:22:48,120 --> 00:22:51,290
it was one of the questions
I was going to ask as a poll

371
00:22:51,290 --> 00:22:53,660
but maybe I won't bother--

372
00:22:53,660 --> 00:22:55,592
how much patience do we have?

373
00:22:55,592 --> 00:22:57,800
When do you think we'll run
out of patience and quit?

374
00:23:03,860 --> 00:23:05,690
If you're out of
patience, raise your hand.

375
00:23:08,640 --> 00:23:11,600
Well, some of you are way
more patient than I am.

376
00:23:11,600 --> 00:23:12,850
So we're going to quit anyway.

377
00:23:18,740 --> 00:23:20,240
We were trying to do 40.

378
00:23:20,240 --> 00:23:22,820
It might have finished 40, 45.

379
00:23:22,820 --> 00:23:26,480
I've never waited long
enough to get to 45.

380
00:23:26,480 --> 00:23:27,740
It just is too long.

381
00:23:33,100 --> 00:23:35,510
That raises the
question, is it hopeless?

382
00:23:41,500 --> 00:23:43,750
And in theory, yes.

383
00:23:43,750 --> 00:23:46,840
As I mentioned last time, it
is an inherently exponential

384
00:23:46,840 --> 00:23:48,460
problem.

385
00:23:48,460 --> 00:23:50,590
The answer is-- in practice, no.

386
00:23:50,590 --> 00:23:54,920
Because there's something
called dynamic programming,

387
00:23:54,920 --> 00:23:59,200
which was invented by a
fellow at the RAND Corporation

388
00:23:59,200 --> 00:24:02,230
called Richard Bellman,
a rather remarkable

389
00:24:02,230 --> 00:24:05,500
mathematician/computer
scientist.

390
00:24:05,500 --> 00:24:07,720
He wrote a whole book
on it, but I'm not sure

391
00:24:07,720 --> 00:24:09,640
why because it's not
that complicated.

392
00:24:14,140 --> 00:24:17,020
When we talk about
dynamic programming,

393
00:24:17,020 --> 00:24:20,860
it's a kind of a funny
story, at least to me.

394
00:24:20,860 --> 00:24:23,550
I learned it and I
didn't know anything

395
00:24:23,550 --> 00:24:24,550
about the history of it.

396
00:24:24,550 --> 00:24:28,180
And I've had all sorts
of theories about why it

397
00:24:28,180 --> 00:24:30,670
was called dynamic programming.

398
00:24:30,670 --> 00:24:35,260
You know how it is, how people
try and fit a theory to data.

399
00:24:35,260 --> 00:24:37,000
And then I read a
history book about it,

400
00:24:37,000 --> 00:24:39,370
and this was Bellman's
own description

401
00:24:39,370 --> 00:24:43,890
of why he called it
dynamic programming.

402
00:24:43,890 --> 00:24:45,630
And it turned out,
as you can see,

403
00:24:45,630 --> 00:24:48,960
he basically chose a word
because it was the description

404
00:24:48,960 --> 00:24:51,430
that didn't mean anything.

405
00:24:51,430 --> 00:24:55,330
Because he was doing
mathematics, and at the time

406
00:24:55,330 --> 00:24:58,180
he was being funded by a part
of the Defense Department

407
00:24:58,180 --> 00:25:00,790
that didn't approve
of mathematics.

408
00:25:00,790 --> 00:25:04,060
And he wanted to
conceal that fact.

409
00:25:04,060 --> 00:25:08,410
And indeed at the time, the
head of Defense Appropriations

410
00:25:08,410 --> 00:25:12,239
in the US Congress didn't
much like mathematics.

411
00:25:12,239 --> 00:25:13,780
And he was afraid
that he didn't want

412
00:25:13,780 --> 00:25:17,530
to have to go and testify and
tell people he was doing math.

413
00:25:17,530 --> 00:25:19,570
So he just invented
something that no one

414
00:25:19,570 --> 00:25:21,400
would know what it meant.

415
00:25:21,400 --> 00:25:24,400
And years of students
spent time later trying

416
00:25:24,400 --> 00:25:27,620
to figure out what
it actually did mean.

417
00:25:27,620 --> 00:25:30,280
Anyway, what's the basic idea?

418
00:25:30,280 --> 00:25:34,870
To understand it I want
to temporarily abandon

419
00:25:34,870 --> 00:25:39,250
the knapsack problem and look
at a much simpler problem--

420
00:25:39,250 --> 00:25:40,225
Fibonacci numbers.

421
00:25:42,880 --> 00:25:46,630
You've seen this already, with
cute little bunnies, I think,

422
00:25:46,630 --> 00:25:49,600
when you saw it.

423
00:25:49,600 --> 00:25:51,980
N equals 0, n
equals 1-- return 1.

424
00:25:51,980 --> 00:25:57,220
Otherwise, fib of n minus
1 plus fib of n minus 2.

425
00:25:57,220 --> 00:25:59,560
And as I think you saw
when you first saw it,

426
00:25:59,560 --> 00:26:03,970
it takes a long time to run.

427
00:26:03,970 --> 00:26:08,030
Fib of 120, for example,
is a very big number.

428
00:26:08,030 --> 00:26:12,410
It's shocking how
quickly Fibonacci grows.

429
00:26:12,410 --> 00:26:20,890
So let's think about
implementing it.

430
00:26:20,890 --> 00:26:22,459
If we run Fibonacci--

431
00:26:22,459 --> 00:26:23,750
well, maybe we'll just do that.

432
00:26:37,030 --> 00:26:39,464
So here is fib of n,
let's just try running it.

433
00:26:39,464 --> 00:26:41,130
And again, we'll test
people's patience.

434
00:26:54,140 --> 00:26:55,970
We'll see how long
we're letting it run.

435
00:26:55,970 --> 00:26:59,240
I'm going to try for
i in the range of 121.

436
00:26:59,240 --> 00:27:02,433
We'll print fib of i.

437
00:27:09,840 --> 00:27:11,205
Comes clumping along.

438
00:27:14,170 --> 00:27:16,970
It slows down pretty quickly.

439
00:27:16,970 --> 00:27:18,910
And if you look at it,
it's kind of surprising

440
00:27:18,910 --> 00:27:21,190
it's this slow because these
numbers aren't that big.

441
00:27:24,090 --> 00:27:25,800
These are not enormous numbers.

442
00:27:25,800 --> 00:27:28,320
Fib of 35 is not a huge number.

443
00:27:28,320 --> 00:27:32,120
Yet it took a long
time to compute.

444
00:27:32,120 --> 00:27:34,280
So you have the numbers
growing pretty quickly

445
00:27:34,280 --> 00:27:38,150
but the computation, actually,
seems to be growing faster

446
00:27:38,150 --> 00:27:40,540
than the results.

447
00:27:40,540 --> 00:27:41,160
We're at 37.

448
00:27:44,380 --> 00:27:48,250
It's going to gets slower and
slower, even though our numbers

449
00:27:48,250 --> 00:27:51,650
are not that big.

450
00:27:51,650 --> 00:27:53,870
The question is,
what's going on?

451
00:27:53,870 --> 00:27:57,440
Why is it taking so
long for Fibonacci

452
00:27:57,440 --> 00:28:00,790
to compute these results?

453
00:28:00,790 --> 00:28:13,580
Well, let's call it and
look at the question.

454
00:28:13,580 --> 00:28:18,560
And to do that I want to
look at the call tree.

455
00:28:18,560 --> 00:28:23,360
This is for Fibonacci
of 6, which is only 13,

456
00:28:23,360 --> 00:28:25,630
which, I think, most
of us would agree

457
00:28:25,630 --> 00:28:27,980
was not a very big number.

458
00:28:27,980 --> 00:28:30,200
And let's look
what's going on here.

459
00:28:33,510 --> 00:28:35,750
If you look at this,
what in some sense

460
00:28:35,750 --> 00:28:39,410
seems really stupid about it?

461
00:28:39,410 --> 00:28:44,120
What is it doing that a
rational person would not want

462
00:28:44,120 --> 00:28:45,350
to do if they could avoid it?

463
00:28:52,740 --> 00:28:55,320
It's bad enough to
do something once.

464
00:28:55,320 --> 00:28:57,780
But to do the same thing
over and over again

465
00:28:57,780 --> 00:29:00,990
is really wasteful.

466
00:29:00,990 --> 00:29:04,020
And if we look at this,
we'll see, for example,

467
00:29:04,020 --> 00:29:07,440
that fib 4 is being
computed here,

468
00:29:07,440 --> 00:29:11,350
and fib 4 is being
computed here.

469
00:29:11,350 --> 00:29:16,015
Fib 3 is being considered
here, and here, and here.

470
00:29:19,190 --> 00:29:21,980
And do you think we'll get
a different answer for fib 3

471
00:29:21,980 --> 00:29:24,480
in one place when we get
it in the other place?

472
00:29:24,480 --> 00:29:27,230
You sure hope not.

473
00:29:27,230 --> 00:29:33,160
So you think, well, what
should we do about this?

474
00:29:33,160 --> 00:29:36,690
How would we go about avoiding
doing the same work over

475
00:29:36,690 --> 00:29:38,540
and over again?

476
00:29:38,540 --> 00:29:40,160
And there's kind of
an obvious answer,

477
00:29:40,160 --> 00:29:43,600
and that answer is at the
heart of dynamic programming.

478
00:29:43,600 --> 00:29:46,760
What's the answer?

479
00:29:46,760 --> 00:29:50,040
AUDIENCE: [INAUDIBLE]

480
00:29:50,040 --> 00:29:51,467
JOHN GUTTAG: Exactly.

481
00:29:51,467 --> 00:29:53,550
And I'm really happy that
someone in the front row

482
00:29:53,550 --> 00:29:57,990
answered the question because
I can throw it that far.

483
00:29:57,990 --> 00:30:03,660
You store the answer and then
look it up when you need it.

484
00:30:03,660 --> 00:30:06,580
Because we know that we can
look things up very quickly.

485
00:30:09,100 --> 00:30:12,950
Dictionary, despite what
Eric said in his lecture,

486
00:30:12,950 --> 00:30:17,510
almost all the time
works in constant time

487
00:30:17,510 --> 00:30:20,960
if you make it big enough,
and it usually is in Python.

488
00:30:20,960 --> 00:30:25,280
We'll see later in the
term how to do that trick.

489
00:30:25,280 --> 00:30:30,520
So you store it and then you'd
never have to compute it again.

490
00:30:30,520 --> 00:30:34,900
And that's the basic trick
behind dynamic programming.

491
00:30:34,900 --> 00:30:41,940
And it's something
called memoization,

492
00:30:41,940 --> 00:30:44,890
as in you create a memo and
you store it in the memo.

493
00:30:48,400 --> 00:30:50,620
So we see this here.

494
00:30:50,620 --> 00:30:56,150
Notice that what we're doing
is trading time for space.

495
00:30:56,150 --> 00:31:05,300
It takes some space to store
the old results, but negligible

496
00:31:05,300 --> 00:31:08,750
related to the time we save.

497
00:31:08,750 --> 00:31:10,130
So here's the trick.

498
00:31:10,130 --> 00:31:13,530
We're going to create a table
to record what we've done.

499
00:31:13,530 --> 00:31:16,380
And then before
computing fib of x,

500
00:31:16,380 --> 00:31:20,370
we'll check if the value
has already been computed.

501
00:31:20,370 --> 00:31:22,920
If so, we just look
it up and return it.

502
00:31:22,920 --> 00:31:25,000
Otherwise, we'll compute it--

503
00:31:25,000 --> 00:31:27,000
it's the first time-- and
store it in the table.

504
00:31:31,560 --> 00:31:36,190
Here is a fast implementation
of Fibonacci that does that.

505
00:31:36,190 --> 00:31:38,980
It looks like the
old one, except it's

506
00:31:38,980 --> 00:31:41,410
got an extra argument--

507
00:31:41,410 --> 00:31:45,350
memo-- which is a dictionary.

508
00:31:45,350 --> 00:31:47,960
The first time we call it,
the memo will be empty.

509
00:31:52,320 --> 00:31:57,240
It tries to return
the value in the memo.

510
00:31:57,240 --> 00:32:01,080
If it's not there, an exception
will get raised, we know that.

511
00:32:01,080 --> 00:32:05,180
And it will branch to
here, compute the result,

512
00:32:05,180 --> 00:32:11,190
and then store it in
the memo and return it.

513
00:32:11,190 --> 00:32:13,260
It's the same old
recursive thing

514
00:32:13,260 --> 00:32:16,920
we did before but with the memo.

515
00:32:16,920 --> 00:32:20,190
Notice, by the way, that
I'm using exceptions

516
00:32:20,190 --> 00:32:22,350
not as an error
handling mechanism,

517
00:32:22,350 --> 00:32:26,270
really, but just as
a flow of control.

518
00:32:26,270 --> 00:32:29,510
To me, this is cleaner than
writing code that says,

519
00:32:29,510 --> 00:32:34,440
if this is in the keys, then
do this, otherwise, do that.

520
00:32:34,440 --> 00:32:37,440
It's slightly fewer lines of
code, and for me, at least,

521
00:32:37,440 --> 00:32:41,050
easier to read to use try-except
for this sort of thing.

522
00:32:44,690 --> 00:32:48,930
Let's see what happens
if we run this one.

523
00:32:48,930 --> 00:33:02,810
Get rid of the slow fib
and we'll run fastFib.

524
00:33:17,780 --> 00:33:20,240
Wow.

525
00:33:20,240 --> 00:33:25,960
We're already done with fib 120.

526
00:33:25,960 --> 00:33:28,980
Pretty amazing, considering last
time we got stuck around 40.

527
00:33:31,890 --> 00:33:35,700
It really works, this
memoization trick.

528
00:33:35,700 --> 00:33:37,440
An enormous difference.

529
00:33:47,940 --> 00:33:49,560
When can you use it?

530
00:33:49,560 --> 00:33:53,010
It's not that memorization
is a magic bullet that

531
00:33:53,010 --> 00:33:54,480
will solve all problems.

532
00:33:58,710 --> 00:34:01,800
The problems it can solve,
it can help with, really,

533
00:34:01,800 --> 00:34:02,970
is the right thing.

534
00:34:02,970 --> 00:34:06,780
And by the way, as we'll see, it
finds an optimal solution, not

535
00:34:06,780 --> 00:34:10,020
an approximation.

536
00:34:10,020 --> 00:34:13,920
Problems have two things
called optimal substructure,

537
00:34:13,920 --> 00:34:16,620
overlapping subproblems.

538
00:34:16,620 --> 00:34:19,350
What are these mean?

539
00:34:19,350 --> 00:34:21,449
We have optimal
substructure when

540
00:34:21,449 --> 00:34:23,550
a globally optimal
solution can be

541
00:34:23,550 --> 00:34:31,650
found by combining optimal
solutions to local subproblems.

542
00:34:31,650 --> 00:34:35,130
So for example, when
x is greater than 1

543
00:34:35,130 --> 00:34:42,900
we can solve fib x by solving
fib x minus 1 and fib x minus 2

544
00:34:42,900 --> 00:34:47,080
and adding those
two things together.

545
00:34:47,080 --> 00:34:50,130
So there is optimal
substructure--

546
00:34:50,130 --> 00:34:53,650
you solve these two smaller
problems independently

547
00:34:53,650 --> 00:34:58,060
of each other and then combine
the solutions in a fast way.

548
00:35:03,750 --> 00:35:09,490
You also have to have something
called overlapping subproblems.

549
00:35:09,490 --> 00:35:11,840
This is why the memo worked.

550
00:35:11,840 --> 00:35:14,570
Finding an optimal
solution has to involve

551
00:35:14,570 --> 00:35:19,200
solving the same
problem multiple times.

552
00:35:19,200 --> 00:35:21,090
Even if you have
optimal substructure,

553
00:35:21,090 --> 00:35:24,570
if you don't see the same
problem more than once--

554
00:35:24,570 --> 00:35:25,760
creating a memo.

555
00:35:25,760 --> 00:35:28,950
Well, it'll work, you can
still create the memo.

556
00:35:28,950 --> 00:35:30,790
You'll just never
find anything in it

557
00:35:30,790 --> 00:35:32,730
when you look things
up because you're

558
00:35:32,730 --> 00:35:34,275
solving each problem once.

559
00:35:36,810 --> 00:35:41,090
So you have to be solving the
same problem multiple times

560
00:35:41,090 --> 00:35:45,090
and you have to be able to
solve it by combining solutions

561
00:35:45,090 --> 00:35:45,975
to smaller problems.

562
00:35:48,940 --> 00:35:51,780
Now, we've seen things with
optimal substructure before.

563
00:35:54,920 --> 00:35:58,250
In some sense, merge
sort worked that way--

564
00:35:58,250 --> 00:36:00,770
we were combining
separate problems.

565
00:36:00,770 --> 00:36:03,260
Did merge sort have
overlapping subproblems?

566
00:36:05,930 --> 00:36:09,980
No, because-- well,
I guess, it might

567
00:36:09,980 --> 00:36:13,550
have if the list had the same
element many, many times.

568
00:36:13,550 --> 00:36:17,450
But we would expect, mostly not.

569
00:36:17,450 --> 00:36:19,940
Because each time we're
solving a different problem,

570
00:36:19,940 --> 00:36:21,860
because we have different
lists that we're now

571
00:36:21,860 --> 00:36:24,320
sorting and merging.

572
00:36:24,320 --> 00:36:27,560
So it has half of it
but not the other.

573
00:36:27,560 --> 00:36:31,520
Dynamic programming will
not help us for sorting,

574
00:36:31,520 --> 00:36:34,620
cannot be used to
improve merge sort.

575
00:36:34,620 --> 00:36:37,770
Oh, well, nothing
is a silver bullet.

576
00:36:40,510 --> 00:36:43,980
What about the knapsack problem?

577
00:36:43,980 --> 00:36:46,840
Does it have these
two properties?

578
00:36:50,640 --> 00:36:55,210
We can look at it in
terms of these pictures.

579
00:36:55,210 --> 00:36:58,990
And it's pretty clear that it
does have optimal substructure

580
00:36:58,990 --> 00:37:02,320
because we're taking the left
branch and the right branch

581
00:37:02,320 --> 00:37:03,400
and choosing the winner.

582
00:37:06,210 --> 00:37:10,490
But what about
overlapping subproblems?

583
00:37:10,490 --> 00:37:13,480
Are we ever solving, in this
case, the same problem--

584
00:37:16,330 --> 00:37:17,120
add two nodes?

585
00:37:21,480 --> 00:37:23,910
Well, do any of these
nodes look identical?

586
00:37:28,430 --> 00:37:30,860
In this case, no.

587
00:37:30,860 --> 00:37:34,250
We could write a dynamic
programming solution

588
00:37:34,250 --> 00:37:35,580
to the knapsack problem--

589
00:37:35,580 --> 00:37:40,170
and we will-- and run
it on this example,

590
00:37:40,170 --> 00:37:42,060
and we'd get the right answer.

591
00:37:42,060 --> 00:37:45,130
We would get zero speedup.

592
00:37:45,130 --> 00:37:46,960
Because at each
node, if you can see,

593
00:37:46,960 --> 00:37:49,360
the problems are different.

594
00:37:49,360 --> 00:37:53,310
We have different things in the
knapsack or different things

595
00:37:53,310 --> 00:37:54,630
to consider.

596
00:37:54,630 --> 00:37:57,215
Never do we have the same
contents and the same things

597
00:37:57,215 --> 00:37:57,840
left to decide.

598
00:38:01,770 --> 00:38:04,800
So "maybe" was not a bad
answer if that was the answer

599
00:38:04,800 --> 00:38:05,940
you gave to this question.

600
00:38:08,550 --> 00:38:11,950
But let's look at
a different menu.

601
00:38:11,950 --> 00:38:15,010
This menu happens to
have two beers in it.

602
00:38:19,110 --> 00:38:22,680
Now, if we look at
what happens, do

603
00:38:22,680 --> 00:38:25,820
we see two nodes that are
solving the same problem?

604
00:38:31,790 --> 00:38:34,640
The answer is what?

605
00:38:34,640 --> 00:38:35,510
Yes or no?

606
00:38:43,200 --> 00:38:45,480
I haven't drawn the
whole tree here.

607
00:38:45,480 --> 00:38:49,440
Well, you'll notice
the answer is yes.

608
00:38:49,440 --> 00:38:56,830
This node and this node are
solving the same problem.

609
00:38:56,830 --> 00:38:58,060
Why is it?

610
00:38:58,060 --> 00:39:02,130
Well, in this node,
we took this beer

611
00:39:02,130 --> 00:39:05,270
and still had this
one to consider.

612
00:39:05,270 --> 00:39:10,250
But in this node,
we took that beer

613
00:39:10,250 --> 00:39:12,920
but it doesn't matter
which beer we took.

614
00:39:12,920 --> 00:39:17,750
We still have a beer in
the knapsack and a burger

615
00:39:17,750 --> 00:39:20,660
and a slice to consider.

616
00:39:20,660 --> 00:39:24,070
So we got there different ways,
by choosing different beers,

617
00:39:24,070 --> 00:39:27,770
but we're in the same place.

618
00:39:27,770 --> 00:39:30,880
So in fact, we
actually, in this case,

619
00:39:30,880 --> 00:39:37,480
do have the same problem
to solve more than once.

620
00:39:37,480 --> 00:39:42,940
Now, here I had two
things that were the same.

621
00:39:42,940 --> 00:39:45,310
That's not really necessary.

622
00:39:45,310 --> 00:39:49,430
Here's another
very small example.

623
00:39:49,430 --> 00:39:56,330
And the point I want to
make here is shown by this.

624
00:39:56,330 --> 00:39:59,390
So here I have again
drawn a search tree.

625
00:39:59,390 --> 00:40:02,600
And I'm showing you this
because, in fact, it's exactly

626
00:40:02,600 --> 00:40:07,430
this tree that will be producing
in our dynamic programming

627
00:40:07,430 --> 00:40:10,040
solution to the
knapsack problem.

628
00:40:10,040 --> 00:40:16,220
Each node in the tree starts
with what you've taken--

629
00:40:16,220 --> 00:40:18,500
initially, nothing,
the empty set.

630
00:40:18,500 --> 00:40:22,430
What's left, the total value,
and the remaining calories.

631
00:40:22,430 --> 00:40:24,710
There's some redundancy
here, by the way.

632
00:40:24,710 --> 00:40:27,410
If I know what I've taken, I
could already always compute

633
00:40:27,410 --> 00:40:31,310
the value and what's left.

634
00:40:31,310 --> 00:40:33,866
But this is just so
it's easier to see.

635
00:40:33,866 --> 00:40:35,740
And I've numbered the
nodes here in the order

636
00:40:35,740 --> 00:40:37,130
in which they're get generated.

637
00:40:40,240 --> 00:40:44,650
Now, the thing that
I want you to notice

638
00:40:44,650 --> 00:40:49,420
is, when we ask whether we're
solving the same problem,

639
00:40:49,420 --> 00:40:56,060
we don't actually
care what we've taken.

640
00:40:56,060 --> 00:41:00,740
We don't even care
about the value.

641
00:41:00,740 --> 00:41:08,680
All we care is, how much room
we have left in the knapsack

642
00:41:08,680 --> 00:41:11,515
and which items we
have left to consider.

643
00:41:14,280 --> 00:41:20,490
Because what I take next or
what I take remaining really

644
00:41:20,490 --> 00:41:23,220
has nothing to do with how
much value I already have

645
00:41:23,220 --> 00:41:27,420
because I'm trying to maximize
the value that's left,

646
00:41:27,420 --> 00:41:30,600
independent of
previous things done.

647
00:41:30,600 --> 00:41:36,210
Similarly, I don't care why
I have a 100 calories left.

648
00:41:36,210 --> 00:41:39,490
Whether I used it up on beers
or a burger, doesn't matter.

649
00:41:39,490 --> 00:41:44,570
All that matters is that
I just have 100 left.

650
00:41:44,570 --> 00:41:49,910
So we see in a large complicated
problem it could easily

651
00:41:49,910 --> 00:41:53,390
be a situation where different
choices of what to take

652
00:41:53,390 --> 00:41:57,620
and what to not take would
leave you in a situation

653
00:41:57,620 --> 00:41:59,835
where you have the same
number of remaining calories.

654
00:42:02,670 --> 00:42:05,700
And therefore you are solving a
problem you've already solved.

655
00:42:12,220 --> 00:42:15,490
At each node, we're just
given the remaining weight,

656
00:42:15,490 --> 00:42:19,540
maximize the value by choosing
among the remaining items.

657
00:42:19,540 --> 00:42:20,710
That's all that matters.

658
00:42:23,310 --> 00:42:26,780
And so indeed, you will have
overlapping subproblems.

659
00:42:29,320 --> 00:42:33,690
As we see in this tree, for
the example we just saw,

660
00:42:33,690 --> 00:42:36,240
the box is around a place
where we're actually

661
00:42:36,240 --> 00:42:39,900
solving the same problem,
even though we've

662
00:42:39,900 --> 00:42:44,580
made different decisions about
what to take, A versus B.

663
00:42:44,580 --> 00:42:46,890
And in fact, we have
different amounts of value

664
00:42:46,890 --> 00:42:48,060
in the knapsack--

665
00:42:48,060 --> 00:42:49,650
6 versus 7.

666
00:42:49,650 --> 00:42:53,770
What matters is we still
have C and D to consider

667
00:42:53,770 --> 00:42:56,260
and we have two units left.

668
00:43:03,930 --> 00:43:06,630
It's a small and easy step.

669
00:43:06,630 --> 00:43:08,430
I'm not going to walk
you through the code

670
00:43:08,430 --> 00:43:10,860
because it's kind
of boring to do so.

671
00:43:10,860 --> 00:43:16,610
How do you modify the maxVal we
looked at before to use a memo?

672
00:43:16,610 --> 00:43:19,790
First, you have to add the third
argument, which is initially

673
00:43:19,790 --> 00:43:23,610
going to be set to
the empty dictionary.

674
00:43:23,610 --> 00:43:26,840
The key of the memo
will be a tuple--

675
00:43:26,840 --> 00:43:32,660
the items left to be considered
and the available weight.

676
00:43:32,660 --> 00:43:37,370
Because the items left to
be considered are in a list,

677
00:43:37,370 --> 00:43:41,420
we can represent the items
left to be considered

678
00:43:41,420 --> 00:43:45,550
by how long the list is.

679
00:43:45,550 --> 00:43:47,710
Because we'll start at
the front item and just

680
00:43:47,710 --> 00:43:48,760
work our way to the end.

681
00:43:52,460 --> 00:43:55,040
And then the function
works, essentially,

682
00:43:55,040 --> 00:43:57,700
exactly the same
way fastFib worked.

683
00:44:03,796 --> 00:44:06,170
I'm not going to run it for
you because we're running out

684
00:44:06,170 --> 00:44:08,755
of time.

685
00:44:08,755 --> 00:44:10,130
You might want to
run it yourself

686
00:44:10,130 --> 00:44:14,120
because it is kind of fun to
see how really fast it is.

687
00:44:14,120 --> 00:44:19,570
But more interestingly,
we can look at this table.

688
00:44:19,570 --> 00:44:22,810
This column is what we would
get with the original recursive

689
00:44:22,810 --> 00:44:26,320
implementation where
we didn't use a memo.

690
00:44:26,320 --> 00:44:30,730
And it was therefore 2
to the length of items.

691
00:44:30,730 --> 00:44:34,990
And as you can see,
it gets really big

692
00:44:34,990 --> 00:44:37,390
or, as we say at the end, huge.

693
00:44:40,770 --> 00:44:43,950
But the number of
calls grows incredibly

694
00:44:43,950 --> 00:44:49,200
slowly for the dynamic
programming solution.

695
00:44:49,200 --> 00:44:53,360
In the beginning
it's worth Oh, well.

696
00:44:53,360 --> 00:44:58,670
But by the time we get to
the last number I wrote,

697
00:44:58,670 --> 00:45:03,290
we're looking at 43,000
versus some really big number

698
00:45:03,290 --> 00:45:06,210
I don't know how to pronounce--

699
00:45:06,210 --> 00:45:09,410
18 somethings.

700
00:45:09,410 --> 00:45:14,120
Incredible improvement
in performance.

701
00:45:14,120 --> 00:45:17,660
And then at the
end, it's a number

702
00:45:17,660 --> 00:45:21,080
we couldn't fit on the
slide, even in tiny font.

703
00:45:21,080 --> 00:45:25,460
And yet, only 703,000 calls.

704
00:45:25,460 --> 00:45:27,380
How can this be?

705
00:45:27,380 --> 00:45:30,770
We know the problem is
inherently exponential.

706
00:45:30,770 --> 00:45:34,050
Have we overturned the
laws of the universe?

707
00:45:34,050 --> 00:45:38,860
Is dynamic programming a
miracle in the liturgical sense?

708
00:45:38,860 --> 00:45:40,850
No.

709
00:45:40,850 --> 00:45:43,520
But the thing I want
you to carry away

710
00:45:43,520 --> 00:45:50,190
is that computational complexity
can be a very subtle notion.

711
00:45:50,190 --> 00:45:52,470
The running time
of fastMaxVal is

712
00:45:52,470 --> 00:45:55,620
governed by the number
of distinct pairs

713
00:45:55,620 --> 00:46:00,690
that we might be able to
use as keys in the memo--

714
00:46:00,690 --> 00:46:03,480
toConsider and available.

715
00:46:03,480 --> 00:46:08,430
The number of possible values
of toConsider is small.

716
00:46:08,430 --> 00:46:10,870
It's bounded by the
length of the items.

717
00:46:10,870 --> 00:46:16,330
If I have a 100 items,
it's 0, 1, 2, up to a 100.

718
00:46:16,330 --> 00:46:19,300
The possible values
of available weight

719
00:46:19,300 --> 00:46:22,340
is harder to characterize.

720
00:46:22,340 --> 00:46:26,030
But it's bounded by the number
of distinct sums of weights

721
00:46:26,030 --> 00:46:28,530
you can get.

722
00:46:28,530 --> 00:46:33,590
If I start with
750 calories left,

723
00:46:33,590 --> 00:46:35,100
what are the possibilities?

724
00:46:35,100 --> 00:46:40,330
Well, in fact, in this case,
maybe we can take only 750

725
00:46:40,330 --> 00:46:42,620
because we're using with units.

726
00:46:42,620 --> 00:46:43,520
So it's small.

727
00:46:43,520 --> 00:46:45,940
But it's actually smaller
than that because it

728
00:46:45,940 --> 00:46:48,760
has to do with the
combinations of ways

729
00:46:48,760 --> 00:46:52,040
I can add up the units I have.

730
00:46:52,040 --> 00:46:53,510
I know this is complicated.

731
00:46:53,510 --> 00:46:56,560
It's not worth my going through
the details in the lectures.

732
00:46:56,560 --> 00:47:01,610
It's covered in considerable
detail in the assigned reading.

733
00:47:01,610 --> 00:47:03,860
Quickly summarizing
lectures 1 and 2,

734
00:47:03,860 --> 00:47:06,320
here's what I want
you to take away.

735
00:47:06,320 --> 00:47:08,330
Many problems of
practical importance

736
00:47:08,330 --> 00:47:12,082
can be formulated as
optimization problems.

737
00:47:12,082 --> 00:47:16,340
Greedy algorithms often provide
an adequate though often not

738
00:47:16,340 --> 00:47:18,750
optimal solution.

739
00:47:18,750 --> 00:47:21,870
Even though finding
an optimal solution

740
00:47:21,870 --> 00:47:24,630
is, in theory,
exponentially hard,

741
00:47:24,630 --> 00:47:29,760
dynamic programming really
often yields great results.

742
00:47:29,760 --> 00:47:33,110
It always gives you a correct
result and it's sometimes,

743
00:47:33,110 --> 00:47:37,890
in fact, most of the times
gives it to you very quickly.

744
00:47:37,890 --> 00:47:39,660
Finally, in the
PowerPoint, you'll

745
00:47:39,660 --> 00:47:42,870
find an interesting
optimization problem

746
00:47:42,870 --> 00:47:46,000
having to do with whether or
not you should roll over problem

747
00:47:46,000 --> 00:47:48,920
that grades into a quiz.

748
00:47:48,920 --> 00:47:51,900
And it's simply a question
of solving this optimization

749
00:47:51,900 --> 00:47:53,450
problem.