1
00:00:00,090 --> 00:00:02,500
The following content is
provided under a Creative

2
00:00:02,500 --> 00:00:04,019
Commons license.

3
00:00:04,019 --> 00:00:06,360
Your support will help
MIT OpenCourseWare

4
00:00:06,360 --> 00:00:10,730
continue to offer high quality
educational resources for free.

5
00:00:10,730 --> 00:00:13,330
To make a donation, or
view additional materials

6
00:00:13,330 --> 00:00:17,236
from hundreds of MIT courses,
visit MIT OpenCourseWare

7
00:00:17,236 --> 00:00:17,861
at ocw.mit.edu.

8
00:00:21,222 --> 00:00:23,180
SRINIVAS DEVADAS: All
right, let's get started.

9
00:00:23,180 --> 00:00:24,740
Good morning everyone.

10
00:00:24,740 --> 00:00:28,210
I see a lot of tired faces.

11
00:00:28,210 --> 00:00:29,030
I'm not tired.

12
00:00:29,030 --> 00:00:29,930
Why are you tired?

13
00:00:29,930 --> 00:00:30,894
[LAUGHTER]

14
00:00:31,860 --> 00:00:33,130
I only lecture half the time.

15
00:00:33,130 --> 00:00:36,240
You guys take the
class all the time.

16
00:00:36,240 --> 00:00:41,640
So today's lecture is
about hash functions.

17
00:00:41,640 --> 00:00:45,680
And you may think that you know
a lot about hash functions,

18
00:00:45,680 --> 00:00:47,860
and you probably do.

19
00:00:47,860 --> 00:00:50,840
But what we're going to do
today is talk about really

20
00:00:50,840 --> 00:00:54,990
a completely different
application of hash functions,

21
00:00:54,990 --> 00:00:57,700
and a new set of
properties that we're

22
00:00:57,700 --> 00:01:00,190
going to require
of hash functions

23
00:01:00,190 --> 00:01:01,900
that I'll elaborate on.

24
00:01:01,900 --> 00:01:04,459
And we're going to see a bunch
of different applications

25
00:01:04,459 --> 00:01:06,700
to things like
password protection,

26
00:01:06,700 --> 00:01:10,220
checking the integrity
of files, auctions,

27
00:01:10,220 --> 00:01:11,316
and so on and so forth.

28
00:01:11,316 --> 00:01:12,940
So a little bit of
a different lecture.

29
00:01:12,940 --> 00:01:15,692
Both today and on
Thursday I'm going

30
00:01:15,692 --> 00:01:18,320
to be going to be doing
cryptography and applications,

31
00:01:18,320 --> 00:01:20,380
not too much of algorithms.

32
00:01:20,380 --> 00:01:22,960
But we will do a little bit
of analysis with respect

33
00:01:22,960 --> 00:01:25,390
to whether properties
are satisfied,

34
00:01:25,390 --> 00:01:28,010
in this case by hash
functions or not.

35
00:01:28,010 --> 00:01:29,910
So let's just dive right in.

36
00:01:29,910 --> 00:01:33,240
You all know what
hash functions are.

37
00:01:33,240 --> 00:01:37,380
There's no real change
in the definition.

38
00:01:37,380 --> 00:01:39,370
But the kinds of hash
functions that we're

39
00:01:39,370 --> 00:01:41,960
going to be looking at
today are quite different

40
00:01:41,960 --> 00:01:45,590
from the simple hash
functions, like taking a mod

41
00:01:45,590 --> 00:01:49,950
with a prime number that
we've looked at in the past.

42
00:01:49,950 --> 00:01:51,680
And the notion of
collisions is going

43
00:01:51,680 --> 00:01:53,980
to come up again,
except that again we're

44
00:01:53,980 --> 00:01:56,600
going to raise the
stakes a little bit.

45
00:01:56,600 --> 00:02:04,100
So a hash function
maps arbitrary

46
00:02:04,100 --> 00:02:08,539
strings-- let me do this right.

47
00:02:11,940 --> 00:02:15,645
So you're not making a statement
about the length of the string.

48
00:02:18,440 --> 00:02:23,490
You will break it up, even if
you had a string of length 512,

49
00:02:23,490 --> 00:02:28,830
or maybe it was 27, you do
want to get a number out of it.

50
00:02:28,830 --> 00:02:31,540
In a specific
range there's going

51
00:02:31,540 --> 00:02:34,190
to be a number of
bits associated

52
00:02:34,190 --> 00:02:35,346
with our hash functions.

53
00:02:35,346 --> 00:02:36,970
And previously we
had a number of slots

54
00:02:36,970 --> 00:02:39,620
associated with the output
of the hash function.

55
00:02:39,620 --> 00:02:42,100
But the input
could be arbitrary.

56
00:02:42,100 --> 00:02:48,080
And these arbitrary
strings of data

57
00:02:48,080 --> 00:02:50,310
are going to get
mapped, as I just said,

58
00:02:50,310 --> 00:02:53,190
to a fixed length output.

59
00:02:56,282 --> 00:02:57,990
And we're going to
think about this fixed

60
00:02:57,990 --> 00:03:01,590
length as being a
number of bits today,

61
00:03:01,590 --> 00:03:04,530
as opposed to slots
in the hash table.

62
00:03:04,530 --> 00:03:08,170
Because we really aren't
going to be storing

63
00:03:08,170 --> 00:03:10,520
a dictionary or a hash
table in the applications

64
00:03:10,520 --> 00:03:11,780
we're going to look at today.

65
00:03:11,780 --> 00:03:14,750
It's simply a question
of computing a hash.

66
00:03:14,750 --> 00:03:18,120
And because the
fixed length output

67
00:03:18,120 --> 00:03:23,880
is going to be something on the
order of 160-bits, or 256-bits,

68
00:03:23,880 --> 00:03:26,980
there's no way that you
could store two arrays

69
00:03:26,980 --> 00:03:33,370
to 160 elements in a hash
table, or even two arrays to 64

70
00:03:33,370 --> 00:03:34,340
really.

71
00:03:34,340 --> 00:03:37,120
And so we're going
to just assume

72
00:03:37,120 --> 00:03:41,880
that we're computing
these hashes

73
00:03:41,880 --> 00:03:45,960
and using them for
certain applications.

74
00:03:45,960 --> 00:03:48,800
I just wrote output
twice I guess.

75
00:03:48,800 --> 00:03:52,720
So map it to a
fixed length output.

76
00:03:52,720 --> 00:03:58,990
We want to do this in a
deterministic fashion.

77
00:03:58,990 --> 00:04:04,430
So once we've computed the
hash of a particular arbitrary

78
00:04:04,430 --> 00:04:07,960
string that is
given to us, we want

79
00:04:07,960 --> 00:04:10,150
to be able to repeat
that process to get

80
00:04:10,150 --> 00:04:13,090
the same hash every time.

81
00:04:13,090 --> 00:04:15,070
We want to do this
in a public fashion.

82
00:04:15,070 --> 00:04:16,170
So everything is public.

83
00:04:16,170 --> 00:04:17,420
There's no secrecy.

84
00:04:17,420 --> 00:04:19,920
There's keyed hash functions
that we won't actually

85
00:04:19,920 --> 00:04:22,210
look at today, but
maybe in passing

86
00:04:22,210 --> 00:04:24,500
I'll mention it next time.

87
00:04:24,500 --> 00:04:26,700
We're not looking at
keyed hash functions here.

88
00:04:26,700 --> 00:04:30,920
There's no secrets in any of
the descriptions of algorithms

89
00:04:30,920 --> 00:04:34,540
or techniques I'm going
to be describing today.

90
00:04:34,540 --> 00:04:37,720
And we want this to be random.

91
00:04:37,720 --> 00:04:39,870
We want it to look random.

92
00:04:39,870 --> 00:04:43,750
True randomness is going to
be impossible to achieve,

93
00:04:43,750 --> 00:04:45,142
given our other constraints.

94
00:04:45,142 --> 00:04:46,850
But we're going to
try and approximate it

95
00:04:46,850 --> 00:04:48,880
with pseudo-randomness.

96
00:04:48,880 --> 00:04:51,040
But we'd want it to
look random, because we

97
00:04:51,040 --> 00:04:55,210
are interested-- as we were
in the case of dictionaries

98
00:04:55,210 --> 00:04:58,030
and the regular application
of hash functions-- we

99
00:04:58,030 --> 00:05:00,870
are interested in
minimizing collisions.

100
00:05:00,870 --> 00:05:03,570
And in fact we're going to
raise the stakes really high

101
00:05:03,570 --> 00:05:05,470
with respect to collisions.

102
00:05:05,470 --> 00:05:09,870
We want it to be impossible
for you, or anyone else,

103
00:05:09,870 --> 00:05:12,210
to discover collisions.

104
00:05:12,210 --> 00:05:15,150
And that's going to be an
important property of collision

105
00:05:15,150 --> 00:05:20,520
resistance that obviously is
going to require randomness.

106
00:05:20,520 --> 00:05:24,220
And those are the
three things we want,

107
00:05:24,220 --> 00:05:26,840
deterministic,
public, and random.

108
00:05:26,840 --> 00:05:32,870
And so just from a function
description standpoint

109
00:05:32,870 --> 00:05:35,800
you have 0, 1 star here,
which implies that it's

110
00:05:35,800 --> 00:05:37,770
an arbitrary length strength.

111
00:05:37,770 --> 00:05:41,830
And we want to go to 0, 1 d.

112
00:05:41,830 --> 00:05:43,955
And this is a
string of length d.

113
00:05:49,170 --> 00:05:51,250
So that means that
you're getting d-bits out

114
00:05:51,250 --> 00:05:52,650
from your hash function.

115
00:05:52,650 --> 00:05:56,630
And here the length is
greater than or equal to 0.

116
00:06:00,360 --> 00:06:01,750
So that's it.

117
00:06:01,750 --> 00:06:05,000
Not a lot that's new here.

118
00:06:05,000 --> 00:06:08,880
But a few things that are going
to be a little bit different.

119
00:06:08,880 --> 00:06:11,630
And there's some subtleties
here that we'll get to.

120
00:06:11,630 --> 00:06:18,600
I want to emphasize two things,
one of which I just said.

121
00:06:18,600 --> 00:06:23,570
There's no secrecy, no secret
keys here in the hash functions

122
00:06:23,570 --> 00:06:25,350
that we are describing.

123
00:06:25,350 --> 00:06:27,250
All operations are public.

124
00:06:27,250 --> 00:06:33,330
So just like you had your hash
function, which was k mod p,

125
00:06:33,330 --> 00:06:38,220
and p was a prime and p was
public and known to everyone

126
00:06:38,220 --> 00:06:40,600
who used the dictionary,
everything here

127
00:06:40,600 --> 00:06:42,570
we are going to be
talking about is public.

128
00:06:42,570 --> 00:06:44,680
So anyone can compute h.

129
00:06:51,942 --> 00:06:53,400
And we're going to
assume that this

130
00:06:53,400 --> 00:06:57,310
is poly-time computation--
not too surprising-- but I'm

131
00:06:57,310 --> 00:07:01,700
being quite flexible here.

132
00:07:01,700 --> 00:07:03,720
When you look at
dictionaries, and you

133
00:07:03,720 --> 00:07:07,140
think about using dictionaries,
and using it to implement

134
00:07:07,140 --> 00:07:10,540
efficient algorithms,
what is the assumption

135
00:07:10,540 --> 00:07:13,690
we kind of implicitly made--
are perhaps explicitly

136
00:07:13,690 --> 00:07:18,810
in some cases-- with respect
to computing the hash?

137
00:07:18,810 --> 00:07:19,310
Anybody?

138
00:07:22,210 --> 00:07:22,710
Yeah?

139
00:07:22,710 --> 00:07:23,690
AUDIENCE: Constant time?

140
00:07:23,690 --> 00:07:25,023
SRINIVAS DEVADAS: Constant time.

141
00:07:25,023 --> 00:07:33,680
We assumed-- so this is not
necessarily order 1, right?

142
00:07:33,680 --> 00:07:34,840
So that's important.

143
00:07:34,840 --> 00:07:42,180
So we're going to-- I want
to make sure you're watching.

144
00:07:42,180 --> 00:07:45,770
So you're going to raise
the stakes even with respect

145
00:07:45,770 --> 00:07:47,990
to the complexity of the hash.

146
00:07:47,990 --> 00:07:50,459
And as you'll see, because
of the desirable properties,

147
00:07:50,459 --> 00:07:51,750
we're going to have to do that.

148
00:07:51,750 --> 00:07:54,010
We're going to ask for
really a lot with respect

149
00:07:54,010 --> 00:07:55,610
to these hash functions.

150
00:07:55,610 --> 00:07:58,350
Nobody can find a
collision, right?

151
00:07:58,350 --> 00:08:01,750
And if you have something
as simple as k mod p,

152
00:08:01,750 --> 00:08:03,910
it's going to be trivial
to find a collision.

153
00:08:03,910 --> 00:08:06,710
And so these order
1 hash functions

154
00:08:06,710 --> 00:08:08,380
that you're familiar
with aren't going

155
00:08:08,380 --> 00:08:12,160
to make the grade with respect
to any of the properties

156
00:08:12,160 --> 00:08:14,500
that we'll discuss
in a few minutes.

157
00:08:14,500 --> 00:08:16,962
All right, so remember this
is poly-time computation.

158
00:08:16,962 --> 00:08:19,170
And there's lots of examples
of these hash functions.

159
00:08:19,170 --> 00:08:21,820
And for those of you who are
kind of into computer security

160
00:08:21,820 --> 00:08:24,890
and cryptography already,
you might have heard

161
00:08:24,890 --> 00:08:29,240
of examples like MD4 and MD5.

162
00:08:29,240 --> 00:08:30,260
These are versions.

163
00:08:30,260 --> 00:08:32,309
MD stands for message digest.

164
00:08:32,309 --> 00:08:35,610
These were functions that were
invented by Professor Rivest.

165
00:08:35,610 --> 00:08:42,970
And they had d equals
128 way back when-- 1992,

166
00:08:42,970 --> 00:08:45,830
if I recall-- when
they were proposed.

167
00:08:45,830 --> 00:08:50,830
And these algorithms have
since been broken in the sense

168
00:08:50,830 --> 00:08:53,490
that it was conjectured that
they had particular properties

169
00:08:53,490 --> 00:08:59,060
of collision resistance that
it would take exponential time

170
00:08:59,060 --> 00:09:01,660
for anybody to find collisions.

171
00:09:01,660 --> 00:09:04,490
And it still kind of
takes exponential time,

172
00:09:04,490 --> 00:09:11,040
but 2 raised to 37 is
exponential at one level,

173
00:09:11,040 --> 00:09:13,700
but constant in another level.

174
00:09:13,700 --> 00:09:17,720
So you can kind of do
it in a few seconds now.

175
00:09:17,720 --> 00:09:20,610
So a little bit of history.

176
00:09:20,610 --> 00:09:23,300
I'm not going to spend
a lot of time on this.

177
00:09:23,300 --> 00:09:28,400
MD5 was used to create what was
called a secure hash algorithm.

178
00:09:28,400 --> 00:09:31,770
This is 160-bits.

179
00:09:31,770 --> 00:09:36,330
And this is not quite
broken at this point.

180
00:09:36,330 --> 00:09:41,920
But that people consider it
broken, or soon to be broken.

181
00:09:41,920 --> 00:09:45,560
Right now the
recommended algorithm

182
00:09:45,560 --> 00:09:50,260
is called SHA-3, secure hash
algorithm version three.

183
00:09:50,260 --> 00:09:53,870
And there was a contest
that ran for like 18 months,

184
00:09:53,870 --> 00:09:56,770
or maybe even longer,
that eventually was won

185
00:09:56,770 --> 00:09:59,580
by what turned into the SHA-3.

186
00:09:59,580 --> 00:10:03,200
And they had a different name
for it that I can't recall.

187
00:10:03,200 --> 00:10:05,180
But it turned into SHA-3.

188
00:10:05,180 --> 00:10:08,220
And what happened along the
way, as we went from MD4,

189
00:10:08,220 --> 00:10:12,690
MD5, SHA-1 to SHA-3, is that
this amount of computation

190
00:10:12,690 --> 00:10:14,810
that you had to do increased.

191
00:10:14,810 --> 00:10:16,570
And the complexity of
operations that you

192
00:10:16,570 --> 00:10:21,690
had to do in order to compute
the hash of an arbitrary string

193
00:10:21,690 --> 00:10:24,460
increased to the
point where-- you

194
00:10:24,460 --> 00:10:27,720
want to think about this as
100 rounds of computation.

195
00:10:27,720 --> 00:10:31,780
And certainly order
d computation,

196
00:10:31,780 --> 00:10:34,200
where d is the number of bits.

197
00:10:34,200 --> 00:10:35,940
And perhaps even more.

198
00:10:35,940 --> 00:10:38,850
So it's definitely not order 1.

199
00:10:38,850 --> 00:10:43,270
So as I said a little bit
of context with respect

200
00:10:43,270 --> 00:10:45,355
to the things that
are out there.

201
00:10:45,355 --> 00:10:46,980
At the end of the
lecture I'll give you

202
00:10:46,980 --> 00:10:49,570
a sense for how these
hash functions are built.

203
00:10:49,570 --> 00:10:51,380
We're not going to
spend a lot of time

204
00:10:51,380 --> 00:10:53,220
on creating these
hash functions.

205
00:10:53,220 --> 00:10:56,320
It's really a research topic
onto itself and not really

206
00:10:56,320 --> 00:10:58,390
in the slope of 6.046.

207
00:10:58,390 --> 00:11:00,920
What is in the scope
of 6.046, and what

208
00:11:00,920 --> 00:11:02,510
I think is more
interesting, which

209
00:11:02,510 --> 00:11:05,690
is what we'll focus
our energy and time on,

210
00:11:05,690 --> 00:11:07,990
is the properties of
these hash functions.

211
00:11:07,990 --> 00:11:10,640
And why these properties
are useful in a bunch

212
00:11:10,640 --> 00:11:12,270
of different apps.

213
00:11:12,270 --> 00:11:15,100
And so what is it that we want?

214
00:11:15,100 --> 00:11:19,440
We want a random oracle.

215
00:11:19,440 --> 00:11:23,790
We want to essentially
build something

216
00:11:23,790 --> 00:11:27,960
that looks like that,
deterministic, public, random.

217
00:11:27,960 --> 00:11:31,390
And we're going to
claim that what we want

218
00:11:31,390 --> 00:11:33,135
is this random
oracle which has all

219
00:11:33,135 --> 00:11:35,892
of these wonderful properties
that I'm going to describe.

220
00:11:35,892 --> 00:11:37,850
I'm going to describe
the random oracle to you,

221
00:11:37,850 --> 00:11:40,391
and then I'm going to tell you
about what the properties are.

222
00:11:40,391 --> 00:11:44,420
And then unfortunately
this is an ideal world

223
00:11:44,420 --> 00:11:47,890
and we can't build
this in the real world.

224
00:11:47,890 --> 00:11:49,919
And so we're going to
have to approximate it.

225
00:11:49,919 --> 00:11:52,460
And that's where the MD4's and
the MD5's and the SHA-1's came

226
00:11:52,460 --> 00:11:55,200
in, OK?

227
00:11:55,200 --> 00:11:56,885
So this is not
achievable in practice.

228
00:12:05,560 --> 00:12:09,300
So what is this oracle?

229
00:12:09,300 --> 00:12:17,170
This oracle is on input
x, belonging to 0,1 star.

230
00:12:17,170 --> 00:12:20,740
So that could be an
arbitrary string.

231
00:12:20,740 --> 00:12:26,100
If x not in the book--
so there's this the book,

232
00:12:26,100 --> 00:12:26,840
all right?

233
00:12:26,840 --> 00:12:29,720
And there's this
infinite capacity book

234
00:12:29,720 --> 00:12:35,170
that has all of the computations
that were ever done prior.

235
00:12:35,170 --> 00:12:36,770
And they're always
stored in the book.

236
00:12:36,770 --> 00:12:38,770
And that's how we're
going to get determinism.

237
00:12:38,770 --> 00:12:42,100
Because this book
initially gets filled in.

238
00:12:42,100 --> 00:12:44,380
All of the entries in
the book are filled

239
00:12:44,380 --> 00:12:47,000
in using pure randomness.

240
00:12:47,000 --> 00:12:55,610
So you flip a coin d
times to determine h of x.

241
00:12:55,610 --> 00:12:57,760
So that's basically it.

242
00:12:57,760 --> 00:12:59,420
And you just keep flipping.

243
00:12:59,420 --> 00:13:00,950
You have to flip d times.

244
00:13:00,950 --> 00:13:05,320
And so if x was 0, you
flip d times, d was 160.

245
00:13:05,320 --> 00:13:08,300
You flipped a coin 160
times and got a string.

246
00:13:08,300 --> 00:13:13,430
If x were 1, flip 160 times,
you get a different string

247
00:13:13,430 --> 00:13:15,530
with very high
probability, obviously.

248
00:13:15,530 --> 00:13:16,890
And so on and so forth.

249
00:13:16,890 --> 00:13:19,280
But what you do is
you have this book.

250
00:13:19,280 --> 00:13:29,870
So you're going to record
x h of x in the book, OK?

251
00:13:29,870 --> 00:13:31,770
So at some level
your hash function

252
00:13:31,770 --> 00:13:35,220
is this giant look-up
table in the sky, right?

253
00:13:35,220 --> 00:13:37,480
Actually not giant, infinite
capacity look-up table

254
00:13:37,480 --> 00:13:38,510
in the sky.

255
00:13:38,510 --> 00:13:42,230
Because you can put
arbitrary strings into this.

256
00:13:42,230 --> 00:13:47,910
And if it's in the book-- this
is obviously the important part

257
00:13:47,910 --> 00:13:52,740
that gives you determinism--
then you return y,

258
00:13:52,740 --> 00:13:58,700
where x and y are
in the book, OK?

259
00:14:01,350 --> 00:14:05,040
So you get a random
answer every time,

260
00:14:05,040 --> 00:14:08,380
except as required
for consistency

261
00:14:08,380 --> 00:14:10,180
with previous answers.

262
00:14:10,180 --> 00:14:11,930
So the very first
time you see a string,

263
00:14:11,930 --> 00:14:16,730
or-- and the whole world
can create this book.

264
00:14:16,730 --> 00:14:17,740
It's public.

265
00:14:17,740 --> 00:14:22,430
So if I created the book at
first with a particular string,

266
00:14:22,430 --> 00:14:23,710
let's say Eric.

267
00:14:23,710 --> 00:14:25,120
I was the string.

268
00:14:25,120 --> 00:14:29,960
And I'm the one who put
the entry-- x equals Eric,

269
00:14:29,960 --> 00:14:34,040
and h of x, h of Eric equals
some random 160-bit string--

270
00:14:34,040 --> 00:14:36,640
into the book, I get
credit for that, right?

271
00:14:36,640 --> 00:14:43,650
But if you come a nanosecond
later and ask for h of Eric,

272
00:14:43,650 --> 00:14:46,090
you should get exactly
what got put into the book

273
00:14:46,090 --> 00:14:49,850
when I asked for h of Eric.

274
00:14:49,850 --> 00:14:51,290
And so on and so forth.

275
00:14:51,290 --> 00:14:53,050
So this is true for everybody.

276
00:14:53,050 --> 00:14:56,810
So this is like-- I mean
basically impossible to get.

277
00:14:56,810 --> 00:15:01,770
Because not only can
anybody and everybody query,

278
00:15:01,770 --> 00:15:05,190
you have to have this
ordering associated

279
00:15:05,190 --> 00:15:09,450
with people querying the book.

280
00:15:09,450 --> 00:15:11,330
And you have to
have consistency.

281
00:15:11,330 --> 00:15:11,990
All right.

282
00:15:11,990 --> 00:15:15,990
So everyone convinced
that we can't build this?

283
00:15:15,990 --> 00:15:16,710
All right.

284
00:15:16,710 --> 00:15:18,660
If you took anything
out of this lecture,

285
00:15:18,660 --> 00:15:20,011
that's what you should take.

286
00:15:20,011 --> 00:15:20,510
No, no.

287
00:15:20,510 --> 00:15:22,290
There's a lot more.

288
00:15:22,290 --> 00:15:26,750
So we want to approximate
the random oracle.

289
00:15:26,750 --> 00:15:28,840
And we're going to get to that.

290
00:15:28,840 --> 00:15:34,767
Obviously we're going to have to
do this in poly-space as well.

291
00:15:34,767 --> 00:15:35,850
So what's wrong with this?

292
00:15:35,850 --> 00:15:38,810
Of course this picture is
I didn't actually say this,

293
00:15:38,810 --> 00:15:42,320
but you'd like things to be
poly-time in terms of space.

294
00:15:42,320 --> 00:15:46,210
You don't want to store
an infinite number-- this

295
00:15:46,210 --> 00:15:48,649
is worse than poly-time,
worse than exponential time,

296
00:15:48,649 --> 00:15:51,190
because it's arbitrary strings
that we're talking about here,

297
00:15:51,190 --> 00:15:51,920
right?

298
00:15:51,920 --> 00:15:53,940
So you can't possibly do that.

299
00:15:53,940 --> 00:15:56,350
So we have to do
something better.

300
00:15:56,350 --> 00:16:00,180
But before I get into how we'd
actually build this, and give

301
00:16:00,180 --> 00:16:04,346
you a sense of how SHA-1
and MD5 were built--

302
00:16:04,346 --> 00:16:06,220
and that's going to come
a little bit later--

303
00:16:06,220 --> 00:16:11,910
I want to spend a lot of time
on the what is interesting,

304
00:16:11,910 --> 00:16:13,630
which are the
desirable properties.

305
00:16:13,630 --> 00:16:16,640
Which you can kind of see
using the random oracle.

306
00:16:16,640 --> 00:16:18,690
So what is cool about
the random oracle

307
00:16:18,690 --> 00:16:21,140
is that it's a simple algorithm.

308
00:16:21,140 --> 00:16:23,060
You can understand it.

309
00:16:23,060 --> 00:16:24,410
You can't implement it.

310
00:16:24,410 --> 00:16:27,130
But now you can see what
wonderful properties

311
00:16:27,130 --> 00:16:28,230
it gives you.

312
00:16:28,230 --> 00:16:30,350
And these properties are
going to be important

313
00:16:30,350 --> 00:16:32,490
for our applications, OK?

314
00:16:32,490 --> 00:16:36,146
And so let's get started with a
bunch of different properties.

315
00:16:36,146 --> 00:16:37,520
And these are all
properties that

316
00:16:37,520 --> 00:16:43,030
are going to be useful for
verification or computer

317
00:16:43,030 --> 00:16:44,850
security applications.

318
00:16:44,850 --> 00:16:51,100
The first one, it's not ow,
it's O, W. It's one-wayness,

319
00:16:51,100 --> 00:16:51,780
all right?

320
00:16:51,780 --> 00:16:53,930
So one-way, or one-wayness.

321
00:16:53,930 --> 00:17:03,230
And it's also called-- you're
not going to call it this--

322
00:17:03,230 --> 00:17:09,390
but perhaps this is a more
technical term, a more precise

323
00:17:09,390 --> 00:17:11,150
term, pre-image resistance.

324
00:17:11,150 --> 00:17:13,060
And so what does this mean?

325
00:17:13,060 --> 00:17:15,167
Well this is a very
strong requirement.

326
00:17:15,167 --> 00:17:16,750
I mean a couple of
other ones are also

327
00:17:16,750 --> 00:17:18,990
going to be perhaps stronger.

328
00:17:18,990 --> 00:17:21,270
But this is a pretty
strong requirement

329
00:17:21,270 --> 00:17:28,710
which says it's
infeasible, given y,

330
00:17:28,710 --> 00:17:45,170
which is in the-- it's basically
a d-bit vector, to find any x

331
00:17:45,170 --> 00:17:50,950
such that h of x equals y.

332
00:17:50,950 --> 00:18:00,120
And so this is x is
the pre-image of y.

333
00:18:00,120 --> 00:18:01,580
So what does this say?

334
00:18:01,580 --> 00:18:04,400
It says that I want to
create a hash function such

335
00:18:04,400 --> 00:18:08,030
that if I give you
a specific-- we call

336
00:18:08,030 --> 00:18:12,870
it a 160-bit string, because
we're talking SHA-1 here,

337
00:18:12,870 --> 00:18:16,792
and that's the hash--
I'm going to have,

338
00:18:16,792 --> 00:18:18,250
it's going to have
to be impossible

339
00:18:18,250 --> 00:18:25,430
for me to discover an x that
produced that 160-bit string,

340
00:18:25,430 --> 00:18:26,390
OK?

341
00:18:26,390 --> 00:18:29,100
Now if you go look
at our random oracle,

342
00:18:29,100 --> 00:18:32,750
you realize that if you
had a 160-bit string,

343
00:18:32,750 --> 00:18:36,970
and perhaps you
have the entire book

344
00:18:36,970 --> 00:18:39,290
and you can read
the entire book.

345
00:18:39,290 --> 00:18:41,940
It's an infinite capacity book.

346
00:18:41,940 --> 00:18:44,750
It's got a bunch of stuff in it.

347
00:18:44,750 --> 00:18:49,800
And know that any time anyone
queried the book the first time

348
00:18:49,800 --> 00:18:53,580
for a given x, that there was
this random 160-bit number that

349
00:18:53,580 --> 00:18:55,700
was generated and
put into the book.

350
00:18:55,700 --> 00:18:58,100
And there's a whole lot
of these numbers, right?

351
00:18:58,100 --> 00:18:59,954
So what's going to
happen is, you're

352
00:18:59,954 --> 00:19:01,870
going to have to look
through the entire book,

353
00:19:01,870 --> 00:19:04,710
this entire potentially
infinite capacity book,

354
00:19:04,710 --> 00:19:13,500
in order to figure out if this
particular y is in the book

355
00:19:13,500 --> 00:19:14,660
or not.

356
00:19:14,660 --> 00:19:18,310
And that's going to take a long
time to do, potentially, OK?

357
00:19:18,310 --> 00:19:23,290
So in the case where you
have a random oracle you'd

358
00:19:23,290 --> 00:19:27,660
have to go through and find--
looking at the output hash

359
00:19:27,660 --> 00:19:30,502
corresponding to each of the
entries in the random oracle,

360
00:19:30,502 --> 00:19:32,960
you're going to start matching,
match, match, match, match,

361
00:19:32,960 --> 00:19:35,290
it's going to take
you exponential time.

362
00:19:35,290 --> 00:19:37,760
Well actually worse than that,
given the infinite capacity

363
00:19:37,760 --> 00:19:38,930
of the book.

364
00:19:38,930 --> 00:19:40,970
So this clearly gives you that.

365
00:19:40,970 --> 00:19:44,070
Now you may not be a completely
satisfied with that answer

366
00:19:44,070 --> 00:19:46,620
because you say well,
you can't implement that.

367
00:19:46,620 --> 00:19:48,410
But we'll talk a
little bit, as I said,

368
00:19:48,410 --> 00:19:50,410
about how you could
actually get this.

369
00:19:50,410 --> 00:19:54,510
But what's-- I should be
clear-- is that the simple hash

370
00:19:54,510 --> 00:19:59,180
functions that we've looked
at in the past just to build

371
00:19:59,180 --> 00:20:02,570
dictionaries do not
satisfy this, right?

372
00:20:02,570 --> 00:20:11,860
So suppose I had h of x
equals x square mod p.

373
00:20:11,860 --> 00:20:18,050
Is this one-way,
given a public p?

374
00:20:18,050 --> 00:20:19,210
No of course not, right?

375
00:20:19,210 --> 00:20:22,310
Because I'm going to be--
it's going to be easy

376
00:20:22,310 --> 00:20:24,730
for me to do something.

377
00:20:24,730 --> 00:20:29,320
Even though this is discrete
arithmetic I could do something

378
00:20:29,320 --> 00:20:32,670
like, well, I know that what
I have here-- actually let's

379
00:20:32,670 --> 00:20:34,170
do it with something
that's simpler,

380
00:20:34,170 --> 00:20:36,580
and then I'll talk
about the x squared.

381
00:20:36,580 --> 00:20:38,650
If I had something
as simple as x mod p,

382
00:20:38,650 --> 00:20:42,100
I mean that's trivially broken
in terms of one-wayness.

383
00:20:42,100 --> 00:20:45,900
Because I know that h of x could
be viewed as the remainder.

384
00:20:45,900 --> 00:20:51,870
So anything-- if this
is h of x, and let's

385
00:20:51,870 --> 00:20:54,310
just call that y for a
second, because that's

386
00:20:54,310 --> 00:20:56,280
what we had it out there.

387
00:20:56,280 --> 00:21:00,340
Something that's a multiple
of y plus the remainder-- so I

388
00:21:00,340 --> 00:21:02,777
could have a-- is that right?

389
00:21:02,777 --> 00:21:03,610
Is that what I want?

390
00:21:03,610 --> 00:21:04,109
Yeah.

391
00:21:04,109 --> 00:21:05,050
No, plus y.

392
00:21:05,050 --> 00:21:13,760
So I want a of-- well since
I can't figure it out,

393
00:21:13,760 --> 00:21:16,250
why can't you?

394
00:21:16,250 --> 00:21:17,970
What do I need to
put in there in order

395
00:21:17,970 --> 00:21:24,170
to discover an x that
would produce a y?

396
00:21:24,170 --> 00:21:25,290
Can I write an equation?

397
00:21:25,290 --> 00:21:26,192
Yeah?

398
00:21:26,192 --> 00:21:27,967
AUDIENCE: Could you
just write y itself?

399
00:21:27,967 --> 00:21:29,300
SRINIVAS DEVADAS: Just y itself.

400
00:21:29,300 --> 00:21:29,960
That's right.

401
00:21:29,960 --> 00:21:30,900
Good point.

402
00:21:30,900 --> 00:21:32,635
Just y itself in this case.

403
00:21:32,635 --> 00:21:33,820
Good.

404
00:21:33,820 --> 00:21:35,650
I knew you guys were
smarter than me.

405
00:21:35,650 --> 00:21:38,060
This proves it.

406
00:21:38,060 --> 00:21:41,500
So if you just take
y-- and y remember

407
00:21:41,500 --> 00:21:46,190
is going to be something
that's 0 to p minus 1, right?

408
00:21:46,190 --> 00:21:47,520
And that's it.

409
00:21:47,520 --> 00:21:49,170
It just goes through, right?

410
00:21:49,170 --> 00:21:51,150
So that's a trivial
example, right?

411
00:21:51,150 --> 00:21:55,780
Now if I put x squared in
here, obviously it's not y,

412
00:21:55,780 --> 00:22:03,050
but I could start looking
at-- what I have here is

413
00:22:03,050 --> 00:22:05,280
I'm going to get y that
looks like x squared.

414
00:22:05,280 --> 00:22:07,220
But I could take
the y that I have,

415
00:22:07,220 --> 00:22:09,240
take the square root
of that, and then start

416
00:22:09,240 --> 00:22:13,570
looking for x's that give
me the y that I have.

417
00:22:13,570 --> 00:22:18,020
Actually it's not a complicated
process to try and figure out,

418
00:22:18,020 --> 00:22:20,230
through trial and
error potentially,

419
00:22:20,230 --> 00:22:23,250
what an x is that
produces a particular y

420
00:22:23,250 --> 00:22:25,360
for the kinds of hash
functions that we've

421
00:22:25,360 --> 00:22:26,900
looked at, all right?

422
00:22:26,900 --> 00:22:32,050
Now as you complicate this
equation it gets harder.

423
00:22:32,050 --> 00:22:34,626
Because you have to invert
this set of equations.

424
00:22:34,626 --> 00:22:36,000
And that's what
the game is going

425
00:22:36,000 --> 00:22:38,620
to be when you go create
one-way hash functions.

426
00:22:38,620 --> 00:22:41,520
The amount of computation
that you do in order

427
00:22:41,520 --> 00:22:44,770
to compute the y is going
to increase to the point

428
00:22:44,770 --> 00:22:47,770
where, as I mentioned, you have
80, 100 rounds of computation,

429
00:22:47,770 --> 00:22:49,400
things getting mixed in.

430
00:22:49,400 --> 00:22:53,210
And the hope is that you create
this circuit, if you will,

431
00:22:53,210 --> 00:22:54,970
that has all this
computation in that.

432
00:22:54,970 --> 00:22:57,170
Going forwards is
easy, because you've

433
00:22:57,170 --> 00:22:59,350
specified the
multiplications and the mods

434
00:22:59,350 --> 00:23:00,700
and so on and so forth.

435
00:23:00,700 --> 00:23:04,830
But not all of these operations
have simple inverses.

436
00:23:04,830 --> 00:23:07,790
And going backwards,
which is what

437
00:23:07,790 --> 00:23:11,010
you need to do in order
to break one-wayness,

438
00:23:11,010 --> 00:23:14,890
or discover the x
given a y, is going

439
00:23:14,890 --> 00:23:17,100
to be harder and harder
as the computations get

440
00:23:17,100 --> 00:23:18,890
more complex, OK?

441
00:23:18,890 --> 00:23:20,905
So everyone have a sense
of what one-wayness is?

442
00:23:24,810 --> 00:23:26,990
So that's one-wayness.

443
00:23:26,990 --> 00:23:30,930
There's four other properties,
two of which are very related.

444
00:23:30,930 --> 00:23:33,700
CR and TCR.

445
00:23:33,700 --> 00:23:35,550
So CR is collision resistance.

446
00:23:42,970 --> 00:23:54,290
It's infeasible to find x and
x prime such that x not equal

447
00:23:54,290 --> 00:24:02,269
to x prime, and h of
x equals h of x prime,

448
00:24:02,269 --> 00:24:03,560
which is of course a collision.

449
00:24:08,300 --> 00:24:09,690
OK?

450
00:24:09,690 --> 00:24:14,790
And that just says you have
this crazy hash function where

451
00:24:14,790 --> 00:24:16,650
you can't discover collisions.

452
00:24:16,650 --> 00:24:18,620
Well it would be
absolutely wonderful.

453
00:24:18,620 --> 00:24:21,740
In fact that's what we wanted
when we built dictionaries.

454
00:24:21,740 --> 00:24:25,290
But why don't we use
SHA-3 in dictionaries?

455
00:24:28,410 --> 00:24:30,350
Why don't we use
SHA-3 in dictionaries?

456
00:24:30,350 --> 00:24:30,851
Yeah?

457
00:24:30,851 --> 00:24:33,058
AUDIENCE: Because it's more
complicated than we need.

458
00:24:33,058 --> 00:24:35,270
SRINIVAS DEVADAS: Yeah,
it's horribly slow, right?

459
00:24:35,270 --> 00:24:39,365
It would take longer to
compute the hash than access

460
00:24:39,365 --> 00:24:40,740
the dictionary,
when you actually

461
00:24:40,740 --> 00:24:44,847
had a reasonable dictionary
that maybe had some collisions.

462
00:24:44,847 --> 00:24:46,930
I mean you just go off and
you have a linked list,

463
00:24:46,930 --> 00:24:50,090
you can afford a few collisions,
what's the big deal, right?

464
00:24:50,090 --> 00:24:51,860
So it just doesn't
make any sense

465
00:24:51,860 --> 00:24:57,420
to use this level of
heavyweight hash function,

466
00:24:57,420 --> 00:25:00,830
even if it satisfies
collision resistance-- which

467
00:25:00,830 --> 00:25:04,070
some of these are conjectured to
do-- for the applications we've

468
00:25:04,070 --> 00:25:04,617
looked at.

469
00:25:04,617 --> 00:25:06,950
But there'll be other apps
where collision resistance is

470
00:25:06,950 --> 00:25:08,520
going to be important.

471
00:25:08,520 --> 00:25:10,110
So that's collision resistance.

472
00:25:10,110 --> 00:25:15,470
And then there's-- TCR is
target collision resistance.

473
00:25:15,470 --> 00:25:18,300
It's a weaker form--
so sometimes people

474
00:25:18,300 --> 00:25:24,190
CR strong collision resistance,
and TCR weak occlusion

475
00:25:24,190 --> 00:25:24,810
resistance.

476
00:25:24,810 --> 00:25:28,090
We'll use CR and TCR here.

477
00:25:28,090 --> 00:25:35,460
And this says it's
infeasible, given

478
00:25:35,460 --> 00:25:39,200
x-- so there's a
specific x that you

479
00:25:39,200 --> 00:25:41,590
want to find a collision
for, as opposed

480
00:25:41,590 --> 00:25:45,360
to just finding a pair that
goes once to x and x prime.

481
00:25:45,360 --> 00:25:49,700
And any pair would suffice to
break the collision resistance

482
00:25:49,700 --> 00:25:50,560
property.

483
00:25:50,560 --> 00:25:54,630
But TCR says is I'm going
to give you a specific x.

484
00:25:54,630 --> 00:25:57,750
And I want you to
find an x prime who's

485
00:25:57,750 --> 00:26:01,050
hash collides with
the hash of x, OK?

486
00:26:01,050 --> 00:26:02,065
That's TCR.

487
00:26:16,350 --> 00:26:18,082
OK that's TCR for you.

488
00:26:18,082 --> 00:26:20,040
And that just to be clear,
I think you probably

489
00:26:20,040 --> 00:26:23,420
all got this, obviously
we want this here

490
00:26:23,420 --> 00:26:26,340
because we have a
deterministic hash function.

491
00:26:26,340 --> 00:26:29,430
And it's a trivial thing
to say that if you had x,

492
00:26:29,430 --> 00:26:32,380
and you had x again, that you
get the same hash back from it.

493
00:26:32,380 --> 00:26:33,740
That's a requirement, really.

494
00:26:33,740 --> 00:26:36,670
So we want two distinct x
and x primes that are not

495
00:26:36,670 --> 00:26:38,590
equal that end up colliding.

496
00:26:38,590 --> 00:26:40,890
That's really what
a collision is.

497
00:26:40,890 --> 00:26:44,200
And so you see the difference
between CR and TCR?

498
00:26:44,200 --> 00:26:44,700
Yup?

499
00:26:44,700 --> 00:26:45,812
Yeah?

500
00:26:45,812 --> 00:26:49,144
AUDIENCE: Are we to
assume that given an x

501
00:26:49,144 --> 00:26:51,105
it's very easy to
get the h of x back?

502
00:26:51,105 --> 00:26:52,480
SRINIVAS DEVADAS:
So the question

503
00:26:52,480 --> 00:26:57,150
was, given an x, it's poly-time
computation to get h of x.

504
00:26:57,150 --> 00:26:58,230
Absolutely.

505
00:26:58,230 --> 00:27:02,480
Public poly-time computation
given an x to get h of x.

506
00:27:02,480 --> 00:27:08,840
So going this way is easy.

507
00:27:08,840 --> 00:27:15,170
Going this way-- I ran
out of room-- hard.

508
00:27:15,170 --> 00:27:16,954
OK?

509
00:27:16,954 --> 00:27:20,160
AUDIENCE: So does that mean that
TCR is basically the same as 1?

510
00:27:20,160 --> 00:27:22,230
SRINIVAS DEVADAS: No,
no, no, absolutely not.

511
00:27:22,230 --> 00:27:25,890
TCR says it's OK.

512
00:27:25,890 --> 00:27:27,620
You can compute this.

513
00:27:27,620 --> 00:27:29,030
You can get x.

514
00:27:29,030 --> 00:27:30,720
And you can get h of x.

515
00:27:30,720 --> 00:27:33,125
So given x, you know
that you can get h of x.

516
00:27:33,125 --> 00:27:35,000
I didn't actually put
that in the definition.

517
00:27:35,000 --> 00:27:36,800
And maybe I should have.

518
00:27:36,800 --> 00:27:38,860
So given x you can
always get h of x.

519
00:27:38,860 --> 00:27:40,080
Remember that.

520
00:27:40,080 --> 00:27:41,640
It's easy to get h of x.

521
00:27:41,640 --> 00:27:44,350
So any time I say given
x, you can always add it,

522
00:27:44,350 --> 00:27:46,400
saying given x and h of x.

523
00:27:46,400 --> 00:27:48,690
So I'm given x.

524
00:27:48,690 --> 00:27:49,960
I'm given h of x.

525
00:27:49,960 --> 00:27:53,600
I obviously need to
map-- I need to discover

526
00:27:53,600 --> 00:27:58,080
an x prime such that h of
x prime equals h of x, OK?

527
00:27:58,080 --> 00:28:04,490
Now you have situations
where for-- it

528
00:28:04,490 --> 00:28:07,920
may be the case that
for particular x's you

529
00:28:07,920 --> 00:28:08,900
can actually do this.

530
00:28:08,900 --> 00:28:10,363
And that's enough to break TCR.

531
00:28:13,270 --> 00:28:15,640
So you have to have
this strong property

532
00:28:15,640 --> 00:28:22,520
that you really don't want to
find collisions are for some--

533
00:28:22,520 --> 00:28:26,470
even if there's a constant
fraction of x's that

534
00:28:26,470 --> 00:28:29,210
break the TCR property, you
don't like your hash function,

535
00:28:29,210 --> 00:28:29,710
OK?

536
00:28:29,710 --> 00:28:31,850
Because you might end
up picking those and go

537
00:28:31,850 --> 00:28:35,490
build security applications
using those properties.

538
00:28:35,490 --> 00:28:37,990
I want to talk a little
bit about the relationship

539
00:28:37,990 --> 00:28:41,240
between OW, CR, and TCR.

540
00:28:41,240 --> 00:28:42,700
So I'm going to
get back to that.

541
00:28:42,700 --> 00:28:45,290
And we're going to talking
about hash functions that

542
00:28:45,290 --> 00:28:48,076
satisfy one property but
don't satisfy the other.

543
00:28:48,076 --> 00:28:49,700
And I think your
question will probably

544
00:28:49,700 --> 00:28:52,150
be answered better, OK?

545
00:28:52,150 --> 00:28:53,460
Thanks for the question.

546
00:28:53,460 --> 00:28:56,160
So those are the main ones.

547
00:28:56,160 --> 00:28:59,260
And really quickly, if you
want to spend a lot of time

548
00:28:59,260 --> 00:29:02,972
on this-- but I do
want to put up--

549
00:29:02,972 --> 00:29:05,320
I think I'll leave
these properties up here

550
00:29:05,320 --> 00:29:06,590
for the duration.

551
00:29:06,590 --> 00:29:10,350
Because it's important for you
to see these definitions as we

552
00:29:10,350 --> 00:29:13,580
look at the
applications where we

553
00:29:13,580 --> 00:29:17,090
require these properties, or
a subset of these properties.

554
00:29:17,090 --> 00:29:19,580
But that we have
pseudo randomness.

555
00:29:19,580 --> 00:29:22,910
And this is simply a
function of the fact

556
00:29:22,910 --> 00:29:31,100
that-- so this is PRF-- we know
we can't build a random oracle.

557
00:29:31,100 --> 00:29:35,300
And so we're going to have to do
something that's pseudo-random.

558
00:29:35,300 --> 00:29:37,840
And basically what
we're saying here

559
00:29:37,840 --> 00:29:45,870
is the behavior is
indistinguishable from random.

560
00:29:50,990 --> 00:29:56,140
So we're going to have to use
non-linearity, things that

561
00:29:56,140 --> 00:29:58,730
are called non-linear
feedback shift registers,

562
00:29:58,730 --> 00:30:00,370
to create pseudo-random
functions.

563
00:30:00,370 --> 00:30:03,710
There's many ways that we can
create pseudo-random functions.

564
00:30:03,710 --> 00:30:05,310
We won't really get into that.

565
00:30:05,310 --> 00:30:07,680
But obviously
that's what we want.

566
00:30:07,680 --> 00:30:14,420
And then the last
one is a bit tricky.

567
00:30:14,420 --> 00:30:18,830
And we will have an app that
requires this way at the end.

568
00:30:18,830 --> 00:30:29,240
But this is infeasible
given h of x

569
00:30:29,240 --> 00:30:42,010
to produce h of x prime, where
x and x prime are-- and it gets

570
00:30:42,010 --> 00:30:50,150
a little bit fuzzy here-- are
related in some fashion, right?

571
00:30:50,150 --> 00:30:53,630
So a concrete
example of this is,

572
00:30:53,630 --> 00:30:59,770
let's say that x
prime is x plus 1.

573
00:30:59,770 --> 00:31:02,630
So this is a reasonable
example of this.

574
00:31:02,630 --> 00:31:09,930
So what this says is
you're just given h of x.

575
00:31:09,930 --> 00:31:12,680
It doesn't actually say
anything about one-wayness yet.

576
00:31:12,680 --> 00:31:14,670
But you could
assume, for example,

577
00:31:14,670 --> 00:31:18,510
that if this was a
one-way hash function,

578
00:31:18,510 --> 00:31:23,581
that it would be possible to
get x from h of x, correct?

579
00:31:26,300 --> 00:31:28,070
And let's keep that though.

580
00:31:28,070 --> 00:31:29,470
Hold that thought, all right?

581
00:31:29,470 --> 00:31:31,290
We're going to get back to it.

582
00:31:31,290 --> 00:31:36,710
So if I'm just given the hash
through some computation,

583
00:31:36,710 --> 00:31:40,300
it may be possible for me
to create another hash, h

584
00:31:40,300 --> 00:31:45,330
of x prime, such that
there's some relationship

585
00:31:45,330 --> 00:31:51,010
that I can prove or argue
for between the strings that

586
00:31:51,010 --> 00:31:54,390
created the hashes,
namely x and x prime, OK?

587
00:31:54,390 --> 00:31:57,330
That's what
malleability is, right?

588
00:31:57,330 --> 00:32:03,440
Now you might just go off and
say here's an x, here's a y,

589
00:32:03,440 --> 00:32:07,700
here's h of x,
and here's h of y.

590
00:32:07,700 --> 00:32:09,620
These look completely random.

591
00:32:09,620 --> 00:32:12,890
And you might go off-- I'm
being facetious here-- I

592
00:32:12,890 --> 00:32:17,767
say that y is x's third cousin's
roommate's brother-in-law

593
00:32:17,767 --> 00:32:18,600
or something, right?

594
00:32:18,600 --> 00:32:20,600
I mean just make
something up, right?

595
00:32:20,600 --> 00:32:26,470
So clearly there's got to be
a strong, precise relationship

596
00:32:26,470 --> 00:32:27,780
between x and y.

597
00:32:27,780 --> 00:32:32,180
If in fact you could
do this and get y

598
00:32:32,180 --> 00:32:36,160
equals x plus 1, that'd
be a problem, right?

599
00:32:36,160 --> 00:32:38,840
But if you are--
and then you can

600
00:32:38,840 --> 00:32:42,280
do this sort of consistently
for different x's and y's, that

601
00:32:42,280 --> 00:32:44,980
would absolutely be
a problem, right?

602
00:32:44,980 --> 00:32:48,440
But what you're really
asking for-- and typically

603
00:32:48,440 --> 00:32:50,710
when you want
non-malleability-- it's

604
00:32:50,710 --> 00:32:55,000
things where you have
auctions, for example, where

605
00:32:55,000 --> 00:32:58,350
you are to be careful about
making sure that you don't want

606
00:32:58,350 --> 00:33:01,320
to expose your bid.

607
00:33:01,320 --> 00:33:04,700
And so maybe what you're
doing is exposing h of x.

608
00:33:04,700 --> 00:33:08,960
You don't want somebody
to look at your h of x

609
00:33:08,960 --> 00:33:10,420
and figure out how
they could beat

610
00:33:10,420 --> 00:33:13,540
your bid by just a little bit.

611
00:33:13,540 --> 00:33:17,140
Or in case of Vickrey auctions,
where the second highest bidder

612
00:33:17,140 --> 00:33:20,031
wins, now just be a little
bit below you, right?

613
00:33:20,031 --> 00:33:21,530
So that's the kind
of thing that you

614
00:33:21,530 --> 00:33:25,110
want to think about when it
comes to non-malleability,

615
00:33:25,110 --> 00:33:28,880
or malleability, where you
want a strong relationship

616
00:33:28,880 --> 00:33:32,300
between two strings
that are related

617
00:33:32,300 --> 00:33:35,510
in some ordered fashion,
like x equals-- x prime

618
00:33:35,510 --> 00:33:38,950
equals x plus 1, or just
x prime equals 2 times x.

619
00:33:38,950 --> 00:33:43,040
And you don't want
to be able to-- you

620
00:33:43,040 --> 00:33:45,350
don't want the adversary
to be able to discover

621
00:33:45,350 --> 00:33:47,670
these new strings.

622
00:33:47,670 --> 00:33:51,440
Because that would be
the system, all right?

623
00:33:51,440 --> 00:33:55,580
So any questions
about properties?

624
00:33:55,580 --> 00:33:57,620
Are we all good on
these properties?

625
00:33:57,620 --> 00:33:59,840
All right, because I'm
going to start asking you

626
00:33:59,840 --> 00:34:03,010
how to use them for
particular applications,

627
00:34:03,010 --> 00:34:09,170
or what properties are required
for certain applications, OK?

628
00:34:09,170 --> 00:34:11,150
One last thing
before we get there.

629
00:34:11,150 --> 00:34:16,960
I promised a slightly
more detailed analysis

630
00:34:16,960 --> 00:34:20,170
of the relationships
between these properties.

631
00:34:20,170 --> 00:34:20,974
So let's do that.

632
00:34:24,810 --> 00:34:27,830
Now if your just look
at it, eyeball it,

633
00:34:27,830 --> 00:34:34,888
and you look at collision
resistance and TCR,

634
00:34:34,888 --> 00:34:36,429
what can I say about
the relationship

635
00:34:36,429 --> 00:34:40,820
between CR and TCR?

636
00:34:40,820 --> 00:34:45,953
If h is CR, it's going
to be TCR, right?

637
00:34:45,953 --> 00:34:46,744
It's got to be TCR.

638
00:34:46,744 --> 00:34:48,735
It's a strictly
stronger requirement.

639
00:34:54,659 --> 00:34:55,415
But not reverse.

640
00:34:57,940 --> 00:35:04,230
And you can actually
give a concrete example

641
00:35:04,230 --> 00:35:07,077
of a particular hash
function that is TCR.

642
00:35:07,077 --> 00:35:08,160
I'm not going to go there.

643
00:35:08,160 --> 00:35:09,659
It's actually a
little more involved

644
00:35:09,659 --> 00:35:12,780
than you might think it is,
where a TCR hash function is

645
00:35:12,780 --> 00:35:14,430
not collision resistant.

646
00:35:14,430 --> 00:35:17,180
But you can see that
examples such as these

647
00:35:17,180 --> 00:35:20,340
should exist, simply because I
have a more stringent property

648
00:35:20,340 --> 00:35:22,280
corresponding to
collision resistance

649
00:35:22,280 --> 00:35:24,680
as opposed to TCR, right?

650
00:35:24,680 --> 00:35:27,170
So if you're interested in
that particular example,

651
00:35:27,170 --> 00:35:29,780
you're not responsible for
it, get in touch with me

652
00:35:29,780 --> 00:35:32,545
and I'll point you to a,
like a three-page description

653
00:35:32,545 --> 00:35:34,180
of an example.

654
00:35:34,180 --> 00:35:35,930
So I didn't really
want to go in there.

655
00:35:35,930 --> 00:35:40,170
But what I do want to do is talk
about one-wayness and collision

656
00:35:40,170 --> 00:35:40,820
resistance.

657
00:35:40,820 --> 00:35:43,069
Because I think that's
actually much more interesting,

658
00:35:43,069 --> 00:35:43,720
all right?

659
00:35:43,720 --> 00:35:59,060
So if h is one-way--
any conjectures

660
00:35:59,060 --> 00:36:03,370
as to what the question
mark is in the middle?

661
00:36:03,370 --> 00:36:07,950
Can I make strong statements
about the collision resistance

662
00:36:07,950 --> 00:36:10,430
of a hash function,
if I'm guaranteed

663
00:36:10,430 --> 00:36:14,010
that the hash function I have
is a one-way hash function,

664
00:36:14,010 --> 00:36:14,730
or vice versa?

665
00:36:20,960 --> 00:36:23,080
Another way of
putting it is, can you

666
00:36:23,080 --> 00:36:28,970
give me an example of,
just to start with,

667
00:36:28,970 --> 00:36:35,096
a hash function which is
one-way but not TCR, not

668
00:36:35,096 --> 00:36:36,220
target collision resistant?

669
00:36:40,520 --> 00:36:43,540
So I'm going to try and
extract this out of you.

670
00:36:43,540 --> 00:36:46,870
This is somewhat subtle.

671
00:36:46,870 --> 00:36:48,990
But the way you want
to think about this

672
00:36:48,990 --> 00:36:59,260
is, let's say that h
of x is OW and TCR, OK?

673
00:36:59,260 --> 00:37:02,660
And so I have a bunch of inputs.

674
00:37:02,660 --> 00:37:03,660
And this is the output.

675
00:37:03,660 --> 00:37:06,160
And I get d-bits out.

676
00:37:06,160 --> 00:37:12,010
And I've got x1, x2, to xn, OK?

677
00:37:12,010 --> 00:37:16,620
Now I've given this h--
I've been given this h which

678
00:37:16,620 --> 00:37:18,240
is one-way and TCR.

679
00:37:18,240 --> 00:37:20,960
It satisfies those properties
that you have up there.

680
00:37:20,960 --> 00:37:24,590
In the case of one-way, I give
you an arbitrary d-bit string.

681
00:37:24,590 --> 00:37:28,770
You can't go backwards and
find a bunch of the xi's that

682
00:37:28,770 --> 00:37:34,150
produce exactly that
d-bit string, all right?

683
00:37:34,150 --> 00:37:36,530
So it's going to be
hard to get here.

684
00:37:36,530 --> 00:37:40,380
But you're allowed now
to give me an example.

685
00:37:40,380 --> 00:37:45,390
So this is some hash
function that you can create,

686
00:37:45,390 --> 00:37:48,300
which may use h as well.

687
00:37:48,300 --> 00:37:51,780
And h is kind of nice because
it has this one-way property.

688
00:37:51,780 --> 00:37:55,030
So let's say that we want
to discover something where

689
00:37:55,030 --> 00:37:59,080
one-way does not imply TCR.

690
00:37:59,080 --> 00:38:03,490
So I want to cook up a
hash function h prime such

691
00:38:03,490 --> 00:38:09,550
that h prime is one-way,
but it's not TCR, OK?

692
00:38:09,550 --> 00:38:13,610
The way you want to think about
this is you want to add to h.

693
00:38:13,610 --> 00:38:16,790
And you want to add something
to h such that it's still hard--

694
00:38:16,790 --> 00:38:20,347
if you add h it's still hard
to go from here to there.

695
00:38:20,347 --> 00:38:21,680
Because you've got to go deeper.

696
00:38:21,680 --> 00:38:23,760
If you add to, for
example, the inputs of h.

697
00:38:23,760 --> 00:38:26,170
Or you could add to the
outputs of h as well,

698
00:38:26,170 --> 00:38:27,730
or the outputs of the current h.

699
00:38:27,730 --> 00:38:34,670
But you can basically go deeper,
or need to go deeper in order

700
00:38:34,670 --> 00:38:39,580
to find the break
one-wayness, in order

701
00:38:39,580 --> 00:38:43,820
to find an x, whatever you have,
that produces the d-bit string

702
00:38:43,820 --> 00:38:44,910
that you have, right?

703
00:38:44,910 --> 00:38:49,690
So what's a simple way of
creating an h prime such that

704
00:38:49,690 --> 00:38:53,700
it's going to be pretty easy to
find targeted collisions even,

705
00:38:53,700 --> 00:38:56,150
not necessarily collisions,
it's pretty easy to find

706
00:38:56,150 --> 00:38:58,740
targeted collisions,
without breaking

707
00:38:58,740 --> 00:39:00,180
the one-way property of h?

708
00:39:03,785 --> 00:39:05,264
Yeah?

709
00:39:05,264 --> 00:39:11,673
AUDIENCE: So if you have
x sub i, if i odd then

710
00:39:11,673 --> 00:39:14,631
return h of x of i.

711
00:39:14,631 --> 00:39:16,603
So that's minus 1.

712
00:39:16,603 --> 00:39:18,552
So return the even group.

713
00:39:18,552 --> 00:39:19,510
SRINIVAS DEVADAS: Sure.

714
00:39:19,510 --> 00:39:21,004
Yep.

715
00:39:21,004 --> 00:39:24,241
AUDIENCE: Given
x any x of i, you

716
00:39:24,241 --> 00:39:27,478
can usually find another x of
i that was the same output?

717
00:39:27,478 --> 00:39:28,980
You can go backwards.

718
00:39:28,980 --> 00:39:29,700
SRINIVAS DEVADAS: You
can't go backwards.

719
00:39:29,700 --> 00:39:30,500
Yeah, that's good.

720
00:39:30,500 --> 00:39:31,450
That's good.

721
00:39:31,450 --> 00:39:34,114
I'm going to do something that's
almost exactly what you said.

722
00:39:34,114 --> 00:39:35,655
But I'm going to
draw it pictorially.

723
00:39:38,270 --> 00:39:42,705
And what you can do, you can
do a parity, like odd and even

724
00:39:42,705 --> 00:39:43,705
that was just described.

725
00:39:47,520 --> 00:39:51,440
And all I'll do is add
a little [? XNOR ?]

726
00:39:51,440 --> 00:39:55,240
gate, which is a parity
gate, to one of the inputs.

727
00:39:55,240 --> 00:39:56,830
So you have and b here.

728
00:39:56,830 --> 00:40:01,010
So I've taken x1, and
I have a and b here.

729
00:40:01,010 --> 00:40:04,560
So I've added-- I can
add as many inputs

730
00:40:04,560 --> 00:40:06,190
as I want to this function.

731
00:40:06,190 --> 00:40:08,830
Oh I should mention
by the way, h of x

732
00:40:08,830 --> 00:40:11,290
is working on arbitrary strings.

733
00:40:11,290 --> 00:40:13,630
And obviously I
put in some number

734
00:40:13,630 --> 00:40:16,774
here that corresponds to
n, which is a fixed number.

735
00:40:16,774 --> 00:40:19,190
So you might ask, what the
heck happened here with respect

736
00:40:19,190 --> 00:40:20,610
to arbitrary strings?

737
00:40:20,610 --> 00:40:22,570
And there's two answers.

738
00:40:22,570 --> 00:40:25,000
The first answer is,
well, ignore arbitrary.

739
00:40:25,000 --> 00:40:27,350
And assume that you
only have n-bit strings.

740
00:40:27,350 --> 00:40:29,370
And n this is really
large number, right?

741
00:40:29,370 --> 00:40:31,500
And that may not be
particularly satisfying.

742
00:40:31,500 --> 00:40:34,220
The other answer is,
which is more practical,

743
00:40:34,220 --> 00:40:35,850
which is what's
used in practice,

744
00:40:35,850 --> 00:40:38,140
is that typically
what happens is,

745
00:40:38,140 --> 00:40:41,000
you do have particular
implementations

746
00:40:41,000 --> 00:40:43,180
of hash functions that
obviously need to have

747
00:40:43,180 --> 00:40:46,440
fixed inputs, n, for example.

748
00:40:46,440 --> 00:40:48,110
And n is typically 512.

749
00:40:48,110 --> 00:40:49,680
It's usually the block size.

750
00:40:49,680 --> 00:40:52,940
And you chunk the input up
into five 12-bit blocks.

751
00:40:52,940 --> 00:40:54,770
And typically what
you do is, you

752
00:40:54,770 --> 00:40:57,800
take the first five 12-bits,
compute the hash for it.

753
00:40:57,800 --> 00:41:02,280
And then you can do it
for the remaining blocks.

754
00:41:02,280 --> 00:41:04,530
And then you can hash all
of them together, all right?

755
00:41:04,530 --> 00:41:06,872
So there's typically
more invocations.

756
00:41:06,872 --> 00:41:08,330
I don't really want
to get into it.

757
00:41:08,330 --> 00:41:11,370
But there's typically
more invocations of h

758
00:41:11,370 --> 00:41:15,600
when the input would be 2 times
n, or 3 times n, all right?

759
00:41:15,600 --> 00:41:17,410
So we don't really
need to go there

760
00:41:17,410 --> 00:41:18,960
for the purposes
of this lecture.

761
00:41:18,960 --> 00:41:20,270
But keep that in mind.

762
00:41:20,270 --> 00:41:23,750
So we'll still stick with our
arbitrary string requirement.

763
00:41:23,750 --> 00:41:26,410
So having said that, take
a look at this picture.

764
00:41:26,410 --> 00:41:30,190
And see what this
picture implies.

765
00:41:30,190 --> 00:41:33,340
I have an h prime that
I've constructed, right?

766
00:41:33,340 --> 00:41:36,720
Now if I look at h
prime, and I give you

767
00:41:36,720 --> 00:41:40,270
an output for h prime--
so h prime now has,

768
00:41:40,270 --> 00:41:45,640
it's a function of a and b, and
x2 all the way to xn, right?

769
00:41:45,640 --> 00:41:47,850
So it's got an extra input.

770
00:41:47,850 --> 00:41:50,630
If I look at h prime, and I look
at the output of h prime that

771
00:41:50,630 --> 00:41:56,280
is given to me, and I need
to discover something that

772
00:41:56,280 --> 00:42:00,280
produces that, it is pretty
clear that I need to figure out

773
00:42:00,280 --> 00:42:03,400
what these values
are, all right?

774
00:42:03,400 --> 00:42:06,930
And I need to know what
the parity of a and b is.

775
00:42:06,930 --> 00:42:09,293
And maybe I don't need to
know exactly what a and b are,

776
00:42:09,293 --> 00:42:11,626
but I absolutely need to know
what the parity of a and b

777
00:42:11,626 --> 00:42:13,230
are, because that's x1.

778
00:42:13,230 --> 00:42:15,490
And the one-way I'd
break would require

779
00:42:15,490 --> 00:42:17,670
me to tell you what
the value of x1 is,

780
00:42:17,670 --> 00:42:20,070
and the value of x2,
and so on and so forth.

781
00:42:20,070 --> 00:42:23,640
So it's pretty clear that
h prime is one-way, right?

782
00:42:23,640 --> 00:42:25,520
Everybody buy that?

783
00:42:25,520 --> 00:42:28,870
h prime is one-way.

784
00:42:28,870 --> 00:42:30,160
But you know what?

785
00:42:30,160 --> 00:42:33,860
I've got target
collisions galore, right?

786
00:42:33,860 --> 00:42:37,360
All I have to do is flip-- I
have a equals 1 and b equals 1.

787
00:42:37,360 --> 00:42:39,770
And I have a equals
0 and b equals 0.

788
00:42:39,770 --> 00:42:42,350
They're going to give
me the same hash, right?

789
00:42:42,350 --> 00:42:45,690
So trivial example,
but that gets

790
00:42:45,690 --> 00:42:50,070
to the essence of the difference
between collision resistance

791
00:42:50,070 --> 00:42:52,290
and one-wayness, target
collision resistance

792
00:42:52,290 --> 00:42:54,210
and one-wayness, all right?

793
00:42:54,210 --> 00:43:03,710
So this is one-way but not TCR,
simply because a equals 0, b

794
00:43:03,710 --> 00:43:06,500
equals 0 for
arbitrary x's produce

795
00:43:06,500 --> 00:43:11,200
the same thing as a equals
1 and b equals 1, right?

796
00:43:11,200 --> 00:43:13,940
So those are collisions.

797
00:43:13,940 --> 00:43:15,690
So admittedly contrived.

798
00:43:15,690 --> 00:43:19,350
But it's a counterexample.

799
00:43:19,350 --> 00:43:21,150
Counterexamples
can be contrived.

800
00:43:21,150 --> 00:43:23,510
It's OK.

801
00:43:23,510 --> 00:43:24,710
All right.

802
00:43:24,710 --> 00:43:28,470
So that was what
happens with that.

803
00:43:28,470 --> 00:43:32,400
Let's look at one
more interesting thing

804
00:43:32,400 --> 00:43:36,150
that corresponds to
the other way, right?

805
00:43:36,150 --> 00:43:46,030
So what I want to show is that a
TCR does not imply one-wayness.

806
00:43:59,040 --> 00:44:03,122
OK, so now I want an
example where it is clear

807
00:44:03,122 --> 00:44:05,580
that I have target collision
resistance, because I can just

808
00:44:05,580 --> 00:44:06,370
assume that.

809
00:44:06,370 --> 00:44:08,310
And we're going to
use the same strategy.

810
00:44:08,310 --> 00:44:10,550
I'm just going assume
that I have an h that's

811
00:44:10,550 --> 00:44:12,240
target collision resistant.

812
00:44:12,240 --> 00:44:16,250
And I'm going to try and cook up
an h prime that is not one-way.

813
00:44:16,250 --> 00:44:21,080
So I'm going to assume that
in fact h is TCR and OW.

814
00:44:21,080 --> 00:44:24,420
And I'm going to take away
one of the properties.

815
00:44:24,420 --> 00:44:26,060
And if I take it one
of the properties

816
00:44:26,060 --> 00:44:28,350
I have a counterexample, right?

817
00:44:28,350 --> 00:44:34,320
So think about how
you could do this.

818
00:44:34,320 --> 00:44:38,355
You have h as before.

819
00:44:41,920 --> 00:44:46,330
And I want to add
some stuff around it

820
00:44:46,330 --> 00:44:52,820
such that it's going to be
easy to discover-- for a large,

821
00:44:52,820 --> 00:44:55,610
for a constant
fraction of hashes

822
00:44:55,610 --> 00:44:58,430
that I've given to me,
not for any old hash.

823
00:44:58,430 --> 00:45:01,000
Because you can always
claim that one-wayness

824
00:45:01,000 --> 00:45:06,360
is broken by saying I have
x, I computed h of x, now

825
00:45:06,360 --> 00:45:09,780
I know what-- given h
of x I know what x is.

826
00:45:09,780 --> 00:45:11,970
I mean you can't do that, right?

827
00:45:11,970 --> 00:45:14,360
So that's not breaking
the one-wayness of it.

828
00:45:14,360 --> 00:45:16,420
It's when you have
an h of x and this

829
00:45:16,420 --> 00:45:18,250
is the first time
you've seen it,

830
00:45:18,250 --> 00:45:20,660
you're trying to find
what x is, right?

831
00:45:20,660 --> 00:45:23,370
So how would you-- how
would you set it up

832
00:45:23,370 --> 00:45:28,230
so you break the
one-wayness of h

833
00:45:28,230 --> 00:45:31,310
without necessarily breaking
the target collision

834
00:45:31,310 --> 00:45:37,430
resistance of the overall hash
function that you're creating?

835
00:45:37,430 --> 00:45:41,339
And you have to do something
with the outputs, OK?

836
00:45:41,339 --> 00:45:42,380
You have to do something.

837
00:45:42,380 --> 00:45:43,671
This is a little more involved.

838
00:45:43,671 --> 00:45:45,734
It's not as easy
as this example.

839
00:45:45,734 --> 00:45:46,900
It's a little more involved.

840
00:45:46,900 --> 00:45:47,920
But any ideas?

841
00:45:51,240 --> 00:45:52,761
Yeah, go ahead.

842
00:45:52,761 --> 00:45:55,707
AUDIENCE: So x is
less than b returns x.

843
00:45:55,707 --> 00:45:57,964
If x is greater than
b, return [INAUDIBLE].

844
00:45:57,964 --> 00:45:59,130
SRINIVAS DEVADAS: Beautiful.

845
00:45:59,130 --> 00:45:59,460
Right.

846
00:45:59,460 --> 00:46:00,970
What color did
you get last time?

847
00:46:00,970 --> 00:46:02,150
AUDIENCE: Blue.

848
00:46:02,150 --> 00:46:03,050
SRINIVAS DEVADAS: You
got a blue last time?

849
00:46:03,050 --> 00:46:03,800
All right.

850
00:46:03,800 --> 00:46:04,890
Well you get a purple.

851
00:46:04,890 --> 00:46:06,190
You have a set.

852
00:46:06,190 --> 00:46:09,220
Actually we have these red ones
that are precious, that are--

853
00:46:09,220 --> 00:46:12,780
no, we don't.

854
00:46:12,780 --> 00:46:14,479
We chose not to do red.

855
00:46:14,479 --> 00:46:15,020
I don't know.

856
00:46:15,020 --> 00:46:17,370
There was some
subliminal message

857
00:46:17,370 --> 00:46:20,750
I think with throwing red
Frisbees that we didn't like.

858
00:46:20,750 --> 00:46:21,380
But OK.

859
00:46:21,380 --> 00:46:22,550
So thank you.

860
00:46:22,550 --> 00:46:33,260
And h of x is simply
something where

861
00:46:33,260 --> 00:46:37,360
I'm going to concatenate
a zero to the x value

862
00:46:37,360 --> 00:46:38,660
and just put it out.

863
00:46:38,660 --> 00:46:40,810
And clearly this is
breaking one-wayness

864
00:46:40,810 --> 00:46:43,610
because I'm just taking the
input, I'm adding a zero to it,

865
00:46:43,610 --> 00:46:44,730
and shipping it out.

866
00:46:44,730 --> 00:46:46,900
So it's going to be easy
to go backwards, right?

867
00:46:46,900 --> 00:46:53,500
And this only happens
if x is less than n,

868
00:46:53,500 --> 00:46:55,460
as the gentleman just said.

869
00:46:55,460 --> 00:47:00,220
Less than or equal to n in
terms of the input length, OK?

870
00:47:00,220 --> 00:47:03,131
Otherwise I'm
going to do h of x.

871
00:47:08,270 --> 00:47:10,160
So this is good news.

872
00:47:10,160 --> 00:47:15,400
Because I'm actually using
the hash function in the case

873
00:47:15,400 --> 00:47:17,890
where I have a
longer input string.

874
00:47:17,890 --> 00:47:20,660
This is bad news for
one-wayness because I'm just

875
00:47:20,660 --> 00:47:23,010
piping out the input.

876
00:47:23,010 --> 00:47:30,927
And so if I get an x, and I
see what the x is out here,

877
00:47:30,927 --> 00:47:32,510
and let's just say
for argument's sake

878
00:47:32,510 --> 00:47:38,480
that-- you could
even say that n is

879
00:47:38,480 --> 00:47:43,330
going to be something
that is less than d,

880
00:47:43,330 --> 00:47:46,210
which is the final
output, which has d-bits.

881
00:47:46,210 --> 00:47:49,090
And so if you see something
that h prime produces

882
00:47:49,090 --> 00:47:51,450
that's less than
d-bits you instantly

883
00:47:51,450 --> 00:47:54,030
know that you can go
backwards and discover

884
00:47:54,030 --> 00:47:57,186
what input produced that
for the h prime, right?

885
00:47:57,186 --> 00:47:59,060
Because you just go off
and you go backwards.

886
00:47:59,060 --> 00:48:00,350
This is what it tells you.

887
00:48:00,350 --> 00:48:01,850
Now on the other
hand if it's larger

888
00:48:01,850 --> 00:48:03,160
obviously you can't do that.

889
00:48:03,160 --> 00:48:06,770
But there's a whole
lot of combinations

890
00:48:06,770 --> 00:48:08,100
that you can do that for.

891
00:48:08,100 --> 00:48:11,300
So this breaks one-wayness, OK?

892
00:48:11,300 --> 00:48:13,074
Now you think about TCR.

893
00:48:13,074 --> 00:48:14,490
And what you want
a show of course

894
00:48:14,490 --> 00:48:17,570
is that this maintains TCR.

895
00:48:17,570 --> 00:48:20,622
So that's the last thing
that we have to show.

896
00:48:20,622 --> 00:48:22,080
We know that it
breaks one-wayness.

897
00:48:22,080 --> 00:48:25,182
But if it broke TCR we don't
quite have our example.

898
00:48:25,182 --> 00:48:26,640
So we want to show
that it actually

899
00:48:26,640 --> 00:48:31,220
maintains TCR, which is
kind of a weakish property

900
00:48:31,220 --> 00:48:33,440
that we need to maintain.

901
00:48:33,440 --> 00:48:35,890
And the reason
this maintains TCR

902
00:48:35,890 --> 00:48:39,290
is that there's really only
two cases here obviously,

903
00:48:39,290 --> 00:48:41,720
corresponding to
the if statement.

904
00:48:41,720 --> 00:48:49,280
And it's pretty clear that if
x is less than or equal to n,

905
00:48:49,280 --> 00:49:03,520
clearly different x's produce
different h prime x's, correct?

906
00:49:03,520 --> 00:49:06,620
Because I'm just passing
along the x out to the output.

907
00:49:06,620 --> 00:49:09,730
So if x is less than n I am
going to get different hashes

908
00:49:09,730 --> 00:49:10,570
at the output.

909
00:49:10,570 --> 00:49:12,350
I'm just passing them out.

910
00:49:12,350 --> 00:49:13,940
So that's easy.

911
00:49:13,940 --> 00:49:17,490
And for the other case,
well I assume that h of x

912
00:49:17,490 --> 00:49:20,129
was CCR, correct?

913
00:49:20,129 --> 00:49:22,420
Because that was the original
assumption, that I had h,

914
00:49:22,420 --> 00:49:23,540
which was CCR.

915
00:49:23,540 --> 00:49:30,690
So in both cases TCR is
maintained because else h

916
00:49:30,690 --> 00:49:38,350
of x maintains TCR, all right?

917
00:49:38,350 --> 00:49:41,284
So again, a bit of
a contrived example

918
00:49:41,284 --> 00:49:42,700
to show you the
difference between

919
00:49:42,700 --> 00:49:45,510
these different properties so
you know not to mix them up.

920
00:49:45,510 --> 00:49:47,630
You know what you
want to ask for,

921
00:49:47,630 --> 00:49:51,150
what is required
when you actually

922
00:49:51,150 --> 00:49:53,870
implement an
application that depends

923
00:49:53,870 --> 00:49:56,000
on particular properties.

924
00:49:56,000 --> 00:49:57,230
All right?

925
00:49:57,230 --> 00:49:59,010
Any questions so
far about properties

926
00:49:59,010 --> 00:50:01,040
or any of these examples?

927
00:50:01,040 --> 00:50:03,227
We're going to dive
in to using them.

928
00:50:06,970 --> 00:50:08,510
OK.

929
00:50:08,510 --> 00:50:12,170
So start thinking
computer security.

930
00:50:12,170 --> 00:50:18,090
Start thinking hackers,
protecting yourself

931
00:50:18,090 --> 00:50:20,655
against the bad guys
that are out there who

932
00:50:20,655 --> 00:50:22,640
are trying to discover
your passwords,

933
00:50:22,640 --> 00:50:24,924
trying to corrupt
your files, generally

934
00:50:24,924 --> 00:50:25,965
make your life miserable.

935
00:50:32,880 --> 00:50:38,880
And we'll start out with
fairly simple examples, where

936
00:50:38,880 --> 00:50:41,730
the properties are
somewhat obvious,

937
00:50:41,730 --> 00:50:46,205
and graduate to this auction
bidding example which

938
00:50:46,205 --> 00:50:48,080
should be sort of the
culmination of at least

939
00:50:48,080 --> 00:50:50,120
this part of the lecture.

940
00:50:50,120 --> 00:50:52,470
And depending on
how much time I have

941
00:50:52,470 --> 00:50:54,800
I'll tell you a
little bit about how

942
00:50:54,800 --> 00:50:56,730
to implement hash functions.

943
00:50:56,730 --> 00:50:59,640
But I think these
things are more

944
00:50:59,640 --> 00:51:03,580
important from a
standpoint of giving you

945
00:51:03,580 --> 00:51:08,610
a sense of cryptographic hashes.

946
00:51:08,610 --> 00:51:10,380
All right.

947
00:51:10,380 --> 00:51:11,970
Password storage.

948
00:51:11,970 --> 00:51:16,730
How many of you write your
password in an unencrypted text

949
00:51:16,730 --> 00:51:22,230
file and store it in
a readable location?

950
00:51:22,230 --> 00:51:24,380
There you go, man.

951
00:51:24,380 --> 00:51:27,390
Thank you for being honest.

952
00:51:27,390 --> 00:51:29,550
And I do worse.

953
00:51:29,550 --> 00:51:32,610
Not only do I do that, I
use my first daughter's

954
00:51:32,610 --> 00:51:35,334
name for four passwords.

955
00:51:35,334 --> 00:51:36,750
I won't tell you
what the name is.

956
00:51:41,350 --> 00:51:43,470
So that's something that
we'd like to fix, right?

957
00:51:43,470 --> 00:51:45,500
So what do real systems do?

958
00:51:45,500 --> 00:51:49,530
Real systems cannot protect
against me using my first

959
00:51:49,530 --> 00:51:51,400
daughter's name as
a password, right?

960
00:51:51,400 --> 00:51:53,580
So there's no way you
can protect against that.

961
00:51:53,580 --> 00:51:56,830
But if I had a reasonable
password, which

962
00:51:56,830 --> 00:51:59,030
had reasonable
entropy in it-- so

963
00:51:59,030 --> 00:52:01,344
let's assume here that we
have reasonable entropy

964
00:52:01,344 --> 00:52:02,010
in the password.

965
00:52:02,010 --> 00:52:04,000
And you can just say 128-bits.

966
00:52:04,000 --> 00:52:05,240
And it's not a lot, right?

967
00:52:05,240 --> 00:52:09,135
128-bits is 16 characters, OK?

968
00:52:09,135 --> 00:52:11,260
And you don't have to answer
this-- how many of you

969
00:52:11,260 --> 00:52:15,390
have 16 characters
in your password?

970
00:52:15,390 --> 00:52:16,710
Oh I'm impressed.

971
00:52:16,710 --> 00:52:17,350
OK.

972
00:52:17,350 --> 00:52:18,980
So you've got
128-bits of entropy.

973
00:52:18,980 --> 00:52:21,710
But the rest of you, forget it.

974
00:52:21,710 --> 00:52:25,040
This is not going
to help you, OK?

975
00:52:25,040 --> 00:52:28,140
But what I want,
assuming you have

976
00:52:28,140 --> 00:52:31,830
significant entropy in your
password-- because otherwise,

977
00:52:31,830 --> 00:52:33,940
if there's not
enough entropy you

978
00:52:33,940 --> 00:52:38,272
can just enumerate all possible
passwords of eight letters.

979
00:52:38,272 --> 00:52:39,230
And it's not that much.

980
00:52:39,230 --> 00:52:41,391
It's 2 raised to
50, what have you.

981
00:52:41,391 --> 00:52:42,390
And you can just go off.

982
00:52:42,390 --> 00:52:44,150
And none of these
properties matter.

983
00:52:44,150 --> 00:52:45,810
You just-- you have your h of x.

984
00:52:45,810 --> 00:52:48,206
It's public.

985
00:52:48,206 --> 00:52:50,080
We'll talk about how we
use that in a second.

986
00:52:50,080 --> 00:52:53,350
But clearly if the
domain is small

987
00:52:53,350 --> 00:52:55,120
you can just
enumerate the domain.

988
00:52:55,120 --> 00:52:57,062
So keep that in mind.

989
00:52:57,062 --> 00:52:58,770
I talked about h of
x, and it's obviously

990
00:52:58,770 --> 00:53:00,300
going to be relevant here.

991
00:53:00,300 --> 00:53:02,520
But suppose I wanted
to build a system,

992
00:53:02,520 --> 00:53:04,300
and this is how
systems are built,

993
00:53:04,300 --> 00:53:06,700
ETC slash password
file, assuming

994
00:53:06,700 --> 00:53:11,040
you have long passwords
it does it this way,

995
00:53:11,040 --> 00:53:13,320
otherwise it needs something
that's called a salt.

996
00:53:13,320 --> 00:53:16,540
But that's 6, 8, 57
and we won't go there.

997
00:53:16,540 --> 00:53:19,590
So we just assume
a large entropy.

998
00:53:19,590 --> 00:53:21,980
What is it that a system can do?

999
00:53:21,980 --> 00:53:26,210
What can it store in order
to let you in, and only

1000
00:53:26,210 --> 00:53:28,830
let you in when you
type your password,

1001
00:53:28,830 --> 00:53:32,190
and not let some bogus
password into the system?

1002
00:53:32,190 --> 00:53:34,610
Or somebody with a bogus
password into the system.

1003
00:53:34,610 --> 00:53:35,249
Yeah, go ahead.

1004
00:53:35,249 --> 00:53:37,540
AUDIENCE: If you capture the
password when you enter it

1005
00:53:37,540 --> 00:53:39,380
and compare it to
what's stored--

1006
00:53:39,380 --> 00:53:40,347
SRINIVAS DEVADAS: Yes.

1007
00:53:40,347 --> 00:53:42,430
AUDIENCE: If it's a one-way
hash you know you have

1008
00:53:42,430 --> 00:53:42,730
what the correct password is.

1009
00:53:42,730 --> 00:53:43,820
SRINIVAS DEVADAS:
That's exactly right.

1010
00:53:43,820 --> 00:53:44,790
That's exactly right.

1011
00:53:44,790 --> 00:53:49,950
So it's a really simple
idea, a very powerful idea.

1012
00:53:49,950 --> 00:53:54,610
It, as I said, assumed that the
entropy-- and I'm belaboring

1013
00:53:54,610 --> 00:53:56,890
the obvious now--
but it is important

1014
00:53:56,890 --> 00:53:59,890
when you talk about security
to state your assumptions.

1015
00:53:59,890 --> 00:54:04,380
But you do not store
password on your computer.

1016
00:54:04,380 --> 00:54:06,940
And you store the
hash of the password.

1017
00:54:06,940 --> 00:54:09,530
Now why do I store my
password on the computer?

1018
00:54:09,530 --> 00:54:12,200
Because this is so
inconvenient, right?

1019
00:54:12,200 --> 00:54:15,180
So this is what the
system does for me.

1020
00:54:15,180 --> 00:54:18,110
But the fact of the matter
is, if I lose my password,

1021
00:54:18,110 --> 00:54:19,470
this doesn't help me.

1022
00:54:19,470 --> 00:54:24,050
Because what the system wants
you to do is choose a password

1023
00:54:24,050 --> 00:54:26,720
that is long enough,
and the h is one-way.

1024
00:54:26,720 --> 00:54:30,960
So anybody who discovers h of
PW that is publicly readable

1025
00:54:30,960 --> 00:54:33,840
cannot discover PW, all right?

1026
00:54:33,840 --> 00:54:36,420
That's what's cool about this.

1027
00:54:36,420 --> 00:54:38,740
How do you let
the person log in?

1028
00:54:38,740 --> 00:54:47,860
Use h of PW to compare
against h of PW prime,

1029
00:54:47,860 --> 00:54:54,420
which is what is entered, where
PW prime is the typed password.

1030
00:55:00,540 --> 00:55:08,530
And clearly what we need is
the disclosure of h of PW

1031
00:55:08,530 --> 00:55:14,960
should not reveal PW.

1032
00:55:14,960 --> 00:55:19,570
So we definitely
need one-wayness.

1033
00:55:19,570 --> 00:55:24,370
What about-- what about
collision resistance?

1034
00:55:24,370 --> 00:55:28,340
Our target collision resistance?

1035
00:55:28,340 --> 00:55:31,350
Think practitioner now, right?

1036
00:55:31,350 --> 00:55:33,590
Are we interested in
this hash function

1037
00:55:33,590 --> 00:55:34,880
being collision resistant?

1038
00:55:34,880 --> 00:55:37,150
What does that
mean in this case?

1039
00:55:37,150 --> 00:55:40,315
Give me the context in this
particular application?

1040
00:55:40,315 --> 00:55:40,940
Yeah, go ahead.

1041
00:55:40,940 --> 00:55:44,860
AUDIENCE: It means that someone
entering a different password

1042
00:55:44,860 --> 00:55:47,107
will have the same
hash [INAUDIBLE].

1043
00:55:47,107 --> 00:55:48,190
SRINIVAS DEVADAS: Exactly.

1044
00:55:48,190 --> 00:55:56,600
So it means that what you have
is a situation where you do not

1045
00:55:56,600 --> 00:56:00,900
reveal-- and so what might
happen is that h of PW prime

1046
00:56:00,900 --> 00:56:02,460
equals h of PW.

1047
00:56:02,460 --> 00:56:07,190
But h of PW equals
h of PW prime.

1048
00:56:07,190 --> 00:56:11,490
But PW is not equal to PW prime.

1049
00:56:11,490 --> 00:56:13,950
What you have is
a false positive.

1050
00:56:13,950 --> 00:56:15,570
Someone who didn't
know your password

1051
00:56:15,570 --> 00:56:19,060
but guessed right-- and
this is a 128-bit value,

1052
00:56:19,060 --> 00:56:22,840
and they guessed right--
is going to get it.

1053
00:56:22,840 --> 00:56:24,940
You don't particularly
care of the probability

1054
00:56:24,940 --> 00:56:26,190
of this occurrence.

1055
00:56:26,190 --> 00:56:27,900
It's really small.

1056
00:56:27,900 --> 00:56:30,570
Typically you're going to
have systems that lock you out

1057
00:56:30,570 --> 00:56:34,770
if you try 10 tries that occurs
one, two, wrong passwords,

1058
00:56:34,770 --> 00:56:35,270
right?

1059
00:56:35,270 --> 00:56:37,965
So really in systems
you do not require--

1060
00:56:37,965 --> 00:56:39,340
you do want to
build systems that

1061
00:56:39,340 --> 00:56:42,090
have minimal
properties with respect

1062
00:56:42,090 --> 00:56:43,570
to the perimeters that are used.

1063
00:56:43,570 --> 00:56:47,090
So from a system building
standpoint just require OW.

1064
00:56:47,090 --> 00:56:48,350
Don't go overboard.

1065
00:56:48,350 --> 00:56:53,100
Don't require collision
resistance or TCR, OK?

1066
00:56:53,100 --> 00:56:55,420
Let's do a slightly
different example.

1067
00:56:55,420 --> 00:56:59,010
Also a bit of a
warm-up for what's

1068
00:56:59,010 --> 00:57:01,895
coming next, which is a
file modification detector.

1069
00:57:22,080 --> 00:57:32,800
So for each file F, I'm going to
store h of F. And as securely.

1070
00:57:32,800 --> 00:57:36,980
So you assume that this means
that h of F cannot be modified

1071
00:57:36,980 --> 00:57:40,380
by anybody, h of F itself.

1072
00:57:47,860 --> 00:57:56,030
And now we want to
check if F is modified

1073
00:57:56,030 --> 00:58:04,470
by re-computing h of
F. Which could be,

1074
00:58:04,470 --> 00:58:05,640
this could be modified.

1075
00:58:05,640 --> 00:58:07,130
So this could
actually be F prime.

1076
00:58:07,130 --> 00:58:09,250
You don't know that.

1077
00:58:09,250 --> 00:58:10,500
You have a file.

1078
00:58:10,500 --> 00:58:11,780
It's a gigabyte.

1079
00:58:11,780 --> 00:58:14,270
And somebody might
have tampered with one

1080
00:58:14,270 --> 00:58:16,030
of the bits in the file.

1081
00:58:16,030 --> 00:58:19,340
All you have is a
d-bit digest that

1082
00:58:19,340 --> 00:58:23,670
corresponds to h of F that you
stored in a secure location.

1083
00:58:23,670 --> 00:58:27,190
And you want to check
to see, by re-computing

1084
00:58:27,190 --> 00:58:31,940
h of F, the file
that is given to you,

1085
00:58:31,940 --> 00:58:34,135
and comparing it with what
you've stored, the h of F

1086
00:58:34,135 --> 00:58:35,730
that you've stored.

1087
00:58:35,730 --> 00:58:42,200
And so what property do we
need in order to pull this off?

1088
00:58:42,200 --> 00:58:44,590
Of hash functions.

1089
00:58:44,590 --> 00:58:48,070
What precisely do we
need to pull this off?

1090
00:58:50,620 --> 00:58:53,040
What is the adversary
trying to do?

1091
00:58:53,040 --> 00:58:55,530
And what is a successful break?

1092
00:58:55,530 --> 00:59:02,000
A successful break is if an
adversary can modify the file

1093
00:59:02,000 --> 00:59:08,720
and keep h of F the same, right?

1094
00:59:08,720 --> 00:59:10,780
That would be a
successful break, right?

1095
00:59:10,780 --> 00:59:13,600
Yup.

1096
00:59:13,600 --> 00:59:14,125
Go ahead.

1097
00:59:14,125 --> 00:59:14,910
AUDIENCE: TCR.

1098
00:59:14,910 --> 00:59:15,550
SRINIVAS DEVADAS: TCR?

1099
00:59:15,550 --> 00:59:16,300
Yeah, absolutely.

1100
00:59:16,300 --> 00:59:16,841
You need TCR.

1101
00:59:19,350 --> 00:59:21,750
So you want to modify the file.

1102
00:59:34,830 --> 00:59:38,230
So you're given that
the file-- the adversary

1103
00:59:38,230 --> 00:59:41,980
is given the file, which
is the input to the hash,

1104
00:59:41,980 --> 00:59:47,550
and is going to try and
modify-- modify the file, right?

1105
00:59:47,550 --> 00:59:51,130
So let's do a couple more.

1106
00:59:51,130 --> 00:59:57,470
And we're going to advance our
requirements here a little bit.

1107
00:59:57,470 --> 01:00:00,891
So those two are
basic properties.

1108
01:00:00,891 --> 01:00:02,140
I want to leave this up there.

1109
01:00:04,937 --> 01:00:06,770
We're going to do
something that corresponds

1110
01:00:06,770 --> 01:00:08,690
to digital signatures.

1111
01:00:08,690 --> 01:00:13,030
So digital signatures are
this wonderful invention

1112
01:00:13,030 --> 01:00:18,290
that came out of MIT in a
computer science laboratory--

1113
01:00:18,290 --> 01:00:23,160
again, Ron Rivest and
collaborators-- which

1114
01:00:23,160 --> 01:00:28,120
are a way of digitally
signing a document using

1115
01:00:28,120 --> 01:00:31,170
a secret key, a private key.

1116
01:00:31,170 --> 01:00:35,660
But anybody who has
access to a public key,

1117
01:00:35,660 --> 01:00:37,210
so it could be
pretty much anybody,

1118
01:00:37,210 --> 01:00:41,647
could verify the authenticity
of that signature, right?

1119
01:00:41,647 --> 01:00:43,230
So that's what a
digital signature is.

1120
01:00:52,490 --> 01:00:55,960
So we're going to talk
about public cryptography

1121
01:00:55,960 --> 01:01:00,730
on Thursday, in terms
of how you could build

1122
01:01:00,730 --> 01:01:06,640
systems or encryption algorithms
that are public key algorithms.

1123
01:01:06,640 --> 01:01:12,470
But here I'll just tell you
what we want out of them.

1124
01:01:12,470 --> 01:01:15,100
Essentially what we have here
in the case of signatures,

1125
01:01:15,100 --> 01:01:18,100
we actually want to talk
about encryption here,

1126
01:01:18,100 --> 01:01:20,180
are-- there's two
keys associated

1127
01:01:20,180 --> 01:01:24,030
with a public key system.

1128
01:01:24,030 --> 01:01:26,880
Anybody and everybody
in the system

1129
01:01:26,880 --> 01:01:31,090
would have a public key that
you can put on your website.

1130
01:01:31,090 --> 01:01:34,500
And you also have a secret key--
that's like your password--

1131
01:01:34,500 --> 01:01:35,930
that you don't
want to write down,

1132
01:01:35,930 --> 01:01:38,221
you don't want to give away,
because that's effectively

1133
01:01:38,221 --> 01:01:39,930
your identity.

1134
01:01:39,930 --> 01:01:44,700
And what digital
signatures respond to

1135
01:01:44,700 --> 01:01:46,880
are that you have
two operations.

1136
01:01:46,880 --> 01:01:51,030
You have signing
and verification.

1137
01:01:51,030 --> 01:01:56,760
So signing means that you
create a signature sigma that

1138
01:01:56,760 --> 01:02:06,420
is the sign using your
private key, your secret key,

1139
01:02:06,420 --> 01:02:10,070
off a message M. So you're
saying this is this message,

1140
01:02:10,070 --> 01:02:12,060
it came from me, right?

1141
01:02:12,060 --> 01:02:13,655
That's what signing means.

1142
01:02:13,655 --> 01:02:16,030
You have this long message
and you sign it at the bottom.

1143
01:02:16,030 --> 01:02:20,620
You're taking responsibility for
the contents of that message.

1144
01:02:20,620 --> 01:02:27,710
And then verification is you
have M sigma and a public key.

1145
01:02:27,710 --> 01:02:31,770
And this is simply going
to output true or false.

1146
01:02:35,780 --> 01:02:42,260
And so the public key should
not reveal any information

1147
01:02:42,260 --> 01:02:43,260
about the secret key.

1148
01:02:48,570 --> 01:02:51,700
And that's the challenge
of building PKI systems,

1149
01:02:51,700 --> 01:02:56,800
that we'll talk about in
some detail next time.

1150
01:02:56,800 --> 01:03:01,440
But we don't need to
think about that other

1151
01:03:01,440 --> 01:03:06,100
than acknowledging it today.

1152
01:03:06,100 --> 01:03:09,680
So the public and private
key are two distinct things,

1153
01:03:09,680 --> 01:03:12,150
neither one of which reveals
anything about the other.

1154
01:03:12,150 --> 01:03:14,430
Think of them as completely
distinct passwords.

1155
01:03:14,430 --> 01:03:16,730
But they happen to be
mathematically related.

1156
01:03:16,730 --> 01:03:18,500
That's why this
whole thing works.

1157
01:03:18,500 --> 01:03:20,260
And that mathematical
relationship

1158
01:03:20,260 --> 01:03:24,750
we'll look at in some
detail on Thursday.

1159
01:03:24,750 --> 01:03:26,920
But having said
that, take a look

1160
01:03:26,920 --> 01:03:29,490
at what this app is
doing for us, right?

1161
01:03:29,490 --> 01:03:31,370
This is a security application.

1162
01:03:31,370 --> 01:03:33,930
And I haven't quite gotten
to hash functions yet.

1163
01:03:33,930 --> 01:03:36,600
But I'll get to it
in just a minute.

1164
01:03:36,600 --> 01:03:39,330
But what I want to do is
emphasize that there's

1165
01:03:39,330 --> 01:03:41,150
two operations going on.

1166
01:03:41,150 --> 01:03:42,760
One of which is a
signature, which

1167
01:03:42,760 --> 01:03:46,050
is a private signature, in the
sense that it's private to me,

1168
01:03:46,050 --> 01:03:47,160
if I'm Alice.

1169
01:03:47,160 --> 01:03:48,500
Or private to Alice.

1170
01:03:48,500 --> 01:03:50,590
And you're using
secret information

1171
01:03:50,590 --> 01:03:52,810
on this public message,
M, because that's

1172
01:03:52,810 --> 01:03:54,690
going to be publicized.

1173
01:03:54,690 --> 01:03:57,580
And you're going to
sign the public message.

1174
01:03:57,580 --> 01:04:01,160
And then anybody in the
world who has access

1175
01:04:01,160 --> 01:04:04,190
to Alice's public key is
going to be able to say,

1176
01:04:04,190 --> 01:04:06,840
oh I'm looking at the signature,
which is a bunch of bits.

1177
01:04:06,840 --> 01:04:09,900
I'm looking at the message,
which is a whole lot of bits.

1178
01:04:09,900 --> 01:04:12,590
And I have this public key,
which is a bunch of bits.

1179
01:04:12,590 --> 01:04:16,150
And I'm going to be
able to tell for sure

1180
01:04:16,150 --> 01:04:19,340
that either Alice
signed this message,

1181
01:04:19,340 --> 01:04:22,560
or Alice did not
sign this message.

1182
01:04:22,560 --> 01:04:26,710
And the assumption
here is that Alice

1183
01:04:26,710 --> 01:04:28,950
kept her private key secret.

1184
01:04:28,950 --> 01:04:30,970
And of course, what
I just wrote there,

1185
01:04:30,970 --> 01:04:33,450
that the public key
does not reveal anything

1186
01:04:33,450 --> 01:04:35,530
about the secret key, OK?

1187
01:04:35,530 --> 01:04:38,350
So that's digital signatures
for you, in a nutshell.

1188
01:04:38,350 --> 01:04:40,990
And when you do MIT
certificates you're

1189
01:04:40,990 --> 01:04:45,130
using digital signatures a la
Rivest-Shamir-Adleman, the RSA

1190
01:04:45,130 --> 01:04:45,900
algorithm.

1191
01:04:45,900 --> 01:04:48,580
So you're using
this all the time,

1192
01:04:48,580 --> 01:04:52,290
when you click on 6.046
links, for example.

1193
01:04:52,290 --> 01:04:56,440
And what happens is M is
typically really large.

1194
01:04:56,440 --> 01:04:58,060
I mean it could
be a file, right?

1195
01:04:58,060 --> 01:04:59,510
It could be a large file.

1196
01:04:59,510 --> 01:05:02,730
And you don't necessarily want
to compute these operations

1197
01:05:02,730 --> 01:05:04,150
on large files.

1198
01:05:04,150 --> 01:05:09,580
So for convenience, what happens
is you end up hashing the file.

1199
01:05:09,580 --> 01:05:22,550
And for large M it's
easier to sign h of M.

1200
01:05:22,550 --> 01:05:29,810
And so replace the M's that
you see here with h of M,

1201
01:05:29,810 --> 01:05:30,720
all right?

1202
01:05:30,720 --> 01:05:38,640
So now that we're given that
we're going to be doing h of M

1203
01:05:38,640 --> 01:05:42,550
in here, think
about what we wanted

1204
01:05:42,550 --> 01:05:45,390
to accomplish with M, right?

1205
01:05:45,390 --> 01:05:48,150
I told you what we wanted
to accomplish with M.

1206
01:05:48,150 --> 01:05:49,360
There's a particular message.

1207
01:05:49,360 --> 01:05:50,190
I'm Alice.

1208
01:05:50,190 --> 01:05:53,850
I'm going to keep my
secret key secret.

1209
01:05:53,850 --> 01:05:57,910
But I want to commit to signing
this message M, all right?

1210
01:05:57,910 --> 01:06:00,330
And I want to make
sure that nobody

1211
01:06:00,330 --> 01:06:05,320
can pretend to be me who
doesn't know my secret key.

1212
01:06:05,320 --> 01:06:07,290
And nobody does.

1213
01:06:07,290 --> 01:06:10,760
So if I'm going to be signing
the hash of the message,

1214
01:06:10,760 --> 01:06:13,930
now it comes down
to today's lecture.

1215
01:06:13,930 --> 01:06:16,680
I'm signing the hash
of the message h of M.

1216
01:06:16,680 --> 01:06:22,120
What property do I require of
h in order for this whole thing

1217
01:06:22,120 --> 01:06:23,640
to work out?

1218
01:06:23,640 --> 01:06:24,636
Yeah, go ahead.

1219
01:06:24,636 --> 01:06:26,540
AUDIENCE: Is it
non-malleability?

1220
01:06:26,540 --> 01:06:28,665
SRINIVAS DEVADAS: Non
malleability, but even before

1221
01:06:28,665 --> 01:06:31,770
that-- suppose-- absolutely,
but non-malleability

1222
01:06:31,770 --> 01:06:36,590
is kind of beyond one of these
properties over on the right.

1223
01:06:36,590 --> 01:06:39,570
You're on the
right track, right?

1224
01:06:39,570 --> 01:06:45,219
So do you want to give
me a different answer?

1225
01:06:45,219 --> 01:06:46,677
You can give me a
different answer.

1226
01:06:46,677 --> 01:06:50,090
AUDIENCE: Oh, I'm not sure.

1227
01:06:50,090 --> 01:06:52,190
SRINIVAS DEVADAS: OK.

1228
01:06:52,190 --> 01:06:52,690
What?

1229
01:06:52,690 --> 01:06:53,898
Yeah, back there.

1230
01:06:53,898 --> 01:06:56,766
AUDIENCE: I think you wanted to
one-way because otherwise you

1231
01:06:56,766 --> 01:07:00,112
could take that signature and
find another message that you

1232
01:07:00,112 --> 01:07:01,080
could credit.

1233
01:07:01,080 --> 01:07:02,740
SRINIVAS DEVADAS: I
can make M public.

1234
01:07:02,740 --> 01:07:05,480
I can make M-- M can be public.

1235
01:07:05,480 --> 01:07:07,060
And h of M is public.

1236
01:07:07,060 --> 01:07:13,570
So one-wayness is not
interesting for this example

1237
01:07:13,570 --> 01:07:14,690
if M is public.

1238
01:07:14,690 --> 01:07:16,690
And we can assume that M
eventually gets public.

1239
01:07:16,690 --> 01:07:18,840
Because that's the message
I'm signing, right?

1240
01:07:18,840 --> 01:07:21,082
I can also put M out.

1241
01:07:21,082 --> 01:07:22,540
So I want the
relationship-- I want

1242
01:07:22,540 --> 01:07:25,760
you to focus on the relationship
between h of M and M

1243
01:07:25,760 --> 01:07:28,720
and tell me what would
break this system.

1244
01:07:28,720 --> 01:07:31,120
And you're on the right track.

1245
01:07:31,120 --> 01:07:31,970
Yeah, go ahead.

1246
01:07:31,970 --> 01:07:32,932
Or way back there.

1247
01:07:32,932 --> 01:07:33,890
Yeah, sorry about that.

1248
01:07:33,890 --> 01:07:35,074
AUDIENCE: TCR.

1249
01:07:35,074 --> 01:07:35,990
SRINIVAS DEVADAS: TCR.

1250
01:07:35,990 --> 01:07:36,780
Why TCR?

1251
01:07:36,780 --> 01:07:37,696
AUDIENCE: [INAUDIBLE].

1252
01:07:46,130 --> 01:07:49,070
SRINIVAS DEVADAS: So I have
M. So what happens here--

1253
01:07:49,070 --> 01:07:51,920
I should write this out.

1254
01:07:51,920 --> 01:08:12,640
I'm given-- as an adversary I
have M and h of M. It is bad

1255
01:08:12,640 --> 01:08:33,010
if Alice signs h of M, but Bob
claims Alice signed M prime.

1256
01:08:33,010 --> 01:08:39,830
Because h of M equals
h of M prime, right?

1257
01:08:39,830 --> 01:08:41,600
That is bad.

1258
01:08:41,600 --> 01:08:44,729
So the M is public--
could you stand up?

1259
01:08:49,229 --> 01:08:50,600
M is given.

1260
01:08:50,600 --> 01:08:53,329
There's a specific
M, and a specific h

1261
01:08:53,329 --> 01:08:56,470
of M in particular,
that has been exposed.

1262
01:08:56,470 --> 01:08:59,620
And h of M is what was
used for the signature.

1263
01:08:59,620 --> 01:09:01,140
So you want to keep
h of M the same.

1264
01:09:01,140 --> 01:09:02,170
It's a specific one.

1265
01:09:02,170 --> 01:09:03,544
So it's not
collision resistance,

1266
01:09:03,544 --> 01:09:05,850
it's target
collision resistance,

1267
01:09:05,850 --> 01:09:07,460
because that's given to you.

1268
01:09:07,460 --> 01:09:09,430
And you want to
keep that the same.

1269
01:09:09,430 --> 01:09:13,600
But you want to claim that oh,
you promised me $10,000, not

1270
01:09:13,600 --> 01:09:15,319
$20, right?

1271
01:09:15,319 --> 01:09:17,899
If you can do that,
you signed saying

1272
01:09:17,899 --> 01:09:22,149
you want to pay $10,000, not
$20, then you've got a problem.

1273
01:09:22,149 --> 01:09:24,160
So your thing is very close.

1274
01:09:24,160 --> 01:09:27,130
It's just that it doesn't need
to be a strong relationship

1275
01:09:27,130 --> 01:09:28,710
between the 10,000 or the 20.

1276
01:09:28,710 --> 01:09:31,000
I mean I give you a
concrete example of that.

1277
01:09:31,000 --> 01:09:33,720
But it could be more,
it could be less.

1278
01:09:33,720 --> 01:09:36,479
Anything that is different
from what you signed,

1279
01:09:36,479 --> 01:09:38,870
be it with the numerical
relationship or not,

1280
01:09:38,870 --> 01:09:43,080
would cause a problem and
break this scheme, all right?

1281
01:09:43,080 --> 01:09:45,260
Are we good?

1282
01:09:45,260 --> 01:09:50,490
All right, one last example,
the most interesting one.

1283
01:09:50,490 --> 01:09:57,250
And as I guessed I'm
probably not going

1284
01:09:57,250 --> 01:10:01,670
to get to saying very much
about how cache functions are

1285
01:10:01,670 --> 01:10:02,250
implemented.

1286
01:10:02,250 --> 01:10:04,041
But maybe I'll spend
a minute or two on it.

1287
01:10:08,770 --> 01:10:12,700
So let's do this example that
has to do with commitments.

1288
01:10:19,260 --> 01:10:20,890
Commitment is important, right?

1289
01:10:20,890 --> 01:10:22,640
You want to commit
to doing things.

1290
01:10:22,640 --> 01:10:24,420
You want to keep your promises.

1291
01:10:24,420 --> 01:10:28,310
And in this case we
have a legal requirement

1292
01:10:28,310 --> 01:10:34,550
that you want to be able to make
people honor their commitments,

1293
01:10:34,550 --> 01:10:37,040
and not weasel their way
out of commitments, right?

1294
01:10:37,040 --> 01:10:39,670
And we want to deal with
this computationally.

1295
01:10:39,670 --> 01:10:42,720
And let's think about auctions.

1296
01:10:42,720 --> 01:10:51,325
So Alice has value x,
e.g. an auction bid.

1297
01:10:54,940 --> 01:11:02,170
Alice computes what
we're going to call

1298
01:11:02,170 --> 01:11:11,500
C of x, which is a commitment
of x, and cements it, right?

1299
01:11:11,500 --> 01:11:26,670
C of x, C of x is-- let's
assume that the auctioneer,

1300
01:11:26,670 --> 01:11:32,470
and perhaps other auctionees
as well, see C of x.

1301
01:11:32,470 --> 01:11:34,770
You have to submit it
to somebody, right?

1302
01:11:34,770 --> 01:11:37,100
So you can assume
that that's exposed.

1303
01:11:37,100 --> 01:11:49,460
And what is going to happen
is, when bidding is over Alice

1304
01:11:49,460 --> 01:11:53,145
is going to open--
so this is-- C

1305
01:11:53,145 --> 01:12:00,069
of x can be thought
of as sealing the bid.

1306
01:12:00,069 --> 01:12:01,110
So that's the commitment.

1307
01:12:01,110 --> 01:12:03,030
You're sealing the--
you're making a bid

1308
01:12:03,030 --> 01:12:04,600
and you're sealing
it in an envelope.

1309
01:12:04,600 --> 01:12:05,650
You've committed to that.

1310
01:12:05,650 --> 01:12:08,110
That's obviously, what
happens in real life

1311
01:12:08,110 --> 01:12:09,740
without cryptography,
but we want

1312
01:12:09,740 --> 01:12:12,300
to do this with cryptography,
with hash functions.

1313
01:12:12,300 --> 01:12:19,250
And so now Alice opens
C of x to reveal x.

1314
01:12:19,250 --> 01:12:25,670
So she has to prove that
in fact x was her bid.

1315
01:12:25,670 --> 01:12:28,580
And that it matches
what she sealed.

1316
01:12:28,580 --> 01:12:31,930
When you open it up, think
about it conceptually

1317
01:12:31,930 --> 01:12:34,660
from a standpoint of
what happens with paper,

1318
01:12:34,660 --> 01:12:38,620
and then we have to think
about this computationally

1319
01:12:38,620 --> 01:12:41,120
and what this implies, right?

1320
01:12:41,120 --> 01:12:43,245
So again I'll do a
little bit of set up.

1321
01:12:43,245 --> 01:12:45,370
And then we have start
talking about the properties

1322
01:12:45,370 --> 01:12:48,997
that we want for this
particular application.

1323
01:12:48,997 --> 01:12:50,580
So there are a bunch
of people who are

1324
01:12:50,580 --> 01:12:54,680
doing bidding for this auction.

1325
01:12:54,680 --> 01:12:56,999
I don't-- I want
to be the first--

1326
01:12:56,999 --> 01:12:58,540
I don't want to
spend a lot of money.

1327
01:12:58,540 --> 01:12:59,560
But I want to win.

1328
01:12:59,560 --> 01:13:01,640
All of us are like that, right?

1329
01:13:01,640 --> 01:13:04,350
If I know information
about your bid,

1330
01:13:04,350 --> 01:13:06,490
that is obviously a
tremendous advantage.

1331
01:13:06,490 --> 01:13:09,110
So clearly that
can't happen, right?

1332
01:13:09,110 --> 01:13:13,000
If I know one other person's
bid I just do plus 1 on that.

1333
01:13:13,000 --> 01:13:16,090
If I know everybody else's I
just do plus 1 on the maximum.

1334
01:13:16,090 --> 01:13:19,420
So clearly there's some secrecy
that's required here, correct?

1335
01:13:19,420 --> 01:13:23,000
So C of x is going to
have to do two things.

1336
01:13:23,000 --> 01:13:26,160
It can't reveal x.

1337
01:13:26,160 --> 01:13:28,760
Because then even maybe
the auctioneer is bad.

1338
01:13:28,760 --> 01:13:31,570
Or other people are
looking at this.

1339
01:13:31,570 --> 01:13:34,760
And you can just assume that C
of x is-- the C of x's are all

1340
01:13:34,760 --> 01:13:36,000
public.

1341
01:13:36,000 --> 01:13:39,840
But I also need a
constraint that's

1342
01:13:39,840 --> 01:13:43,530
associated with C of x
that corresponds to making

1343
01:13:43,530 --> 01:13:46,540
sure Alice is honest, correct?

1344
01:13:46,540 --> 01:13:50,940
So I need to make Alice
commit to something, right?

1345
01:13:50,940 --> 01:13:56,000
So what are the different
properties of the hash function

1346
01:13:56,000 --> 01:14:03,350
that if I use h of
x here, that I'd

1347
01:14:03,350 --> 01:14:08,090
want h to satisfy in order
for this whole process

1348
01:14:08,090 --> 01:14:14,700
to work like it's supposed to
work with paper and envelopes?

1349
01:14:14,700 --> 01:14:15,695
Yeah, go ahead.

1350
01:14:15,695 --> 01:14:18,406
AUDIENCE: It has to be
one-way [INAUDIBLE].

1351
01:14:18,406 --> 01:14:20,030
SRINIVAS DEVADAS: It
has to be one-way.

1352
01:14:20,030 --> 01:14:24,210
And explain to me-- so I
want a description of it

1353
01:14:24,210 --> 01:14:26,260
has to be one-way, because why?

1354
01:14:26,260 --> 01:14:27,957
AUDIENCE: Because
you want all the c

1355
01:14:27,957 --> 01:14:29,790
x's to be hidden from
all the other options.

1356
01:14:29,790 --> 01:14:31,200
SRINIVAS DEVADAS: Right.

1357
01:14:31,200 --> 01:14:40,930
C of x should not
reveal x, all right?

1358
01:14:40,930 --> 01:14:41,430
All right.

1359
01:14:41,430 --> 01:14:41,950
That's good.

1360
01:14:41,950 --> 01:14:44,320
Do you have more?

1361
01:14:44,320 --> 01:14:46,765
It has to be
collision resistant.

1362
01:14:53,180 --> 01:14:55,560
OK.

1363
01:14:55,560 --> 01:14:57,852
I guess.

1364
01:14:57,852 --> 01:15:00,580
A little bit more.

1365
01:15:00,580 --> 01:15:02,560
You're getting there.

1366
01:15:02,560 --> 01:15:05,672
What-- why is it
collision resistant?

1367
01:15:05,672 --> 01:15:08,132
AUDIENCE: Because you want
to make sure that Alice,

1368
01:15:08,132 --> 01:15:12,560
when she makes a bid that
she commits that bid.

1369
01:15:12,560 --> 01:15:15,512
If she's not going to resist
it then she could bid $100

1370
01:15:15,512 --> 01:15:16,805
and then find something else.

1371
01:15:16,805 --> 01:15:18,430
SRINIVAS DEVADAS:
That's exactly right.

1372
01:15:18,430 --> 01:15:26,540
So CR, because
Alice should not be

1373
01:15:26,540 --> 01:15:37,760
able to open this in
multiple ways, right?

1374
01:15:37,760 --> 01:15:41,940
And in this case it's
not TCR in the sense

1375
01:15:41,940 --> 01:15:45,350
that Alice controls
what her bids are.

1376
01:15:45,350 --> 01:15:51,440
And so she might find a pair
of bids that collide, correct?

1377
01:15:51,440 --> 01:15:55,840
She might realize that in
this particular hash function,

1378
01:15:55,840 --> 01:16:01,000
you know $10,000 and a billion
dollars collide, right?

1379
01:16:01,000 --> 01:16:04,450
And so she figures
depending on what happens,

1380
01:16:04,450 --> 01:16:07,820
she's a billionaire,
let's assume.

1381
01:16:07,820 --> 01:16:09,320
She's going to open
the right thing.

1382
01:16:09,320 --> 01:16:11,320
She's a billionaire, but
she doesn't necessarily

1383
01:16:11,320 --> 01:16:13,390
want to spend the billion, OK?

1384
01:16:13,390 --> 01:16:15,040
So that's that, right?

1385
01:16:15,040 --> 01:16:18,360
But I want more.

1386
01:16:18,360 --> 01:16:19,115
Go ahead.

1387
01:16:19,115 --> 01:16:21,590
AUDIENCE: You don't
want it to be malleable.

1388
01:16:21,590 --> 01:16:23,482
Assuming that the
auctioneer is not honest

1389
01:16:23,482 --> 01:16:25,690
because you don't want to
accept a bribe from someone

1390
01:16:25,690 --> 01:16:27,200
and then change
everyone else's bid

1391
01:16:27,200 --> 01:16:29,485
to square root of
whatever they bid.

1392
01:16:29,485 --> 01:16:31,110
SRINIVAS DEVADAS:
That's exactly right.

1393
01:16:31,110 --> 01:16:34,480
Or plus 1, which is a
great example, right?

1394
01:16:34,480 --> 01:16:37,050
So there you go.

1395
01:16:37,050 --> 01:16:38,000
I ran out of Frisbees.

1396
01:16:38,000 --> 01:16:39,083
You can get one next time.

1397
01:16:42,610 --> 01:16:45,640
So yeah, I don't
need this anymore.

1398
01:16:45,640 --> 01:16:47,020
You're exactly right.

1399
01:16:47,020 --> 01:16:49,790
There's another-- it turns out
it's even more subtle than what

1400
01:16:49,790 --> 01:16:51,070
you just described.

1401
01:16:51,070 --> 01:16:54,470
And I think I might be able
to point that out to you.

1402
01:16:54,470 --> 01:16:59,730
But let me just first
describe this answer, which

1403
01:16:59,730 --> 01:17:02,960
gives us non-malleability.

1404
01:17:02,960 --> 01:17:06,130
So the claim is that you
also want non-malleability

1405
01:17:06,130 --> 01:17:08,000
in your hash function.

1406
01:17:08,000 --> 01:17:14,147
And the simple reason is,
given C of x-- and let's assume

1407
01:17:14,147 --> 01:17:14,980
that this is public.

1408
01:17:14,980 --> 01:17:16,646
It's certainly public
to the auctioneer,

1409
01:17:16,646 --> 01:17:19,530
and it could be public to
the other bidders as well.

1410
01:17:19,530 --> 01:17:23,370
Because the notion of
sealing is that you've

1411
01:17:23,370 --> 01:17:24,372
sealed it using C of x.

1412
01:17:24,372 --> 01:17:26,580
But people can see the
outside of the envelope, which

1413
01:17:26,580 --> 01:17:27,990
is C of x.

1414
01:17:27,990 --> 01:17:29,510
So everyone can see C of x.

1415
01:17:29,510 --> 01:17:32,250
You still want this to work,
even though all other bidders

1416
01:17:32,250 --> 01:17:33,650
can see C of x.

1417
01:17:33,650 --> 01:17:44,990
So given C of x, should
not be possible to produce

1418
01:17:44,990 --> 01:17:48,110
C of x plus 1.

1419
01:17:48,110 --> 01:17:49,250
You don't know x is.

1420
01:17:49,250 --> 01:17:54,050
But if you can produce C of
x plus 1, you win, all right?

1421
01:17:54,050 --> 01:17:57,590
And so that's the problem.

1422
01:17:57,590 --> 01:18:04,930
Now it turns out you
now say OK, am I done?

1423
01:18:04,930 --> 01:18:06,930
I want these three properties.

1424
01:18:06,930 --> 01:18:10,350
And I'm done, right?

1425
01:18:10,350 --> 01:18:13,060
There's a little
subtlety here which

1426
01:18:13,060 --> 01:18:15,750
these properties don't capture.

1427
01:18:15,750 --> 01:18:18,290
So that's why there's more here.

1428
01:18:18,290 --> 01:18:21,770
And I don't mean to
titillate, because I'll

1429
01:18:21,770 --> 01:18:24,000
tell you what is missing here.

1430
01:18:24,000 --> 01:18:29,370
But let's say that I have a hash
function that looks like this.

1431
01:18:33,600 --> 01:18:39,970
And this here is non-malleable.

1432
01:18:39,970 --> 01:18:41,690
It is collision resistant.

1433
01:18:41,690 --> 01:18:43,290
And it's one-way, all right?

1434
01:18:43,290 --> 01:18:46,730
So h of x has all these
wonderful properties,

1435
01:18:46,730 --> 01:18:48,710
all right?

1436
01:18:48,710 --> 01:18:52,160
I'm creating an h
prime x that looks

1437
01:18:52,160 --> 01:18:54,660
like this, which is
a concatenation of h

1438
01:18:54,660 --> 01:19:00,210
of x, and giving away the most
significant bit of x, which

1439
01:19:00,210 --> 01:19:01,670
is my bid, right?

1440
01:19:01,670 --> 01:19:03,780
I'm just giving
that away, right?

1441
01:19:03,780 --> 01:19:08,190
The problem here is
that we haven't really

1442
01:19:08,190 --> 01:19:11,660
made our properties
broad enough to solve

1443
01:19:11,660 --> 01:19:14,230
this particular
application to the extent

1444
01:19:14,230 --> 01:19:19,140
that there's contrived cases
where these properties aren't

1445
01:19:19,140 --> 01:19:20,420
enough, OK?

1446
01:19:20,420 --> 01:19:22,180
And the reason is simple.

1447
01:19:22,180 --> 01:19:30,000
h prime x is arguably
NM, CR, and OW.

1448
01:19:30,000 --> 01:19:32,660
And I won't go into to
each of those arguments.

1449
01:19:32,660 --> 01:19:36,630
But you can think
about it, right?

1450
01:19:36,630 --> 01:19:40,030
If I'm just giving you one
bit, there's 159 others,

1451
01:19:40,030 --> 01:19:42,140
there's a couple of
hundred others, whatever it

1452
01:19:42,140 --> 01:19:43,860
is that I have in the domain.

1453
01:19:43,860 --> 01:19:46,230
It's not going to be invertible.

1454
01:19:46,230 --> 01:19:49,420
h prime x is not going to
be invertible if h of x

1455
01:19:49,420 --> 01:19:51,080
is not invertible.

1456
01:19:51,080 --> 01:19:57,880
h prime x is not going to be
breakable in terms of collision

1457
01:19:57,880 --> 01:20:00,950
resistance if h of
x is not breakable,

1458
01:20:00,950 --> 01:20:02,450
and so on and so forth.

1459
01:20:02,450 --> 01:20:04,740
But if I had a hash
function like that,

1460
01:20:04,740 --> 01:20:09,340
is it a good hash function
for my commitment application?

1461
01:20:09,340 --> 01:20:10,090
No, obviously not.

1462
01:20:10,090 --> 01:20:12,298
Because if I publicize this
hash function-- remember,

1463
01:20:12,298 --> 01:20:13,890
everything is public
here with respect

1464
01:20:13,890 --> 01:20:18,030
to h and h prime-- you
are giving away the most

1465
01:20:18,030 --> 01:20:21,360
significant that
corresponds to your bid

1466
01:20:21,360 --> 01:20:23,350
in this particular
hash function, right?

1467
01:20:23,350 --> 01:20:33,170
So you really need a little bit
more than these for secrecy,

1468
01:20:33,170 --> 01:20:34,200
for true secrecy.

1469
01:20:37,510 --> 01:20:39,890
But in the context
of this example,

1470
01:20:39,890 --> 01:20:41,770
I mean it's common
sense that you would not

1471
01:20:41,770 --> 01:20:43,550
use the hash function
like that, right?

1472
01:20:43,550 --> 01:20:46,950
So it's not that there's
anything profound here.

1473
01:20:46,950 --> 01:20:48,540
It's just that I
want to make sure

1474
01:20:48,540 --> 01:20:51,480
that you understand the
nuances of the properties

1475
01:20:51,480 --> 01:20:52,580
that we're requiring.

1476
01:20:52,580 --> 01:20:55,560
We had all the
requirements corresponding

1477
01:20:55,560 --> 01:20:58,900
to the definitions
of NM and CR and OW.

1478
01:20:58,900 --> 01:21:01,150
And you need a little bit
more for this example, where

1479
01:21:01,150 --> 01:21:04,300
you have to say something,
perhaps informally,

1480
01:21:04,300 --> 01:21:10,870
like the bits of your auction
are scrambled in the final hash

1481
01:21:10,870 --> 01:21:14,010
output, which most hash
functions should do anyway,

1482
01:21:14,010 --> 01:21:15,730
and h of x will definitely do.

1483
01:21:15,730 --> 01:21:19,290
But you kind of unscrambled
it by adding this little thing

1484
01:21:19,290 --> 01:21:22,210
in here, corresponding to
the most significant thing,

1485
01:21:22,210 --> 01:21:23,050
all right?

1486
01:21:23,050 --> 01:21:25,480
So I'll stop with that.

1487
01:21:25,480 --> 01:21:29,760
Let me just say that the
operation-- or sorry,

1488
01:21:29,760 --> 01:21:33,590
the work involved in
creating hash functions that

1489
01:21:33,590 --> 01:21:37,430
are poly-time computable
is research work.

1490
01:21:37,430 --> 01:21:40,290
People put up hash functions
and they get broken,

1491
01:21:40,290 --> 01:21:43,770
like MD4 was put up in '92 and
then got broken, SHA-1 and so

1492
01:21:43,770 --> 01:21:44,700
on and so forth.

1493
01:21:44,700 --> 01:21:49,580
And so I just encourage you
to look up SHA-3 and just take

1494
01:21:49,580 --> 01:21:52,480
a quick scan and what
the complexity of SHA-3

1495
01:21:52,480 --> 01:21:56,820
is with respect to computing the
hash given an arbitrary string,

1496
01:21:56,820 --> 01:21:57,590
all right?

1497
01:21:57,590 --> 01:21:59,575
I'll stick around for questions.