Git 使用 SHA1 做為訊息摘要，用於資料命名以及辨識物件。
如今 Google 和 CWI 宣佈在實務計算上破解 SHA1，對 Git 的一致性和安全性是否有所衝擊？
Linus 說：沒錯！Git 會想要轉移至其他 hash 函式，但是否像其他人想要講的：哈！ Git 用 SHA1，肯定完蛋了？未必！實際上，Git 並不是只用 SHA1 來對資料做 hash，Git 還會再加上型態／長度的欄位，未來也可以再加上完整性檢查的機制，這讓 Git 的碰撞攻擊更為困難。
Subject: Re: SHA1 collisions found
From: Linus Torvalds <torvalds () linux-foundation ! org>
Date: 2017-02-23 17:19:06
Message-ID: CA+55aFxJGDpJXqpcoPnwvzcn_fB-zaggj=w7P2At-TOt4buOqw () mail ! gmail ! com
[Download message RAW]
On Thu, Feb 23, 2017 at 8:43 AM, Joey Hess <email@example.com> wrote:
> IIRC someone has been working on parameterizing git’s SHA1 assumptions
> so a repository could eventually use a more secure hash. How far has
> that gotten? There are still many “40” constants in git.git HEAD.
I don’t think you’d necessarily want to change the size of the hash.
You can use a different hash and just use the same 160 bits from it.
> Since we now have collisions in valid PDF files, collisions in valid git
> commit and tree objects are probably able to be constructed.
I haven’t seen the attack yet, but git doesn’t actually just hash the
data, it does prepend a type/length field to it. That usually tends to
make collision attacks much harder, because you either have to make
the resulting size the same too, or you have to be able to also edit
the size field in the header.
pdf’s don’t have that issue, they have a fixed header and you can
fairly arbitrarily add silent data to the middle that just doesn’t get
So pdf’s make for a much better attack vector, exactly because they
are a fairly opaque data format. Git has opaque data in some places
(we hide things in commit objects intentionally, for example, but by
definition that opaque data is fairly secondary.
Put another way: I doubt the sky is falling for git as a source
control management tool. Do we want to migrate to another hash? Yes.
Is it “game over” for SHA1 like people want to say? Probably not.
I haven’t seen the attack details, but I bet
(a) the fact that we have a separate size encoding makes it much
harder to do on git objects in the first place
(b) we can probably easily add some extra sanity checks to the opaque
data we do have, to make it much harder to do the hiding of random
data that these attacks pretty much always depend on.