Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Well, to be fair, 14 Gigayears in terms of an attack geared towards a totally different case. I wonder if we can optimize an attack on sentence passwords using hidden markov modeling, or other natural language processing methods. The entropy at the character level decreases if we assume the password is an english sentence [1], and my gut feeling is that we can apply this same thinking at the word level as well (this may very well have been demonstrated already in research).

[1] http://www.nd.edu/~busiforc/handouts/cryptography/cryptograp...



I thought about that, but presumably since so very few people use sentence based passwords, crackers don't optimize for this use case yet. Regardless it's quite a long sentence and that should afford some additional protection against this kind of attack, i can imagine 3 or 4 word sentences being defeated by this kind of technique, 30 words seems a lot more difficult


This is purely a thought experiment at this point, but it would actually be a pretty interesting project. I wonder if it's true that longer sentences would be more secure -- or perhaps entropy decreases with larger inputs that give more data to analyze. In the cipher world this would be true, but for O(1) hashes, I'm curious how this could be tested.

Edit Ultimately security from these types of attack would be with unusual/non-sensical adject/noun, adverb/verb pairs etc -- which your example does reasonably well. Picking a bible verse (for example) would be bad, though.


The problem is that you have no hints about what is correct or not because presumably a one way hash is being used, so even if you get the first 29 words right, you don't know that.

Assuming the cracker just uses a wordlist containing 200,000 terms and unsophisticated brute force, this thing could crack a 3 word password in a day and a 4 word password in 800 years[1]. It would certainly be an interesting project, but I honestly think this is a safe approach for now.

[1] http://www.wolframalpha.com/input/?i=%28200000**4+%2F+63+bil...


I know we've pretty much made our points by now, but my argument is that brute forcing at this level might be more feasible than what you suggest here (if we assume the sentence is correctly using a particular language).

Say we choose one word from the 200,000 and it's a noun. Then we can make assumptions about the next word (eg it's likely a verb) and immediately the number of options for the next word collapses from 200,000 to a subset of some smaller cardinality. In fact we can use our knowledge of common English to restrict the next options down further -- to only verbs that make sense to this particular noun and that agree with the noun's plurality.

So unlike current password brute forcing, where every character is independent of the others -- thus having exponential complexity -- brute forcing a sentence using current NLP methods could be much less expensive. Perhaps a hierarchical method exists that would scale at O(n log n).

Anyway, this is just fun thinking. You're right that current password strategies are a long way away from making this type of cracking worthwhile.


I agree that you could potentially reduce the search space based on rules, but English especially is not well known for it's adherence to rules, see http://en.wikipedia.org/wiki/Buffalo_buffalo_Buffalo_buffalo...

Conversations like this are why I come to hacker news.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: