A few secure, random bytes without `pgcrypto`

(brandur.org)

55 points | by surprisetalk 5 days ago ago

40 comments

  • mhio 5 days ago ago

    gen_random_uuid() produces a v4 UUID.

    Taking the first 5 bytes of a v6 UUID (time) and last 5 (node) would be a bad random day.

    • hinkley a day ago ago

      I read this exact reply on this exact article two days ago.

      What is happening right now? Why is your comment marked four hours ago?

    • manwe150 5 days ago ago

      Wait, is this blog actually about how to introduce a backdoor into your Postgres install by rolling your own very bad rng?

      • thadt 5 days ago ago

        Nah, mhio is saying that the blog post has a typo:

        > Postgres 13’s gen_random_uuid() which generates a V6 UUID that’s secure...

        gen_random_uuid gives you a version V4 UUID, not a V6 UUID (it's even in the code comments in the snipped included in the blog). I don't believe Postgres even has a function to generate a V6 UUID - which, indeed, would be a bad idea to use as a source of randomness.

      • fanf2 5 days ago ago

        No, a v4 uuid comes from a good RNG. The blog post just said v6 by mistake when it meant v4.

        • hinkley 2 days ago ago

          V6 is just a v4 rearranged to behave more like v7 for the purposes of b-tree insertion.

          • dwattttt a day ago ago

            I believe V6 is a reordering of V1, not V4. V4 is random aside from the bits specifying version & variant, ~6/7 bits.

          • a day ago ago
            [deleted]
    • hinkley a day ago ago

      I read this exact reply on this exact article two days ago.

      What is happening right now?

      • tom_ a day ago ago

        Take the blue pill, press the back button, and pretend you never saw it.

        Or take the red pill, pull back the curtains of reality, and see the machinery behind: https://news.ycombinator.com/item?id=41197775

        • hinkley a day ago ago

          And I got two replies recorded after making an edit to expand on my question.

          Glitch in the matrix.

          • tough a day ago ago

            Man reading this whole thread glitched my brain a bit

            • hinkley 21 hours ago ago

              There’s a few decades after you stop worrying you’re crazy and before you start worrying you’re senile. Leaving you a lot more energy for other things. Enjoy them.

      • frutiger a day ago ago

        My HN client uses the HN API which reveals the true post time of the comment. See https://seville.protostome.com/item/?id=41641314.

  • literalAardvark 13 hours ago ago

    I'm really confused by this post. Wouldn't it be simpler to read a few bytes from /dev/random?

    Sure, it wouldn't be portable to windows but that's more of a feature than a bug.

    • freeqaz 11 hours ago ago

      How do you read from a file like that in SQL? I know that this is in "theory" possible but I've never had a legitimate use case where I've needed to do file I/O from my ORM, lol.

      This is the ChatGPT answer that I was able to derive:

      > You can read from `/dev/urandom` in a PostgreSQL query using `plperlu`, which allows executing unsafe Perl code. > Create a function to read random bytes:

        CREATE EXTENSION plperlu;
      
        CREATE OR REPLACE FUNCTION get_random_bytes(num_bytes int)
        RETURNS bytea
        LANGUAGE plperlu
        AS $$
        my $num_bytes = $_[0];
        open my $urandom, '<', '/dev/urandom' or die "Cannot open /dev/urandom: $!";
        read $urandom, my $bytes, $num_bytes;
        close $urandom;
        return $bytes;
        $$;
      
        SELECT get_random_bytes(16);
    • aaomidi 9 hours ago ago

      /dev/urandom.

      Some systems have basically made them equivalent though.

  • heavensteeth 17 hours ago ago

    > I’m broadly against the use of Postgres extensions because they make upgrades harder and projects less portable [1],

    I can't find that footnote anywhere

  • davidfiala 5 days ago ago

    Exercise extreme caution.

    Having your security strategy rely on quirky behaviors of an implementation detail which might change is incredibly dangerous.

    • yunohn 4 days ago ago

      Not everything is a quirky implementation detail? It’s important for us developers to not write pure glue code between others functions, but to also understand them and write our useful code that may extend others work.

      • a day ago ago
        [deleted]
    • hinkley 5 days ago ago

      UUID v6 isn’t going to change. There’s a reason we have seven of them now. And v8, which would warrant your warning.

      • poincaredisk 5 days ago ago

        UUIDv6 won't change, but what about gen_random_uuid()

        • creatonez 5 days ago ago

          There is widespread acceptance nowadays that randomized UUIDs must be generated from the system CSPRNG or something equivalent, and that any non-cryptographically secure method is a bug. Most library implementations across languages have converged on this in some way.

          That being said, the PostgreSQL documentation doesn't say anything in particular about the predictability of `gen_random_uuid`, so the behavior is unspecified. But it's worth noting the function has an explicit guard to raise an error if secure random is not available, so they were conscious of this possibility and did not attempt any misguided fallbacks.

          And unfortunately this requirement is not baked into the UUID spec either, which uses the word "should" instead of "must" when discussing CSPRNG usage.

        • masklinn 5 days ago ago

          gen_random_uuid isn’t going to change either, the entire point is to generate a secure uuid4. At most it’ll get faster due to using platform-specific syscalls.

  • hinkley 5 days ago ago

    If you’re shopping for a CSPRNG, one of the items that should be very high on your list is being able to call the setSeed function multiple times and have the inputs compose instead of clobber each other.

    You can send half-random input in and then send more half-random input in until you’re satisfied that the RNG has gotten a suitable amount of entropy. Do not chop, rearrange, hash, or bit shift the data trying to make it “stronger” the CSPRNG will do an infinitely better job of doing that for you. Just treat it like a Mr Fusion. Drop a can, a banana peel and the stale beer in and let it cook.

    I gave a similar speech to a team trying to initialize SSL sessions on an embedded machine. “But what if we XOR…” No. Stahp.

    • yunohn 4 days ago ago

      AFAIU the blog author is taking the correctly randomly generated UUID and just cutting out the timestamp portion.

      Why are you equating that to a hacky attempt to make less random data more random?

      • hinkley 4 days ago ago

        Because it’s essentially setSeed(getTimeMillis()). V6 and v7 are sortable, that’s why they exist. Which means like getTimeMillis() there are a finite number of starting points to try to guess the seed.

        • yunohn 4 days ago ago

          This is v4.

          • hinkley 2 days ago ago

            v6 is a transform of v1 UUIDs to behave like v7 keys with respect to database indexing - increasing over time. If it’s a function of time, then it’s still guessable by brute force.

            Another responder suggested that the mention of v6 UUIDs is an error. Maybe. But that’s a truly bizarre typo to make. And they still haven’t fixed it.

    • ronsor 5 days ago ago

      Can you give some examples of CSPRNG implementations that allow this?

      • hinkley 4 days ago ago

        The ones in Java did in the context of the original discussion. And cryptographic hash based PRNG should have the capability of doing so, it’s an implementation detail whether you restart the data collection or append the data.

        Just poke in the setSeed function and see what it does.

      • andreareina 5 days ago ago

        The Linux rng allows writing to a file to effect this.

    • amenhotep 4 days ago ago

      You seem to be saying that

      setSeed(0); setSeed(1); rand()

      and

      setSeed(1); rand()

      returning different values is not only a good idea but is already a thing. Am I wrong?

      This would confuse the hell out of me, what specifically has this behaviour?

      • hinkley 2 days ago ago

        If you’re trying to create a repeatable source of randomness for unit testing for example, you would create a new PRNG for each run, not try to recycle an existing instance. You’re making assumptions about state that aren’t supportable.

        • Dylan16807 a day ago ago

          The behavior you are describing is not setting a seed. A seed should wipe out all existing state.

          Adding entropy is a very different operation.

    • ynik 5 days ago ago

      > You can send half-random input in and then send more half-random input in until you’re satisfied that the RNG has gotten a suitable amount of entropy.

      This does not actually work. If an attacker can observe output of the CSPRNG, and knows the initial state (when it did not yet have enough entropy), then piecemeal addition of entropy allows the attacker to bruteforce what the added entropy was. To be safe, you need to add a significant amount of entropy at once, without allowing the attacker to observe output from an intermediate state. But after you've done that, you won't ever need to add entropy again.

      • beng-nl 4 days ago ago

        You’re right, but I did not read GP to suggest otherwise.

        GP does not suggest using the output before enough entropy had been gathered, eg see ‘until’ in:

        > until you’re satisfied that the RNG has gotten a suitable amount of entropy.

      • hinkley 4 days ago ago

        Sibling already answered this. I don’t know how you came to this conclusion.