SQL: Incorrect by Construction

(chreke.com)

43 points | by ingve 20 hours ago ago

38 comments

  • tcp_handshaker 16 hours ago ago

    Post shows beginner T-SQL under SQL Server lock-based READ COMMITTED defaults, then generalizes that to "SQL is incorrect by construction."

    The conclusion should be much narrower. For the same row double debit, true SNAPSHOT isolation, enabled and actually used, not just RCSI, turns the loser into an update-conflict/retry. The idiomatic SQL fix is a conditional debit UPDATE, row-count check, constraints, and one transaction for the transfer.

    The deadlock example is also overstated. SQL Server detects the cycle, rolls back a victim, returns error 1205 [1], and the application is expected to retry. Microsoft has shitty defaults and the author turned it into a much broader claim ..

    [1] - "Your transaction (process ID #...) was deadlocked on {lock | communication buffer | thread} resources with another process and has been chosen as the deadlock victim. Rerun your transaction"

    "Deadlocks guide" - https://learn.microsoft.com/en-us/sql/relational-databases/s...

  • hilariously 16 hours ago ago

    Generally I would recommend an append only data structure here, not a bunch of updates

      BEGIN TRANSACTION;
    
      IF EXISTS (
        SELECT 1
        FROM account_balances WITH (UPDLOCK, HOLDLOCK)
        WHERE owner = 'alice'
          AND balance >= 10
      )
      BEGIN
        INSERT INTO account_ledger (owner, amount, memo)
        VALUES
            ('alice', -10, 'transfer to bob'),
            ('bob',    10, 'transfer from alice');
      END
    
      COMMIT TRANSACTION;
    
    or if I wanted to use the update pattern since I am taking the lock anyway

      BEGIN TRANSACTION;
    
      UPDATE accounts WITH (UPDLOCK)
      SET balance = balance - 10
      WHERE owner = 'alice'
      AND balance >= 10;
    
      IF @@ROWCOUNT = 1
      BEGIN
        UPDATE accounts
        SET balance = balance + 10
        WHERE owner = 'bob';
      END
    
      COMMIT TRANSACTION;
    • LgWoodenBadger 16 hours ago ago

      What’s amusing is you’ve now stolen $10 from all of Alice’s accounts if she has more than one, or if Bob has none.

      • hilariously 4 hours ago ago

        Sure, you'd need a better key I am just saying for example.

  • crazygringo 17 hours ago ago

    This is just like... SQL 101 for transactions and locking. These are basic, elementary concepts in databases.

    There's nothing "incorrect by construction".

    The author claims the original snippet "looks completely reasonable". It absolutely does not, if you know anything about client-server databases.

    • Bjartr 17 hours ago ago

      Not going past the end of a string or array is C 101 and yet buffer overruns abound. Rust's memory model doesn't do anything you shouldn't already be doing in C, but it provides real value because in practice people are demonstrably bad at doing those things. So would a SQL variant or successor that made these "beginner" mistakes a lot harder to make.

      • jmye 10 hours ago ago

        No one is going to adopt some new “sql, but it solves a dumb problem you’ve already solved!” variant to solve a dumb problem they’ve already solved, no matter how many beginners seem to be writing accounting software with no knowledge of all the ways that problem has been solved.

  • traderj0e 18 hours ago ago

    I've encountered this dozens of times. It's not intuitive, but this implicitly locks the row from concurrent reads, where as SELECTing first won't:

      UPDATE accounts  
      SET balance = balance - 10  
      WHERE owner = 'alice' AND balance >= 10;
    
    Another possible surprise, say two xacts do this at the same time:

      INSERT INTO foo(num) (
        SELECT 1 WHERE NOT EXISTS (
          SELECT * FROM foo WHERE num = 1
        )
      );
    
    Without a UNIQUE on num, you get num=1 twice. Of course adding UNIQUE would prevent this, but what you might not expect is UNIQUE implicitly adds a lock too. So not only do you only get num=1 once, but also both xacts are guaranteed to succeed, which in some situations is an important distinction.

    Schools teach that databases are ACID, but in most cases they aren't by default, and enabling full ACID comes with other caveats and also a large performance hit.

    • mrits 17 hours ago ago

      One issue is there were a lot of database enhancements and known side effects introduced at a time where not only was SQL a full time job, it was often paid a lot more and was the most senior engineer on the team.

      It has since become a tool of even front end engineers.

      • traderj0e 17 hours ago ago

        Yeah I keep running into situations where everyone else is a backend engineer but only has cursory knowledge of databases. Maybe they think the DB is just some modular add-on you can swap out, or they understand it's the foundation of all their backend code but don't really know how to do it right.

  • galkk 16 hours ago ago

    > Let the user manage locks themselves, and make sure the correct locks are acquired before mutating a database object.

    ??? This doesn't make sense. It's like saying "just implement it properly".

    what about distributed clients? what about _different_ clients?

    • traderj0e 15 hours ago ago

      Yeah, SERIALIZABLE will guarantee you get enough locks to avoid race conditions, it's just very slow. The only way to do better in theory is to have your client code aware of all other client code, even then idk how exactly you'd make that take the right locks.

      Also idk what the Rust suggestion is. Nothing in Rust would guard against this kind of race condition if you replicated the example over there. You can do mutex or even RwMutex on a single struct, but that doesn't force you to hold struct A's read mutex while you write to struct B.

  • chasil 17 hours ago ago

    This document relies strongly upon Transact-SQL:

    https://en.wikipedia.org/wiki/Transact-SQL

    A more universal industry standard is SQL/PSM, which originated from Oracle PL/SQL:

    https://en.wikipedia.org/wiki/SQL/PSM

    Demonstrating the flaws in question in the PSM standard would be more useful.

    • traderj0e 17 hours ago ago

      You can in fact show the same flaws in PL/SQL or whatever else

  • Jtsummers 16 hours ago ago

    I don't understand how, after the preceding sections, this makes it into the conclusion:

    > Let the user manage locks themselves, and make sure the correct locks are acquired before mutating a database object.

    As demonstrated by the extended (corrected) version of the transaction, the user is controlling which locks get used. So how does this make it into the conclusion as a want, when it already is how it works?

  • cowboylowrez 41 minutes ago ago

    yes wrap everything in transactions

      set @balance = (....
    
    heh assigning value to @balance is that really part of the transaction? because @balance isn't even part of the database lol

    still, anytime you use an isolation mode besides serializable you have to know the details and even with serializable if you aren't catching on failure in your app then you're sending your deadlock or timeout message to your end user.

    I used to debug procedures with "sleep" tsql interspersed through code, this way I could deliberately overlap procedure executions and see how that went.

    In my last job I had to support app developers who were happy with the "nolock" keyword even when marshalling info for updates. Shortly before I left that job in disgust I was at least able to have the developers quit using "nolock" everywhere, unfortunately their remediation to prevent use of the "nolock" keyword was to use "SET TRANSACTION ISOLATION LEVEL READ UNCOMMITTED" at the top of their code haha

  • mrkeen 17 hours ago ago

    > I want an alternative to SQL

    I don't think it's SQL itself - it's the DB vendors ship weak isolation so people aren't hit by deadlocks, isn't it?

    > Make transactions atomic by default

    Not the issue, right? It's the weak isolation.

  • socketcluster 15 hours ago ago

    It's interesting reading this.

    Preventing these kinds of concurrency issues is exactly why I built https://socketcluster.io years ago. Though it solves the problem at the app layer rather than the storage layer.

    But not many developers care about these race conditions it seems.

    It's not just an issue with SQL but a more general issue with many programming languages and approaches.

    This is a great example because it shows how concurrent executions can lead to significant issues.

  • gonzalohm 18 hours ago ago

    If you need concurrency design your system for concurrency.

    Have a transactions table with the payer and receiver and calculate the current balance using the transactions.

    Each transaction must have a unique Id (pk)

    • traderj0e 18 hours ago ago

      That is actually worse, I've been there. It's good to keep logs like that, but you can't use that for locking, you need a separate balances table.

      Edit: Well another option is to add a "pending" col and do three separate db xacts: 1. insert pending=true row 2. select balance with pending debits deducted (which ages out pending rows older than 1min) 3. update row to pending=false if successful. This is a useful pattern if you're waiting on an external system too, but not good in this case where you're just trying to update in one DB.

      • cozzyd 17 hours ago ago

        your goal is to find if there is any combination of plausible transaction orders that results in a balance less than 0, so you can issue an overdraft fee.

        • traderj0e 17 hours ago ago

          The original stated goal is that we want to disallow overdraft. If you want to allow it instead, then there are some followup questions like do you want to limit how much they could overdraft. But this is meant to be an example of race conditions, not a real world bank.

    • grebc 17 hours ago ago

      This is the answer for any serious banking/accounting software.

      Balance is calculated & stored after the fact from a known correct value.

  • taeric 18 hours ago ago

    SQL is intended as a declarative query language. That it is not the correct tool for imperative processing of updates feels expected? And mostly fine?

    Fair that things often grow beyond their original intent.

    • SkiFire13 8 hours ago ago

      If you don't use SQL for this you'll need to reimplement your external locking for this, even though databases already implement it.

      • taeric 17 minutes ago ago

        For the example of an account, moving to double entry accounting is probably the correct move. With an external reconciliation process to take action if they don't match.

        But, that "external" part is what trips up a lot of people. Few things are confined to only exist within the database. Such that sometimes you can't do locks that accurately portray what we can order outside of the computer system. Think legal clawbacks and the like.

    • 18 hours ago ago
      [deleted]
    • traderj0e 17 hours ago ago

      You run into the same issues without using the weird imperative syntax in this article.

      • taeric 16 hours ago ago

        Fair. It would help if the examples didn't always fallback to processes that should have reconciliation procedures to cover when they differ. Double entry accounting combined with the necessary follow on processes can't be "coded away."

        Which is to say that I am definitely indexing on the idea that you would try and get a query language to encode processes being the problem, here.

  • exabrial 11 hours ago ago

    > What if Alice and Bob both try to transfer money to each other at the same time?

    > Let’s map out the transactions again:

    > T1: Acquire a lock on Alice’s account

    > T2: Acquire a lock on Bob’s account......

    ::loud sigh::

    In MySQL, if you acquire a lock on Alice, then attempt acquire on Bob... then another session acquire Bob, then acquires Alice: The engine notices. Nothing bad happens. One of the threads gets marked as a deadlock and is rolled back, the other succeeds. So... yeah, not really a problem. MySql chooses at "random" which one lives.

    Actually... hang on, pause, let me remind you what you learned in your undergrad CS 400 class: Acquire Locks in consistent order. You could sort by account id, or even their first names. Hell sort by their cats name. It doesn't matter, it just has to be deterministic, so: Alice comes before Bob. So doing this properly:

    T1: Acquire a lock on Alice’s account with SELECT FOR UPDATE

    T2: Acquire a lock on Alice’s account with SELECT FOR UPDATE... ::waits::

    Annnnnnd, yeah, magic. Problem solved.

    I don't find any of this too difficult and I really the only thing SQL engines do poorly is reporting on what locks are being held. Adding in XA Transactions can make things even more intuitive and gives you a nice building block if you need to coordinate sending messages to a broker or something else.

    Oh and lastly, you should be using a ledger table for this; not storing balances as columns. Lock on both accounts, insert parallel rows into the ledger. And don't forget to use fixed point decimal datatypes.

    I'll show myself to the door now. Thank you.

  • bena 17 hours ago ago

    You don't update balances, you enter transactions. Then you derive balances from transactions.

    You can even insert them in an unvalidated state, then validate them later. That way if you have two transactions that come one after another, it doesn't matter because you can process them sequentially anyway.

  • giancarlostoro 18 hours ago ago

    This assumes you don't do any sort of caching or use distributed systems that can cache the data and choose to hold off to write it all to the DB. The cached system can show both users the in-process transactions as well.

    • traderj0e 18 hours ago ago

      That introduces more questions, like are cache reads fully consistent

  • karmakaze 13 hours ago ago

    Yes, that's incorrectly constructed SQL for the selected isolation level.

  • selimthegrim 18 hours ago ago

    Wonder what the [check-constraints] part meant or if it's a placeholder.

    • MattRogish 16 hours ago ago

      “I intentionally wrote the example code the way a beginner might. More experienced users would probably reach for solutions like: Updating and checking the balance in a single UPDATE Using [check constraints]to ensure an account balance can never be negative.” Not the author but presumably ALTER TABLE foo ‘ADD CONSTRAINT BalanceCantBeNegative CHECK( balance >= 0 )’

    • minkeymaniac 18 hours ago ago

      [dead]

  • analyticsfs 17 hours ago ago

    [dead]