Anatomy of High-Performance Matrix Multiplication (2008) [pdf]

(cs.utexas.edu)

36 points | by tosh 3 days ago ago

1 comments

  • srean 15 hours ago ago

    FLAME [0] was a joy to work with and work on.

    The fun and honestly quite revealing part of the course was that operations on matrices even if applied using higher level primitives of conformal partitioning, the same techniques one uses for proving matrix properties could yield fast routines.

    A side effect was these primitives used as a DSL not only generated fast C code but also generated a human believable proof of correctness (as opposed to automated theorem proving) rendered in LaTex.

    Anyone who likes this style should also checkout PLAPACK [1].

    Both FLAME and PLAPACK are by the same author.

    [0] https://www.cs.utexas.edu/~flame/pubs/fire.pdf

    [1] https://www.cs.utexas.edu/~plapack/