3 comments

  • rkwz 10 hours ago ago

    This is nice!

    How do you extract the data from pdf or images? How do you reduce inaccuracies in this process?

    • essaylor 8 hours ago ago

      It's a combination of using an LLM and some pre and post processing. Data extraction itself has been fairly accurate in my experience. The bigger challenge has been biomarker name normalization because different labs often name the same biomarkers quite differently.

      • rkwz an hour ago ago

        Thanks, sounds interesting!