# About Me I graduated with my PhD in Computer Science / NLP from the [University of Maryland]() in 2022. For my dissertation, I did research on [building NLP tools for collection browsing]. My advisors were [Philip Resnik]() and [Doug Oard](). Since then, I have worked at: - [Adobe Research](https://adobe.com), in their Document Intelligence Lab - [Pattern Data](https://patterndata.ai), leading their ML/NLP work I am __not__ Bengals quarterback [Joe Burrow](), for those of you who landed here Googling him. πŸ˜‰ # Penpusher My hobby project is [penpusher.app](https://penpusher.app), an AI-powered form-filling service. Email me, DM me, or sign up for the waitlist if you want beta access. # Blog Posts ## The Document AI Landscape A series of posts in which I dig deep into the current state of document AI: summarizing model releases, datasets, interesting papers, and running a few experiments of my own. **[[1. Tracking Large Multimodal Model (LMM) Announcements and Metrics (2023-2024)]]** _December 2024_ `🚧 under construction` Looking at the landscape of document tasks that model labs publish benchmark results on. **[[2. What can we learn about LMMs from Document Question Answering?]]** _January 2025_ `🚧 under construction` An in-depth analysis of the DocVQA dataset. What is it, how was it constructed, and what can we learn from it? **[[3. What can we learn about LMMs from Charts?]]** _February 2025_ `🚧 under construction` A look at chart datasets and LMM performance on them. What are the limitations, why aren't current models *there* yet? ## Paper Notes and Miscellaneous Various paper notes, plus some explainers, guides, and general thoughts. **[[Google Gemini 101 - Object Detection with Vision and Structured Outputs]]** _December 19th, 2024_ The missing manual for getting up and running quickly with Gemini 2.0 Flash: making calls with vision mode, getting structured returns, and getting/showing bounding boxes. We build a neat little zero-shot tea set detector. (**[Code](https://gist.github.com/jbarrow/c35279cbed578eeba4e1253dc6907c8c)**) **[[Google Gemini 102 - Advanced Structured Outputs]]** _December 26th, 2024_ `🚧 under construction` A deeper look at Gemini's structured outputs: required fields, the subset of the OpenAPI spec it supports, `$refs`, `$defs`, and using it with `openai`, oh my! **[[Speeding up Visual Question Answering 10x (Faster ANLS)]]** *January 8th, 2024* `🚧 under construction` An explanation of Average Normalized Levenshtein Similarity (ANLS), which is used as a metric for evaluating document question answering (e.g. [[2. What can we learn about LMMs from Document Question Answering?|DocVQA]]). By swapping how edit distance is computed, we can get a 10x speedup and still pass all the tests. **[[Evaluate Your Structured Outputs with ANLS*]]** *January 15th, 2024* `🚧 under construction` An explanation of ANLS*, a metric for evaluating typed/structured question answering. %% `🚧 under construction` How does localization work (and not work) in LMMs. %% ## `TinyHNSW`: Build Your Own Vector Database A series about how HNSW vector search works, from zero to a working approximate nearest neighbors (ANN) library in Python. Minimal dependencies (`numpy` and `networkx`, plus `matplotlib` for visualization). **All the code is at**: https://github.com/jbarrow/tinyhnsw (**[Code](https://github.com/jbarrow/tinyhnsw)**) **[[0. Introduction to `TinyHNSW`|0. Introduction to `TinyHNSW`]]** **Start here to build your own (reasonably efficient) vector database**. We'll start with a naive implementation, then go through how HNSW works piece-by-piece, and eventually put it all together into a simple and clean Python HNSW implementation. **[[1. 🚧 Nearest Neighbor Search and Vector Databases]]** **[[2. 🚧 An Overview of HNSW]]** # Publications ``` working on transferring publications over from: https://scholar.google.com/citations?user=A0mvKlgAAAAJ&hl=en ``` # Contact Me Feel free to reach out to me at: ``` joseph.d.barrow [at] gmail [dot] com ``` If you **include the URL of my blog in the subject line**, I'll look at the email.