Sunday, February 28, 2021

Learning Vespa

No, not the scooter :-).

I meant Vespa.AI, a search engine that supports structured search, text search, and approximate vector search. While Vespa's vector search functionality was probably built in response to search engines incorporating vector based signals into their ranking algorithms, there are many ML/NLP pipelines as well that can benefit from vector search, i.e., the ability to find nearest neighbors in high dimensional space at scale. I was interested in Vespa because of its vector search feature as well.

The last couple of times I needed to implement a vector search feature in my application, I had considered using Vespa, and even spent a couple of hours on their website, but ultimately gave up and ended up using NMSLib (Non-Metric Space Library). This was because the learning curve looked pretty steep and I was concerned it would impact project timelines if I tried to learn it inline with the project.

So this time, I decided to learn Vespa by implementing a toy project using it. Somewhat to my surprise, I had better luck this time around. Some of it is definitely thanks to the timely and knowlegable help I received from Vespa employees (and Vespa experts obviously) on the Relevancy slack workspace. But I would attribute at least some of the success to the epiphany that there were correspondences between Vespa functionality and Solr. I wrote this post How I learned Vespa by thinking in Solr on the Vespa blog, which is based on that epiphany, and which describes my experience implementing the toy project with Vespa. If you have a background in Solr (and probably Elasticsearch) and are looking to learn Vespa, you might find it helpful.

One other thing I generally do for my ML/NLP projects is to create couple of interfaces for users to interact with it. The first interface is for human users, and so far it has almost always been a skeletal but fully functional custom web application, although minus most UI bells and whistles, since my front end skills are firmly stuck in the mid 1990s. It used to be Java/Spring applications in the past, and more recently it has been CherryPy and Flask applications.

I have often felt that a full application is overkill. For example, my toy application does text search against the CORD-19 dataset, and MoreLikeThis style vector search to find papers similar for a given paper. A custom application not only needs to demonstrate the individual features but also the interactions between these features. Of course, these are just two features, but you can see how it can get complicated real quick. However, most of the time, your audience is just looking to trying out your features with different inputs, and have the imagination to see how it will all fit together. A web application is just a convenient way for them to do the former.

Which brings me to Streamlit. I had heard of Streamlit from one of my Labs colleagues, but I got a chance to see it in action during an informal demo by a co-member (non-work colleague?) of a meetup I attend regularly. Based on the demo, I decided to use it for my own work, where each feature has its own separate dashboard. The screenshots below show these two features with some actual data. The code to do this is quite simple, just Python calls to streamlit functions, and doesn't involve any web frontend skills.

The second interface is for programmatic consumers. This toy example was relatively simple, but often a ML/NLP/search pipeline will involve talking to multiple services or other random complexities, and a consumer of your application doesn't really need or want to care about whats going on under the hood. In the past, I would build in JSON API front-ends that mimicked the front end (in terms of information content), and I did the same here with FastAPI, another library I've been planning to take a look at. As with Streamlit, FastAPI code is very simple and very little work to set up. As a bonus, it comes with a built-in Swagger Editor that automatically documents your API, and allows the user of your API to try out various services without an external client. The screenshots below show the request parameters and JSON response for the two services in my toy application.

You can find the code for both the dashboard and the API in the python-scripts/demo subdirectory of my sujitpal/vespa-poc repository. I factored out the application functionality into its own "package" (demo_utils.py) so it can be used from both Streamlit and FastAPI.

If you have read this far, your probably realize that the title of the post is somewhat misleading. This post has been more about the visible artifacts of my first toy Vespa application, rather than about learning Vespa itself. However, I decided to keep the title as-is, since it was a natural lead-in for my dad joke in the next line. For a more thorough coverage of my experience with Learning Vespa, I will point you once again to my blog post How I learned Vespa by thinking in Solr. Hopefully you will find that as interesting (if not more) as you found this post.