It’s a foodie’s worst nightmare: you find yourself surrounded by baskets of fine delicacies, but unable to order due to a language barrier and a lack of familiarity with the dishes
In real life, this happens every day, as travelers to Southern China, Hong Kong, or even your local Chinatown, have difficulty ordering dim sum.
The Problem
There’s frequently a knowledge gap between the restaurant and the patrons, as people who didn’t grow up eating dim sum lack a visual familiarity with the offerings. On top of that, if there’s a language gap, and no multi-lingual menu, the ordering process quickly falls apart.
As a customer, it’s difficult to tell what’s inside a basket of dumplings before you actually take a bite. Imagine if you’re dining out with people with food allergies, or dietary restrictions. If you can’t communicate with your waiter effectively, it becomes a big problem.
The Solution
Instead of the scenarios above, what if you could simply reach into your pocket, pull out your phone and immediately identify each of the dozens of dishes?
As you wave your phone over the steaming baskets, the names pop up on screen, quickly allowing you to identify what’s inside. Have friends that don’t eat pork? No problem. Need a substitute for steamed pork buns? The app can show you common alternatives, directing you towards vegetarian or chicken options that you can point to, or search the menu for.
With real-time identification of dishes, Dim Sum Detective enables an entirely new way to communicate with servers. Now instead of a grumpy smirk, the old ladies who push the carts might even give you half a smile.
Creating a deep learning model to identify dim sum
O.K., so we’ve identified a problem and a potential solution, now how do we go about making it?
There are many steps, but just three major ones: build a custom dataset, train a model, and deploy it as an app.
Feeling anxious to try it? You can test how the model performs using the demo app linked below. The current version can identify: shrimp dumplings ‘har gow’, steamed bbq pork buns ‘charsiu bao’, pork and shrimp dumplings ‘siu mai’, and “soup dumplings” xiao long bao.
The web-based interface allows you to submit an image from your phone’s library, or by taking a photo with your camera. If you’re on a laptop, you press ‘Select Image’ to upload an image, and then submit it by pressing ‘Analyze’.
Good sample photos to test the model contain just one type of dim sum at a time, like the three below:
Feel free to use these images to test it out. Update: The demo version is currently offline, but I’ll add a video of it in action here soon.
Telling one dish from another is not as easy as you think! The model works by learning to identify patterns that provide clues as to what an object is.
Next Steps
There are several steps I’d like to take to develop Dim Sum Detective further in 2020. I’m going to train the model on more types of dim sum and then design a full user interface for the app. This is just the early stages of the app’s development! Stay tuned for part II.