According to the Pfeiffer report, 80% of unstructured content is image-based and 10% of it is video-based. With the appetite for visual media growing (42% of Gen Z uses TikTok for search) and generative AI usage increasing every day, the amount of unstructured content is set to explode. Managing digital assets effectively in the face of this is critical.
In this post, we’ll show you the challenges of managing digital assets, what kind of problems it can lead to, and how AI-first multimodal search can solve these issues and even unlock new ways of working.
Digital assets have a search problem
Anyone who works with a digital asset library wants to do one thing: find the right asset at the right time. Sounds obvious, right?
Still, 33% of teams waste three weeks per year searching for assets (Canto, 2021) and 52% of organizations reported their employees can’t find the information they need (AIIM, 2018).
Assets may be scattered over multiple stores (Google Drive, Dropbox, etc) which forces users to perform multiple searches in different places. Even with a Digital Asset Management (DAM) system to keep everything in one place, search can still fall short.
When you need to find something specific like a skydiving clip with a company logo on the parachute, or a 3D asset of foliage for a scene, being able to express what you want in plain language and have the system understand and retrieve what you’re looking for is vital.
Isn’t that what tagging is supposed to solve?
There are three problems with tagging:
- Maintaining discoverability with tagging is a time-consuming chore.
- It’s impossible to think of all the ways someone might search for something (e.g. mood, colors, visual perspective, purpose), so tagging doesn’t lead to adequate search support.
- Automatic taggers are unreliable. We know because we tested a bunch of them. Contact us if you’re interested in details.
The combination of shallow, keyword-only search combined with tagging chores lead to cumbersome workflows, wasted efforts duplicating work that already exists, and missed opportunities.
How multimodal search addresses all these problems
Here’s multimodal search in one sentence:
Multimodal search is a semantic search method that allows users to find and explore different types of media, such as text, images, and videos, using simple and descriptive language queries.
To make this possible, there needs to be a platform which can ingest and automatically “understand” media of any kind. So given an image for example, the platform should see objects and landmarks, and attributes such as shapes and colors. Beyond that, it should also be aware of sentiment and intent.
At Kailua Labs, we’ve built exactly that. Check out how it works on a set of stock images:
With a system like this, companies achieve three improvements for digital asset management:
- They don’t need to spend any time tagging to make assets discoverable. Just load the data and they’re done.
- Users can now search using descriptive language to find exactly what they’re looking for.
- Greater asset usage. The easier users find what already exists, the less chance they’ll recreate it or settle for a subpar asset.
Multimodal search also unlocks new opportunities
Here’s an example: social media is moving from a social graph to a content graph. Increasingly, what’s being recommended is whatever the algorithm thinks a user will engage with. This is driving creators and companies to market with memes which are cheap, relatable, solicit instant feedback, and results in up to 60% organic engagement.
If a trend or a meme takes off, and you want to play off that using your company’s brand, finding the right asset “at the speed of memes” makes the difference between driving traffic or being leapfrogged by the competition.
If you’re interested in helping your team be more efficient and less frustrated, let's chat!