BETA
This is a BETA experience. You may opt-out by clicking here

More From Forbes

Edit Story

How ‘Seeing’ AI Focuses On Large Vision Models

Following

AI is agnostic, thankfully. As software developers now create the new breed of Artificial Intelligence (AI) enriched applications that we will use to drive our lives, we can be perhaps thankful of the fact that AI holds no grudges, harbors no preferences and is agnostically unphased about where it is applied, what tasks it is given and who ends up using it.

This reality (albeit a largely virtual one) means that we can apply AI to petrochemical installations in the oil and gas sector, we can use it in financial markets and it’s use cases also extend to running small businesses that specialize in bespoke cup cake production.

AI works anywhere, but it also works on any thing.

AI for images

Driving AI into the image space is Landing AI, a computer vision cloud platform company that specializes in helping enterprises build what it calls domain-specific Large Vision Models (LVMs).

Just to break that down: the domain-specific element refers to specialist image library collections specific to individual industries and in this instance that includes agriculture, medical devices, food & beverage, manufactring etc. plus we might also add in ‘infrastructure’ (as in civil engineering city infrastructures) as well. In this context, domain-specific also means LVMs are trained using the enterprise's own private images i.e. many companies have hundreds of thousands, millions or billions of images, most of which differ from Internet images that other models have been trained on.

Additionally, just as a Large Language Model (LLM) is a collection of text-based intelligence, facts and sentence strings, words and values, a Large Vision Model (LVM) is a collection of images depicting groups of objects and things in various stages of classification, or not, as the case may be.

Landing AI says that its work enables businesses with vast image libraries to bring AI to their proprietary image data, allowing versatile applications within their domain to meet business needs. Using LVMs, enterprises are promised the ability to unlock intelligence from their images at a much faster pace than before while protecting their privacy with domain-specific LVMs.

So can Landing AI provide us with a working example? Let’s take biotech and pharmaceutical where a microparticle in a syringe can mean the difference between life and death.

Spot the microparticle

“High volume fluid test machines require dynamic measurement of relative volumes for accurate diagnostics to save millions of lives from early disease detection. Landing AI’s visual inspection workflow empowers teams and inspectors to build reliable AI models that solve problems previously considered impossible to automate for biotech companies,” notes the company, on its web pages.

With the company’s LVM technology, companies can take unlabeled image data and create high-performing LVMs that serve as a foundation to solve a diverse set of computer vision tasks in their specific domains. This will occur much faster than with traditional approaches because companies will save months of work by not having to label vast image libraries. And they'll see improved accuracy and performance in terms of completed computer vision tasks given the intelligence of the LVM.

"The Large Vision Model revolution is following the Large Language Model revolution, but with one key difference - whereas Internet text that LLMs learned from are similar enough to most corporate text for the model to apply, a lot of companies in manufacturing, life sciences, geo-spatial data, agriculture, retail, and other sectors have proprietary images that look nothing like the typical Instagram pictures found online," said Andrew Ng, Landing AI CEO. "That's why developing domain-specific LVMs is key to unlocking the value of the images in these domains."

Histopathology images

Landing AI is building and running domain-specific LVMs for enterprises in scenarios where (as another example) there is a need to analyze include such things as production line images for finding defects in manufacturing or histopathology (the diagnosis and study of diseases of the tissues) images for finding cancer cells in life sciences.

While generic LVMs built on Internet images are one size fits all, the Landing AI LVMs focus on one domain at a time, helping solve proprietary problems facing enterprises. Today enterprises spend a non-trivial amount of effort training individual models for each vision task, even when these tasks belong to the same business domain. With domain-specific LVMs, the goal is for companies to use a limited set of LVMs, one for each business domain, and meet their needs to solve a multitude of vision tasks in each domain.

The company says that through the use of LVMs, companies will more quickly pinpoint solutions for tasks such as object detection, image segmentation, visual prompting, or other AI vision-enabled applications. The addition of LVM capability to Landing AI runs in line with the organization’s work in generative AI. In April, it announced its Visual Prompting capability as part of its LandingLens offering.

As a final example here, AI in the form of LVMs has been used in the food industry for a long time. It is now moving out of the processing plant and closer to the field. Vision systems are being used to help farmers to optimize yields, minimize the use of chemicals for greatest return and sustainability; and take over jobs such as weeding and picking.

How many more large models?

What next? Large Sound Models (LSM) for audio signal processing? Yes, they already exist. Large Touch Models (LTMs) don’t necessarily exist, but NTT is already work on haptic sensory touch sharing simulation technologies at the NTT Docomo labs in Tokyo. Haptic (i.e. touch-related) information is quantified in terms of human-touch vibrations measured with a device similar to a Piezoelectric sensor.

We might have to get ready for Large Smell Models (LSMs) is we can perfect machine olfacation soon. After all that, Large Emotion Models (LEMs) may start tracking our ability to fall in love. Let’s hope that part of life (even with the existence of dating apps) stays mostly organic and natural for now shall we?

Follow me on Twitter or LinkedIn