As the dust settles, focus is shifting away from potential-promising research and towards implementable products which can add real value to people; away from the math behind the models and towards the agents that can be built upon them.
True to its name, an AI agent is simply an entity which is artificially intelligent (i.e. good at mimicking human intelligence) and be of some- often specialized- use. This can look like voice-bots, self-drivers, editors, recommendation-engines, investment advisors, and whatever else you can think of as having an algorithmic job.
To be more useful as agents, they often collaborate with other single-purpose AI agents to diversify their skill-set. This complicates the unnecessary conversation of what is and isn’t an AI agent, and hence, as suggested by Andrew Ng, we will be referring to this technology rather as agentic AI when the waters get muddy.
The goal for this blog is to look at some of the applicative areas of agentic AI as well as the undercurrent skills that enable this application. Importantly, AI agents need to have some degree of autonomy- be that in decision making, actions, or information capture- for them to be of practical help. This motivates certain pipelines and workflows fitting better than others, in turn motivating the examples we chose to look at.
Natural Language Processing
producing models like the ChatGPT and Claude series. This is the technology that helps AI read, write, and chat.
Agents that exploit this are more front-facing than most, commonly seeing deployment in customer-service and support. Being unexciting, algorithmic, and crucial, customer-support fits the bill perfectly.
Agentic NLP is also reasonably good at detecting sentiments from text; whether a user is happy, irritated, impatient, affirming etc is now extractable data. Therefore, agents are also finding use in research regarding sentiments in mass-media or the internet, how it changes with changing coverage-styles, media-houses, and so forth.
Deep Learning
Most machine learning seeing the light of day today is deep-learning: a subclass which is designed to simulate the neural linkages of a human-brain. In turn, this leads to more human-like downstream-behavior, the poster child for which has become pattern recognition.
Expectedly, then, detecting fraudulent behavior of a particular user becomes easier for a bank with these agents. Similar degrees of success are found in personalized recommendations and predictive maintenance. The underlying themes are huge amounts of data and anomaly-detection (positive or negative).
These agents excel at finding needles in haystacks, and as long as your problem has this structure, they have you covered.
Computer Vision
The visual counterpart of NLP, computer vision is the machine learning of images and videos. The natural progression for a lot of NLP technology is giving eyes to the agent, like Open AI did with GPT 4o. The ‘brain’ stays the same, but now the agent can see more things than simple text and reason about them.
So, similar to NLP, the use-cases that click have traits of algorithmic and repetitive decision-making which now require visual inputs.
Can I discard a product in my manufacturing pipeline by looking at visual damage? Yes, and so can my agentic solution. A particularly impressive agentic feat powered by computer vision is autonomous vehicles navigation. Taking side-mirror back-mirror visuals, traffic signal data, shortest-route suggestions in real-time and converting the decisions made off of them to actions that drive around a car is the perfect symbol for what agentic AI is. The core guiding principles are machine learning algorithms, yes, but in-addition the program is interacting with its environment in a rich, timely, complicated fashion. This is what the promise is!
Robotic Process Automation (RPA)
Much less impressive but much more bang-for-buck are agentic solutions which can scrape, validate, format, and maintain data that would’ve wasted countless human-hours for companies. Drawing on principles from NLP, RPA-centric agents can also be deployed to work with textual-data on the back-burner, serving businesses internally.
Processing claims, compliance regulation, HR onboarding etc. are end-to-end solvable problems once agentic pipelines are configured as per company nuances and trail right behind customer support in deployment frequency.
Speech
Read our detailed coverage on voice-to-voice agents here. For a brief overview, text-based and textless NLP solutions are being implemented in multimodal agents, adding a third to text and video.
Audio recognition and synthesis are both skills that have rudimentary bases in agents and are being polished with continuous research.
We again circle-back to customer-support, but now the agent is able to pick up the phone. Reading tones, moods, and accents, understanding queries, looking up company-policies and guidelines, and troubleshooting all in a human voice makes a world of difference to customers who don’t have the time to text for support.
Simpler versions of speech-agents have already displayed abilities of taking instructions and managing music, lights, and temperature during the Internet-of-Things (IoT) boom. These are only going to gain skills with time, possibly transforming to full-fledged personal-assistants.
Explainable AI (XAI)
Still more of an area of interest in research than business, XAI is interested in making AI decision-making processes more transparent and understandable to humans.
Rather than a standalone focus area, XAI can augment the previously mentioned technology to become usable in more sensitive, high-stake scenarios.
For instance, a clinical agentic solution could use RPA and Speech to collect patient data including medical history and lab results, use NLP and CV to process these and image-scans, and XAI to explain the rationale behind each suggestion, referencing specific data points and medical guidelines. This fosters trust and transparency where the absence of these would have prevented the use of AI solutions at all.
Taking models to agentic solutions requires work - both in identifying where to apply them and determining the extent of their implementation. The goal has always been to emulate human performance as closely as possible. In addition to cognitive abilities, this has also meant replicating sensory input pathways and increasing autonomy - all with the aim of creating more capable and helpful agents.
Moving these agents from proofs-of-concept to deployment is all the wait before your enterprise can reap the benefits of agentic AI. Reach out to us at hello@nurix.ai to take the step forward!