Browsing: Tech

tech news

Speed Demon: NVIDIA Blackwell Takes Pole Position in Latest MLPerf Inference Results​on April 2, 2025 at 3:00 pm

In the latest MLPerf Inference V5.0 benchmarks, which reflect some of the most challenging inference scenarios, the NVIDIA Blackwell platform set records — and marked NVIDIA’s first MLPerf submission using the NVIDIA GB200 NVL72 system, a rack-scale solution designed for AI reasoning. Delivering on the promise of cutting-edge AI takes a new kind of compute
Read ArticleIn the latest MLPerf Inference V5.0 benchmarks, which reflect some of the most challenging inference scenarios, the NVIDIA Blackwell platform set records — and marked NVIDIA’s first MLPerf submission using the NVIDIA GB200 NVL72 system, a rack-scale solution designed for AI reasoning. Delivering on the promise of cutting-edge AI takes a new kind of compute
Read Article  

 

In the latest MLPerf Inference V5.0 benchmarks, which reflect some of the most challenging inference scenarios, the NVIDIA Blackwell platform set records — and marked NVIDIA’s first MLPerf submission using the NVIDIA GB200 NVL72 system, a rack-scale solution designed for AI reasoning.

Delivering on the promise of cutting-edge AI takes a new kind of compute infrastructure, called AI factories. Unlike traditional data centers, AI factories do more than store and process data — they manufacture intelligence at scale by transforming raw data into real-time insights. The goal for AI factories is simple: deliver accurate answers to queries quickly, at the lowest cost and to as many users as possible.

The complexity of pulling this off is significant and takes place behind the scenes. As AI models grow to billions and trillions of parameters to deliver smarter replies, the compute required to generate each token increases. This requirement reduces the number of tokens that an AI factory can generate and increases cost per token. Keeping inference throughput high and cost per token low requires rapid innovation across every layer of the technology stack, spanning silicon, network systems and software.

The latest updates to MLPerf Inference, a peer-reviewed industry benchmark of inference performance, include the addition of Llama 3.1 405B, one of the largest and most challenging-to-run open-weight models. The new Llama 2 70B Interactive benchmark features much stricter latency requirements compared with the original Llama 2 70B benchmark, better reflecting the constraints of production deployments in delivering the best possible user experiences.

In addition to the Blackwell platform, the NVIDIA Hopper platform demonstrated exceptional performance across the board, with performance increasing significantly over the last year on Llama 2 70B thanks to full-stack optimizations.

NVIDIA Blackwell Sets New Records

The GB200 NVL72 system — connecting 72 NVIDIA Blackwell GPUs to act as a single, massive GPU — delivered up to 30x higher throughput on the Llama 3.1 405B benchmark over the NVIDIA H200 NVL8 submission this round. This feat was achieved through more than triple the performance per GPU and a 9x larger NVIDIA NVLink interconnect domain.

While many companies run MLPerf benchmarks on their hardware to gauge performance, only NVIDIA and its partners submitted and published results on the Llama 3.1 405B benchmark.

Production inference deployments often have latency constraints on two key metrics. The first is time to first token (TTFT), or how long it takes for a user to begin seeing a response to a query given to a large language model. The second is time per output token (TPOT), or how quickly tokens are delivered to the user.

The new Llama 2 70B Interactive benchmark has a 5x shorter TPOT and 4.4x lower TTFT — modeling a more responsive user experience. On this test, NVIDIA’s submission using an NVIDIA DGX B200 system with eight Blackwell GPUs tripled performance over using eight NVIDIA H200 GPUs, setting a high bar for this more challenging version of the Llama 2 70B benchmark.

Combining the Blackwell architecture and its optimized software stack delivers new levels of inference performance, paving the way for AI factories to deliver higher intelligence, increased throughput and faster token rates.

NVIDIA Hopper AI Factory Value Continues Increasing

The NVIDIA Hopper architecture, introduced in 2022, powers many of today’s AI inference factories, and continues to power model training. Through ongoing software optimization, NVIDIA increases the throughput of Hopper-based AI factories, leading to greater value.

On the Llama 2 70B benchmark, first introduced a year ago in MLPerf Inference v4.0, H100 GPU throughput has increased by 1.5x. The H200 GPU, based on the same Hopper GPU architecture with larger and faster GPU memory, extends that increase to 1.6x.

Hopper also ran every benchmark, including the newly added Llama 3.1 405B, Llama 2 70B Interactive and graph neural network tests. This versatility means Hopper can run a wide range of workloads and keep pace as models and usage scenarios grow more challenging.

It Takes an Ecosystem

This MLPerf round, 15 partners submitted stellar results on the NVIDIA platform, including ASUS, Cisco, CoreWeave, Dell Technologies, Fujitsu, Giga Computing, Google Cloud, Hewlett Packard Enterprise, Lambda, Lenovo, Oracle Cloud Infrastructure, Quanta Cloud Technology, Supermicro, Sustainable Metal Cloud and VMware.

The breadth of submissions reflects the reach of the NVIDIA platform, which is available across all cloud service providers and server makers worldwide.

MLCommons’ work to continuously evolve the MLPerf Inference benchmark suite to keep pace with the latest AI developments and provide the ecosystem with rigorous, peer-reviewed performance data is vital to helping IT decision makers select optimal AI infrastructure.

Learn more about MLPerf

Images and video taken at an Equinix data center in the Silicon Valley.

 

In the latest MLPerf Inference V5.0 benchmarks, which reflect some of the most challenging inference scenarios, the NVIDIA Blackwell platform set records — and marked NVIDIA’s first MLPerf submission using the NVIDIA GB200 NVL72 system, a rack-scale solution designed for AI reasoning. Delivering on the promise of cutting-edge AI takes a new kind of compute
Read Article

NVIDIA’s Jacob Liberman on Bringing Agentic AI to Enterprises​on April 2, 2025 at 4:00 pm

AI is rapidly transforming how organizations solve complex challenges. The early stages of enterprise AI adoption focused on using large language models to create chatbots. Now, enterprises are using agentic AI to create intelligent systems that reason, act and execute complex tasks with a degree of autonomy. Jacob Liberman, director of product management at NVIDIA,
Read ArticleAI is rapidly transforming how organizations solve complex challenges. The early stages of enterprise AI adoption focused on using large language models to create chatbots. Now, enterprises are using agentic AI to create intelligent systems that reason, act and execute complex tasks with a degree of autonomy. Jacob Liberman, director of product management at NVIDIA,
Read Article  

 

AI is rapidly transforming how organizations solve complex challenges.

The early stages of enterprise AI adoption focused on using large language models to create chatbots. Now, enterprises are using agentic AI to create intelligent multi-agent systems that reason, act and execute complex tasks with a degree of autonomy.

Jacob Liberman, director of product management at NVIDIA, joined the NVIDIA AI Podcast to explain how agentic AI bridges the gap between powerful AI models and practical enterprise applications.

Enterprises are deploying AI agents to free human workers from time-consuming and error-prone tasks. This allows people to spend more time on high-value work that requires creativity and strategic thinking.

Liberman anticipates it won’t be long before teams of AI agents and human workers collaborate to tackle complex tasks requiring reasoning, intuition and judgement. For example, enterprise software developers will work with AI agents to develop more efficient algorithms. And medical researchers will collaborate with AI agents to design and test new drugs.

NVIDIA AI Blueprints help enterprises build their own AI agents – including many of the use cases listed above.

“Blueprints are reference architectures implemented in code that show you how to take NVIDIA software and apply it to some productive task in an enterprise to solve a real business problem,” Liberman said.

The blueprints are entirely open source. A developer or service provider can deploy a blueprint directly, or customize it by integrating their own technology.

Liberman highlighted the versatility of the AI Blueprint for customer service, for example, which features digital humans.

“The digital human can be made into a bedside digital nurse, a sportscaster or a bank teller with just some verticalization,” he said.

Other popular NVIDIA Blueprints include a video search and summarization agent, an enterprise multimodal PDF chatbot and a generative virtual screening pipeline for drug discovery.

Time Stamps: 

1:14 – What is an AI agent?

17:25 – How software developers are early adopters of agentic AI.

19:50 – Explanation of test-time compute and reasoning models.

23:05 – Using AI agents in cybersecurity and risk management applications.

You Might Also Like…

Imbue CEO Kanjun Que on Transforming AI Agents Into Personal Collaborators

Kanjun Qiu, CEO of Imbue, discusses the emerging era of personal AI agents, drawing a parallel to the PC revolution and explaining how modern AI systems are evolving to enhance user capabilities through collaboration.

Telenor’s Kaaren Hilsen on Launching Norway’s First AI Factory

Kaaren Hilsen, chief innovation officer and head of the AI factory at Telenor, highlights Norway’s first AI factory, which securely processes sensitive data within the country while promoting data sovereignty and environmental sustainability through green computing initiatives, including a renewable energy-powered data center in Oslo.

Firsthand’s Jon Heller Shares How AI Agents Enhance Consumer Journeys in Retail 

Jon Heller of Firsthand explains how the company’s AI Brand Agents are boosting retail and digital marketing by personalizing customer experiences and converting marketing interactions into valuable research data.

 

AI is rapidly transforming how organizations solve complex challenges. The early stages of enterprise AI adoption focused on using large language models to create chatbots. Now, enterprises are using agentic AI to create intelligent systems that reason, act and execute complex tasks with a degree of autonomy. Jacob Liberman, director of product management at NVIDIA,
Read Article

No Foolin’: GeForce NOW Gets 21 Games in April​on April 3, 2025 at 1:00 pm

GeForce NOW isn’t fooling around. This month, 21 games are joining the cloud gaming library of over 2,000 titles. Whether chasing epic adventures, testing skills in competitive battles or diving into immersive worlds, members can dive into April’s adventures arrivals, which are truly no joke. Get ready to stream, play and conquer the eight games
Read ArticleGeForce NOW isn’t fooling around. This month, 21 games are joining the cloud gaming library of over 2,000 titles. Whether chasing epic adventures, testing skills in competitive battles or diving into immersive worlds, members can dive into April’s adventures arrivals, which are truly no joke. Get ready to stream, play and conquer the eight games
Read Article  

 

GeForce NOW isn’t fooling around.

This month, 21 games are joining the cloud gaming library of over 2,000 titles. Whether chasing epic adventures, testing skills in competitive battles or diving into immersive worlds, members can dive into April’s adventures arrivals, which are truly no joke.

Get ready to stream, play and conquer the eight games available this week. Members can also get ahead of the pack with advanced access to South of Midnight, streaming soon before launch.

Unleash the Magic

South of Midnight, an action-adventure game developed by Compulsion Games, offers advanced access for gamers who purchase its Premium Edition. Dive into the title’s hauntingly beautiful world before launch, exploring its rich Southern gothic setting and unique magical combat system while balancing magic with melee attacks.

South of Midnight Advanced Access on GeForce NOW
Step into the shadows.

Set in a mystical version of the American South, the game combines elements of magic, mystery and adventure, weaving a compelling story that draws players in. The endless opportunities for exploration and combat, along with deep lore and engaging characters, make the game a must-play for fans of the action-adventure genre.

With its blend of dark fantasy and historical influences, South of Midnight is poised to deliver a unique gaming experience that will leave players spellbound.

GeForce NOW members can be among the first to get advanced access to the game without the hassle of downloads or updates. With an Ultimate or Performance membership, experience the game’s haunting landscapes and cryptid encounters with the highest frame rates and lowest latency — no need for the latest hardware.

April Is Calling

Call of Duty Warzone Season 3 on GeForce NOW
Verdansk is back! Catch it in the cloud.

Verdansk, the original and iconic map from Call of Duty: Warzone, is making its highly anticipated return in the game’s third season, and available to stream on GeForce NOW. Known for its sprawling urban areas, rugged wilderness and points of interest like Dam and Superstore, Verdansk offers a dynamic battleground for intense combat. The map has been rebuilt from the ground up with key enhancements across audio, visuals and gameplay, getting back to basics and delivering nostalgia for fans.

Look for the following games available to stream in the cloud this week:

Here’s what to expect for April: 

  • South of Midnight (New release on Steam and Xbox, available on PC Game Pass, April 8)
  • Commandos Origins (New release on Steam and Xbox, available on PC Game Pass, April 9)
  • The Talos Principle: Reawakened (New release on Steam, April 10)
  • Night Is Coming (New release on Steam, April 14)
  • Mandragora: Whispers of the Witch Tree (New release on Steam, April 17)
  • Sunderfolk (New release on Steam, April 23)
  • Clair Obscur: Expedition 33 (New release on Steam and Xbox, available on PC Game Pass, April 24)
  • Tempest Rising (New release on Steam, April 24)
  • Aimlabs (Steam)
  • Backrooms: Escape Together (Steam)
  • Blood Strike (Steam) 
  • ContractVille (Steam)
  • EXFIL (Steam)

March Madness

In addition to the 14 games announced last month, 26 more joined the GeForce NOW library:

What are you planning to play this weekend? Let us know on X or in the comments below.

 

GeForce NOW isn’t fooling around. This month, 21 games are joining the cloud gaming library of over 2,000 titles. Whether chasing epic adventures, testing skills in competitive battles or diving into immersive worlds, members can dive into April’s adventures arrivals, which are truly no joke. Get ready to stream, play and conquer the eight games
Read Article

Nintendo Switch 2 Leveled Up With NVIDIA AI-Powered DLSS and 4K Gaming​on April 3, 2025 at 1:00 pm

The Nintendo Switch 2, unveiled April 2, takes performance to the next level, powered by a custom NVIDIA processor featuring an NVIDIA GPU with dedicated RT Cores and Tensor Cores for stunning visuals and AI-driven enhancements. With 1,000 engineer-years of effort across every element — from system and chip design to a custom GPU, APIs
Read ArticleThe Nintendo Switch 2, unveiled April 2, takes performance to the next level, powered by a custom NVIDIA processor featuring an NVIDIA GPU with dedicated RT Cores and Tensor Cores for stunning visuals and AI-driven enhancements. With 1,000 engineer-years of effort across every element — from system and chip design to a custom GPU, APIs
Read Article  

 

The Nintendo Switch 2, unveiled April 2, takes performance to the next level, powered by a custom NVIDIA processor featuring an NVIDIA GPU with dedicated RT Cores and Tensor Cores for stunning visuals and AI-driven enhancements.

With 1,000 engineer-years of effort across every element — from system and chip design to a custom GPU, application programming interfaces (APIs) and world-class development tools — the Nintendo Switch 2 brings major upgrades.

The new console enables up to 4K gaming in TV mode and up to 120 frames per second at 1080p in handheld mode. Nintendo Switch 2 also supports high dynamic range and AI upscaling to sharpen visuals and smooth gameplay.

AI and Ray Tracing for Next-Level Visuals

The new RT Cores bring real-time ray tracing, delivering lifelike lighting, reflections and shadows for more immersive worlds.

Tensor Cores power AI-driven features like Deep Learning Super Sampling (DLSS), boosting resolution for sharper details without sacrificing image quality.

Tensor Cores also enable AI-powered face tracking and background removal in video chat use cases, enhancing social gaming and streaming.

With millions of players worldwide, the Nintendo Switch has become a gaming powerhouse and home to Nintendo’s storied franchises. Its hybrid design redefined console gaming, bridging TV and handheld play.

More Power, Smoother Gameplay

With 10x the graphics performance of the Nintendo Switch, the Nintendo Switch 2 delivers smoother gameplay and sharper visuals.

  • Tensor Cores boost AI-powered graphics while keeping power consumption efficient.
  • RT Cores enhance in-game realism with dynamic lighting and natural reflections.
  • Variable refresh rate via NVIDIA G-SYNC in handheld mode ensures ultra-smooth, tear-free gameplay.

Tools for Developers, Upgrades for Players

Developers get improved game engines, better physics and optimized APIs for faster, more efficient game creation.

Powered by NVIDIA technologies, Nintendo Switch 2 delivers for both players and developers.

 

The Nintendo Switch 2, unveiled April 2, takes performance to the next level, powered by a custom NVIDIA processor featuring an NVIDIA GPU with dedicated RT Cores and Tensor Cores for stunning visuals and AI-driven enhancements. With 1,000 engineer-years of effort across every element — from system and chip design to a custom GPU, APIs
Read Article

NVIDIA Showcases Real-Time AI and Intelligent Media Workflows at NAB​on April 3, 2025 at 3:00 pm

Real-time AI is unlocking new possibilities in media and entertainment, improving viewer engagement and advancing intelligent content creation.  At NAB Show, a premier conference for media and entertainment running April 5-9 in Las Vegas, NVIDIA will showcase how emerging AI tools and the technologies underpinning them help streamline workflows for streamers, content creators, sports leagues
Read ArticleReal-time AI is unlocking new possibilities in media and entertainment, improving viewer engagement and advancing intelligent content creation.  At NAB Show, a premier conference for media and entertainment running April 5-9 in Las Vegas, NVIDIA will showcase how emerging AI tools and the technologies underpinning them help streamline workflows for streamers, content creators, sports leagues
Read Article  

 

Real-time AI is unlocking new possibilities in media and entertainment, improving viewer engagement and advancing intelligent content creation. 

At NAB Show, a premier conference for media and entertainment running April 5-9 in Las Vegas, NVIDIA will showcase how emerging AI tools and the technologies underpinning them help streamline workflows for streamers, content creators, sports leagues and broadcasters.  

Attendees can experience the power of the NVIDIA Blackwell platform, which serves as the foundation of NVIDIA Media2 — a collection of NVIDIA technologies including NVIDIA NIM microservices and NVIDIA AI Blueprints for live video analysis, accelerated computing platforms and generative AI software.   

Attendees can also see NVIDIA Holoscan for Media — an advanced real-time AI platform designed for live media workflows and applications — in action at the Dell booth, as well as experience the NVIDIA AI Blueprint for video search and summarization, which makes it easy to build and customize video analytics AI agents.  

NVIDIA will also present in these sessions: 

Driving Innovation With Partners  

Partners across the industry are showcasing innovative solutions using NVIDIA technologies to accelerate live media. 

Amazon Web Services (booth W1701) will collaborate with NVIDIA to showcase an esport racing challenge through a live cloud production. The professional-grade racing simulator allows users to analyze their performance through cutting-edge AI-powered insights and step into the spotlight for their own post-race interview. Other demos will offer a peek into the future of live cloud production and generative AI in sports broadcasting. 

Beamr (booth SL1730MR) will demonstrate how it’s driving AV1 adoption with GPU-accelerated video processing. Beamr’s technology, powered by the NVIDIA NVENC encoder, enables cost-efficient, high-quality and scalable AV1 transformation. 

Dell (booth SL4616) is collaborating with a wide range of partners to highlight their latest innovations in the media industry. Autodesk will feature its Flame visual effects software for AI-driven compositing; Avid will demonstrate real-time editing and AI metadata tagging on Dell Pro Max high-performance PCs; and Boris FX and RE:Vision Effects will showcase their motion-tracking, slow-motion interpolation and object-removal technologies — all running on NVIDIA accelerated computing. In addition, Speed Read AI will showcase the use of NVIDIA RTX-powered workstations to analyze scripts in seconds, while Arcitecta and Elements will demonstrate high-speed media collaboration and post-production workflows on Dell PowerScale storage.  

HP (booth SL3723) will showcase its desktop and mobile workstation portfolio with NVIDIA RTX PRO Blackwell GPUs, delivering cutting-edge AI performance in a variety of use cases. Attendees can also find HP’s newly announced AI solutions, the HP ZGX Nano AI Station G1n and HP ZGX Fury AI Station G1n, developed in collaboration with NVIDIA.  

Qvest (booth W2055) will spotlight two new AI solutions that help clients increase audience engagement, simplify insight gathering and streamline workflows. The Agentic Live Multi-Camera Video Event Extractor identifies, detects and extracts near-real-time events into structured outputs in an easily configurable, natural language, no-code interface, and the No-Code Media-Centric AI Agent Builder extracts meaningful structured data from unstructured media formats including video, images and complex documents. Both use NVIDIA NIM microservices, NVIDIA NeMo, NVIDIA Holoscan for Media, the NVIDIA AI Blueprint for video search and summarization and more. 

Monks (booth W2530) will announce its complete suite of products and services for the media and entertainment industry, designed to drive innovation, monetization and efficiency. Monks uses tools under NVIDIA Media2, such as NIDIA NIM microservices and Holoscan for Media, to enable real-time audience feedback, AI-powered selective encoding and contextual content analysis for large archives. The company will also launch a new suite of vision language model service offerings with its strategic partner TwelveLabs.  

Supermicro (booth W3713) will demonstrate the ease of setting up and running a complete AI video pipeline with WAN 2.1 and Adobe Premiere Pro, all running on the new high-performance Supermicro AS -531AW-TC workstation with an NVIDIA RTX PRO 6000 Blackwell Workstation Edition GPU. With RAVEL Orchestrate handling workstation and AI cluster orchestration, everything can run smoothly — from setup and deployment to user access and workload management.  

Speechmatics (booth W2317) will demonstrate its speech-to-text technology, which taps into NVIDIA accelerated computing to deliver highly accurate, real-time transcription across multiple languages and use cases, from media production to broadcast captioning. 

Telestream (booth W1501) will showcase its waveform monitoring solution, which seeks to bridge the gap for cloud-native workflows with a microservices architecture that taps into NVIDIA Holoscan for Media. In collaboration with NVIDIA, Telestream will demonstrate the ability to introduce cloud-native waveform monitoring to replicate broadcast center and master control room capabilities for engineering and creative teams. 

TwelveLabs (booth W3921) will showcase its newest models, which are being trained in part on NVIDIA DGX Cloud, to bring state-of-the-art video understanding to the world’s largest sports teams, clubs and leagues. The company is currently developing models based on NVIDIA NIM microservices to bring media and entertainment customers highly efficient inference and easy integration with leading software frameworks and agentic applications. 

VAST Data (booth SL9213) will spotlight the VAST InsightEngine — a solution that securely ingests, processes, and retrieves all enterprise data in real-time —– in a demo powered by the NVIDIA AI Enterprise software platform. Developed in collaboration with the National Hockey League, the demo showcases instant access to an archive of over 550,000 hours of hockey game footage. The work is set to redefine sponsorship analytics and empower video producers to instantly search, edit and deliver dynamic broadcast clips — fueling hyper-personalized fan experiences. 

Vizrt (booth W3031) will present its solution portfolio, which when matched with NVIDIA accelerated computing and NVIDIA Maxine technology, simplifies complex processes to support the immersive talent reflections, shadow casting and 3D pose tracking of Reality Connect, in addition to Particle Effects, Talent Gesture Control, XR Draw and the AI Gaze Correction feature available in the TriCaster Vizion. 

 V-Nova (booth W1252 and W1454) will spotlight its 6DoF virtual-reality experiences with new immersive content — Sharkarma and Weightless in booth W1252 — and AI-accelerated optimization in booth W1454, demonstrating how NVIDIA NVENC and NVIDIA GPUs unlock incredible video quality, efficiency and performance for critical video, AI and VR streaming cloud applications. 

Join NVIDIA at NAB Show 2025

 

Real-time AI is unlocking new possibilities in media and entertainment, improving viewer engagement and advancing intelligent content creation.  At NAB Show, a premier conference for media and entertainment running April 5-9 in Las Vegas, NVIDIA will showcase how emerging AI tools and the technologies underpinning them help streamline workflows for streamers, content creators, sports leagues
Read Article

From Browsing to Buying: How AI Agents Enhance Online Shopping​on April 3, 2025 at 3:00 pm

Editor’s note: This post is part of the AI On blog series, which explores the latest techniques and real-world applications of agentic AI, chatbots and copilots. The series also highlights the NVIDIA software and hardware powering advanced AI agents, which form the foundation of AI query engines that gather insights and perform tasks to transform
Read ArticleEditor’s note: This post is part of the AI On blog series, which explores the latest techniques and real-world applications of agentic AI, chatbots and copilots. The series also highlights the NVIDIA software and hardware powering advanced AI agents, which form the foundation of AI query engines that gather insights and perform tasks to transform
Read Article  

 

Editor’s note: This post is part of the AI On blog series, which explores the latest techniques and real-world applications of agentic AI, chatbots and copilots. The series also highlights the NVIDIA software and hardware powering advanced AI agents, which form the foundation of AI query engines that gather insights and perform tasks to transform everyday experiences and reshape industries.

Online shopping puts a world of choices at people’s fingertips, making it convenient for them to purchase and receive orders — all from the comfort of their homes.

But too many choices can turn experiences from exciting to exhausting, leaving shoppers struggling to cut through the noise and find exactly what they need.

By tapping into AI agents, retailers can deepen their customer engagement, enhance their offerings and maintain a competitive edge in a rapidly shifting digital marketplace.

Every digital interaction results in new data being captured. This valuable customer data can be used to fuel generative AI and agentic AI tools that provide personalized recommendations and boost online sales. According to NVIDIA’s latest State of AI in Retail and Consumer-Packaged Goods report, 64% of respondents investing in AI for digital retail are prioritizing hyper-personalized recommendations.

Smart, Seamless and Personalized: The Future of Customer Experience

AI agents offer a range of benefits that significantly improve the retail customer experience, including:

  • Personalized Experiences: Using customer insights and product information, these digital assistants can deliver the expertise of a company’s best sales associate, stylist or designer — providing tailored product recommendations, enhancing decision-making, and boosting conversion rates and customer satisfaction.
  • Product Knowledge: AI agents enrich product catalogs with explanatory titles, enhanced descriptions and detailed attributes like size, warranty, sustainability and lifestyle uses. This makes products more discoverable and recommendations more personalized and informative, which increases consumer confidence.
  • Omnichannel Support: AI provides seamless integration of online and offline experiences, facilitating smooth transitions between digital and physical retail environments.
  • Virtual Try-On Capabilities: Customers can easily visualize products on themselves or in their homes in real time, helping improve product expectations and potentially lowering return rates.
  • 24/7 Availability: AI agents offer around-the-clock customer support across time zones and languages.

Real-World Applications of AI Agents in Retail

AI is redefining digital commerce, empowering retailers to deliver richer, more intuitive shopping experiences. From enhancing product catalogs with accurate, high-quality data to improving search relevance and offering personalized shopping assistance, AI agents are transforming how customers discover, engage with and purchase products online.

AI agents for catalog enrichment automatically enhance product information with consumer-focused attributes. These attributes can range from basic details like size, color and material to technical details such as warranty information and compatibility.

They also include contextual attributes, like sustainability, and lifestyle attributes, such as “for hiking.” AI agents can also integrate service attributes — including delivery times and return policies — making items more discoverable and relevant to customers while addressing common concerns to improve purchase results.

Amazon faced the challenge of ensuring complete and accurate product information for shoppers while reducing the effort and time required for sellers to create product listings. To address this, the company implemented generative AI using the NVIDIA TensorRT-LLM library. This technology allows sellers to input a product description or URL, and the system automatically generates a complete, enriched listing. The work helps sellers reach more customers and expand their businesses effectively while making the catalog more responsive and energy efficient.

AI agents for search tap into enriched data to deliver more accurate and contextually relevant search results. By employing semantic understanding and personalization, these agents better match customer queries with the right products, making the overall search experience faster and more intuitive.

Amazon Music has optimized its search capabilities using the Amazon SageMaker platform with NVIDIA Triton Inference Server and the NVIDIA TensorRT software development kit. This includes implementing vector search and transformer-based spell-correction models.

As a result, when users search for music — even with typos or vague terms — they can quickly find what they’re looking for. These optimizations, which make the search bar more effective and user friendly, have led to faster search times and 73% lower costs for Amazon Music.

AI agents for shopping assistants build on the enriched catalog and improved search functionality. They offer personalized recommendations and answer queries in a detailed, relevant, conversational manner, guiding shoppers through their buying journeys with a comprehensive understanding of products and user intent.

SoftServe, a leading IT advisor, has launched the SoftServe Gen AI Shopping Assistant, developed using the NVIDIA AI Blueprint for retail shopping assistants. SoftServe’s shopping assistant offers seamless and engaging shopping experiences by helping customers discover products and access detailed product information quickly and efficiently. One of its standout features is the virtual try-on capability, which allows customers to visualize how clothing and accessories look on them in real time.

Defining the Essential Traits of a Powerful AI Shopping Agent

Highly skilled AI shopping assistants are designed to be multimodal, understanding text- and image-based prompts, voice and more through large language models (LLMs) and vision language models. These AI agents can search for multiple items simultaneously, complete complicated tasks — such as creating a travel wardrobe — and answer contextual questions, like whether a product is waterproof or requires drycleaning.

This high level of sophistication offers experiences akin to engaging with a company’s best sales associate, delivering information to customers in a natural, intuitive way.

Diagram showing NVIDIA technologies used to build agentic AI applications, such as NVIDIA AI Blueprints (top), NVIDIA NeMo (middle) and NVIDIA NIM microservices (bottom).
With software building blocks, developers can design an AI agent with various features.

The building blocks of a powerful retail shopping agent include:

  • Multimodal and Multi-Query Capabilities: These agents can process and respond to queries that combine text and images, making search processes more versatile and user friendly. They can also easily be extended to support other modalities such as voice.
  • Integration With LLMs: Advanced LLMs, such as the NVIDIA Llama Nemotron family, bring reasoning capabilities to AI shopping assistants, enabling them to engage in natural, humanlike interactions. NVIDIA NIM microservices provide industry-standard application programming interfaces for simple integration into AI applications, development frameworks and workflows.
  • Management of Structured and Unstructured Data: NVIDIA NeMo Retriever microservices provide the ability to ingest, embed and understand retailers’ suites of relevant data sources, such as customer preferences and purchases, product catalog text and image data, and more, helping ensure AI agent responses are relevant, accurate and context-aware.
  • Guardrails for Brand Safe, On-Topic Conversations: NVIDIA NeMo Guardrails are implemented to help ensure that conversations with the shopping assistant remain safe and on topic, ultimately protecting brand values and bolstering customer trust.
  • State-of-the-Art Simulation Tools: The NVIDIA Omniverse platform and partner simulation technologies can help visualize products in physically accurate spaces. For example, customers looking to buy a couch could preview how the furniture would look in their own living room.

By using these key technologies, retailers can design AI shopping agents that exceed customer expectations, driving higher satisfaction and improved operational efficiency.

Retail organizations that harness AI agents are poised to experience evolving capabilities, such as enhanced predictive analytics for further personalized recommendations.

And integrating AI with augmented- and virtual-reality technologies is expected to create even more immersive and engaging shopping environments — delivering a future where shopping experiences are more immersive, convenient and customer-focused than ever.

Learn more about the AI Blueprint for retail shopping assistants.

 

Editor’s note: This post is part of the AI On blog series, which explores the latest techniques and real-world applications of agentic AI, chatbots and copilots. The series also highlights the NVIDIA software and hardware powering advanced AI agents, which form the foundation of AI query engines that gather insights and perform tasks to transform
Read Article

National Robotics Week — Latest Physical AI Research, Breakthroughs and Resources​on April 6, 2025 at 4:00 pm

This National Robotics Week, running through April 12, NVIDIA is highlighting the pioneering technologies that are shaping the future of intelligent machines and driving progress across manufacturing, healthcare, logistics and more. Check back here throughout the week to learn the latest on physical AI, which enables machines to perceive, plan and act with greater autonomy
Read ArticleThis National Robotics Week, running through April 12, NVIDIA is highlighting the pioneering technologies that are shaping the future of intelligent machines and driving progress across manufacturing, healthcare, logistics and more. Check back here throughout the week to learn the latest on physical AI, which enables machines to perceive, plan and act with greater autonomy
Read Article  

 

Check back here throughout the week to learn the latest on physical AI, which enables machines to perceive, plan and act with greater autonomy and intelligence in real-world environments.

This National Robotics Week, running through April 12, NVIDIA is highlighting the pioneering technologies that are shaping the future of intelligent machines and driving progress across manufacturing, healthcare, logistics and more.

Advancements in robotics simulation and robot learning are driving this fundamental shift in the industry. Plus, the emergence of world foundation models is accelerating the evolution of AI-enabled robots capable of adapting to dynamic and complex scenarios.

For example, by providing robot foundation models like NVIDIA GR00T N1, frameworks such as NVIDIA Isaac Sim and Isaac Lab for robot simulation and training, and synthetic data generation pipelines to help train robots for diverse tasks, the NVIDIA Isaac and GR00T platforms are empowering researchers and developers to push the boundaries of robotics.

Teaching Robots to Think: Nicklas Hansen’s AI Breakthroughs 🔗

What does it take to teach robots complex decision-making in the real world? For Nicklas Hansen, a doctoral candidate at UC San Diego and an NVIDIA Graduate Research Fellow, the answer lies in scalable, robust machine learning algorithms.

With experience from the University of California, Berkeley, Meta AI (FAIR) and the Technical University of Denmark, Hansen is pushing the boundaries of how robots perceive, plan and act in dynamic environments. Their research sits at the intersection of robotics, reinforcement learning and computer vision — bridging the gap between simulation and real-world deployment.

Nicklas Hansen, a doctoral candidate at UC San Diego and an NVIDIA Graduate Research Fellow.

Hansen’s recent work tackles one of robotics’ toughest challenges: long-horizon manipulation. Their paper, Multi-Stage Manipulation With Demonstration-Augmented Reward, Policy and World Model Learning, introduces a framework that enhances data efficiency in sparse-reward environments by using multistage task structures.

Left: A simulated Franka robot solves a peg insertion manipulation task. Right: Hansen’s method, DEMO3, infers task progress directly from raw visual observations.

Another key project of Hansen’s, Hierarchical World Models as Visual Whole-Body Humanoid Controllers, advances control strategies for humanoid robots, enabling more adaptive and humanlike movements.

Beyond their own research, Hansen advocates for making AI-driven robotics more accessible.

“My advice to anyone looking to get started with AI for robotics is to simply play around with the many open-source tools available and gradually start contributing to projects that align with your goals and interests,” they said. “With the availability of free simulation tools like MuJoCo, NVIDIA Isaac Lab and ManiSkill, you can make a profound impact on the field without owning a real robot.”

Hansen is the lead author of TD-MPC2, a model-based reinforcement learning algorithm capable of learning a variety of control tasks without any domain knowledge. The algorithm is open source and can be run on a single consumer-grade GPU.

Learn more about Hansen and other NVIDIA Graduate Fellowship recipients driving innovation in AI and robotics. Watch a replay of the “Graduate Program Fast Forward” session from the NVIDIA GTC AI conference, where doctoral students in the NVIDIA Graduate Fellowship showcased their groundbreaking research.

Hackathon Features Robots Powered by NVIDIA Isaac GR00T N1 🔗

The Seeed Studio Embodied AI Hackathon, which took place last month, brought together the robotics community to showcase innovative projects using the LeRobot SO-100ARM motor kit.

The event highlighted how robot learning is advancing AI-driven robotics, with teams successfully integrating the NVIDIA Isaac GR00T N1 model to speed humanoid robot development. A notable project involved developing leader-follower robot pairs capable of learning pick-and-place tasks by post-training robot foundation models on real-world demonstration data.

How the project worked:

  • Real-World Imitation Learning: Robots observe and mimic human-led demonstrations, recorded through Arducam vision systems and an external camera.
  • Post-Training Pipeline: Captured data is structured into a modality.json dataset for efficient GPU-based training with GR00T N1.
  • Bimanual Manipulation: The model is optimized for controlling two robotic arms simultaneously, enhancing cooperative skills.

The dataset is now publicly available on Hugging Face, with implementation details on GitHub.

Team “Firebreathing Rubber Duckies” celebrating with NVIDIA hosts.

Learn more about the project.

Advancing Robotics: IEEE Robotics and Automation Society Honors Emerging Innovators 🔗

The IEEE Robotics and Automation Society in March announced the recipients of its 2025 Early Academic Career Award, recognizing outstanding contributions to the fields of robotics and automation.

This year’s honorees — including NVIDIA’s Shuran Song, Abhishek Gupta and Yuke Zhu — are pioneering advancements in scalable robot learning, real-world reinforcement learning and embodied AI. Their work is shaping the next generation of intelligent systems, driving innovation that impacts both research and real-world applications.

Learn more about the award winners:

These researchers will be recognized at the International Conference on Robotics and Automation in May.

Stay up to date on NVIDIA’s leading robotics research through the Robotics Research and Development Digest (R2D2) tech blog series, subscribing to this newsletter and following NVIDIA Robotics on YouTube, Discord and developer forums.

 

This National Robotics Week, running through April 12, NVIDIA is highlighting the pioneering technologies that are shaping the future of intelligent machines and driving progress across manufacturing, healthcare, logistics and more. Check back here throughout the week to learn the latest on physical AI, which enables machines to perceive, plan and act with greater autonomy
Read Article