NVIDIA introduced its NemoClaw stack at GTC 2026. This move supports and secures the rapidly expanding open-source AI agent platform, OpenClaw (formerly Moltbot). NVIDIA CEO Jensen Huang described OpenClaw as the next ChatGPT. He highlighted a shift from passive chatbots to proactive AI engines that can replace conventional PC application workflows.  

OpenClaw enables users to run autonomous agents, known as Claws, locally on their personal computers. These agents can autonomously perform a variety of tasks, such as launching applications, managing files, automating workflows, interacting with other software, and processing information all without user input.  

Overview of OpenClaw 

  • Local-first agent OpenClaw is an open-source tool that runs directly on a user’s computer. It works with Windows, Mac, or Linux systems. OpenClaw prioritizes privacy by keeping data on the device.  
  • Action-oriented, unlike ChatGPT, which primarily generates text, OpenClaw is designed to perform actions. It can read and write files, send emails, browse the web, manage calendars, and control software via APIs.  
  • OpenClaw offers over 100 built-in skills for integrating with applications and automating complex, repetitive tasks.  
  • Observers characterize OpenClaw as the fastest-growing open source project to date. Many associate it with a lobster mascot and the term “lobster fever.”  

NVIDIA NemoClaw: Security and Infrastructure 

OpenClock provides high-level access to personal files and software, generating significant security risks. NVIDIA’s NemoClaw addresses these issues by supplying a secure infrastructure for these agents.  

  • NemoClaw enables users to deploy open-source models, such as NVIDIA’s Nemo Plum, directly on RTX-enabled personal computers with a single command.  
  • NVIDIA OpenShell. This sandboxed runtime, developed by NVIDIA, adds security and privacy controls for agents. It ensures safe operation without access to unauthorized files or networks.  
  • NemoClaw comes with important safety features. These make it trustworthy for use in professional and business settings.  

Consequences For PC Applications 

  • Agenting workflows. OpenClaw is increasingly used to develop specialized agents that eliminate the need for conventional standard-roll applications for tasks such as research, document crafting, and project workflow management.  
  • Operating System for Personal AI: Jensen Hoan has stated that OpenClaw serves as an operating system for AI on devices, potentially replacing traditional software interaction paradigms much as Windows changed personal computing.  
  • Elevating capabilities. This technology enables non-expert users to perform complex tasks like design or planning. Users provide instructions to the agent instead of using specialized software.  

Peter Steinberger developed OpenClaw, previously known as Clawdbot. He then joined OpenAI. Anthropic launched a trademark company, prompting the platform to rebrand quickly.  

Huang repeatedly described OpenClaw as the next ChatGPT, emphasizing its significance at GTC 2026 and forecasting that it would mark a new era in software development, akin to Windows’s impact on personal computing. He stressed that OpenClaw is the largest and most successful open-source project in history, underscoring its viral adoption and transformative role in agentic AI.  

Huang stated that Open Cloud begins a new era in software development.  

He described OpenClaw as the largest and most successful open-source project in history.  

Huang equated OpenClaw’s emergence to the revolutionary impact of ChatGPT in 2022, suggesting it marks a key milestone in the field of artificial intelligence.  

In an interview with CNBC’s Jim Kramer during NVIDIA’s GPU Technology Conference, Huang highlighted how OpenClaw shifts the paradigm from passive chatbots to proactive, action-oriented AI agents.  

Huang reiterated that this is definitely the next ChatGPT and emphasized that OpenCloud signifies the beginning of a new era in software. He further described it as the largest, most popular, most successful open-source project in the history of humanity, noting its rapid adoption as the fastest-growing open-source project to date.  

Significance of OpenClaw in the AI Sector 

Unlike large language models, OpenClaw is an open-source framework for building autonomous AI agents. These agents can execute real-world tasks with minimal human oversight, such as managing emails, automating scheduling, or performing repetitive digital operations. The platform enables users to create, deploy, and run these agents locally on personal computers.  

OpenClaw enables users to create personalized agents with minimal code, allowing them to manage complex workflows. For example, agents can autonomously design kitchen layouts by analyzing images, researching tools, iterating concepts, and refining outputs.  

Juan stressed the potential social impact of OpenClaw, stating that every carpenter can now be an architect. Every plumber will become an architect. We are going to improve everyone’s capabilities. He compared OpenClaw’s significance to the agentic AI era, to the roles of Windows in personal computing, and Linux or Kubernetes in cloud infrastructure. Originally developed by Peter Steinberger, who later joined OpenAI, the project runs locally on Mac, Windows, or Linux and supports models from Anthropic, OpenAI, and other large language models.s. It integrates with messaging applications such as WhatsApp, Telegram, Discord, and Slack. OpenClaw prioritizes privacy by retaining data on the device and functions as a personal AI assistant that actually does things, including managing calendars, handling emails, and automating monetization tasks.  

NVIDIA Expands Offerings With NemoClaw Enterprise Stack 

To leverage the momentum of open-source AI development, NVIDIA announced NemoClaw, an enterprise-grade, secure version of OpenClaw. NemoClaw integrates Nvidia’s comprehensive software stack and introduces robust safeguards for privacy, oversight, compliance, and scalability, addressing businesses’ concerns about uncontrolled agent actions.  

Huang emphasized Nvidia’s commitment to making agentic AI safe and widely accessible by addressing key concerns around security, privacy, and oversight as these AI agents become more autonomous.s. 

SourceOpenClaw definitely the next ChatGPT’: Nvidia CEO Jensen Huang hails viral AI agent platform at GTC 2026 

At NVIDIA GTC 2026, Samsung Electronics made waves by unveiling its next-generation HBM4E (High Bandwidth Memory 4E) and announcing mass production of its 6th-generation HBM4 for NVIDIA’s upcoming Vera Rubin platform. This milestone marks a major technical triumph for Samsung, driving AI processing rates and ensuring a strong memory supply for NVIDIA’s next-gen AI infrastructure. 

Key Breakthroughs And Technical Specifications 

  • HBM4 6th Gen now in production. Samsung’s HBM4 delivers 11.7 Gbps per pin for NVIDIA’s Vera Rubin AI platform. 
  • HBM4E 7th Gen recently unveiled. HBM4E reaches 16 Gbps per pin and 4.0 TB/s bandwidth. 
  • Hybrid copper bonding enables 16-plus layers and cuts heat resistance by over 20%, improving efficiency. 
  • Process node: both HBM products use an advanced 10nm-class DRAM (1C) process for high performance. 

Strategic Impact on NVIDIA Partnership 

  • Samsung’s introduction of HBM4 and HBM4e is strategically aligned to enhance NVIDIA’s high-performance AI accelerators. This ensures improved AI training and inference capabilities for NVIDIA’s Vera Rubin platform, directly supporting NVIDIA’s competitive edge in AI infrastructure. 
  • Diversified supply chain by adopting Samsung’s HBM4. NVIDIA fortifies its next-generation GPU platforms with a more resilient and diversified supply chain, reducing the strategic risk of dependence on a single supplier. 
  • Total AI solution by offering an integrated turnkey service across memory HBM4, HBM4e, logic design foundry, and advanced packaging. Samsung aims to position itself as a total AI solution provider for NVIDIA, enabling NVIDIA to accelerate development and streamline supply with a single partner 

GTC 2026 Showcase 

Samsung’s presence at GTC 2026 highlighted a comprehensive AI alliance featuring: 

  • NVIDIA Gallery, a special section featuring Samsung’s HBM4, SOC-AMM2, and PM1763 SSDs all optimized for Samsung AI infrastructure. 
  • AI Factory Cooperation: Implementation of N-Media Accelerated Computing to scale Samsung’s AI Factory and expedite manufacturing with digital twins powered by N-Media Omniverse. 
  • The products improve energy efficiency and system performance for inference workloads. 

Samsung Electronics, recognized for its leadership in advanced microchip technology, has announced the AI computing technologies it will present at NVIDIA GTC 2026 in San Jose, California, from March 16 to 19. As the only semiconductor company in the industry to supply a comprehensive AI solution encompassing memory, logic, foundry, and advanced packaging, Samsung will display a complete portfolio of products and solutions that support the design and development of advanced AI systems. Additional information about Samsung’s AI solutions will be available at the company’s GTC 2026 booth (#1207). 

The primary focus of Samsung’s presentation at NVIDIA GTC 2026 will be the 6th generation HBM4, now in mass production and designed for the NVIDIA Vera Rubin platform. Samsung’s HBM4 is projected to advance the development of future AI applications by delivering consistent data rates of 11.67 Gbps, surpassing the industry standard of 8 GB/s and enabling potential enhancements up to 13 GB/s. 

Furthermore, utilizing the 6th generation 10nm NM-class DRAM process, Samsung has achieved stable yields and high performance. The company’s next-generation HBM4E, which delivers 16 Gbps per pin and 4.0 TB/s bandwidth, will also be exhibited for the first time at GTC 2026 

In addition to its HBM portfolio, Samsung will also present its Hybrid Copper Bonding (H3B) technology, a new chip connection method that lets next-generation HBM reach 16 or more stacked memory layers while lowering thermal resistance and making cooling more effective by more than 20% compared to the traditional Thermal Compression Bonding (TCB) method. 

Advancing AI Through Tactical Collaboration 

The teamwork between Samsung and NVIDIA will be highlighted in a special NVIDIA gallery inside the booth. This area will show a range of Samsung technologies including HBM4, SoCAMM2, a server memory module and PM1763 SSD, a storage device all built for NVIDIA AI systems. 

To further meet the requirements for efficiency and expandability in AI systems, Samsung’s SoCAMM2, based on low-power DRAM, serves as a server memory module offering high bandwidth and flexible system integration for next-gen AI infrastructure. SoCAMM2 is currently in mass production, denoting an industry-first achievement. 

Samsung’s PM1763 SSD, designed for next-gen AI storage solutions, uses the PCIe 6.0 interface to deliver fast data transfers and high capacities. The performance of the PM1763 will be demonstrated on servers running the NVIDIA BlueField-4 STX reference architecture for accelerated storage infrastructure on the NVIDIA Vera Rubin platform. Samsung’s PM1753 SSD will demonstrate its contribution to increased energy efficiency and system performance for inference workloads. 

Memory Architecture to Scale 

At GTC 2026, Samsung will present its collaboration with NVIDIA on AI factory development, including plans to utilize NVIDIA accelerated computing to expand Samsung’s AI factory and expedite digital twin manufacturing using NVIDIA Omniverse libraries. This partnership supports a comprehensive chip manufacturing infrastructure that includes memory, logic, foundry, and advanced packaging. 

Yong Ho Song, Executive Vice President and Head of AI Center at Samsung Electronics, will discuss the strategic cooperation between the two companies during his speaker session on March 17, 2026. The session titled Transforming Semiconductor Manufacturing with Agentic AI from Design and Engineering to Production will detail the AI Factory and present real-world use cases where AI and digital twins are advancing semiconductor manufacturing, including development chain, electronic design automation (EDA), computational lithography, and the operation of advanced manufacturing facilities powered by NVIDIA. 

Turning to local AI, Samsung’s memory solutions are engineered to maximize efficiency for local AI workloads on personal devices. At GTC 2026, Samsung will present customized solutions for personal AI supercomputers including the PM9E3 and PM9E1 NAND for NVIDIA DJX Spark Display DRAM solutions LPDDR5X and LPDDR6 designed for embedding in smartphones, tablets and wearable devices providing increased data throughput and reduced latency. LPDDR5X achieves speeds up to 25 Gbps per pin and reduces power consumption by up to 15%, supporting responsive mobile experiences, high-resolution gaming, and advanced AI-enhanced applications while maintaining battery life. LPDDR6 offers further bandwidth, scaling to 30–35 Gbps per pin, and provides advanced power management features, for example adaptive voltage scaling and dynamic refresh control, which together provide the performance needed for next-gen edge AI workloads.

Source: Samsung Unveils HBM4E, Showcasing Comprehensive AI Solutions, NVIDIA Partnership and Vision at NVIDIA GTC 2026 

Imagine yourself as a developer creating a research assistant with GPT-5.4. This agent can retrieve documents, summarize findings, and answer follow-up questions over several interactions. Early tests show strong reasoning, but as the agent combines retrieval, tool use, and generation, delays can increase. For interactive experiences these delays are important. So many teams use a multi-model approach: a larger model handles planning while smaller models quickly complete subtasks at scale. 

That’s where GPT-5.4 Mini and GPT-5.4 Nano help. These smaller versions are built for developer tasks needing low latency, cost savings, and flexibility, now available in Microsoft Foundry. They give you more options for efficient agent design. 

GPT 5.4 Mini: Efficient Reasoning for Production Workflows 

GPT 5.4 mini combines the strengths of GPT 5.4 into a smaller, more efficient model for tasks needing quick responses. It’s a step up from GPT-5 mini in coding, reasoning, understanding images and text, and using tools while running about twice as fast. 

  • Text and image inputs let you create experiences that use both prompts and images like screenshots. 
  • You can reliably use tools and call APIs to support agent workflows. 
  • Web and file search features help ground responses in outside or company content during multi-step tasks. 
  • Computer use support implies the model can understand the software’s UI and take specific actions as needed. 

Where Gpt-5.4 Mini Thrives 

  • Developer, copilots and coding assistants benefit from quick coding help, code review suggestions and fast feedback loops where speed is important. 
  • Multimodal developer workflows include apps that can read screenshots, understand UI states, or process images during coding and debugging. 
  • Computer use sub-agents are fast helpers that take specific actions in software such as navigating UIs or handling repetitive tasks, all within a larger agent system managed by a planner model. 

GPT 5.4 Nano: Ultra Low Latency Automation at Scale 

GPT 5.4 Nano is the smallest and fastest model built for low latency, low cost API use at volume. It excels at quick tasks like classification, extraction and ranking as well as at simple sub-agent jobs where speed and cost matter more than deep reasoning. 

  • It follows instructions while sticking closely to what developers want in short clear tasks. 
  • It can reliably call tools and APIs for agent and automation tasks. 
  • It’s tuned for common programming tasks that require quick results. 
  • It supports image inputs so it can handle basic image interpretation along with text. 
  • It’s designed to give fast, efficient responses at scale while keeping costs low. 

Where GPT 5.4 Nano Thrives 

GPT 5.4 Nano works best when you need reliable results at high volume and your tasks are quick and clearly defined. 

  • It’s great for classification and intent detection as well as for quickly labeling and routing lots of requests. 
  • It can extract structured fields from text, check formats, and standardize outputs. 
  • It helps with ranking and triage like re-ordering candidates, prioritizing tickets or leads, and picking the next best action when speed is important. 
  • It can handle guardrails and policy checks such as simple safety and policy reviews, prompt filtering and making enforcement decisions before sending tasks to schools or biggie. It’s useful for high-volume text processing such as batch transformations, data cleanup, duplicate removal and content normalization where cost and speed are key. 
  • It can route and prioritize jobs at the edge, choosing the right workflow template, queue, or model for each request when speed is critical. 

Choosing The Right GPT-5.4 Model 

With Microsoft Foundry, you can use different GPT 5.4 models at the same time. This lets teams send each request to the model best suited to its specific requirements. Here is a simple way to evaluate which model to use: 

Model Best suited for. Typical workloads. 
GPT 5.4 Sustained multi-step reasoning with reliable follow through. Agentic workflows, research assistants, document analysis, complex internal tools. 
GPT 5.4 PRO Deeper, higher reliability reasoning for complex production scenarios. High-stakes agentic workflows, long-form analysis and synthesis, complex planning, advanced internal co-pilots. 
GPT-5.4 Mini Balanced reasoning with lower latency for interactive systems. Real-time agents, developer tools, retrieval, and augmented applications. 
GPT 5.4 Nano Ultra low latency and high throughput. High Volume Request Routing Real Time Chat Lightweight Automation 

Responsible AI In Microsoft Foundry 

At Microsoft, our objective is to empower people and organizations. As AI becomes more common, trust is vital to adoption and building that trust means being transparent, safe, and accountable. Microsoft Foundry offers governance tools, monitoring, and evaluation features to help organizations use GPT-5.4 models responsibly in production following Microsoft’s responsible AI principles. 

Pricing 

Model Deployment. Input USD/M token Cached input USD/m tokens Output USD/m tokens 
GPT 5.4 Mini Standard Global $0.75 $0.075 $4.50 
GPT 5.4 Nano Standard Global $0.22 $0.02 $1.25 

The models are available in data zone US and will soon be available in data zone EU. 

Try the models in Microsoft Foundry sign in, browse the catalog, compare Mini and Nano with other options, and choose the best fit for your workload.

Source: Introducing OpenAI’s GPT-5.4 mini and GPT-5.4 nano for low-latency AI 

News Summary 

  • The NVIDIA Vera CPU delivers twice the efficiency and 50% higher performance than traditional CPUs, enabling faster, more cost-effective computing for demanding workloads. 
  • NVIDIA is working with clients in the global cloud, AI, and enterprise sectors to deploy the Vera CPU. 
  • Manufacturers across the industry have already started using the Vera CPU in their systems. 

At GTC, NVIDIA launched the Vera CPU, the world’s first processor for agentic AI and reinforcement learning, offering twice the efficiency and 50% more speed than traditional rack-scale CPUs. 

As reasoning and agentic AI improve, the infrastructure behind these models becomes more important for scaling performance and cost. This includes systems that plan tasks, run tools, interact with data, execute code, and check results. 

The NVIDIA Vera CPU builds on the NVIDIA Grace CPU, allowing organizations of any size to create AI factories that use agentic AI at scale. Vera offers top single-thread performance and per-core bandwidth, making it ideal for large-scale AI services such as coding assistants and both consumer and enterprise agents. 

Major cloud providers like Alibaba Cloud, Coreweave, Meta, and Oracle Cloud Infrastructure, along with system makers such as Dell Technologies, HPE, Lenovo, and Supermicro, are working with NVIDIA to deploy Vera. This widespread adoption positions Vera as the new standard for critical AI workloads, making AI easier to use and accelerating innovation for developers, startups, institutions, and businesses. 

Vera is arriving at a turning point for AI as intelligence becomes agentic, capable of reasoning and acting. The importance of the systems orchestrating that work is elevated, said Jensen Huang, founder and CEO of NVIDIA. The CPU is no longer simply supporting the model. It’s driving it with breakthrough performance and energy efficiency. Vera unlocks AI systems that think faster and expand further. 

Configurable for Every Data Center 

NVIDIA announced a new Vera CPU rack that holds 256 liquid-cooled Vera CPUs. This setup can support over 22,500 CPU environments running at full performance. At the same time, AI factories can quickly scale up to tens of thousands of instances of agentic tools in one rack. 

The new Vera rack uses NVIDIA MGX modular reference architecture, a flexible blueprint for building different server configurations, and is supported by AT partners around the world. 

On the NVIDIA Vera Rubin NVL72 platform, Vera CPUs connect to NVIDIA GPUs using NVIDIA NVLink C2C interconnect technology, a high-speed connection that lets CPUs and GPUs share data quickly, offering 1.8 TB/s of bandwidth, which is seven times more than PCIe Gen 6. This allows for fast data sharing between CPUs and GPUs. NVIDIA also introduced new reference designs that use Vera as the main GPU for NVIDIA HGX Rubin NBL8 systems, managing data movement and system control for GPU-accelerated tasks. 

Vera system partners offer both dual- and single-socket CPU server setups. These are ideal for tasks such as reinforcement learning, agentic inference, data processing, orchestration, storage management, cloud applications, and high-performance computing. 

Across all configurations, Vera systems include NVIDIA, ConnectX, Supermicro cards and BlueField 4 DPUs, delivering high-speed networking, storage, and security. Key benefits for agentic AI: customers can optimize performance with a single software stack across the NVIDIA platform, while high-performance CPU cores, high-bandwidth memory, and an advanced coherency fabric ensure quick responses even under heavy agentic workflows and Reinforcement learning. 

Vera has 88 custom NVIDIA-designed Olympus cores that provide strong performance for compilers (software that translates programming code), runtime engines (systems that execute application code), analytics pipelines (processes for analyzing data), agentic tools (AI tools that perform tasks independently), and orchestration services systems that coordinate complex processes. Each core can handle two tasks at once using n-media spatial multi-threading (a technology that lets a single-core CPU execute multiple instructions simultaneously), ensuring steady, predictable performance. This is ideal for AI factors that run many jobs at the same time. 

Vera also improves energy efficiency with the second generation of NVIDIA’s low-power memory system, now using LPDDR5X (a high-performance, low-power memory type). This provides up to 1.2 TB of bandwidth, which is twice that of general-purpose CPUs, and uses half the power. 

Widespread Ecosystem Support 

Cursor, a company focused on AI-native software development, is using NVIDIA Vera to improve performance of its AI coding agents. 

We are excited to use NVIDIA Vera CPUs to improve overall throughput and latency so we can deliver faster, more responsive coding agent experiences for our customers, said Michael Truell, the co-founder and CEO of Cursor. 

Redpanda, a top streaming data platform and AI platform, is using Vera to greatly improve performance. 

Redpanda recently tested NVIDIA Vera running Apache Kafka–compatible workloads and saw dramatically better performance than other systems in a benchmark. It delivered up to 5.5x lower latency, said Alex Gallego, founder and CEO of Redpanda. Vera represents a new direction in CPU architecture with more memory and less overhead per core. This enables our customers to scale real-time streaming workloads further than ever and unlock new AI and agentic applications. 

National labs planning to use Vera CPUs include the Leibniz Supercomputing Center, Los Alamos National Laboratory, Lawrence Berkeley National Laboratory, National Energy Research Scientific Computing Center, and the Texas Advanced Computing Center (TACC). 

At TACC, we recently tested NVIDIA’s Vera CPU platform as we prepared for launching our upcoming Horizon system and running six of our scientific applications. We saw impressive early results, said John Cazes, Director of High-Performance Computing at TACC. Vera’s per-core performance and memory throughput represent a giant leap forward for scientific computing, and we look forward to bringing Vera-based nodes to our CPU users on Horizon later this year. 

Leading cloud service providers planning to deploy Vera CPUs including Alibaba Cloud, ByteDance, Cloudflare, CoreVive, Curso, Lombardo, Nebus, NScale, Oracle Cloud Infrastructure, Together.ai, and Vultr. 

Leading infrastructure providers adopting various APIs include Aivres, ASRock, Rack, Asus, Compel, Cisco, Dell, Foxconn, Gigabyte, HPE, HYVE, InventTech, Lenovo, MiTAC Computing, MSI, Pegatron, Quanta Cloud Technology, QCT, Supermicro, Wistron and Wiwynn.

Source: NVIDIA Launches Vera CPU, Purpose-Built for Agentic AI 

Recent developments from Refroid Technologies and TIERX data centers illustrate how infrastructure is rapidly evolving to power next-generation AI-driven customer experiences.  

Refroid and TIERX have joined to develop infrastructure tailored for AI-powered customer experiences.  

As more organizations use artificial intelligence and analytics-based services, the technology driving these tools is becoming a key focus not just for CIOs and CTOs but also for those leading customer experience.  

They are building a modular data center system for high-performance computing and AI workloads.  

The partnership joins RFROID’s advanced liquid cooling with TIERX’s modular, standard-based data centers.  

Their goal is to build scalable infrastructure that can handle high-density AI computing and be quickly deployed in research centers, businesses, and edge locations.  

Although the announcement is mainly about new engineering in data-center design, its impact extends beyond that: as companies digitize customer engagements and increasingly use AI services, a reliable, efficient, and scalable infrastructure is becoming vital to modern customer-experience strategies.  

How Infrastructure Drives Customer Experience Behind the Scenes 

Traditionally, customer experience strategies have focused on front-end areas like user interfaces, service journeys, personalization, and engagement channels.  

But as AI is increasingly used in customer engagements, the backend systems that support these experiences are becoming more important.  

Tools such as recommendation engines, predictive support systems, real-time analytics, and AI assistance all require computing systems capable of handling large amounts of data quickly and reliably.  

As these systems become central to digital engagement, organizations must ensure their infrastructure meets the performance and scalability demands of AI workloads.  

Traditional data centers struggle to manage the heat and power demands of new AI processors. High-performance chips for AI training and inference generate much more heat than regular computer hardware.  

This challenge is driving greater interest in new solutions, such as modular data centers and liquid-cooling systems. These technologies help organizations run high-density computing environments more efficiently.  

Strong technology infrastructure is now crucial to effective digital customer experience.  

Strategic Standing in the AI Landscape 

Their partnership responds to evolving strategies for managing digital infrastructure.  

TierX offers modular prefabricated data centers, enabling faster setup and easier expansion than traditional builds.  

This method is especially useful for sectors where computer needs are growing rapidly, such as artificial intelligence, digital services, and research computing.  

Meanwhile, Refroid Technologies brings expertise in liquid-cooling systems, helping solve one of the biggest challenges in today’s data centers: controlling heat.  

As AI processors become increasingly powerful, they generate substantially more heat. Inadequate cooling can significantly impair system performance and stability.  

Together, they deliver fast-deploying, energy-efficient, high-performance computing environments that reduce costs and support demanding workloads, such as AI.  

Satya Bhavaraju, CEO of Referred Technologies, emphasizes that the initiative centers on developing infrastructure capable of accommodating the intense thermal requirements of next-generation processors, leveraging innovations conceived and manufactured locally.  

Ravikumar Enamsetti, CEO of TierX Data Center, highlights that modular infrastructure expedites the deployment of sophisticated computing environments, which is particularly beneficial to research institutions and distributed computing applications.  

The Technology Behind The Partners 

Artificial intelligence workloads require high-performance processors that consume substantial energy. Frequently, those processors demand more than 500 watts per socket, far exceeding the requirements of conventional server hardware.  

These processors create major heat. Direct-to-chip DLC cooling circulates liquid over hot components, providing better cooling than traditional air cooling. The collaboration also includes immersion cooling, in which hardware is submerged in fluids that efficiently remove heat.  

These cooling methods let data centers handle more powerful computing while using less energy to keep things at a safe temperature for RFROID and TierX. Integrating advanced cooling systems into modular data center units enables customers to rapidly deploy infrastructure capable of handling high computing loads with lower energy requirements and enhanced cooling reliability. These modules are adaptable for deployment across diverse environments, including university campuses, research laboratories, industrial facilities, and edge computing sites. This modular strategy enables organizations to incrementally expand computational capacity, eliminating the prolonged construction timelines associated with traditional data center development.  

Consequences for Customer Experience Strategy 

Data center technologies may seem unrelated to customer engagement, but they significantly affect customer experience.  

More digital services use AI platforms that process large volumes of customer data in real time. These systems enable features like dynamic product recommendations, predictive support, fraud detection, and smart service routing.  

For these features to work, organizations need environments that handle complex tasks rapidly and reliably.  

New infrastructure options, such as modular data centers, help organizations quickly boost computing power as demand grows. This flexibility enables businesses to scale AI services more quickly.  

Lower latency is another benefit when computing resources are placed closer to where data is generated. Digital platforms can respond more quickly to customers.  

This results in more responsive chatbots, faster recommendations, and better real-time analytics, all of which shape customer perceptions.  

Energy efficiency is also becoming more important within digital infrastructure. Cooling technologies that use less power can cut costs and help meet green targets for environmental accountability. Infrastructure efficiency may play an indirect but meaningful role in shaping customer perceptions and trust.  

Wider Industry Implications 

The Refroid and TierX partnership highlights major trends in global digital infrastructure.  

One major trend is the shift toward modular, distributed computing. As organizations expand digital services and deploy AI, they need infrastructure that scales quickly and operates closer to the edge.  

Another trend is liquid cooling in high-performance computing as powerful processors outpace air-cooling solutions.  

Additionally, the announcement emphasizes the growing importance of regional infrastructure ecosystems. Many countries are seeking to strengthen domestic capabilities in semiconductor manufacturing, AI infrastructure, and advanced computing technologies.  

Both governments and businesses now see reducing dependence on global supply chains for key technology as a major strategy. These trends suggest that infrastructure innovation will play an increasingly central role in enabling digital transformation initiatives.

SourceRefroid and TierX: The Infrastructure Behind AI-Powered CX 

Samsung Electronics announced plans to turn all its manufacturing operations into AI-driven factories by 2030. The company will fully integrate AI throughout the manufacturing process, from material, logistics, and production to quality checks and final shipment, creating a new autonomous production environment.  

To support this change, Samsung will use digital twin simulations in its manufacturing and introduce specialized AI agents for quality control, production, and logistics. These agents will help improve data analysis and pre-validation, raising quality, efficiency, and productivity throughout Samsung’s global factories.  

Samsung will also bring more AI into its environmental, health, and safety operations through proactive detection and automated hazard prevention. The company intends to raise safety standards at its production sites worldwide.  

The core of this change is agentic AI, first seen in the Galaxy S26 series. This AI can plan, execute, and optimize decisions on its own to meet set goals. Samsung is now using its mobile AI expertise to build a strong base for self-governance in manufacturing.  

With custom AI agents, Samsung will improve production workflows, predictive maintenance, repairs, and logistics. This will help ensure high standards and consistent quality at all its sites worldwide.  

To accelerate the transition from automation to advanced autonomy, Samsung is adding humanoid and specialized robots to its production lines. These include operating robots for managing lines and facilities, logistic robots for moving materials, and assembly robots for precise manufacturing. In places where it is hard or unsafe for people to work, Samsung will use environmental safety robots with digital twin technology to monitor conditions, spot risks, and stop hazards.  

The next phase of manufacturing innovation is about creating autonomous situations where AI understands operations in real time and makes the best decisions on its own, said Young-Soo Lee, executive vice president and head of global technology research at Samsung Electronics. We are committed to leading the way in AI-powered global manufacturing innovation.  

Global Industry Engagement 

Samsung will present its industrial AI strategy and digital twin manufacturing vision at MWC 2026 in Barcelona. The company will show how industrial AI can improve safety and efficiency in actual environments.  

At the Samsung Mobile Business Summit, which marks its 10th anniversary this year, the company will share its governance strategy for expanding AI autonomy. This approach includes adding safety features from the start, making sure industrial AI grows responsibly and can be trusted by customers and partners worldwide.  

SFBS is a private invitation-only event for key B2B customers and partners. Samsung uses this event to share its B2B strategy, latest technology plans, and to explore new ways to work together across industries.

Source: Samsung Electronics Announces Strategy To Transition Global Manufacturing Into ‘AI-Driven Factories’ by 2030 

Samsung Electronics brought together global experts for a panel called In Tech, We Trust: Rethinking Security and Privacy in the AI Age ” during its tech forum at CES 2026. The event, held at the Wynn in Las Vegas, focused on how trust is becoming a key factor in how people use and accept AI as it becomes a bigger part of everyday life.  

Making Invisible Intelligence Feel Trustworthy 

As AI increasingly anticipates needs, curates routines, and operates autonomously across devices, panelists Allie K. Miller (CEO of Open Machine), Amy Webb (CEO of The Future Now Strategy Group), Zach Kass (Global AI Advisor at ZKAI Advisory and former Head of Go-to-Market at OpenAI), and Shin Baik (AI Platform Center Group Head at Samsung Electronics) emphasized that trust must be earned not through promises but via consistent, understandable behavior. Each expert contributed their view on how transparency and reliability are critical for earning user trust.  

During the session, Samsung explained its trust-by-design approach, stressing that AI should be predictable, transparent, and easy for users to control. Allie Miller said when it comes to AI, users are looking for openness and control. They want to be leaders in their own personalized experiences, to understand whether an AI model is running locally or in the cloud, to know that their data is secure, and to clearly see what is powered by AI and what is not. That level of visibility builds confidence in the provider’s head. There is a responsibility to show up for users by designing personalized experiences around the core components of trust, clarity, security, and accountability.  

Samsung also pointed out that on-device AI keeps personal data on the device whenever possible. Cloud-based AI is used only when more speed or scale is needed, so users get flexibility without sacrificing privacy.  

Security for an AI-Driven World 

The panel discussed how security needs to change as AI spreads across phones, TVs, and home appliances. Samsung presented its Knox security platform, which now protects billions of devices through chip-level security and the Knox Matrix, a system that enables devices to authenticate and protect one another.  

Trust in AI starts with security, thus proven, not promised. Shin Baik said that, for more than a decade, Samsung Knox has provided a deeply embedded security platform that protects sensitive data at every layer. But trust does go beyond a single device  it requires an ecosystem that protects itself. Samsung Knox devices continuously authenticate and monitor one another, so each device acts as a shield for the rest, creating a resilient, secure environment users can rely on.  

Across Industry, Look At The Future Of Trust. 

Shin Baik emphasized that trust grows when AI behaves predictably and securely across devices, arguing that users need visible signals of control rather than black-box systems. Samsung pointed to its alliances with industry leaders such as Google and Microsoft as a way to strengthen shared security research, interoperability, and ecosystem-wide protection, while Allie Miller pointed out the value of transparency for users, including clear visibility into where AI models run, how data is used, and explicit labels that show what is powered by AI and what is not. Meanwhile, Zach Kass added that while misinformation and misuse present real challenges, for every risk there is also a countermeasure, and technology itself will play a critical part in mitigating AI’s downsides.  

Amy Webb evaluated the relationship between trust and consumer purchasing habits. I don’t think they’re making decisions based solely on trust. She said people aren’t paying for trust. They don’t buy things because of trust. They buy things because of convenience. So if the AI piece of this hooks people in, it makes their lives easier and more convenient.  

As AI becomes part of daily life, the panel agreed on one point. The technologies people trust most will be those that prioritize security, transparency, and user choice from the start.

Source: Samsung Explores How Trust, Security and Privacy Shape the Future of AI at CES 2026 

Apple’s OpenELM 2.0 models work with the MLX framework to deliver fast, private, local retrieval-augmented generation (RAG) on Apple Silicon in optimized setups. The back-end processes over 150 tokens/sec by using layer-wise scaling and unified memory. This configuration delivers secure, high-speed AI inference directly on your device, eliminating the need for the cloud.  

Key Aspects Of The OpenELM And MLX Stack 

  • High-speed local inference: The MLX framework is designed for Apple Silicon and efficiently runs models, processing over 150 tokens/sec in some cases for real-time responses.  
  • Privacy-focused RAG: Build local RAG servers using the MLX framework to keep sensitive data on your MacBook and protect privacy.  
  • OpenELM architecture: Layer-wise scaling improves the accuracy of OpenELM models by optimizing parameter distribution across transformer layers.  
  • Optimized Deployment: MLX provides libraries like MLX-LM for fast model deployment with minimal code and supports fine-tuning for on-device machine learning.  
  1. Index documents by splitting them into small text chunks.  
  1. Generate embeddings: Use MLX to turn chunks into numerical representations that help models interpret meaning.  
  1. Store in vector database: Store embeddings, numerical text representations, locally in a vector database, a system for storing and searching embeddings efficiently.  
  1. Retrieve and generate: Locate the best context for your queries, then use Apple or Open ELM models to generate responses.  

Note: these results are based on ideal conditions and specific Apple Silicon hardware.  

On-device AI has long promised a local-first approach, but hardware limits and slow speeds have held it back. With the release of Apple’s Open ELM 2.0 MLX framework, things are changing as Apple pairs the open-source, efficient language model Open ELM with the MLX array framework. Apple now delivers private, high-speed local RAG (retrieval-augmented generation) at 150 tokens per second on everyday consumer hardware.  

For developers and privacy-focused businesses, this update is more than just faster for performance. It changes how sensitive data from medical records to proprietary code is handled and searched, keeping everything on the device and protected by private RAG.  

The Architecture of OpenELM 2.0: Layer-wise Scaling Reimagined 

The key innovation in OpenELM 2.0 is its new layer-wise scaling approach. Instead of using the same layer sizes throughout, it spreads parameters unevenly. Layers near the input and output have different sizes and headcounts, allowing the model to use its resources more efficiently.  

When used with the MLX framework, Apple’s open-source library for Apple Silicon, Open ELM 2.0, leverages the unified memory architecture. Here, the CPU and GPU share a single fast memory pool, eliminating the usual PCIe bottleneck found with separate GPUs. Thanks to 4-bit quantization with MLX-LM, the model fits easily in a base MacBook AS memory living room for the vector database needed for RAG.  

Enabling 150 Token/Sec Performance: Speculative Decoding And MLX Kernels. 

OpenELM 2.0 reaches 150 tokens/sec, which feels almost instant to users, by using two main updates: optimized Metal kernels for Group Query Attention (GQA) and RMSNorm. These kernels reduce how often the processor needs to access memory, which is usually the main slowdown for large language models (LLMs).  

For speculative generation, OpenELM 2.0 uses a draft-and-verify method: a model like the 270M version predicts tokens first, and then the larger 3B model checks them in parallel. This lets the system generate several tokens at once, reaching speeds of 150 tokens/sec.  

The speed is especially important for local RAG workflows. In these pipelines, the model needs to read and summarize the context it finds before answering. High throughput means that, even with extensive documentation, the wait for the first token is barely noticeable.  

The Private RAG Advantage: Security at the Edge 

The private int private 150 token/SEC local RAG is more than just a label. It’s a technical guarantee based on the local-first design. In typical RAG setups, you send data, such as health records or company spreadsheets, to the cloud for processing.  

With OpenELM 2.0 and MLX, the whole process stays on your device.  

  • Local embedding: data is converted into vectors using MLX-optimized embedding models, such as Hugging Face’s Sentence Transformers, and the results are saved locally in an encrypted format as microservices.  
  • Local inference: the open ELM 2.0 model searches the local store and creates responses using the device’s GPU. This reduces ongoing subscription costs and keeps the system running even when offline. For fields like healthcare or law, this is the only practical way to use generative AI every day.  

Developer Implementation: The MLX LM Ecosystem 

Apple has made it easy for engineers to get started. The Mix LM package lets you add OpenELM 2.0 to a Swift or Python project and handles converting and quantizing Hugging Face rates with just one CLI command or a few lines of Python code. It uses modern containerization to manage memory limits, allowing the system to adjust its GPU cache for devices ranging from iPhones with 8 GB of RAM to M3 Max workstations with 128 GB.  

Final Thoughts 

The Apple Open ELM 2.0 MLX Framework signals a new era for local machine learning with a private 150-token/SEC local RAG. Apple allows that retina-level AI, which delivers instant responses, to run without a huge server farm. As the open-source community continues to improve these models and the MLX framework gains more support, cloud-only LLMs will lose their edge for anyone who cares about both privacy and performance. Local elements are not just here they’re running 150 tokens per second. 

Source: OpenELM: An Efficient Language Model Family with Open Training and Inference Framework 

Today, Microsoft released its March 2026 Patch Tuesday updates addressing 79 security flaws. This includes two zero-day vulnerabilities that have been publicly disclosed.  

Transitioning from the initial overview, Dispatch Tuesday also fixes three serious vulnerabilities. Two are remote code execution issues, and one is an information disclosure flaw.  

Here is a breakdown of the number of bugs in each vulnerability category. In cybersecurity, a vulnerability is a weakness that could allow unauthorized actions in a system.  

  • 46 elevation of privilege vulnerabilities.  
  • Two security features bypass vulnerabilities  
  • Eighteen, remote code execution on the line of notice.  
  • Information Disclosure Vulnerabilities  
  • Four denial-of-service vulnerabilities.  
  • Four spoofing vulnerabilities.  

Bleeping Computer only counts Patch Tuesday security updates released by Microsoft on the day itself. This means the total does not include 9 Microsoft Edge flaws or issues in Marina Payment Orchestrator Service, Azure, and Microsoft Devices Pricing Program that were fixed earlier this month. Thank you.  

Two Zero-Day Vulnerabilities and Microsoft Office Flaws 

This month’s Patch Tuesday addresses two publicly disclosed zero-day vulnerabilities with no reports of active exploitation.  

Microsoft defines a zero-day flaw as one that is either publicly disclosed or actively exploited before an official fix is available.  

The two publicly disclosed zero-day vulnerabilities are:  

CVE-2026-21262 – SQL Server Elevation of Privilege Vulnerability  

Microsoft has fixed this publicly disclosed SQL Server flaw.  

Improper access control in SQL Server allows an authorized attacker to elevate their privileges on a network, Microsoft explains.  

Microsoft credited Erland Sommarskog for discovering this flaw. He told Bleeping Computer that it was first disclosed in the “Packaging Permissions in Stored Procedures” article.   

CVE-2026-2612. net denial of service vulnerability.  

An “Out of bounds read” in .NET allows an unauthorized attacker to deny service over a network, Microsoft explains.  

An anonymous researcher is credited with finding this flaw.  

Microsoft has also addressed two remote code execution bugs (CVE-2026-26110 and CVE-2026-26113) in Microsoft Office. Remote code execution (RCE) refers to bugs that could allow attackers to run programs on your computer from a distance. These can be exploited through the preview pane, so users should update the application as soon as possible.  

One remarkable issue is the Microsoft Excel Information Disclosure flaw (CVE-2026-26144), which could allow attackers to steal data using Microsoft Copilot. An information disclosure occurs when unauthorized people can access confidential data.   

An attacker who successfully exploited this vulnerability could cause Copilot Agent Mode to exfiltrate data via unintended network egress, enabling a zero-click information disclosure attack, Microsoft explains.  

Turning to Azure Cloud Services, Microsoft has also fixed a vulnerability in Azure Container Instances (ACI) that could have allowed users to access information belonging to other Azure customers.  

Microsoft did not share technical details about the vulnerability. However, Palo Alto Networks researchers say attackers could have used the bug to run code on other users’ containers, steal sensitive data like crypto secrets, or even install crypto-mining malware.   

Microsoft said it notified customers who might have been affected through service health notifications in the Azure portal. The company added that anyone who did not receive a notification does not need to take any action.  

There is no indication that any customer data was accessed as a result of this vulnerability. Out of an abundance of caution, Microsoft sent notifications to customers potentially affected by the researcher’s activities, advising them to revoke any privileged credentials deployed to the platform before August 31, 2021, Microsoft said in a statement.  

Microsoft recommends that all Azure customers regularly rotate their privileged credentials as a precaution. Credential rotation involves frequently changing passwords or secure keys to reduce the risk of theft.  

Palo Alto Networks researchers say the issue known as Azure escape could allow attackers to compromise Kubernetes clusters that host ACI. This would give them full control over other Azure customers’ containers.  

ACI was designed to prevent attacks from malicious neighboring containers, such as cross-account or cross-tenant attacks. However, it used an older version of runc (the standard container runtime), which was vulnerable to several container escape flaws.  

The researchers used a modified proof-of-concept code for CVE-2019-5736 to escape from a container and establish a root reverse shell on the host system. They then found token permissions in the Kube system namespace that allowed them to run commands on any pod in the cluster and carry out the cross-account attack.  

A malicious Azure user could have compromised the multi-tenant Kubernetes cluster hosting ACI. As a cluster administrator, an attacker could execute commands in other customers’ containers, exfiltrate secrets and private images deployed to the platform, or deploy crypto miners. A sophisticated adversary would further investigate the detection protocols protecting ACI to avoid detection, Palo Alto said.  

ACI, launched in July 2017, is a container-as-a-service (CaaS) that runs on multi-tenant clusters such as Kubernetes and Service Fabric. It lets users deploy containers without managing the underlying infrastructure.  

In a related development, two weeks ago, Microsoft fixed a similar problem in Azure Cosmos DB. In that case, users could access other customers’ databases with full administrative rights, potentially gaining full control of them.

Source: Microsoft March 2026 Patch Tuesday fixes 2 zero-days, 79 flaws 

Microsoft Warns of Information Leak Flaw in Azure Container Instances

Silicon Motion has launched the SM8008A PCIe Gen5 x4 NVMe SSD controller. This controller delivers high performance and sets industry-leading energy efficiency for data-center boot drives and other enterprise storage.  

With the growth of AI and cloud services, data centers are adding more servers than ever. Each server requires a system drive to run its software. In large-scale data centers, the energy used by these systems accumulates and can account for a significant portion of total power consumption.  

Silicon Motion SM8008 Features And Support 

The SM8008 directly tackles the challenge of balancing high performance with low power draw in modern data centers. Built with TSMC’s 6nm process, it reaches up to 14 GB/s in sequential speed, delivers over 2.3 million random IOPS for 4 KB tasks, and keeps power consumption under 5 watts. With 8 NAND channels and support for ONFI and DDR5.0 interfaces up to 3600 MT/s, the S8 offers robust and efficient storage management.  

The controller uses a PCIe Gen5 x4 interface and runs on the NVMe 2.0a protocol, delivering the required performance for new server platforms. Depending on the SSD configuration, it supports up to 16 TB of storage.  

To help data centers cut energy costs when running thousands of drives, the controller supports DDR4-3200 or LPDDR4-3200 DRAM featuring built-in error correction. This approach reduces power consumption while maintaining reliability, making it cost-effective for enterprise environments.  

Silicon Motion SM-80008 Security 

With built-in security features like TCG, Profile 2.0, encryption, and hardware support for AES-256, SHA-512, and RSA-3072, the SM-8008 helps enterprises ensure robust protection for critical data while supporting secure boot and firmware integrity.  

The controller also incorporates security schemes such as DICE and SPDM and already meets the CNSA 2.0 cryptographic standard. As a result, it will satisfy new requirements expected for some government and business users starting in 2027.  

Silicon Motion SM8008 Storage Capabilities 

For storage, the controller uses Silicon Motion, NAMD command technology, and a strong LDPC error-correction engine. These features protect data and help the drive last longer, even with TLC and QLC memory. The controller’s hardware also supports NAMD processing methods that increase speed and throughput.  

The SM8008 enables flexible deployment in demanding environments by supporting standards used in large-scale data centers and enterprises. Its compatibility with the NVMe 2.0a protocol and the OCP Hyperscale NVMe Boot SSD Specification version 1.0 ensures seamless interoperability. The controllers’ support for multiple SSD form factors such as M.2, U.2, E1.S, and E3.S allows vendors to streamline server and storage architectures for diverse applications.  

Silicon Motion SM8008 Specifications 

Category  Details  
Host Interface  PCIe Gen5 x4  
Specifications.  NVMe 2.0A protocol support  
OCP Hyperscale NVMe Boot SSD Specifications Version 1.0 Compliant (Partial)  
NAND Interface  8 NAND channels supporting ONFI and Toggle DDR 5.0 up to 3600 MT/s.  
DRAM Interface  DDR4-3200 and LPDDR4-3200 with inline PCC support.  
Performance.  Up to 14 GB sequential performance with active power under 5 W  
Security features.  TCG Opal 2.0 AES-256 SHA2-512 RSA-3072 B DICE SPDM Secure Boot  
NAND, common maximizing enterprise performance of next generation NAND geometries with LDPC error correction and endurance extension for QLC and beyond.  
Enterprise features.  Supports NVMe Management Interface Basic Management Command (NVMe-M1V1.2)  
Supporting advanced data placement of images, SR‑IOV with 64 virtual functions.  
CNSA 2.0 support  
Up to 16 TB  

Silicon Motion SM8008, Availability 

Interest in controllers built specifically for boot storage is growing as hyperscale infrastructure evolves. Early deployments of the SM8008 are underway with companies such as ATP and Exascend integrating the controller into new enterprise SSD platforms for large server environments. Silicon Motion’s boot storage lineup now includes SLTA, PCIe Gen3/Gen4 controllers, PCIe NVMe BGA SSDs, and the SM8008 Gen5 controller, underscoring its focus on dedicated enterprise boot storage.t.  

About Silicon Motion 

Silicon Motion Technology Corporation is a global leader in NAND flash controllers for solid-state storage devices. The company ships more SSD controllers than any other supplier worldwide for servers, PCs, and other devices. It also dominates as the leading merchant provider of eMMC and UFS, as well as embedded storage controllers for smartphones, IoT products, and automotive applications.  

Silicon Motion delivers custom high-efficiency solutions for hyperscale data centers, industrial systems, and automotive SSDs. The company designs its controllers for advanced AI, cloud, and enterprise storage platforms, ensuring high performance, low power use, and reliable operation.  

Many of the world’s NAND flash vendors, data center and enterprise storage providers, storage device makers, and top OEMs rely on Silicon Motion’s controller technologies to achieve innovative, high-quality storage solutions. www.siliconmotion.com

Source: Silicon Motion Launches SM8008, a Purpose-Built PCIe Gen5 Controller for Enterprise Boot Drives and Ultra-Low-Power Storage