A new sovereign AI framework created for state and municipal governments in the U.S. by NVIDIA allows these types of government organisations to construct and operate their own AI infrastructures, while maintaining sensitive information on or within the borders of each individual state’s jurisdiction. 

This change reflects an overall shift in how governments are now viewing the use of AI in their operations; the focus now is on ensuring and enforcing control over their respective data, as well as minimising risk associated with loss of or unauthorised access to that information due to the use of an external cloud computing company providing AI capabilities. 

A Shift Toward Localized AI Systems 

As organisations in the public sector continue to adopt AI as part of their governance model, there is an increased desire among these organisations to maintain total control over their data. Most cloud-based AI systems require that data be sent to a third-party platform for processing and storage; this can cause a multitude of concerns regarding possible access to that data and/or where the data will be processed and/or stored. 

NVIDIA’s sovereign AI framework will solve the above-mentioned concerns by allowing states to turn to AI systems that can be created and executed in jurisdictions where the states have full authority over the creation and/or use of that data. 

Understanding Sovereign AI 

Sovereign AI refers to a type of system where the processing and storage of data, as well as decision making, takes place within a defined geographic/administrative area. This gives governments full authority over the use of the data that is processed or stored in the system. Additionally, this will assist with meeting regulatory requirements and compliance with privacy laws. 

How the Framework Works 

The framework leverages high-performance computing with AI software tools, enabling agencies to provision dedicated AI clusters that act as standalone systems for performing data processing and training models. These systems can be configured based on the needs of agencies, thus providing flexibility for use across multiple public sector use cases. 

Why This Matters 

There are multiple reasons why the introduction of sovereign AI is important. 

One, Enhanced Privacy. A significant benefit of sovereign AI is that sensitive data (such as personally identifiable data) remains on local infrastructure rather than being exposed to external risk. 

Two, Greater Security. The systems described are less exposed to external threats and thus will provide enhanced security through increased resilience. 

Three, Expanded Control. The majority of today’s governments are highly dependent on global technology platforms for their AI strategies, and therefore, reliance on sovereign AI will allow governments to have increased control and autonomy over their AI strategies. 

Key Applications in the Public Sector 

The framework is applicable in many different industries. In the healthcare industry, these frameworks will help to improve data analytics and prediction systems. In urban planning, these frameworks will aid in the optimisation of traffic and infrastructure. They will also support activities such as disaster relief efforts, law enforcement, and fraud detection. 

The Rise of Decentralized AI 

The many varied applications for these localized AI systems demonstrate the potential for them to be used in many ways. This impact reflects an overarching movement towards decentralised technology systems, as these organizations, as well as governments, have begun to de-emphasize centralised technology development and place funding in technological infrastructures that they could develop and manage. 

Future Implications 

Sovereign AI, being a piece of this transition, aims to offer a singular balance between innovation and governance. While there are many benefits associated with the implementation of a sovereign AI solution, its adoption is a more complex process. 

The development and maintenance of the requisite AI infrastructure require not only significant capital investment, but also a high level of experience and skills and also require long-term planning. Smaller states or agencies may not be able to successfully implement these types of systems without the assistance of external organisations. 

The introduction of a new framework of this type could alter the manner in which other nations develop governance systems for AI. This new framework could provide a standard for building AI systems that emphasise ownership of data and regulatory ownership and control. As AI technology is further developed and widely adopted, it is expected that the need for such frameworks will increase in the future 

Conclusion 

NVIDIA’s Sovereign AI Framework is a significant advancement in public sector technological evolution. The Sovereign AI Framework enables independent and geographically distributed localized AI systems, thus mitigating privacy and data concerns. 

This framework demonstrates a future of governance that not only uses AI but also provides complete control over government data and infrastructure. This framework changes how technology functions within a government organisation.

Source- National Transformation With Sovereign AI 

What Is Sovereign AI?

Apple has introduced the M5, which delivers substantial AI acceleration and upgrades throughout the silicon. Fabricated with third-generation 3-nanometer process technology, the M5 integrates a newly designed 10-core GPU with a dedicated neural accelerator per core. This architecture enables significantly faster GPU execution of AI workloads, achieving over 4X the peak compute throughput of its predecessor, the M4. The GPU also implements advanced graphics capabilities, including third-generation hardware, ray tracing, and increasing graphics performance by up to 45% over M4. The M5 features the world’s fastest CPU core with up to ten CPU cores six for efficiency and up to four for performance delivering up to 15% greater multithreaded performance than the M4. 

 The upgraded sixteen-core neural engine, enhanced media engine, and a 30% increase in unified memory bandwidth (up to 153 GBs) further accelerate demanding tasks. The M5’s efficient, high-performance architecture now powers the new 14-inch MacBook Pro, iPad Pro, Apple Vision Pro, and Apple Vision Pro, all engineered to leverage these advanced capabilities. All three devices are available for pre-order today.  

M5 sets a thrilling new benchmark for AI performance on Apple silicon,” said Johny Srouji, Apple’s senior vice president of Hardware Technologies. “With innovative neural accelerators in the GPU, M5 supercharges AI workloads. It delivers blazing graphics performance, the world’s fastest CPU core, and an even more powerful neural engine. Thanks to soaring unified memory bandwidth, M5 delivers unprecedented performance and groundbreaking capabilities to MacBook Pro, iPad Pro, and Apple Vision Pro.”  

A New GPU Architecture Designed For AI And Graphics 

The M5’s GPU architecture is purpose-built for AI workloads. Its 10-core GPU features a dedicated neural accelerator in each core, delivering 4X the peak GPU compute power of the M4 and over 6X the AI throughput of M1. With M5, MacBook Pro, and iPad Pro, you can execute AI tasks faster, such as running stable diffusion models in Draw Things or processing large language models locally with web AI.  

The M5’s new GPU with enhanced shader cores delivers up to 30% faster graphics performance than M4 and is 2.5 times faster than M1. The third-generation ray tracing engine improves ray-traced graphics by up to 45% in compatible applications. The second-generation dynamic caching in the GPU optimizes memory usage, enabling smoother gameplay, higher-fidelity 3D visuals, and faster rendering for graphics-intensive tasks. On Apple Vision Pro, the M5 powers micro-OLED displays with 10% more pixels and refresh rates up to 120 hertz for enhanced sharpness and reduced motion blur. The GPU architecture is designed to work smoothly with Apple’s software frameworks. Apps that use built-in Apple frameworks and APIs such as Core ML, Metal Performance Shaders, and Metal 4 instantly benefit from improved performance. Developers can also create custom solutions for their apps. They can program the neural accelerators directly with Tensor APIs in Metal 4.  

A Faster Neural Engine To Power Intelligent Features.  

The new 16-core neural engine offers strong AI performance while using very little energy. It works together with the neural accelerators in the CPU and GPU. This makes the M5 chip well-suited for AI tasks. For example, AI features on Apple Vision Pro now run faster and more efficiently. Turning 2D photos into spatial scenes in the Photos app or creating a Persona is also faster. 

The neural engine in M5 also boosts performance for Apple Intelligence. Thanks to these enhancements, on-device AI tools, such as Image Playground, run faster. The improved neural engine and unified memory in M5 help Apple Intelligence models run better overall. In addition, developers using Apple’s Core Models framework will also see faster results.  

Enhanced Memory to Do Even More with AI 

M5 has a unified memory bandwidth of 153 GBs. This is almost 30% more than M4 and over twice as much as M1. This unified memory lets the entire chip access a single large pool of memory. Devices like MacBook Pro, iPad Pro, and Apple Vision Pro can run bigger AI models right on the device. The unified memory also powers the faster CPU, GPU, and neural engine. This delivers better multi-threaded performance in apps, faster graphics in creative apps and games, and improved AI performance when running models on neural accelerators or the neural engine. With 32 GB of memory, M5 lets users run demanding creative programs like Adobe Photoshop and Final Cut Pro simultaneously. They can do this even while uploading large files to the cloud.  

Apple Silicon and the Environment 

Apple 2030 is the company’s plan to become carbon neutral across its entire business by the end of this decade. The goal is to cut product emissions from materials, electricity, and transportation. The energy-efficient M5 chips help the new fourteen-inch MacBook Pro, iPad Pro, and Apple Vision Pro meet Apple’s high energy-efficient standards. It also lowers the total energy these products use over their lifetime.

SourceApple unleashes M5, the next big leap in AI performance for Apple silicon 

Right now, we are shifting from using models AI systems that are strong at specific isolated tasks to reusing agents that can autonomously manage and coordinate more advanced workflows. When you prompt a model, you only engage its pattern-matching intelligence on single-step tasks. But if you give a model the environment and autonomy of an agent, it can perform far more complex operations, such as running services, requesting data from APIs, or generating useful inputs, such as spreadsheets or reports.  

When building agent systems, orchestrating models for complex automated workflows, several technical hurdles arise. For instance, you must determine intermediate storage solutions, prevent embedding large data structures in prompts, enable workflow-level network access while maintaining robust security, and implement timeout and retry mechanisms without developing a custom workflow engine.  

To streamline developer adoption, we engineered core infrastructure to provide the responses API with a controlled computational environment, enabling robust real-world task automation without requiring developers to manage execution contexts.  

OpenAI’s Responses API, together with the shell interface and hosted containerized workspace, addresses these engineering constraints. The model generates stepwise actions and shell commands executed within an isolated environment featuring its own visual file system, optional transactional storage via SQLite, and restricted outbound network connectivity.  

In this post, we’ll explain how we built a computer environment for agents and share some early lessons on using it to speed up and standardize production workflows. The model suggests an action, such as reading files or fetching data with an API; the platform runs it, and the result is used in the next step. To illustrate this in practice, we’ll start looking at the Shell tool, then move on to the container workspace, networking, reusable skills, and context compaction.  

To understand the shell tool, it helps to know how language models and agents interact with tools differently. A model uses tools by suggesting actions such as calling functions or issuing computer commands. During training, the model observes examples of tool use and their effects, learning when and how to suggest a tool. However, a model operating alone can only suggest using a tool; it can’t execute it. An agent, in contrast, provides the environment and orchestration needed to execute the model’s suggestions and complete real tasks.  

The Shell tool significantly extends the model’s capabilities. It provides direct access to a system’s command-line interface, enabling operations such as text searches or API communication. Built atop standard Unix utilities, it provides immediate access to commands such as grep, curl, and awk.  

Compared to our existing code interpreter, which only executes Python, the shell tool enables a much wider range of use cases, such as running Go or Java programs or starting a Node.js server. Such flexibility enables the model to perform complex agentic tasks.  

Orchestrating the Agent Loop 

By itself, a model can only suggest shell commands. But how do these commands actually run? We need an orchestrator to take the model’s output, run the tools, and send the results back to the model in a loop until the task is complete.  

Developers use the Responses API to interact with OpenAI models. When you use custom tools, the Responses API gives control back to the client, so you need your own setup to run the tools. But the API can also automatically manage the connection between the model and hosted tools.  

When the responses API receives a prompt, it assembles the model’s context, including the user prompt, prior dialogue, and tool instructions. For shell execution, the prompt must specify the shell tool, and the model must be trained to suggest shell commands. This applies to GPT 5.2 and later. With this context, the model decides what to do next. If it picks shell execution, it sends one or more shell commands to the responses API. The API sends these commands to the container runtime, streams the shell output back, and includes it in the next request’s context. The model can then review the results, send more commands, or give a final answer. The loop continues until the model finishes without any more shell commands.  

When the responses API initiates shell command execution, it maintains a live stream from the container service. As output becomes available, the API transmits it to the model in near real time, enabling the model to decide whether to wait or further output, initiate a new command, or compose a final output.  

The model can suggest several shell commands at once, and the responses API runs them simultaneously in separate container sessions. Each session streams its output independently, and the API combines these streams into structured tool outputs for context. This means the agent loop can run tasks such as file searches, data fetches, and result checks in parallel.  

For example, if you set a limit of 1,000 characters, the responses API will keep the first 500 and last 500 characters of the command output. The omitted section in the middle will be clearly marked, so you can see both the start and the end with skipped content indicated.  

By running commands in parallel and limiting output size, the agent loop stays fast. It uses context efficiently. This helps the model focus on important results instead of getting lost in long terminal logs.  

When Context Window Gets Full: Compaction 

A challenge with agent loops is that some tasks can take a long time to finish. These long tasks fill up the context window, which is needed to keep track of information across steps and agents. For example, when an agent calls a skill, gets a response, adds tool calls, and writes summaries, the limited context window can fill up fast. To keep important details while removing extraneous information, we added built-in compaction to the responses API. This means developers don’t have to build their own summarization or state systems. The compaction works with how the model is trained.  

Our latest models can review previous dialogue states and generate a compaction item that stores important information in an encrypted token-efficient format. After compaction, the next context window includes this compaction item and the most valuable parts of the earlier window. This lets windows operate uninterruptedly across context windows, even during long multi-step or tool-driven sessions.  

Codex uses this system to handle long programming tasks and repeated tool use without losing quality.  

You can use compaction either as a built-in server feature or through a separate /compact endpoint. With server-side compaction, you set a threshold, and the system manages compaction timing for you, so you don’t need complex client logic. This setup allows a slightly larger input context window, so requests just over the limit can still be processed and compacted rather than rejected. As models improve, the compaction feature updates with each OpenAI release.  

Codex played a key role in building the compaction system by being one of its first users. If a Codex instance encountered a compaction error, we started another instance to investigate the issue. This process helped Codex develop a strong built-in compaction system by solving real problems. Codex’s ability to review and improve itself is a unique part of working at OpenAI. Most tools only require users to learn them, but Codex learns with us.  

Container Context 

Now let’s cover the state and resources. The container is not only a place to run commands, but also the model’s working context. Inside the container, the model can read files, query databases, and access external systems under network policy controls.  

File Systems 

The first part of the container context is the file system. It is used to upload, organize, and manage resources. We created container and file APIs to give the model a clear view of available data. This helps it pick specific file operations rather than running broad scans.  

A common mistake is putting all input directly into the prompt context. As inputs get larger, the prompt becomes more expensive and harder for the model to use. It’s better to store resources in the container file system. Then the model can choose which files to open, read, or change using shell commands. Like most people, models work better when information is organized.  

Databases 

The second part of the container context is databases. We often recommend that developers store structured data in databases like SQLite and query them. Instead of putting a whole spreadsheet into the prompt, you can give the model a description of the tables, including the columns and their meanings, and let it pull only the rows it needs. In this quarter’s sales, the model can query only the relevant rows, rather than scanning the entire spreadsheet. This is faster, cheaper, and more scalable for larger datasets.  

Network Access 

The third part of the container context is network access, an essential part of the agent workload. Agents may need to give live data, call external APIs, or install packages. But giving containers full internet access can be risky. It could expose information to outside websites, accidentally reach sensitive systems, or make it harder to stop leaks and data theft. Use a sidecar egress proxy. All outbound network requests flow through a centralized layer, a policy layer that enforces allowlists and access controls while keeping traffic observable. For train chairs, we use domain skate secret injection at egress. The model and container only see placeholders while raw secret values remain outside the model’s visible context and are applied only to approved destinations. This reduces the risk of leakage while still enabling authenticated external calls.  

Agent Skills 

Shell commands are powerful, but many tasks follow the same multi-step patterns. Agents often have to figure out the workflow each time. This means replanning, reissuing commands, and relearning steps. It can result in inconsistent results and wasted effort. Agent skills package these patterns into reusable building blocks. A skill is a folder bundle. It includes a SKILL.md file with metadata and instructions, along with any necessary resources, such as API specs and UI assets.  

This structure fits well with the runtime setup described earlier. The container gives persistent files and an execution context. The shell tool provides a way to run commands. With both, the model can find skill files using shell commands like ls and cat. It can read instructions and run skill scripts all within the same agent loop.  

We offer APIs to manage skills on the OpenAI platform. Developers upload the end store skill folders as versioned bundles, which can be retrieved later by skill ID before sending a prompt to the model. The responses API loads the skill and adds it to the model context. This process follows the same steps: fetch skill metadata and include the name and descriptions.  

  • Fetch the skill bundle, copy it into the container, and unpack it.  
  • Update the model context, scale metadata, and the container path.  

When deciding if a skill is relevant, the model reviews its instructions step by step. Then it runs its scripts using shell commands in the container.  

How Agents Are Made 

To sum up: the responses API handles orchestration. The shell tool runs actions. The hosted container gives a persistent runtime context. Scales add reusable workflow logic. Compaction lets an agent run for a long time with the context it needs.  

With these building blocks, a single prompt can turn into a full workflow. It can find the right skill, get data, turn it into a structured local state, query it efficiently, and produce lasting results.  

Make Your Own Agent 

For an in-depth example of combining the shell tool and computer environment for end-to-end workflows, see our developer blog post and cookbook. These walk through packaging a skill and executing it through the Responses API.  

We look forward to seeing what developers create with these tools. Language models can do much more than just generate text, images, or audio. We will continue improving our platform to handle complex real-world tasks at scale.

SourceFrom model to agent: Equipping the Responses API with a computer environment 

We’re excited to announce that the new Azure Cobalt 100-based virtual machines (VMs) are now generally available. These VMs use Microsoft’s first 64-bit ARM-based Azure Cobalt 100 CPU, designed entirely in-house. This launch constitutes a major step forward in how we build and improve our cloud infrastructure, with improvements at every level—from hardware to services. By integrating hardware and software, Azure Cobalt 100-based VMs demonstrate our commitment to delivering the right balance of performance, power efficiency, and scale for our customers.  

The Cobalt 100-based VMs include our new general-purpose DPSv6 series and DPSLV6 series, as well as the memory-optimized EPSV6 series. They deliver up to 50% better price-to-performance than our previous ARM-based VMs. This makes them a great choice for many cloud-native Linux workloads, such as data analytics, web applications, servers, open-source databases, caches, and more.  

Azure Cobalt 100-based VMs offer up to 1.4x better CPU performance and 1.5x better Java performance, with twice the web.net and cache app performance compared to prior ARM-based VMs. They also provide up to 4x local storage IOPS and 1.5x bandwidth.  

The new VMs are now available in many regions, including Canada Central, Central US, East US 2, East US, Germany West Central, Japan East, Mexico Central, North Europe, South East Asia, Sweden Central, Switzerland North, UAE North, West Europe, and West US. We plan to add more regions in 2024 and beyond, such as Australia East, Brazil South, France Central, India Central, South Central, US, UK South, and West US. 3 and the West US. Microsoft Teams is serving its growing customer base more efficiently, attaining up to 45% better performance on Cobalt 100-based VMs.  

We also offer Cobalt 100-based VMs to independent software vendors providing PaaS and SaaS on Azure.  

The Journey to ARM: Adopting Innovation and Customer Benefits 

Microsoft has a long-standing history of working with ARM architecture and technology. This experience helped us develop key industry standards to prepare for data center scale computing. We are also partnering with others to launch initiatives such as Silver Ready and System Ready, earning industry recognition. Our move to ARM-based VMs stems from our goal to deliver better price-performance and power efficiency. The Cobalt 100-based VMs reflect this goal by delivering strong performance and cost savings for our customers. STEM for ARM has continued to thrive and has seen tremendous progress over the last couple of years. Major developer platforms and languages, such as C++, .NET, and Java, offer ARM-native versions. We have invested in ARM-specific optimizations for each of these platforms and languages, enabling us to fully leverage the capabilities of the ARM architecture.  

Many popular infrastructure and deployment tools now support Arm natively. GitHub Actions, which many developers use for continuous integration and delivery, is now available for Arm in two ways: self-hosted runners that run on an Arm VM or local Arm hardware, and GitHub-hosted runners.  

Containers are a popular method for application deployment due to their support for workflow streamlining, isolation, security, resource efficiency, portability, and reproducibility. Microsoft Azure Kubernetes Service (AKS) extends the ARM ecosystem by enabling users to create ARM agent nodes and supporting mixed deployments of both X86 and ARM nodes within the same cluster, emphasizing ARM’s flexibility.  

Specifications 

You can choose from several Azure virtual machines with three different memory ratios for each vCPU size. This gives you the flexibility to choose the setup that best fits your CPU and memory needs. All VM series are available with or without local disks, so you can choose the option that best suits your workload. The Dpsv6 series and the dpdsv6 series offer up to 96 vCPUs, 384 GB of RAM, and a 4:1 memory-to-vCPU ratio. They suit scale-out workloads, databases, applications, and web servers, and ARM-based development tasks. The Dplsv6 and dpldsv6 series VMs have up to 96 vCPUs and 192 GiB of RAM (2:1 ratio) and are suited for media coding, encoding, small databases, gaming servers, and lighter workloads. The Epsv6 and epdsv6 series offer up to 96 vCPUs and 72 GiB of RAM (8:1 ratio) for memory-intensive workloads such as large databases and data analytics.  

The new VMs support all remote disk types, including standard SSD, HDD, premium SSD, and ultra disks. For more on disk types and locations, see Azure Managed Disk Types. Disk storage is billed separately. Deploy VMs via portal, SDKs, APIs, PowerShell, or CLI.  

You can learn more about the new Azure Cobalt 100-based VMs by reading the documentation. Embrace this new breakthrough and unlock new possibilities for innovation, performance, and cloud transformation with Azure. 

SourceAzure Cobalt 100-based Virtual Machines are now generally available 

As developers create more advanced AI applications, they regularly face situations where large amounts of context such as long documents, detailed instructions, or code bases must be repeatedly sent to the model. While this information helps models respond accurately, it can also increase costs and slow performance because the same data is processed repeatedly.  

Vertex AI context caching, launched by Google Cloud in 2024, addresses this. Since then, we’ve continued to improve Gemini for greater speed and cost-effectiveness. Caching saves and reuses pre-computed input tokens. Benefits include:  

  • Significant cost reduction. Cached tokens on supported Gemini 2.5 and newer models cost just 10% of the standard input token rate. Implicit caching applies this saving automatically to recognized repeated inputs. Explicit caching lets you intentionally reuse inputs, guaranteeing the discount and predictable savings.  
  • Latency caching reduces latency by retrieving previously computed content rather than recomputing it.  

With these core benefits outlined, let’s now explore in greater detail how context caching operates and how you can take full advantage of it in your applications.  

What is Vertex AI Context Caching? 

Vertex AI context caching stores tokens for repeated content. There are two types of caching available:  

  • Implicit caching: enabled by default, saving money on cache hits without API changes. Vertex AI uses prior request states to inform future prompts, reducing costs. The cache deletes within 24 hours, depending on system load and reuse.  
  • Explicit caching: gives you control over what to cache and lets you reference it as needed. You receive a guaranteed discount for explicit caching, allowing for predictable savings.  

To support different prompt sizes and use cases, caching can range from 2,048 tokens to the entire model context window. For Gemini 2.5 Pro, that’s over a million tokens. You can cache any content supported by Gemini multimodal models, including text, PDFs, images, audio, and video. For example, you might cache a large amount, a large document, an audio file, or a video. See the list of supported models here.  

Both types of caching work with global and regional endpoints, so you get the benefits no matter how you use Gemini. Implicit caching is also integrated with provisioned throughput to support production-level traffic. For extra security and compliance, you can encrypt and explicit caches using customer-managed encryption keys (CMEKs).  

Ideal Use Cases for Context Casing 

Large-scale document processing: cache lengthy contracts or regulatory documents to enable repeated queries for clauses or compliance checks.  

  • A financial analyst using Gemini can upload and cache documents, such as annual reports, to enable repeated querying or analysis without re-uploading the files each time. The explicit cache can be cleared manually, while the implicit cache clears automatically.  
  • For customer support chatbots, cache detailed persona instructions and comprehensive product information. This ensures that the chatbot delivers consistent, accurate, and relevant responses to customer inquiries without reprocessing instructions or data for every new session.  
  • For example, a customer support chatbot can use cached instructions and information. Instead of reprocessing these for each conversation, compute them once and let the chatbot reference them, resulting in faster responses and lower costs.  
  • For code use cases, you can cache your entire codebase or frequently used code files. This supports more efficient code Q&A, enables faster auto-complete, accelerates bug fixing, and improves the pace of feature development by reducing redundant processing.  
  • For enterprise knowledge bases, organizations can cache large collections of internal documentation, wikis, or manuals. This helps employees quickly access essential process details or compliance information without re-entering data, streamlining daily workflows.  

Cost Consequences for Implicit and Explicit Caching 

Implicit caching: enabled by default for all Google Cloud projects. On repeated content with a cache hit, you automatically get a discount. Standard input tokens are billed with no extra storage charge.  

Explicit caching:  

  • Cached token count: Creating a cached content object incurs a one-time fee at the standard input token rate. Later, each generate content request using this cache is billed at a 90% discount.  
  • Storage duration (TTL): You are also billed for the time the cached content is stored, based on its time-to-live (TTL), which determines how many hours the data remains available before it is deleted. The charge is an hourly rate per million tokens, prorated down to the minute.  

Title Cache, Best Practices, and How to Optimize Cache Hit Rate 

  • Check the limitations. Make sure you meet the caching requirements, such as the minimum cache size and supported models.  
  • Granularity: Place repeated or cached context at the start of the prompt. Avoid caching frequently changing small pieces.  
  • Monitor usage and costs: Check Google Cloud billing to see the caching’s impact. To see token count, check cachedcontenttokencount in usagemetadat.a.  
  • Frequency: Implicit caches are cleared within 24 hours or sooner. Making repeated requests quickly keeps the cache available.  

For explicit caching specifically:  

  • TTL management: Choose the time-to-live (TTL) setting carefully. A longer TTL means higher storage costs, but less need to create the cache. Balance this based on how long your context is useful and how often it is used.  

Get Started: 

Context caching can greatly improve the efficiency and cost-effectiveness of your AI applications. By using this feature, you can reduce redundant token processing, achieve faster response times, and build more scalable, cost-effective generative AI solutions.  

Implicit caching is turned on by default for all GCP projects, so you can start using it right away.  

To get started with explicit caching, check out our documentation, which includes sample code to create your first cache and a Colab notebook with common examples and code.

SourceSave costs and decrease latency while using Gemini with Vertex AI context caching 

The Kuiper project, one of Amazon’s latest satellite initiatives, is also a significant factor in the company’s plan to develop commercial services for businesses and government. Kuiper will provide low-latency, high-speed internet service to business customers and public entities in the U.S., especially in rural and underserved areas where access to broadband networks capable of supporting high-capacity connections is limited.  

Since Amazon began to develop its Kuiper satellite constellation, an effort that represents one of the largest private-sector efforts to build a large-scale LEO satellite constellation, it has shifted to focus on enterprise and AI-driven connectivity, rather than just consumer broadband. An example of this shift was Amazon’s announcement last month of the launch of its first five satellites in the Kuiper constellation and plans to launch 19 additional satellites in the future.  

Building a Space-Based Internet Backbone  

Kuiper will use a system of many satellites to continuously connect the globe, rather than a single stationary satellite in geostationary orbit, providing better service and lower latency than today’s traditional satellite services.  

Kuiper’s performance model is geared towards the needs of today’s businesses: real-time access to their data, integration with cloud computing, and the use of AI to make decisions, all of which require a reliable, high-speed internet connection. Kuiper’s design will enable low-latency, more reliable networks for businesses operating in areas with limited infrastructure.  

Amazon has stated that this is a business-focused network to provide a better solution than the traditional consumer broadband offerings.  

Expanding Enterprise Connectivity Across the US  

Satellite internet has primarily been limited to remote areas until now; however, Project Kuiper is positioning itself as a provider of connectivity solutions for enterprises across the US. This includes multiple industries such as logistics, energy and agriculture, defense, and disaster response.  

Many enterprises have redundant systems in place to ensure their business operations continue in the event of issues with their existing network or disruptions, because outages and interruptions occur frequently and unexpectedly. Kuiper offers businesses a backup or primary source of connectivity through its satellite-based architecture, enabling connectivity for businesses operating in remote, geographically isolated, and/or infrastructure-deficient locations.  

The trend toward hybrid telecom networks is becoming the standard model for large-scale networks, in which fiber, wireless, and satellite-based connectivity systems will connect end users.  

Competing in the LEO Satellite Race  

In a very competitive new industry that already has successful players like SpaceX’s Starlink network, Amazon is entering the race to deploy LEO (low Earth orbit) satellite constellations. The push for increased global broadband and more reliable internet connections continues to grow exponentially.  

LEO satellite constellations are the first step toward solving this problem, but building them comes with high costs; for example, an LEO constellation requires high upfront costs to set up manufacturing and launch logistics, and then the cost to cover all ground infrastructure will be very high. Once the system is operational, they are relatively less expensive to implement, since the satellites can scale and reach areas that would normally be too expensive to build fiber networks.  

Amazon is placing significant emphasis on integrating the Project Kuiper system into its existing cloud infrastructure. This could give Amazon an advantage in providing enterprise services to its current customers who already use its cloud technologies.  

The Role of Satellite Internet in the AI Era  

Increasingly, the need for low-latency/high-bandwidth networks is driven by the growing use of artificial intelligence. AI systems require continuous data communication across multiple devices, edge systems, and cloud computing centers.  

In this environment, satellite internet is rapidly emerging as an important infrastructure element within this ecosystem. For sectors currently using AI at scale (e.g., autonomous logistics, remote sensing, and smart agricultural products), connectivity issues can significantly reduce overall performance.  

Project Kuiper directly addresses these issues by ensuring consistent connectivity where traditional networks either fail or do not exist; this is especially important for AI-based organizations that cannot afford any downtime or data delays.  

Integration with Cloud and Edge Computing  

The primary benefit of Kuipers is its easy integration with AWS (Amazon Web Services) and its cloud computing services. Connecting satellite communications directly to a cloud-based infrastructure enables Amazon to provide an end-to-end solution that integrates data collection, transmission, storage, processing, and analysis with AI.  

Amazon can also leverage the trend toward edge computing and use its satellites as intermediary nodes to transmit data from remote sensors, vehicles, or industrial systems directly into cloud-based AI models, rather than processing all the data centrally via traditional cloud servers.  

This type of integration is critical for applications such as disaster monitoring, defense communication, and manufacturing process automation that require immediate feedback from multiple data sources within milliseconds of occurrence.  

Infrastructure Challenges and Deployment Scale  

Though it has great potential, Project Kuiper faces many difficult engineering and logistical challenges. Putting a complete satellite constellation into operation requires launching many rockets simultaneously, each satellite reaching the correct orbit, and a robust ground station network.  

One of the major hurdles to manufacturing satellite systems on a large scale is ensuring an efficient, well-streamlined production process that can produce large quantities of advanced satellite systems while maintaining consistency and controlling costs.  

Amazon has made substantial investments in creating manufacturing facilities and forming rocket launch partnerships to propel satellite deployment times, but we won’t see a complete global footprint for many years.  

Regulatory and Spectrum Considerations  

The expansion of satellite Internet largely depends on the satellite regulatory authority (RA) related to satellites, such as spectrum allocation, orbital slots, and frequencies, to minimize interference among competing satellite networks.    

As more companies enter the LEO market, international coordination becomes increasingly complex. Regulators are also paying closer attention to issues such as space debris, orbital congestion, and long-term sustainability of satellite constellations.  

Satellite Internet, Inc.’s (SII) success will be determined in part by how it navigates the regulatory landscape as it grows its business.  

Economic and Industry Implications  

If Project Kuiper is successful, it will have far-reaching effects on telecommunications and enterprise technology markets. New models of connectivity may reduce the need for traditional fiber infrastructure in many parts of the world and provide greater agility for businesses.  

With Project Kuiper, enterprises can expect increased redundancy, improved uptime, and additional access to remote locations for operations. There is also potential for new competitors to enter the telecommunications market and change how pricing and service are structured.  

Finally, by entering the satellite broadband market, Amazon demonstrates that cloud computing, artificial intelligence (AI) infrastructure, and global connectivity are converging into a single technology stack.  

The Future of Satellite-Powered Connectivity  

With the increasing prevalence of AI-enabled applications that require connectivity as much as, or more than, computational capability, Project Kuiper aims to help ensure that how we access networks keep pace with the development of data-intensive technologies driven by AI.  

Satellite constellations will increasingly become an essential means of providing a global digital infrastructure over the next few years, creating a seamless link between disparate urban and rural areas while enabling the deployment of new AI-based workloads.  

As such, Project Kuiper has the potential to be an essential part of the infrastructure for building what will undoubtedly be one of the most significant new economies based upon AI, with applications for enterprise logistics, autonomous systems, and global cloud services.  

Conclusion: A New Layer of Digital Infrastructure  

The expansion of the Project Kuiper network indicates a new way of thinking about connecting with one another in an era dominated by AI. Instead of being just a backup option for remote areas, satellite internet is becoming a primary component of the infrastructure that enterprises and cloud-based systems rely on.  

From Amazon’s perspective, this is both a technological and strategic gamble: it believes that the future of connecting with each other will be through space, will utilize artificial intelligence in the connection, and will become entrenched in operating global enterprises.  

As Kuiper deployment proceeds, it has the potential to transform how businesses connect, compute, and expand in a world that increasingly depends on artificial intelligence. 

Source: Amazon Leo mission updates: Amazon Leo completes ninth mission, two more on deck 

According to recent updates to both Google’s AdSense policy and platform guidance regarding a major evolution in its advertising ecosystem due to rapid changes in how search delivers results, the search engine has been evolving from traditional “link-based” search results to “AI-generated” search responses. Google is preparing for the continued transformation of digital advertising in which conversational AI will replace the traditional “ten blue links” model of digital advertising.  

With generative AI playing a major role in API-based search experiences, Google is working to position its ad-serving infrastructure as being closely integrated with AI-based search results. This work will include adapting how ad serving locations are determined, refining the monetization model, and rethinking the interpretation of user intent when they provide search requests, distinguishing between synthesized and listed answers.  

From Search Links to AI Answers  

Keyword-targeted ads have dominated search marketing for a long time. They are triggered whenever a user searches by entering keywords and provide a ranked list of relevant websites and pay-per-click ads. Advertisers bid on keywords to have their ad displayed to a user based on the keywords used on a search engine results page.  

With the introduction of AI-driven search functions, users who make keyword searches are now more frequently getting AI-generated answers from the system instead of multiple clickable websites. This change to an AI-generated search results model has dramatically reduced the number of traditional click-through opportunities a user has throughout the search process.  

Google is having to address how to support successful monetization for advertisers in a world where users receive only a single, synthesized AI-generated answer to their search queries.  

Rebuilding AdSense for an AI-First Web  

AdSense has historically been based mostly on the placement of contextual ads in print media and on websites. Today, Google is looking for new ways to apply these principles in the context of artificial intelligence.  

Some examples of potential implementations include placing ads within AI-generated answers, using contextual sponsorship overlays, and developing new formats specifically for conversational interfaces such as chatbots. Therefore, whereas traditional advertising focused on placing ads next to static pages, advertisers must now identify and match relevant ads to user intent through dynamic, ongoing conversations.  

Accordingly, Google has released updated guidelines stating that it will work toward offering more adaptable monetization options that work well with generative environments while also preserving a positive user experience.  

The Challenge of Monetizing AI Search  

Users increasingly find their searches involve fewer traditional link listings, which may impede advertiser visibility and make user journeys even harder to predict. Traditional search has provided many touchpoints for displaying ads via multiple links as the users scroll through listings.  

As AI continues to generate responses and display them in search results, advertisers could be left with fewer opportunities to gain exposure if only one answer is displayed for each search query. To solve this problem, companies like Google are forced to rethink how they derive value from the ads and information displayed at the end of searches.  

One approach that has recently been discussed by several companies is embedding sponsored content directly into AI-generated responses to address reduced exposure opportunities; however, this will raise concerns about transparency, trust, and regulation for users.  

Intent-Based Advertising in the AI Era  

A major benefit of AI-powered searches is their ability to better understand a person’s intent than previous methods by leveraging data from prior inquiries. AI’s ability to analyze the context, follow-up questions, and conversational nuances of an individual’s online behavior enables it to provide a more accurate understanding of their desires.  

Using real-time data about an individual’s intent will enable advertisers to develop a more sophisticated approach toward advertising to their target audience. For example, if an individual searches for travel-planning information across multiple searches, relevant ads may be served to that individual at each stage of the planning process, such as flights, hotels, or travel insurance.  

As such, Google is expected to use this intent data to deliver a more customized experience for advertisers and users alike in online advertising.  

Risks of Over-Monetization in AI Systems  

Artificial intelligence-based advertising creates new possibilities and returns; however, it also creates risks associated with those possibilities, e.g., AI-generated ads. One of those risks is that the organic use of AI can be used to create ads, blurring the lines between “informational” and “advertising.”  

If a user can’t easily tell whether a response from an AI is from a “neutral” or “sponsored” source, there is potential for a loss of trust in AI-generated responses. Thus, countries around the globe are now reviewing how AI technologies generate or provide information about the commercial intent behind their creation.  

A pivotal activity for companies developing monetization models for AI-driven search will be ensuring transparency in their monetization framework.  

Impact on Publishers and the Open Web  

AI-generated results can alter the revenue model and affect publishers reliant on surfacing their content through search. If search engine users receive a direct response within the search interface, there will likely be fewer clicks to external websites.  

This could reduce referral traffic to these sites and, in turn, decrease advertising revenue for many independent publishers, blogs, and news organizations. To counteract this potential loss of revenue, many publishers are calling for clearer compensation models for when their established content has been used to train or support learning AI systems.  

The evolution of Google’s ad platform will be critical in determining how value is allocated across the digital content economy.  

New Ad Formats for Conversational Search  

To address these challenges, the next generation of search advertising is expected to include formats specifically designed for conversational interfaces. These may include:  

  • Contextual sponsored suggestions within AI responses  
  • Product recommendations embedded in chat-like search flows  
  • Dynamic ad insertion based on conversation stage  
  • Interactive ads that respond to user follow-ups  

These formats aim to preserve advertising effectiveness while adapting to a less structured user experience.  

Competition in AI Search Monetization  

This movement away from traditional search and advertising towards an AI-enhanced model is not unique to Google. There are a number of other major tech companies in the marketplace investing in and testing their own versions of AI-based search and advertising, and therefore, there is intense competition among these companies to establish what constitutes the ‘standard’ for AI-enhanced search and advertising.  

The results of this competition will likely have a significant impact on the way users interact with information and purchase goods online. Companies that can successfully incorporate monetization into AI products without compromising on user experience should stand to experience superior long-term benefits.  

The Road Ahead for AI-Powered Advertising  

With AI as a primary means of accessing information, advertising will increasingly rely on platforms that understand contextual nuances, user intent, and dialogue progression, rather than solely on standard keywords.  

For Google, this presents an opportunity and a challenge: to maintain its leading role in digital advertising while also updating the fundamental technologies it relies on to deliver services to customers on a very different Internet platform.  

The success of this transformation will determine how effectively AI-driven search can sustain the economic model that has powered the internet for decades.  

Conclusion: Reinventing the Economics of Search  

The transition to an AI-first search experience is driving a fundamental overhaul of the digital advertising landscape. Google is taking the lead in this shifted paradigm by reworking its AdSense platform to become central to the entire change process.  

In this way, AI answers overtake traditional search results. How advertising can be placed in a conversational environment without sacrificing consumer trust or user experience will determine how monetization proceeds moving forward.  

The next phase of search will be more than just providing better answers; it will reimagine how information itself provides funding.

Source: Announcements 

Firefly Aerospace has integrated NVIDIA’s Jetson AI platform into its lunar mission systems. By moving decision-making away from traditional Earth-based methods for on-orbit spacecraft, this integration serves as an essential capability for autonomous space transportation. Onboard AI will leverage real-time image processing via the Jetson AI platform to enable intelligent decision-making on lunar missions, eliminating communication delays that compromise the time-critical nature of many missions. 

The onboard AI will be used through Firefly’s lunar imaging and data services, enabling it to process and analyze data as it is generated before human operators are involved. By providing onboard computing capabilities to enable edge AI to drive and develop autonomous space infrastructures, Firefly is at the forefront of this transformational change in the aerospace industry.  

Bringing AI to the Edge of Space  

For decades, most of the data processing for space missions has been performed on the ground using Earth-based resources. The raw data obtained from space is then transmitted back to Earth for processing before being used to convey commands to the spacecraft. This creates time delays of several seconds to minutes, depending on orbit distance and mission design.  

The addition of NVIDIA Jetson technology to Firefly’s lunar systems transforms much of the ground-based data processing into onboard operations. This allows near-real-time interpretation of captured data, such as image processing, real-time terrain mapping, and immediate adjustments to the spacecraft without waiting for data from Earth.  

This shift is extremely valuable for missions to the Moon because of rapidly changing surface conditions and the limited time available to establish a two-way communication link with Earth; therefore, it is imperative that lunar missions be performed autonomously.  

What Edge AI Changes for Space Missions  

Edge AI refers to computers that analyze their own data on-site rather than sending it to a central location for processing. In space exploration, spacecraft will be able to perform independent sensor analysis and respond immediately without needing to send all data back to Earth for processing.  

The Firefly lunar camera service demonstrates how Edge AI will improve space mission operations. The camera could assist in detecting hazards, analyzing surface features, and optimizing landing paths, among other tasks. Edge AI enables data savings by transmitting processed insights to Earth rather than sending complete sensor data. 

These efficiencies enable improved communication with deep-space vehicles, as the number of communication pathways far exceeds the number of available channels.  

NVIDIA Jetson’s Role in Space Computing  

Jetson by NVIDIA has been created specifically for power-efficient, compact, high-performance AI computing. Primarily designed for use in drones, robots, and autonomous machines operating in Earth environments, their inherent ability to manage heavy-duty AI workloads while conserving power has led to an increasing number of applications in the aerospace and defense industries.  

Firefly’s design relies on the Jetson platform because it enables the spacecraft to run onboard machine learning models that perform image classification, object identification, and environmental analysis.  

This capability is extremely beneficial for lunar missions because the three largest engineering challenges will be power consumption, weight, and reliability.  

Firefly’s Push Toward Autonomous Lunar Systems  

Firefly Aerospace is among a host of expanding private-sector space companies creating commercial infrastructure on the surface of the Moon and is devoted to using AI as part of a trend across the industry toward the development of autonomous systems in support of space exploration.  

As the increasing complexity of space exploration missions requires real-time navigation and control during surface operations or extended periods in orbit, autonomous systems with AI will be necessary to scale space exploration efforts.  

Firefly is embedding intelligence directly into spacecraft to reduce reliance on ground stations for mission design flexibility.  

Why Real-Time Processing Matters in Lunar Missions  

Mission planners are tasked with developing and executing mission plans that will be accomplished as quickly as possible while adhering to very tight time constraints and significant communication delays. A small amount of latency in decision-making can have major adverse effects on landing accuracy, data quality, and the overall mission’s safety.  

Spacecraft use their onboard AI system to handle unexpected events, which include surface anomalies and navigation errors. The technology will enhance mission success rates by enabling autonomous landers and orbiters to conduct mapping and reconnaissance activities. 

Onboard AI enables dynamic planning of mission objectives; i.e., spacecraft can modify their objectives based on new data obtained in real time, rather than following pre-programmed objectives.  

A Step Toward Fully Autonomous Space Infrastructure  

Spacecraft that utilize AI computing platforms such as Jetson are poised to provide a fundamentally new perspective for fully autonomous spacecraft. In addition to collecting and gathering data through an onboard system, future space vehicles will also be capable of autonomously interpreting, making decisions based on, and acting on that data.  

This transition to an increasingly autonomous spacecraft architecture supports the long-term objectives of an ongoing and permanent presence on the Moon and, ultimately, deep-space travel, both of which are impeded by the inability to effectively communicate in real time with human operators due to the timing of interplanetary communications.  

As AI integration into advanced space technology advances, onboard intelligence will become a standard feature of all advanced spacecraft systems.  

Industry Implications and Future Applications  

With their joint efforts, Firefly Aerospace and NVIDIA could create a model that may inspire many other aerospace companies to evaluate incorporating edge AI into their operations. Future examples of this type of system could be found in Mars missions, asteroid exploration, and Earth-observing satellites.  

Beyond exploration, real-time AI capabilities in orbit could also support commercial satellite services such as environmental monitoring and disaster detection and enable global imaging systems.  

The success of these types of integrations will ultimately be a significant contributor to the timeline of when autonomous spacecraft will become a major component of commercial spaceflight.  

Conclusion: Intelligence Moves Beyond Earth  

Firefly has successfully combined NVIDIA Jetson capabilities with its lunar exploration system, enabling autonomous exploration through real-time AI processing while reducing Earth’s ground control requirements. 

With increasingly advanced missions planned for our solar system, onboard intelligence will begin to rival propulsion and navigation systems in importance, completely transforming our method of planetary exploration beyond Earth.

Source: Firefly Aerospace Enables On-Orbit Processing for Moon Imaging Service with NVIDIA Jetson 

At 12:56 PM CDT on Monday, four Artemis II astronauts set a new space flight record. They reached 248,655 miles from Earth, surpassing Apollo 13’s distance in 1970. Orion will peak at about 252,756 miles before returning. This marks a milestone in human space travel.  

Six days into the first crewed Artemis mission, astronauts Wiseman, Glover, Koch, and Hansen continued taking photos of the moon as they traveled farther from Earth.  

At NASA, we dare to reach higher, explore farther, and achieve the impossible. That’s embodied perfectly by our Artemis II astronauts, Reid, Victor, Christina, and Jeremy. They are charting uncharted territories for all humanity, said Dr. Glaze, Acting Associate Administrator, regarding the Exploration Systems Development Mission Directorate at NASA Headquarters in Washington. Their commitment is about more than breaking records. It’s powering our hope for a bold future. Their mission is to carry out our pledge to return to the moon’s surface, this time to stay, as we build a moon base. Following their historic achievement, NASA’s Orion spacecraft launched on April 1 using the Space Launch System rocket from Kennedy Space Center. The next day, Orion performed engine burns to leave Earth’s orbit and head to the Moon. 

After setting the new record, the crew shared brief, heartfelt remarks. Canadian astronaut Jeremy Hansen spoke to the world from aboard Orion.  

From the cabin of Integrity here, as we surpass the furthest distance humans have ever traveled from planet Earth, we do so in honoring the extraordinary efforts and feats of our predecessors in manned space exploration. We will continue our journey even further into space before Mother Earth succeeds in dragging us back to everything that we hold dear. Most importantly, we chose this moment to challenge this generation and the next. Make sure this record is not long-lived.  

Along with breaking the spaceflight record, the crew suggested naming two lunar craters, one after their spacecraft, Integrity, and another honoring Wiseman’s late wife, Carol. After the mission, these names will be submitted to the International Astronomical Union, which oversees the naming of astronomical features. Looking ahead, the crew will pass within about 4,067 miles of the moon’s surface. They will be the first to see some areas on the far side and witness a solar eclipse. During this phase, NASA expects a 40-minute planned communications blackout as the moon blocks signals between Orion and Earth. Once Orion emerges, it should quickly reconnect with flight controllers in Houston. 

During the lunar flyby, many cameras will capture images of the Moon, including areas never seen before. The astronauts will use different digital handheld cameras to take high-resolution photos of the lunar surface. Artemis II gives the crew a chance to collect valuable data. Direct observations are a powerful tool for studying the Moon’s features in varying lighting and texture. 

Photos, videos, telemetry, and communication data from this test flight will guide future Artemis missions as NASA works to build its moon base.  

The Artemis II astronauts have passed the halfway point of their mission. The crew is expected to splash down off the coast of San Diego at about 8:07 PM EDT (5:07 PM PDT) on Friday, April 10. After the splashdown, recovery teams will retrieve the crew by helicopter. They will then transport the astronauts to the USS John P. Murtha. The astronauts will receive post-flight medical exams in the ship’s infirmary. Later, they will return to shore and board the aircraft for NASA Johnson.  

Through the Artemis initiative, NASA aims to return humans to the moon and establish a sustainable presence there. The program will send astronauts on progressively challenging missions to explore more of the lunar surface. Artemis missions will advance scientific research, foster economic development, and help lay the foundation for the first crewed missions to Mars.  

For the latest mission progress, visit https://www.nasa.gov/artemis-II.

SourceNASA’s Artemis II Crew Eclipses Record for Farthest Human Spaceflight 

Google Search is committed to helping people find useful information. That’s why we’re introducing the Helpful Content Guide. This update is part of our ongoing work to show more original, helpful content created by people for people in search results. Here’s what you need to know about the update and what content creators should keep in mind.  

Focus on People First Content 

The helpful content update is designed to reward content that gives visitors a satisfying experience. Content that doesn’t meet visitors’ expectations may not perform as well.  

To help your content perform well with this update, create for people, not just search engines. Focus on creating content that is helpful and satisfying while using SEO best practices to add value. Answering yes to any of these questions means you’re likely taking a people-first approach.  

  • Do you have an existing or intended audience for your business or site that would find the content useful if they came directly to you?  
  • Does your content clearly demonstrate first-hand expertise and a depth of knowledge? (For example, expertise that comes from actually having used a product or service or visited a place).   
  • Does Your Site Have a Primary Purpose or Focus?  
  • After reading your content, will someone leave feeling they have learned enough about a topic to help achieve their goal?  
  • Will someone who reads your content leave feeling satisfied?  
  • Are you keeping our guidance on core updates and product reviews in mind?  

Avoid creating content for search engines first. 

We recommend following SEO best practices outlined in Google’s SEO Guide. SEO works best when it supports people-first content, while content made mainly for search engines often leaves searchers unsatisfied.  

How can you avoid focusing on search engines first? If you answer yes to some or all of the following questions, it may be time to rethink your content strategy:  

  • Is the content primarily to attract people from search engines rather than made for humans?  
  • Are you producing lots of content on various topics in the hopes that some of it might perform well in search results?  
  • Are you using extensive automation to produce content on many topics?  
  • Are you mainly summarizing what others have to say without adding much value?  
  • Are you writing about things simply because they seem trending and not because you’d write about them otherwise for your existing audience?  
  • Does your content leave readers feeling like they need to search again to get better information from other sources?  
  • Are you writing to a particular word count because you read or heard that Google has a preferred word count? (No, we don’t.)  
  • Did you decide to enter some niche, uh, topic area without any real expertise, but instead mainly because you thought you’d get search traffic?  
  • Does your content promise to answer a question that actually has no answer? Such as suggesting a release for a product, movie, or TV show when no release has been committed?  

How the Update Works 

The update will begin rolling out next week. We’ll update our Google ranking updates page when it starts and when it’s finished, which may take up to two weeks. This update adds a new site-wide signal that we use along with other signals to rank web pages. Our systems automatically look for content that has little value, no added value, or isn’t very helpful to searchers.  

Any content on sites with a lot of unhelpful material is less likely to rank well in search, especially if better content is available elsewhere. Removing unhelpful content can improve your pages’ rankings. 

You might wonder how long it takes for a site to improve after removing unhelpful content. Sites affected by this update may have the signal applied for several months. Over time, our system runs continuously, checking both new and existing sites. Once it sees that unhelpful content hasn’t come back over time, the classification will be removed. The classification process is entirely automated using a machine learning model. It is not a manual or spam action. Instead, it is just another signal, one of many that Google evaluates to rank content.  

People-first content on sites with unhelpful content can still rank well if other signals show it’s helpful and relevant. The signal is weighted. – Sites with more unhelpful content may see a bigger impact. For best results, remove unhelpful content and follow our guidelines.  

This update will affect English searches worldwide, and we plan to add support for other languages later. In the coming months, we’ll keep improving how our system detects unhelpful content and will work to better reward people-first content. We appreciate everyone who’s submitted feedback. We received enough reports for this specific update, and the feedback form is now closed. However, for historical accuracy, we left the link in the blog post.  

If you have any feedback about this update, you can comment on this thread in our help forum. If you’d like to give us feedback on this update, you can comment in our help forum thread. If your response concerns your own site, use the feedback form for this update. Your input helps our engineers improve our systems.

SourceWhat creators should know about Google’s August 2022 helpful content update