• About
  • Advertise
  • Privacy & Policy
  • Contact
HK Businesswire
  • Home
  • News
    • All
    • Business
    • Politics
    • PR Newswire
    • Science
    • World
    OpenAI launches AI browser Atlas

    OpenAI Files Confidentially for IPO Amid Intensifying AI Competition

    World economy should avoid recession: IMF

    IMF Approves $163 Million Disbursement for Papua New Guinea Under Multiple Facilities

    Europe swelters in worst early-summer heatwaves

    Innovative projects explore ways to deal with extreme heat

    STAK Inc. to Launch AI-Ready Distributed Power Solutions Through Proposed U.S. Subsidiary

    Park Systems Secures KRW 100 Billion in Strategic Financing to Expand Production Capacity and Accelerate Global Growth

    Park Systems Secures KRW 100 Billion in Strategic Financing to Expand Production Capacity and Accelerate Global Growth

    Skills Remain in Focus as Hiring Momentum Moderates Across APME in Q3 2026, ManpowerGroup Survey Finds

    Skills Remain in Focus as Hiring Momentum Moderates Across APME in Q3 2026, ManpowerGroup Survey Finds

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • PR Newswire
  • Business
  • World
  • Entertainment
  • Sports
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup
    OpenAI launches AI browser Atlas

    OpenAI Files Confidentially for IPO Amid Intensifying AI Competition

    SpaceX scrubs launch of ISS replacement crew mission

    SpaceX Valued at $780 Billion Ahead of Potential IPO, Morningstar Says

    Fortress Launches Major Service Upgrade to Boost O+O Sales in Hong Kong

    Carousell Launches Hyper-Local Climate Impact Leaderboard in Hong Kong

    Xiaomi Cuts MiMo-V2.5 API Prices by Up to 99% Worldwide

    Hong Kong Medical Implant Firm Koln 3D to Expand into Central Asia

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Feature
No Result
View All Result
  • Home
  • News
    • All
    • Business
    • Politics
    • PR Newswire
    • Science
    • World
    OpenAI launches AI browser Atlas

    OpenAI Files Confidentially for IPO Amid Intensifying AI Competition

    World economy should avoid recession: IMF

    IMF Approves $163 Million Disbursement for Papua New Guinea Under Multiple Facilities

    Europe swelters in worst early-summer heatwaves

    Innovative projects explore ways to deal with extreme heat

    STAK Inc. to Launch AI-Ready Distributed Power Solutions Through Proposed U.S. Subsidiary

    Park Systems Secures KRW 100 Billion in Strategic Financing to Expand Production Capacity and Accelerate Global Growth

    Park Systems Secures KRW 100 Billion in Strategic Financing to Expand Production Capacity and Accelerate Global Growth

    Skills Remain in Focus as Hiring Momentum Moderates Across APME in Q3 2026, ManpowerGroup Survey Finds

    Skills Remain in Focus as Hiring Momentum Moderates Across APME in Q3 2026, ManpowerGroup Survey Finds

    Trending Tags

    • Trump Inauguration
    • United Stated
    • White House
    • Market Stories
    • Election Results
  • PR Newswire
  • Business
  • World
  • Entertainment
  • Sports
  • Tech
    • All
    • Apps
    • Gadget
    • Mobile
    • Startup
    OpenAI launches AI browser Atlas

    OpenAI Files Confidentially for IPO Amid Intensifying AI Competition

    SpaceX scrubs launch of ISS replacement crew mission

    SpaceX Valued at $780 Billion Ahead of Potential IPO, Morningstar Says

    Fortress Launches Major Service Upgrade to Boost O+O Sales in Hong Kong

    Carousell Launches Hyper-Local Climate Impact Leaderboard in Hong Kong

    Xiaomi Cuts MiMo-V2.5 API Prices by Up to 99% Worldwide

    Hong Kong Medical Implant Firm Koln 3D to Expand into Central Asia

    Trending Tags

    • Nintendo Switch
    • CES 2017
    • Playstation 4 Pro
    • Mark Zuckerberg
  • Feature
No Result
View All Result
HK Businesswire
No Result
View All Result
Home News PR Newswire

WEKA and Oracle Cloud Infrastructure Validate 10x Throughput Gains for Long-Context AI Inference

PR Newswire by PR Newswire
9 June 2026
in PR Newswire
0
WEKA and Oracle Cloud Infrastructure Validate 10x Throughput Gains for Long-Context AI Inference
0
SHARES
0
VIEWS
Share on FacebookShare on Twitter

Joint benchmarks on OCI H100 infrastructure showed 10x more concurrent users, 10x higher token throughput, and 7x more tokens served without adding GPUs

CAMPBELL, Calif., June 10, 2026 /PRNewswire/ — WEKA, the AI data and memory infrastructure company, today announced production-scale benchmarks that show how organizations can improve the economics of long-context AI inference by serving more users and tokens on the same GPU footprint. The benchmarks show that WEKA’s NeuralMesh™ platform with Augmented Memory Grid™ on Oracle Cloud Infrastructure (OCI) serves 10x more concurrent users, delivers 10x higher token throughput, and produces 7x more tokens per GPU than DRAM-only configurations without adding infrastructure. The results were validated on a nine-node OCI bare-metal H100 cluster with 100,000-token context windows.


“Enterprise AI workloads are pushing context windows and GPU utilization to new limits,” said Pablo Selem, senior director, software development, Oracle Cloud Infrastructure. “These benchmarks show how WEKA’s NeuralMesh platform with Augmented Memory Grid on OCI helps remove memory bottlenecks so customers can support larger, more demanding inference workloads without simply adding more GPUs.”

Three Outcomes That Change the Math on Inference
Validated at production scale on a bare-metal H100 cluster (nine nodes, 72 GPUs, 100,000-token context windows, thousands of concurrent users), NeuralMesh with Augmented Memory Grid on OCI delivered:

  • 10x more concurrent users served, without adding infrastructure. NeuralMesh with Augmented Memory Grid scaled past 5,000 concurrent users vs. about 600 for DRAM-only configurations. This eliminates the failure cliff that hits when cache saturates by expanding the active cache working set from 8.64 TiB of DRAM to 287 TiB of usable NVMe. In addition, more users per GPU means the same investment stretches further.
  • 10x higher token throughput. More output from every GPU in the cluster. On OCI, NeuralMesh with Augmented Memory Grid reached approx. two million tokens per second, compared to under 200,000 for the DRAM-only baseline. For product teams running real-time AI features, including search, summarization, code assist, and multi-turn agents, the throughput determines the ceiling for how many users can be served, how fast features respond, and how much revenue the infrastructure can support.
  • 7x more tokens served. Lower cost per token at scale. NeuralMesh with Augmented Memory Grid served five billion tokens, compared to 700 million for the DRAM-only baseline, in a single one-hour, 2,400-user test. For organizations running agentic workflows, DRAM saturation quietly drains GPU capacity through constant recomputation, creating a direct hit on cost per token and ROI.

“Inference is bottlenecked by how much effective memory is available to GPUs,” said Liran Zvibel, CEO of WEKA. “These results prove that AI token economics aren’t solved by hardware alone; they’re solved by eliminating the memory wall that has been the real ceiling on what existing hardware can do. NeuralMesh with Augmented Memory Grid running on OCI brings orders of magnitude more tokens to customers in an extremely cost-efficient way.”

Transforming AI Economics with Context Memory Infrastructure
As inference demand grows, AI infrastructure inefficiencies compound. Every key-value (KV) cache eviction is a tax: on GPU cycles, latency, user experience, and the cost of every token served. For long-context and agentic workloads, where inputs routinely run to 100,000 tokens or more, that tax is not a rounding error. It is a direct hit on the unit economics of every organization running production AI.

Augmented Memory Grid, a capability of NeuralMesh, solves the problem at the architectural level by decoupling KV cache from local GPU memory and storing it in a high-performance token warehouse accessible across the cluster. Any host can serve any session with cache hits intact, eliminating rigid session stickiness while delivering superior performance to DRAM, improving load balancing, and enabling clean horizontal scaling as concurrency grows. The result is persistent context memory for AI agents and the cost lever that makes long-context inference economical to run at scale.

Production-Grade Proof
OCI published the full benchmark methodology, system configuration, and results on its AI & Data Science blog on May 13, 2026. The benchmarks, executed on a nine-node OCI bare-metal H100 cluster, move beyond the prior phase of validation, which demonstrated 1000x more KV cache capacity and up to 20x faster time to first token at 128,000 tokens. This latest phase tests the full economics of inference in production: concurrency density, sustained throughput, cache persistence, and service level objective (SLO) stability when demand spikes under high load.

Available on Oracle Marketplace
NeuralMesh with Augmented Memory Grid is generally available to WEKA customers and on the Oracle Marketplace, with OCI as WEKA’s exclusive cloud launch partner. Organizations running long-context inference on OCI can deploy a validated, production-ready architecture today. For more on the OCI and WEKA Augmented Memory Grid benchmark, read the OCI blog: https://blogs.oracle.com/ai-and-datascience/scaling-long-context-inference-on-oci-with-wekas-augmented-memory-grid.

About WEKA
WEKA is the AI data and memory infrastructure company transforming the economics of agentic AI. Its NeuralMesh™ platform unifies high-performance data storage with extended GPU memory, giving enterprises, AI cloud providers, and AI builders a single foundation for training, inference, and agentic workloads. With Augmented Memory Grid, NeuralMesh extends GPU memory capacity by 1000x, accelerates time to first token by up to 20x, and delivers 10x more concurrent users from the same GPU footprint, proven in production benchmarks. Trusted by 30% of the Fortune 50, WEKA enables organizations to scale AI faster, optimize GPU utilization, and reduce the cost of every token served. Learn more at www.weka.io or connect with us on LinkedIn and X.

WEKA and the W logo are registered trademarks of WekaIO, Inc. Other trade names herein may be trademarks of their respective owners.

Tags: prnewswire
PR Newswire

PR Newswire

PR Newswire is the industry’s leading press release distribution partner with an unparalleled global reach of more than 440,000 newsrooms, websites, direct feeds, journalists and influencers and is available in more than 170 countries and 40 languages. From our award-winning Content Services offerings, integrated media newsroom and microsite products, Investor Relations suite of services, paid placement and social sharing tools, PR Newswire has a comprehensive catalog of solutions to solve the modern-day challenges PR and communications teams face. For 70 years, PR Newswire has been the preferred destination for brands to share their most important news stories across the world.

Read More

YY Group (NASDAQ YYGH) Launches Commercial Humanoid Robotics Initiative to Drive AI-Driven Margin Expansion and Address Global Facility Management Labor Shortages

YY Group (NASDAQ YYGH) Launches Commercial Humanoid Robotics Initiative to Drive AI-Driven Margin Expansion and Address Global Facility Management Labor Shortages

9 June 2026
AGIBOT Brings APC 2026 to Indonesia, Accelerating Local Deployment of Embodied AI

AGIBOT Brings APC 2026 to Indonesia, Accelerating Local Deployment of Embodied AI

9 June 2026
  • Trending
  • Comments
  • Latest

Fortress Launches Major Service Upgrade to Boost O+O Sales in Hong Kong

4 June 2026
MICROIP to Debut “AI Vehicle System Business Group” and Edge AI Innovations at COMPUTEX 2026

MICROIP to Debut “AI Vehicle System Business Group” and Edge AI Innovations at COMPUTEX 2026

1 June 2026

Hong Kong Courtroom Drama ‘COURT!’ Concludes With Acclaimed Realism

5 June 2026

Carousell Launches Hyper-Local Climate Impact Leaderboard in Hong Kong

4 June 2026
OpenAI launches AI browser Atlas

OpenAI Files Confidentially for IPO Amid Intensifying AI Competition

8 June 2026
World economy should avoid recession: IMF

IMF Approves $163 Million Disbursement for Papua New Guinea Under Multiple Facilities

8 June 2026
Europe swelters in worst early-summer heatwaves

Innovative projects explore ways to deal with extreme heat

8 June 2026

STAK Inc. to Launch AI-Ready Distributed Power Solutions Through Proposed U.S. Subsidiary

8 June 2026

Recent News

OpenAI launches AI browser Atlas

OpenAI Files Confidentially for IPO Amid Intensifying AI Competition

8 June 2026
World economy should avoid recession: IMF

IMF Approves $163 Million Disbursement for Papua New Guinea Under Multiple Facilities

8 June 2026
Europe swelters in worst early-summer heatwaves

Innovative projects explore ways to deal with extreme heat

8 June 2026

STAK Inc. to Launch AI-Ready Distributed Power Solutions Through Proposed U.S. Subsidiary

8 June 2026
HK Businesswire

Stay ahead with the latest insights on Hong Kong’s economy, finance, and investments. From market trends to policy updates, we bring you in-depth analysis and expert opinions.

📩 Subscribe to our newsletter for exclusive updates.
📍 Follow us on social media for real-time news.
📧 Contact us: info@hongkong-invest.com

Follow Us

  • About
  • Advertise
  • Privacy & Policy
  • Contact

© 2025 by HKBusinesswire.com

No Result
View All Result

© 2025 by HKBusinesswire.com