Researchers at Trend Micro Research have uncovered a significant security issue affecting NVIDIA Riva AI deployments in cloud environments.
A pattern of exposed, authentication-free Riva API endpoints was detected across numerous organizations, potentially granting attackers unrestricted access to powerful GPU resources and sensitive AI models.
Two core vulnerabilities, now designated CVE-2025-23242 and CVE-2025-23243, were found to be driving these exposures.
Both flaws have since been addressed and disclosed under ZDI-25-145 and ZDI-25-144 following a responsible disclosure process.

Attack Surface Expanded by Defaults and Advanced Configurations
NVIDIA Riva, celebrated for its real-time speech recognition, translation, and synthesis capabilities, offers enterprises a robust toolkit for building speech-driven applications.
However, its complex deployment architecture inadvertently creates a broad attack surface prone to human error and misconfiguration.
The urgency to deploy such cutting-edge solutions often leads teams to rely on default settings or incomplete security hardening, especially when integrating Riva with cloud services.

Notably, Trend Micro identified over fifty-four unique IPs with exposed Riva instances, highlighting the widespread nature of the issue.
The vulnerabilities primarily arise from how Riva services are launched. Upon initialization-often via NVIDIA’s QuickStart guide-Riva servers default to listening for gRPC connections on port 50051, binding to all network interfaces (0.0.0.0).
Unless explicitly restricted by firewall rules, this makes the endpoint externally accessible.
Moreover, gRPC reflection is enabled by default, facilitating easy service identification for developers-but also for potential attackers.
While the documentation implies that modifying the configuration and enabling TLS/SSL secures the deployment, this only encrypts traffic and verifies the server’s identity; it does not authenticate the client, inadvertently allowing open access to anyone who connects.
Downstream Exposure Via Triton Inference Server
The problem extends beyond Riva itself. The AI server also exposes internal endpoints for the underlying Triton Inference Server, which handles actual inference jobs.
These endpoints-REST API (default port 8000), gRPC (8001), and metrics (8002)-are often left accessible by default container orchestration.
Consequently, even if Riva’s gRPC endpoint is secured, an attacker can still bypass it by directly targeting Triton’s open interfaces.
This risk is especially pronounced when Triton is operated in advanced configurations that may reveal additional vulnerabilities.
The implications of these exposures are serious: unauthorized parties can exploit exposed endpoints to run compute-intensive workloads on enterprise hardware, consume paid API keys, leak proprietary models, and potentially orchestrate denial-of-service attacks.
For organizations relying on AI for proprietary or sensitive applications, these risks extend to intellectual property theft and operational disruptions.
Experts recommend an immediate review of public cloud AI deployments, especially for those using default Riva or Triton configurations.
Companies should ensure network controls are in place, enforce strict firewall rules, and properly configure authentication for all exposed API endpoints.
Additionally, comprehensive monitoring tools, such as Trend Vision One™ Cloud Risk Management, can help organizations detect and remediate unintended network exposures.
NVIDIA’s Riva suite, while offering remarkable advancements in conversational AI, underscores the pressing need for robust cloud security practices.
As enterprises accelerate AI adoption, ensuring secure configuration and deployment will remain a critical challenge.
Find this Story Interesting! Follow us on LinkedIn and X to Get More Instant updates