The Next Frontier for GPUs Is Sound: A Tech Founder's Vision for Real-Time Audio

Alexander Talashov
Alexander Talashov

For years, graphic processing units (GPUs) have powered some of the world's most demanding experiences—from gaming and 3D rendering to AI model training. But one domain remained largely untouched: audio processing. Traditionally handled by aging digital signal processors (DSPs) or central processing units (CPUs), audio workflows have long faced performance bottlenecks that limited scalability, latency, and spatial precision. Today, that's changing. A new generation of audio technologies is emerging, powered by GPUs. They are enabling everything from multi-zone sound in vehicles and neural guitar amps to spatialized audio in music production and VR.

The pioneer behind this shift is Alexander Talashov, co-founder and CEO of GPU Audio. He took on what many considered an unsolvable problem: how to make highly sequential, real-time audio workloads run smoothly on inherently parallel GPU architectures. What began as a side project with a friend has since grown into a full developer platform powering cutting-edge applications across automotive, music and other industries. In this interview, Talashov talks about the origins of GPU Audio, the deep technical challenges behind high-performance sound processing, and how GPU acceleration is unlocking personalized, real-time audio experiences—particularly in automotive applications.

Let's start with the beginning of your career. How did technology and entrepreneurship find you?

— I've never studied entrepreneurship formally—I'm a technology engineer at heart. I graduated from Russia's top technical university, renowned as the birthplace of many avionics, telemetry, and networking systems developed during the Soviet era and beyond. Although my degree program focused on Wi-Fi and 5G hardware, I was personally drawn to GPU programming, which I found far more exciting.

Around 2011, GPUs were just starting to break out of graphics into general computing. It was fringe territory; few people knew you could program GPUs. For my diploma and postgraduate theses at university, I immersed myself in GPU development. That decision shaped my career from the very beginning.

What happened after university?

— After graduating, I took a risk management job at the Russian branch of one of the largest German banks. To be completely honest, I thought finance would make me wealthy, but I quickly discovered German banking wasn't a meritocratic environment. However, that is when I started programming high-frequency trading bots as a side project alongside my main role—and that's where I first applied GPU computing practically.

After a while, I realized I couldn't move as fast as I wanted within the corporate ladder. I quit and soon met my future co‑founder, who had been working as a sound engineer for the past 20 years. That was when we started building what later became GPU Audio.

How did you two come up with the idea of GPU Audio?

— It all began when my co-founder asked: "Why can't GPUs be used for audio computing?" I didn't know the answer—but I knew it was intriguing. Audio was sequential, low latency, and complex, so using GPUs, originally built for graphics, was uncharted territory. From 2012 to around 2015 we experimented relentlessly, making prototypes when we could, failing more than we succeeded.

Meanwhile, I also held engineering jobs, including at a logistics company called Pickpoint that operates a nationwide network of automated parcel lockers. That experience, as unrelated as it might seem, played an important role in shaping my confidence as a problem-solver.

What exactly was your involvement at Pickpoint, and how did it influence your later work?

— Pickpoint was scaling fast, and by the time they had a few hundred parcel lockers deployed across Moscow, they hit a massive technical hurdle. Each of these kiosks was built from a wide array of different hardware configurations. Some were placed inside residential buildings, others outdoors in subzero conditions, which required thermal regulation systems, heaters, and all sorts of extra embedded components.

Here's the catch: whenever a graphical interface (GUI) update was needed—and these happened often—it required a technician to manually remote into each unit's OS and perform a 30-minute update process per kiosk. The process was slow, error-prone, and unsustainable at scale.

I was a new hire, not yet assigned to any project. But I saw this problem and thought, why not try to solve it? I designed and wrote the first mass-update system that could deploy new firmware and GUIs across all kiosks automatically. This system ran in production back when CI/CD wasn't a common practice yet, and Pickpoint hadn't adopted it at the time. That experience gave me a valuable realization: I could build complex, real-world solutions with enough persistence and curiosity.

— And what was happening with GPU Audio during that period? When did the product reach a production-ready stage?

— In 2017, our first research scientist joined the team—a former advisor to the founders of Qualcomm and an expert with numerous awards won by his academic papers, including ACM CHI, Eurographics and others. His specialty was developing custom ray-tracing engines, which essentially use GPUs to render complex 3D worlds with high speed and precision.

We spent three years iterating through major failures. But by late 2020 or early 2021, GPU architectures had evolved, AI had progressed, and we built our first live demos. We created the world's first novel DSP framework to enable real-time audio DSP algorithms on GPUs, published demo videos on YouTube and shared progress on Reddit, building a community of tens of thousands. Around the same time, we closed a major financing round led by RTP Global, with participation from other prominent funds, as well as former VPs from Airbnb, SoftBank, Disney, Amazon, and ex-Googlers among our investors.

In 2022, we made our public debut at NVIDIA GTC, where we revealed our Scheduler technology, followed by appearances at multiple industry exhibitions alongside AMD, Adobe, and other tech leaders. Initially, we focused on developing software for musicians and sound engineers—tools that traditionally ran on CPUs. But over time, we shifted our strategy.

Rather than building end-user applications ourselves, we began offering our software development kit (SDK), which provides access to high-performance DSP for seamless, real-time audio processing and supports multiple GPU vendors. This allows other companies to transition their pro audio plugins and standalone tools to GPU-based workflows, so that they can create innovative features and products. That pivot proved to be a smarter and far more scalable approach.

— So, why is audio processing uniquely challenging for GPUs?

Audio processing is fundamentally different from graphics rendering. GPUs are Single Instruction, Multiple Data (SIMD) devices, optimized to run a small number of tasks across massive datasets—think pixels in an image. In contrast, real-time audio involves hundreds of lightweight, interdependent tasks running across very small datasets. That's a terrible fit for how traditional GPUs expect to work. So our challenge was to reshape the GPU into something that could handle many micro-tasks with complex dependencies in real time.

The major barrier was the mathematics itself. Most pro audio relies on IIR (Infinite Impulse Response) filters, which are inherently sequential and difficult to parallelize. We had to reinvent that math—finding or creating designs that could emulate IIR behavior but run efficiently on a GPU. For instance, instead of traditional IIR filters, we used their state-space representation (a set of first-order differential equations using matrices). This idea made it possible to run many computations in parallel—processing 4, 8, 16, or even 32 instances at the same time on the GPU.

— What did you ultimately achieve?

We ended up building and optimizing our DSP and ML algorithm implementations—from FIR filters and delay lines to dynamic range compressors, classic DSP components, and neural networks—all specifically tuned for efficient GPU execution.

Our key breakthroughs came from studying how ray tracing engines resolve dependencies between physical rules when rendering 3D scenes. Our chief scientist had built one of the fastest ray tracers in the industry. The techniques he used to prioritize and process rule chains inspired how we architected task scheduling for audio. We took those lessons and built our own patented Scheduler. It acts like a micro-operating system for GPU audio—collecting tasks from different library client calls, optimizing their execution, and delivering results with ultra-low latency.

Ultimately, what we achieved is a complete rethink of how real-time audio or any dependency-aware time series data can be processed using GPUs, NPUs and APUs. We've made it possible to run complex, multi-threaded audio environments on GPUs—something the industry has long considered impossible. And with our SDK, we've enabled others to tap into this power and build next-gen audio tools that are faster, smarter, and far more scalable.

— Where are you applying GPU Audio technology right now?

We're focusing heavily on automotive audio, and the reason is simple: this industry is overdue for a fundamental transformation. Despite all the innovation in electric vehicles, infotainment systems, and autonomous driving, audio processing in most cars is still running on decade-old DSPs.

Meanwhile, the cabin experience has changed dramatically. You've got multiple passengers engaging in completely different tasks: someone taking a phone call, someone else watching a show, while the driver needs to hear navigation prompts or voice assistant responses. And yet audio is still treated as a single, uniform stream which floods the entire cabin equally. It's a one-size-fits-all approach that no longer fits.

We saw a huge opportunity to rethink in-cabin audio from the ground up, leveraging the powerful in-vehicle GPUs to enable multi-zone, spatially aware, real-time audio rendering. Our technology lets OEMs deliver premium, personalized sound experiences.

— How exactly does your solution work inside vehicles?

Take my own experience. I've got two kids, and most mornings I drive them to school. Let's say I'm on a work call using hands-free, while my kids are watching Peppa Pig on the backseat screen. In a typical car, that means two audio streams competing in the same physical space. Everyone hears everything. It's messy, distracting, and ineffective.

With our solution, we've implemented what we call Zone Compensation—a GPU-accelerated, real-time spatial filtering technique that actively suppresses unwanted audio for simultaneous multiple zone playback. So in that same drive, I can hear my call clearly through the front speakers, while my kids hear Peppa Pig in the back—with no crosstalk between the two zones. It's not just volume balancing or simple cancellation. It's true acoustic separation, tailored to the vehicle cabin and dynamically adapted to real-world conditions.

— Could you break it down from a technical perspective?

— Technically, this works by measuring impulse responses (IR) for each car model and generating a precise spatial audio model of compensation IRs that takes into account cabin geometry, interior materials, speaker placement, and seat positions. The GPU runs this spatial matrix convolver algorithm to apply zone compensation in real time using only around 5% of compute capacity. It delivers 5–12 dB of zone suppression at the listener's position, over a frequency range of 250 Hz to 8 kHz—perfectly optimized for speech intelligibility and clarity.

All of this happens without requiring additional chips. And best of all, it's completely deployable via OTA updates, which means the technology can be installed or updated without physical access to the device. OEMs don't have to modify their processing platforms and audio hardware components to deliver a drastically better cabin experience. This intelligent software addresses a key challenge for audio engineers, enabling precise zone separation that makes 2- and 3-zone simultaneous audio playback both personalized and comfortable. In the confined space of a car cabin, such separation simply isn't possible through geometry and speaker design alone—it requires precise zone compensation at the listener's position.

Have you already attracted any customers?

— I'm proud to share that we've secured our first customer, who plans to integrate our technology into 100,000 to 1 million vehicles annually. We're also in active discussions with several other clients aiming for similar-scale deployments of our technology and flagship product.

How do you see the future of this technology?

— The big picture is this: the car is turning into a media consumption capsule, especially as we get closer to full self-driving and modern vehicles become increasingly equipped with advanced infotainment features like multiple screens, microphones, and even head-tracking systems. People aren't just listening to music—they're watching, calling, interacting. We're making sure every passenger hears only what they're supposed to hear, as if they were wearing headphones—but they're not. The entire soundscape is sculpted invisibly, in real time, through intelligent software that demands high-performance computing.

However, what's truly exciting is that this GPU-powered audio solution is just the beginning. The same platform is extensible to gaming, AR/VR, and of course, professional and consumer audio. As I've already mentioned, we're not developing just one solution, but a full SDK. We're creating the kind of core technology that companies across multiple verticals—from billion-dollar automotive giants to indie music software startups—can license and build on. It's all about enabling developers to build new, more advanced ways to create and experience sound.

ⓒ 2025 TECHTIMES.com All rights reserved. Do not reproduce without permission.

Join the Discussion