
A Q&A with Andrius Kūkšta, R&D Engineer @ Oxylabs
How do technical and non-technical users differ in their approach to AI-augmented development tools? Can you "accept all" in Cursor? Those are just some of the questions that will be discussed at the Ai4 Conference in Las Vegas on August 11–13, 2025. Ahead of the event, we caught up with Andrius Kūkšta, R&D Engineer at Oxylabs and one of the upcoming panelists, to discuss the evolving intersection of AI, development workflows, and web data.
Before we dive into development workflows, could you explain how AI and web scraping have become so interconnected in today's tech landscape?
We're witnessing a two-way relationship develop between the two. Web scraping has always been a very technical field, but with the emergence of LLM-driven tools, it has become more accessible. AI is also enhancing scraping tools—they can now better navigate the architecture of more complex websites and contextualize content in ways that traditional scrapers couldn't.
At the same time, web scraping is critical for AI development. These models require massive, diverse datasets to function effectively. While LLM breakthroughs often steal the spotlight, people tend to forget how crucial the data-gathering process is behind the scenes. Web scraping offers developers a way to access current, specific data—far more targeted than relying on generic, static datasets used by everyone else.

How do you navigate the growing tensions around data access, especially with platforms like Cloudflare implementing stricter anti-scraping measures?
The conversation around data access is rapidly evolving, and we're seeing a range of approaches emerge—from opt-in frameworks to direct licensing deals with publishers. Models like Cloudflare's recent stance represent just one path among many, and each approach reflects a legitimate effort to assert control, protect rights, and promote transparency.
As this landscape matures, the key challenge is finding a fair and sustainable balance between the rights of content creators, the integrity of digital ecosystems, and the need for broad, representative data to fuel innovation. The diversity of approaches being explored today signals a move toward more intentional data stewardship, and that's a positive step.
But it's also essential to recognize the risks of limiting access too far. If we move toward models that fragment or gatekeep data too aggressively—whether through high paywalls, exclusive partnerships, or restrictive defaults—we risk building AI systems that are less representative, more biased, and more concentrated in the hands of a few.
How has AI been integrated into your daily development workflows at Oxylabs?
The adoption has been super quick, and thus—both exciting and dynamic. Company-wide, there's been a surge of new AI tools. Everyone's experimenting, although of course, anything we use must pass the scrutiny of our Risk Team.
Specific tools have gained widespread adoption. For most developers now, Cursor AI is what Excel is for accountants. Instead of only having Stack Overflow as a platform for knowledge, developers can now also turn to Cursor or even ChatGPT for assistance.
While I haven't dabbled much with no-code or low-code testing solutions, my other colleagues swear by them and generate comprehensive test suites, including bench tests. These AI-powered testing tools are remarkably good at catching bugs.
Beyond development teams, we're seeing AI adoption across the organization. Customer support teams receiving inquiries can now use AI Studio—a tool we built ourselves—to generate parsing code on the fly.
Trust is a major concern with AI-generated code. How do you ensure quality and manage risks?
This is critical, and our approach centers on always having a person in the driving seat. When it comes to building, we never let AI operate without supervision. Some might call it micro-managing, but hey—AI doesn't have any feelings, it's a tool after all.
When reviewing code, Cursor—a crowd favorite at Oxylabs—offers two options: "accept all" or review every change individually. Our policy is strictly for the latter approach, even if the changes seem minuscule. If you skip reviewing several minor modifications, you'll eventually lose track of what's happening in your codebase. It might look like you're saving hours of work, but you might be inadvertently setting off ticking bombs that will go off when you least expect them to.
Where have you seen the biggest productivity gains from AI, and are there any unexpected bottlenecks?
AI tools help massively with repetitive tasks—and in web scraping, this translates to the ability to easily adapt to layout changes, pinpoint relevant data points, and clean unstructured data. However, in the case of building software, especially if you're doing it from scratch, AI tools might limit or even mislead you. Being hands-on with envisioning the architecture pays off in the long run. Plus, the better your understanding of the new feature you're building is, the better the prompts you construct become.
There's an interesting phenomenon I've noticed when it comes to AI adoption by non-developers. Typically, they overestimate the things LLMs can do. While I'm all for empowering as many people to create things as possible, this democratization comes with certain challenges. People will bring you seemingly great ideas generated by ChatGPT that create false hopes and unreasonable expectations.
And when it comes to the biggest bottleneck we've encountered so far, I'd say it's when people ask LLMs to generate frameworks without providing sufficient context. Without understanding the available options and constraints, both in terms of business logic and technology, AI can lead teams significantly off track.
What's been the biggest challenge in integrating AI tools into your existing workflows and tech stack?
A healthy dose of initial chaos, driven by everyone being super enthusiastic. I'm not saying that there's anything wrong with being an early adopter. Still, when everyone jumps into new technology headfirst, it might lead to fragmentation, so it calls for structured efforts.
For example, we've also recently established an AI Guild, where people from different teams can come together and share their experiences with different tools. We discuss everything from Cursor usage to broader AI strategies.
Oxylabs recently launched AI Studio for public use. What's the story behind that product?
AI Studio started with a typical problem developers face—having a ton of great ideas that are too small to justify a standalone product. So, one of our managers proposed this: why not build a platform uniting a few smaller but impactful AI-powered tools?
It started as an internal experiment, but thanks to fast feedback loops, it evolved into a full-fledged product. We're currently in a free trial phase with token-based access available to anyone who wants to try it.
In a way, AI Studio can also serve as an introduction to key steps in data integration—public data collection, web scraping, and parsing. Scraping can be a technically challenging task, but with the user-friendliness of natural language prompting, it can be a breeze.
The best part? It works for everything—personal or business. For example, I recently needed to gather data about Lithuanian Basketball League teams for a hobby project. Instead of writing complex scraping code, I simply used AI Studio and got all the information I needed.
Andrius Kūkšta will be speaking at the Ai4 Conference in Las Vegas, August 11–13, 2025, where he'll join other industry leaders in exploring the cutting-edge applications of AI across various sectors. Established in 2018, Ai4 is now North America's largest Artificial Intelligence industry event.
ⓒ 2025 TECHTIMES.com All rights reserved. Do not reproduce without permission.