I Let AI Agents Plan My Vacation—and It Wasn't Terrible

The worst part of travel is often the planning. The endless scrolling through websites, comparing prices, finding transport, booking accommodation, making restaurant reservations, and mapping out attractions—it can feel like a second job before the actual vacation even begins. This tedious process is precisely what the latest wave of AI agents claim to solve. Tools like OpenAI’s Operator (available to ChatGPT Pro subscribers) and Anthropic’s Computer Use are designed to go beyond simply answering questions; they can interact with websites, fill out forms, and perform tasks much like a human assistant would. The promise is compelling: offload the dreary, cumbersome tasks and let AI handle the logistics, leaving you free to dream about your destination.

But how good are these digital assistants at navigating the complex, often messy world of travel booking? To find out, I decided to put them to the test. I tasked OpenAI's Operator with planning a last-minute weekend getaway. My criteria were simple: it needed to be budget-friendly, offer good food and art, and ideally be accessible by train. I wanted to see if AI could truly take the reins and dig out the 'good stuff'—the hidden gems and efficient routes—or if it would simply regurgitate generic information.

The Experiment Begins: Choosing a Destination

The process with Operator is fascinating because you can watch it work in real time. It opens a browser window and begins searching, much like I would, for destinations reachable by rail from my location. It scrolls through articles, processes information, and then presents its initial suggestions: Paris or Bruges. Having recently visited Paris, the choice was easy. "I recently went to Paris," I typed into the chat. "Let’s do Bruges!"

With the destination decided, Operator moved on to the next step: finding transport. It navigated to the Eurostar website and searched for train times, successfully locating a return ticket that included travel to Brussels and onward connections within Belgium. This initial step felt promising. The AI was actively browsing and finding relevant information.

Navigating the Logistics: Booking Transport and Accommodation

However, the first hiccup appeared when Operator presented the train timings. It had selected an early-morning train out on Saturday and an equally early train back on Sunday. This wasn't exactly maximizing the weekend experience. I intervened, pointing out that I'd prefer later timings. Impressively, the AI understood and found a later return option.

This interaction highlighted a key aspect of working with current AI agents: they often require supervision and refinement. While they can perform tasks, their 'common sense' or ability to infer user preferences beyond explicit instructions is still developing. It felt less like handing off a task entirely and more like collaborating with a slightly naive but capable assistant.

I paused the process to double-check my calendar before committing to the booking. When I returned, I encountered a significant limitation of Operator: the session had timed out. Unlike a persistent chat interface like standard ChatGPT, Operator closes conversations between tasks, forcing me to start the train booking process over from scratch. This felt frustrating, like being disconnected from a human travel agent mid-conversation. To add insult to injury, the fares had changed in the interim. I found myself 'haggling' with the AI, asking if it could find a cheaper option. Eventually, tickets were selected, but I decided to take over for the final step, entering my personal and payment details. While I was willing to trust the AI with searching and selecting, handing over sensitive information felt like a step too far for now.

Image may contain Electronics Mobile Phone Phone Iphone and Person — Using ChatGPT's Operator to book a train ticket to Bruges.
Courtesy of Victoria Turk

With trains sorted (mostly by me in the end), Operator seemed to consider its job complete. But a trip requires more than just transport. I needed somewhere to stay. I reminded the AI, asking it to book a hotel. I kept my instructions intentionally vague: comfy and conveniently located. Comparing hotels is often the most tedious part of travel planning for me, so I was happy to delegate the task of scrolling through Booking.com listings.

Again, I observed its process. It initially set the wrong dates, but to its credit, it identified the error and corrected it without my intervention. It spent some time examining an Ibis listing before eventually selecting a three-star hotel called Martin’s Brugge. I later checked user reviews and noted its excellent location, which aligned perfectly with my vague requirement. This was a win for the AI agent – successfully navigating a booking site and making a reasonable choice based on implicit criteria (location being key for a short trip).

The Itinerary Challenge: AI's Strengths and Weaknesses in Planning Activities

Having booked transport and accommodation, the final piece of the puzzle was the itinerary. This is where Operator seemed to lose momentum. It provided a rather perfunctory one-day schedule, which appeared to be largely lifted from a vegetarian travel blog. For the second day, the suggestion was a remarkably unhelpful: "visit any remaining attractions or museums." This lack of detail and creativity was disappointing.

This experience highlighted a distinction between AI agents designed for task execution (like Operator's booking capabilities) and large language models (LLMs) designed for information retrieval and generation. While Operator could browse and interact with booking sites, its ability to synthesize information into a compelling, detailed plan was weak. I realized that for research-heavy tasks like building an itinerary, a standard LLM might be more effective.

To test this hypothesis, I turned to other AI models. Google’s Gemini and Anthropic’s Claude, along with Operator's sibling, ChatGPT, all provided much more thorough plans. They plotted activities by the hour, suggested specific places to eat (like recommending Flemish stew at De Halve Mann brewery), and listed key attractions: walk to the market square, see the belfry tower, visit the Basilica of the Holy Blood. These plans were detailed and structured, offering a clear roadmap for the weekend.

However, a new question arose: were these detailed itineraries genuinely tailored or simply the standard tourist route? Bruges is a small city, and the suggestions from different LLMs were remarkably similar. This raised the possibility that they were all drawing from the same common pool of popular tourist information, rather than offering unique or personalized recommendations. This genericness is a challenge for AI in travel – how to move beyond the obvious and suggest experiences that truly match an individual's interests and travel style?

Beyond Generic: Travel-Specific AI Tools

Recognizing this limitation, various travel-specific AI tools are emerging, attempting to offer more personalized and dynamic planning. I briefly explored MindTrip, which integrates a map with a written itinerary, offers personalization through a quiz, and includes collaborative features for group trips. According to CEO Andy Moss, these tools aim to build upon broad LLM capabilities by incorporating a travel-specific "knowledge base" that includes real-time data like weather and availability. This approach suggests that combining the conversational power of LLMs with structured, domain-specific data could be the key to more effective AI travel planning.

The Real-World Test: AI on the Go

The true test of an AI travel assistant comes during the trip itself. As I dragged myself out of bed at 4:30 AM for the early train (a timing I had failed to correct with Operator), I was reminded of the practicalities AI needs to handle. Getting to Brussels was smooth, but upon arrival, I faced a common traveler's dilemma: finding the correct platform for my onward train to Bruges. My ticket allowed the connection, but the specific platform wasn't listed.

I fired up Operator on my phone, asking for the platform number. It began searching the Belgian railway timetables. Minutes passed. The AI was still searching, seemingly struggling to access or process the real-time departure information quickly. I looked up at the station display board and found the details instantly. I was on the platform waiting for the train before Operator had managed to provide the information. This highlighted a critical weakness: AI agents, despite their browsing capabilities, may not be as fast or reliable as simply checking a local information source for real-time data.

Image may contain Airport Adult Person Clothing Footwear Shoe Accessories Bag Handbag People and Computer Hardware — Courtesy of Victoria Turk

Adjusting Plans and Finding Hidden Gems

Bruges itself was delightful. Given Operator's lackluster itinerary, I decided to branch out and rely more on my own instincts and the more detailed plans provided by other LLMs. However, even those detailed plans proved overly ambitious. According to ChatGPT's schedule, I should have spent the afternoon on a boat tour, taking photos in another square, and visiting a museum. This vastly overestimated the stamina of a human who had been awake since 4:30 AM. I admitted defeat and returned to my hotel for a rest. The hotel, while basic, was indeed ideally located, a testament to Operator's successful booking task. I started to appreciate Operator's 'lazier' itinerary – perhaps a less packed schedule was exactly what was needed.

As a final task for the AI agent, I asked it to make a dinner reservation. I specified somewhere authentic but not too expensive. This task involved navigating a restaurant's website, potentially interacting with a booking system. Operator initially got bamboozled by a dropdown menu during the process, a common challenge for AI agents interacting with varied web interfaces. However, after a little encouragement from me, it managed to find a workaround and complete the reservation.

I was genuinely impressed when I walked to the restaurant. It was located away from the obvious tourist traps, a quiet dining room serving classic local cuisine, themed rather uniquely around pigeons. It felt like a genuine local find, the kind of place that doesn't typically appear on the first page of TripAdvisor or The Fork. This success demonstrated that AI agents, despite their flaws, can sometimes uncover less obvious options, potentially offering a path away from the homogenized 'top 10' lists that dominate traditional travel search.

Reflecting on the AI Travel Planning Experience

On the train journey home, I had time to reflect on my experience. The AI agent, specifically Operator, certainly required supervision. It struggled to string multiple tasks together seamlessly and lacked an element of common sense, as demonstrated by its initial choice of train timings and the session timeout issue. The need for human oversight was clear; I couldn't simply delegate the entire process and walk away.

However, there was a refreshing aspect to outsourcing some of the decision-making. Instead of scrolling through hundreds of train times or hotel listings, the AI presented a few select options based on my criteria. This curation, even if imperfect, significantly reduced the initial cognitive load of planning. It felt like having a research assistant who did the initial legwork, allowing me to focus on refining the choices rather than sifting through vast amounts of data.

Comparing the agentic AI (Operator) with the conversational LLMs (ChatGPT, Gemini, Claude) for itinerary planning revealed different strengths. The LLMs were better at generating detailed, text-based plans based on their training data, while the agent showed potential for interacting with real-world booking systems, albeit with some fumbles. Travel-specific AI tools like MindTrip suggest a future where these capabilities are combined with structured data for a more robust experience.

The Human Touch vs. The Algorithmic Assistant

Despite the advancements, the human element in travel planning remains significant. Emma Brennan at the travel agent trade association ABTA notes that for many, AI is currently used more for inspiration than for booking. The core value proposition of a human travel agent, she argues, is the safety net they provide. "An increasing number of people are booking with the travel agents for the reason that they want someone there if something goes wrong," she says. AI agents, in their current form, cannot replicate this level of support, especially when unexpected issues arise during a trip.

The experiment also raised questions about the future landscape of travel information. If AI agents become proficient at finding and presenting options, they could potentially disrupt the role of traditional search engines and review sites. Businesses might soon be clamoring not just for high search rankings but for inclusion in AI-generated suggestions. "Google isn’t going to be the front door for everything in the future," suggests Andy Moss of MindTrip. Are we, as travelers, ready to give this much power and influence over our choices to a machine?

Conclusion: An Evolving Relationship with AI in Travel

Embarking on this AI-planned trip, I worried I would spend more time staring at my screen, debugging the AI's efforts, than actually experiencing the destination. While there were moments of frustration and necessary intervention, by the end of the trip, I realized something unexpected: I had probably spent less time glued to my phone for planning and searching *during* the trip than I usually would. The AI, despite its flaws, had handled enough of the initial heavy lifting that I felt more present once I arrived.

My experience suggests that AI agents are not yet ready to be fully autonomous travel planners. They require supervision, struggle with real-time dynamics, and lack the nuanced understanding and problem-solving skills of a human agent. However, their ability to automate tedious search and booking tasks, and occasionally uncover less conventional options, makes them a promising tool. They are evolving from simple conversational assistants into agents capable of action, albeit with a learning curve.

The future of AI in travel likely lies in a hybrid approach. AI can excel at data aggregation, initial research, and automating bookings for straightforward trips. Human agents will continue to be invaluable for complex itineraries, handling unexpected issues, and providing personalized recommendations based on deep understanding and experience. And travelers will need to adapt, learning how to effectively prompt and supervise their AI assistants, much like collaborating with a junior colleague. The journey towards truly seamless AI-powered travel planning has begun, and while we're not quite at the destination, the route is becoming clearer.

Subscribe to Our Tech & Career Digest