Why OpenAI’s ChatGPT Plugins feature combined with the power of GPT agents is the new Internet gateway and the real Web 3.0?
The Internet
The development of the World Wide Web in the 1990s led to a surge in Internet use, as individuals and businesses began to create and access web pages. Today, the internet is a ubiquitous presence in our lives, providing a wealth of information, entertainment, and social connection. People interact with the internet in a variety of ways, using a range of devices and applications. They may use web browsers to access websites, social media platforms to connect with friends and family, or messaging apps to communicate with others in real time. Online shopping has become increasingly popular, with consumers able to browse and purchase products from a range of retailers. Their online journey’s entry point (i.e. Internet gateway) whether a website, an app, or the user’s device is a crucial strategic asset for companies. It is the main battlefield of the world’s leading companies. The earlier in the journey that a company engages with the user the more it can determine and control its path and monetize on that. And this means money, lots of money.
Internet User’s Typical Journies
Throughout this blog post, we will use 3 examples of users’ journies to accompany us. These examples may not encompass all possible journeys but they do represent major ones. The examples will help us to demonstrate how online behavior has changed and is going to change, and possibly help us prove our claim in the title of this blog post.
Example 1– I need to collect information in order to write an essay about the dangers of the internet for young adults. Basically a typical data/information retrieval task.
Example 2– I wish to go on a vacation. There are several things I need to do: research possible destinations, book a flight and a hotel, etc. This task has both data/information retrieval tasks but also demands the user to make actual actions online (such as book and pay). We shall refer to this as executable data/information task.
Example 3– I’m interested in hiking, towards a hiking day I use the internet to select hiking routes and to buy hiking accessories, but on a daily basis I don’t search for anything related to hiking, but I still enjoy reading about it or be notified about sales of hiking gear. While this may seem similar to Example 1, the information I consume now has to be pushed to me. So in fact more of a data/information consumption task.
If you’re in a hurry and just want to glimpse what the future holds, go here.
Disclaimer: In this blog post, I will be referring to known companies by name and talk about their positioning in this new AI era, but the truth is that this battle is far from over and things are changing rapidly. Hence, any of the mentioned companies can eventually get a market share. but it is just easier to mention specific companies and their current (publicly known) offerings.
Google’s Internet – The 2000s (AKA Web 1.0, the “read-only Web”)
Towards the last millennium, with the rapid development of the internet and the need to easily search and locate needed information, the Google search engine has quickly become the go-to search tool that overshadowed all its competitors, giving the most relevant search results. The way it operates is simple: Google crawls and indexes every website there is, and upon a query gives its users the best results it can. Basically, this gave Google almost infinite possibilities (and power). Google became the official gateway for the internet. Google’s whole financial model was based on giving users relevant ads and business information that relates to the query, it may not be very valuable when the user’s query is “Abraham Lincoln’s birthdate”, but it is critical for a travel agency to be at the top of the results (or as ads) when the user’s query is “Book a flight to Thailand”. companies have literally risen and fallen by the success of their search engine optimization (SEO).
Search Engine Optimization (SEO)
[Skip this paragraph if you know what SEO means] Search engine optimization (SEO) is the practice of optimizing websites and content to improve their ranking and visibility on search engine results pages (SERPs) for specific keywords or phrases. The ultimate goal of SEO is to increase organic (unpaid) traffic to a website.
SEO involves a variety of techniques and strategies, including optimizing website content and structure, keyword research and targeting, building backlinks from other websites, and improving the user experience of a website. By implementing these techniques, website owners can make their sites more attractive and relevant to search engines, which can result in higher rankings and increased visibility.
By optimizing their websites for search engines, website owners can improve their chances of being found by people searching for related topics or products. This can lead to increased website traffic, higher engagement, and ultimately, increased revenue or other business objectives.
Our Examples in the 2000s
Let’s use our examples to see how the internet (with the help of Google changed our lives)
Example 1 – Instead of going to the library or manually searching the web, spending hours looking at the scarce selection of material on this topic, eventually finding only several paragraphs in a handful of books or websites, Google allowed me to search (in milliseconds) an endless database with the option to fine-tune my query to the subtopics I am interested in. I am also not limited to books or official websites. I can also find unofficial information such as doctors answering online questions in forums, and other people’s own experiences, in fact, all I’m left with is to collect the most relevant data, out of the already relevant data, and summarize it into a coherent assay.
Example 2 – Instead of going to the local travel agency, explaining my needs, and getting a limited number of travel deals, with the agency’s fixed prices, I can review unlimited deals to any destination I wish for, as niche as it may be. Also, I needed to trust the agency’s experience with specific hotels, not knowing if I myself, with my own special preferences, will like it or not. Nowadays, I can be my own travel agent. I can research destinations, watch images and videos of the places I’m going to visit, and see other traveler’s impressions of the sights and hotels. I can also compare flights and hotel prices, check the proximity of the hotel to places I want to visit, and even see what the room and its view will look like. But, I have to initiate all the research, collect the data, find the best prices using flight & hotel booking websites and eventually make all the booking actions myself. I sometimes miss the travel agent making all this work for me.
Example 3 – While the Google search engine offered revolutionary solutions for the first 2 use cases, it didn’t go too deep into passive information consumption. Most of the information we consumed happened by actively searching for it.
To conclude, Google search was and still is the most popular search engine, and as such, it is a major gateway or starting point for any internet journey, whether it is online shopping, reading, etc. But Google search is limited, it is merely a pointer to existing websites that answer the user’s query, it almost doesn’t give you insights, it is definitely not executable, and it offers no information push capabilities.
Social and media networks’ Internet – The 2010s (AKA Web 2.0, the “participative social Web”)
To date, search engines are still the primary way that people find information online. But it is different from where users spend most of their time, and thus, a new alternative Internet gateway has appeared. Social networks and media networks have gained immense popularity, with hundreds of thousands of users, spending hours consuming content shared by other users of the platforms, some of the content is their own thoughts, feelings, and moments, and some is information produced by official business entities, such as news, products, etc.
For example, It is almost a decade now that most articles published are accessed when content publishers share them using their social presence. These articles are then spread around either when social-network users are actively following the publishers, or as they spread (virally) when users share them with their followers or friends.
This was the beginning of a new Internet gateway. Internet users were no longer starting their journey to consume information on Google, but rather on their social platforms, and hence, a significant shift in ad budgets shifted from Google to popular social networks, such as Facebook.
Our Examples in the 2010s
Let’s use our examples to see how the internet (with the help of Google changed our lives)
Example 1– Nothing much changed, except more social media sources became available to the search engines.
Example 2– Again, nothing much changed, more social media sources became available to the search engine, and you can now go to specific social network groups and ask other users specific questions to help your research and get firsthand information.
Example 3– People spend more time than even on social media platforms, and by engagement, they explicitly and implicitly declared their interests. Their information consumption has transitioned from active to passive. Previously, I only got hiking information when I actively searched for it. Now, since I’m in hiking-related social groups, I get information on hiking and ads on sales even when I’m just passing time.
To conclude, Facebook (and other social networks) have dramatically changed the way we interact with the internet, instead of actively searching for information we are interested in, we “configure” social networks to stream and push the things we find interesting to our news feeds. This makes them the perfect platform for spending time when we don’t have a specific task in mind. Google, noticed how it is losing its gateway supremacy and has tried to compete with its own social networks but failed.
ChatGPT’s Internet – The 2020s (AKA, Web 3.0, the “read, write, execute Web”)
From its early days, people have compared ChatGPT with Google and predicted it can and will serve as a replacement for the search engine’s capabilities, headlines such as this, taken from Forbes, and the following tweet, appeared:
But even this enormous hype doesn’t reflect ChatGPT’s full potential. This only scrapes the surface of what ChatGPT plugins, powered by GPT Agents, can and will do.
While it is clear why people find information retrieval using chatGPT satisfactory, it still suffers from some limitations that give an advantage to classic search engines such as grounding and referencing, and actually, the Bing search engine is now solving that. But since GPT is much more than a search engine, we need to look at other aspects. But before we look at how our journey examples will change, we wish to introduce two new products/technologies:
ChatGPT Plugins
[Skip this paragraph if you know what ChatGPT plugins are] A software plugin (or simply “plugin”) is a software component that adds specific functionality to an existing software application. A plugin is typically designed to extend or enhance the features of the host application, allowing users to perform tasks or access capabilities that would not be possible otherwise.
Plugins are often used in a variety of applications, including web browsers, audio and video editing software, and content management systems. For example, a web browser plugin might allow users to view certain types of multimedia content, such as videos or interactive applications, while an audio editing plugin might provide additional audio effects or filters.
Plugins are typically installed separately from the host application and may be created by the original software developer, although usually they are created by third-party developers. Some plugins may be freely available, while others may be sold commercially or licensed to individual users. A host application that is based on allowing third-party developers to develop substantial, possibly profitable plugins is essentially a platform.
OpenAI’s ChatGPT is such a host application, now, with the declaration on ChatGPT plugins it is making its first step in becoming a platform. ChatGPT’s plugins will give ChatGPT new functions and enhance our daily engagement with it. Currently (15/4/23), in order to develop new plugins for the ChatGPT platform you need to join a waiting list, but you can already get a sense of what the future holds and the potential of such a platform.
Here are some details from the documentation:
OpenAI plugins connect ChatGPT to third-party applications. These plugins enable ChatGPT to interact with APIs defined by developers, enhancing ChatGPT’s capabilities and allowing it to perform a wide range of actions.
Plugins can allow ChatGPT to do things like:
- Retrieve real-time information; e.g. stock prices, the latest news, etc.
- Retrieve knowledge-base information; e.g. company docs, notes, etc.
- Perform actions on behalf of the user; e.g. book a flight, order food, etc.
Plugin developers expose one or more API endpoints, accompanied by a standardized manifest file and an OpenAPI specification. These define the plugin’s functionality, allowing ChatGPT to consume the files and make calls to the developer-defined APIs.
The AI model acts as an intelligent API caller. Given an API spec and a natural-language description of when to use the API, the model proactively calls the API to perform actions. For instance, if a user asks, “Where should I stay in Madrid for a couple of nights?”, the model may choose to call a hotel reservation plugin API, receive the API response, and generate a user-facing answer combining the API data and its natural language capabilities. Over time, they anticipate the system will evolve to accommodate more advanced use cases.
Currently, there are about 20 plugins available in the ChatGPT platform that allow similar functionality as shown in the example above. In terms of potential usage, it is almost the equivalent of using Google when the Internet only had 20 websites, so it may be a little hard to imagine, but this will revolutionary change the way we interact with the Internet.
To sum up, the plugins allow the ChatGPT to do things he has trouble doing such as reserving places in a restaurant, calculating complex math, getting real-time information on stocks, etc.
One example of an important plugin (actually, so important that OpenAI developed it), is the ability to browse the internet, without it, and most of you have encountered it already, you will get the irritating response that the model is not connected to the Internet and hence can’t answer your query. Another revolutionary capability is the plugin with the ability to generate and run code, this, of course, helps code-related tasks (or goals) but this y itself is not new. The thought behind this plugin is that there is a limit to the variety of available plugins, and there is a possibility that this essentially gives the platform the ability to write and run its own plugins. For example, let’s say you need a piece of code that compares deals coming from several different APIs.
GPT-based Agents
[Skip this paragraph if you know what GPT-based agents are] GPT Agents enable users to build autonomous artificial intelligence agents to carry out different online tasks. It is based on GPT, which enables it to function autonomously without help from a human. These agents can structure online searches, assign subtasks, and launch new agents to finish them. For example, It can recursively debug, develop, and self-improve code that was written using GPT-4. Users can define the goals of their AI, which automatically divides them into more manageable tasks using systems such as AgentGPT and AutoGPT. These systems are very accessible to the public because of their no-code, browser-based approach as well as being open-source.
In other words, take a GPT-like engine, give it a task, then ask it to break it down into smaller tasks, prioritize them, launch them, evaluate the progress, and then continue and re-try different approaches until all the subtasks are completed successfully and hence the also the main goal. This ability to split the task into subtasks, reflect on the results, assign another agent for failed tasks, try new solutions, and so on, repeatedly, with strong perseverance leads to surprising results. Add to that a memory, live connection to the Internet, the ability to write and run code, and an endless number of ChatGPT plugins and you get a hard-working and goal-oriented autonomous entity that has unprecedented AI assistant capabilities.
Researchers (from Stanford and Google) suggest that such an agent, with memory, character, and a list of desires, could in the future be a kind of duplicate of you and represent you in the real world.
You can try one of these chaining and agents tools here:
https://github.com/yoheinakajima/babyagi
https://github.com/Significant-Gravitas/Auto-GPT
https://python.langchain.com/en/latest/index.html
But how do agents improve my life? And what do ChatGPT plugins have to do with this? Let’s explain it using our 2nd example.
Our Examples in the 2020s
Example 1– Google allowed me to search in an endless database with the option to fine-tune my query to the subtopics I am interested in. I can also find unofficial information. But I have to read and screen out the most relevant data, and summarize it to a coherent assay. Now, I can ask ChatGPT to write the essay for me, I can give instructions to fine-tune the areas in which I want to focus. This saves me the need to read, filter, summarize, and write it down. A task that may have taken at least a day now takes 1 hour tops, including proofreading the outcome. This is usually the use case that people think about when they suggest that ChatGPT will replace Google. Still, our 2nd example, a classical executable information retrieval task, will demonstrate why it is more disruptive than this.
Example 2– Instead of going to the local travel agency, explaining my needs, and getting probably a limited number of travel deals, with the agency’s prices, etc. Google allowed us to be our own travel agent. We can research destinations, watch pictures and videos of the places we’re going to visit, we can also compare flight and hotel prices, etc.
Now Imagine, with enough APIs and information on your preferences, this agent will not only reach its goals but also mimic the way you act online. For example, let’s say you have a habit of sending an email to the hotel you’re about to stay and asking for a quiet room, at some point, when asking to book a hotel room, the agent will figure out that this is part of the task of booking a room that is customized to you. Essentially, you will use this AI assistant in a similar way to the conversation you used to have with your travel agent, but with the full context of what you usually prefer and with endless options. The agent will ask for the preferred dates, offer alternatives, recommend and book hotels & restaurants, etc. This has profound implications, as it is now your agents that roam the internet and not you. You have a multi-agent interface that you engage with and which digests this information and serves it to you to choose or act independently. Think of what this can do to the online ads industry. Business models are going to change profoundly, as this is no longer a human-facing network but rather api-network.
Example 3– Actually this is an interesting missing piece of the puzzle. How will ChatGPT change the way we act and consume information (passively) on social platforms? One theory states that an agent(s) will be able to replace your social presence, mimicking your behavior. The agent(s) will tailor and summarize content according to your interests and needs, and eventually also act on your behalf.
To Conclude
The ChatGPT plugins combined with the GPT agents system will be our new internet gateway and will become the real web 3.0 – the execute web. And why? Because it allows us to execute (basically, act) on the basis of complex executable-information retrieval. Some say the plugins are some kind of an app store, but actually, they are much more than the app store. In the app store, you still have to search for the app that will best serve you, review it, download, and try it, this is a huge overhead. Wouldn’t it just be more simple for an agent to try a few plugins, compare them, and give you the results from the plugin that produced the best results?
OpenAI still didn’t declare their GPT agents’ vision, but it exists implicitly in their plugin announcement. Why would anyone stop at a single task when he can perform elaborate ones? In fact, even today, if you ask ChatGPT for a high-level goal it will already try to divide it into subtasks.
The Internet’s business model disrupted
We used to have a journey, which was accompanied by ads that financed the creators and services. Now it’s the agents who do the journey for me, are there going to be “ads” for agents? Surely, the chat interface will still be able to display ads, but what about the rest of the web? In this case, bidding on which service (API) the agents are going to use means a lot of money for the API providers. Will APIs use SEO to convince agents to use their services over others? Will the user pay a commission for each API that helps him reach his goals? When a goal involves a monetary transaction is involved, this is a clear yes, but when we only retrieve information, what do the content creators make?
To sum it up, We shouldn’t underestimate the implications such capabilities can have on the future of the internet, and this also applies also to other areas and industries such as travel, education, content creation, etc. But it’s not all perfect, a lot has been written on ChatGPT’s ability to hallucinate wrong information and basically having no reference or grounding to support its results, but this dwarf with comparison to the advantages. It really depends on the end application and how much it needs to be fail-proof.
References
[1] Google search www.google.com
[2] AI-related groups in social networks www.facebook.com , www.linkedin.com
[3] ChatGPT https://chat.openai.com/
[4] Midjourney www.midjourney.com
[5] Human-brain