Speech & Audio

A friendly introduction to MCP, the USB of AI • The Register

Editorial Team

April 21, 2025
15 Min Read

A friendly introduction to MCP, the USB of AI • The Register

Hands On Getting large language models to actually do something useful usually means wiring them up to external data, tools, or APIs. The trouble is, there’s no standard way to do that – yet.

Anthropic thinks it has an answer: MCP, an open protocol that promises to be USB-C for AI. So we took it for a spin to see what works.

Introduced late last year, the open source Model Context Protocol (MCP) project was developed by Claude’s model builders as “a universal, open standard for connecting AI systems with data sources.”

It’s not just data stores like databases, either. MCP servers can expose various tools and resources to AI models, enabling functionalities such as querying databases, initiating Docker containers, or interacting with messaging platforms like Slack or Discord.

If MCP sounds at all familiar, that’s because it’s garnered a lot of attention in recent months. The official MCP server GitHub repo alone now counts dozens of official integrations with major software vendors including Grafana, Heroku, and Elastisearch, along with more than 200 community and demo servers.

If you want to connect an LLM to a SQL database, manage your Kubernetes Cluster, or automate Jira, there’s a good chance there’s already an MCP server available to do it. In fact, MCP has garnered so much attention that OpenAI and Google are now throwing their weight behind the project.

In this hands-on guide, we’ll be taking a closer look at how MCP works in practice, what you can do with it, some of the challenges it faces, and how to deploy and integrate MCP servers both with Claude Desktop or your own models with Open WebUI.

A Quick overview of MCP

Before we jump into how to spin up an MCP server, let’s take a quick look at what’s happening under the hood.

At a high level, MCP uses a typical client-server architecture, with three key components: the host, client, and server.

Here’s a high-level look at MCP’s architecture … Credit: modelcontextprotocol.io. Click to enlarge any image

The host is typically a user accessible front-end, such as Claude Desktop or an IDE like Cursor, and is responsible for managing one or more MCP clients.
Each client maintains a one-to-one connection over the MCP protocol with the server. All messages between the client and server are exchanged using JSON-RPC, but the transport layer will vary depending on the specific implementation with Stdio, HTTP, and server-sent events (SSE) that are currently supported.
The MCP server itself exposes specific capabilities to the client, which makes them accessible to the host in a standardized way. This is why MCP is described in the docs as being like USB-C for AI.

Just like USB largely eliminated the need for different interfaces to interact with peripherals and storage devices, MCP aims to allow models to talk to data and tools using a common language.

Depending on whether the resource is local, a SQLite database for example, or a remote, such as an S3 bucket, the MCP server will either access the resource directly or act as a bridge to relay API calls. The USB-C analogy is particularly apt in the latter case, as many MCP servers effectively serve as adapters, translating vendor-specific interfaces into a standardized format that language models can more easily interact with.

However, the important bit is that the way these resources are exposed and responses are returned to the model is consistent.

One of the more interesting nuances of MCP is it works both ways. Not only can the host application request data from the server, but the server can also talk to the LLM via a sampling/createMessage request to the client. Unfortunately, this functionality isn’t universally supported just yet, but it could open the door to some interesting agentic workflows.

Now that we’ve got a better understanding of what MCP is and how it works, let’s dive into how to use them.

Testing MCP with Claude Desktop

Claude Desktop is one of the easiest ways to try out MCP

Given that Anthropic birthed MCP, one of the easiest ways to get your hands dirty with it is, unsurprisingly, using Claude Desktop.

If you’ve rather not use an outside LLM provider, such as Anthropic, in the next section we’ll explore how to connect MCP servers to local models and the popular Open WebUI interface.

To get started, we’ll need a few dependencies in addition to Claude Desktop, as MCP servers can run in a number of different environments. For the purposes of this demo, you need to install Node.js, Python 3, and the UVX package manager for Python.

With your dependencies installed, launch Claude Desktop and sign in using your Anthropic account. Next, navigate to the application settings and then to the “Developer” tab.

From the Claude Desktop settings, open the “Developer” tab and click “Edit Config” to generate a new MCP config file

Once there, click the “Edit Config” button. This will automatically generate an empty claude_desktop_config.json file under the ~/Library/Application Support/Claude/ folder on Mac or the %APPDATA%\Claude\ folder on Windows. This is where we’ll add our MCP Client configuration. To test things out we’ll be using the System Time and File System MCP servers.

Open the claude_desktop_config.json file in your preferred text editor or IDE — we’re using VSCodium — and replace its contents with the time-server config below. Feel free to adjust to your preferred time zone.

{
"mcpServers": {
  "time": {
    "command": "uvx",
    "args": ["mcp-server-time", "--local-timezone=UTC"]
  }
}
}

Save the file and restart Claude Desktop. When it relaunches, you should notice a new icon in the chat box indicating that the tool is available for use.

We can then test it out by asking a simple question, like: “What time is it in New York?” Claude on its own doesn’t know the local time, but now has the ability to query your time server to figure it out.

Claude on its own does know what time it is at any given point, but given access the MCP time server, now it can now tell time.

Claude on its own doesn’t know what time it is at any given point, but given access to the MCP time server, now it can now tell time

Now we’ll try out the File System MCP server by updating the claude_desktop_config.json file with the following:

{
  "mcpServers": {
    "time": {
      "command": "uvx",
      "args": ["mcp-server-time", "--local-timezone=UTC"]
    },
    "filesystem": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-filesystem",
        "/Users/username/Desktop",
        "/path/to/other/allowed/dir"
      ]
    }
  }
}

Make sure you update /Users/username/Desktop and /path/to/other/allowed/dir with the directories on your file system you’d like to give the model access to before saving.

Relaunch Claude Desktop and you should notice you now have access to more tools than before. Specifically, the File System MCP server allows the model to perform a variety of file system functions including:

Reading and writing files
Editing files
Creating or listing directories
Moving or searching files
Retrieving file info like size or creation date
Listing which directories it has access to

In this case we’ve given Claude access to our desktop. So we ask things like:

Prompt: “What’s on my desktop”
Prompt: “Can you tidy up my desktop?”
Prompt: “Rename file.txt to doc.md”

Some observations from the Vulture technical docs desk:

We did observe some flakiness with the File System MCP server with longer tasks, so your mileage may vary.
If you prefer to use PIP or Docker you can find alternative configurations for the MCP Time and File Servers servers here.

Using MCP with your own models and Open WebUI

As of version v0.6.0 Open WebUI supports MCP servers via an OpenAPI bridge – Click to enlarge

If you’d prefer to try out MCP with our own self-hosted models, Open WebUI recently merged support for the protocol via an OpenAPI-compatible proxy.

If you’re not familiar with Open WebUI, it’s a popular open source web interface similar to ChatGPT’s, which integrates with inference servers like Ollama, Llama.cpp, vLLM, or really any OpenAI-compatible API endpoint.

Prerequisites

This guide assumes you’re familiar with running models locally. If you need help, we’ve got a guide on deploying local LLMs on just about any hardware in a matter of minutes right here.
You’ll also need to deploy and configure Open WebUI in Docker. We have a detailed guide on setting this up here.
Speaking of Docker, we’ll be using the container runtime to spin up our MCP servers as well.

Once you’ve got Open WebUI up and running with your locally hosted models, extending MCP tool support is fairly easy using Docker.

As we mentioned earlier, Open WebUI supports MCP via an OpenAPI proxy server which exposes them as a standard RESTful API. According to the developers, this has a couple of benefits including better security, broader compatibility, and error handling, while keeping things simple.

Configuring MCP servers is arguably simpler as a result; but it does require converting the JSON configs used by Claude Desktop to a standard executable string.

For example, if we want to spin up a Brave Search MCP server, which will query Brave search as needed from your input prompt, we would decompose the config into a simple docker run command.

config.json:

{
  "mcpServers": {
    "brave-search": {
      "command": "npx",
      "args": [
        "-y",
        "@modelcontextprotocol/server-brave-search"
      ],
      "env": {
        "BRAVE_API_KEY": "YOUR_API_KEY_HERE"
      }
    }
  }
}

Docker run command:

docker run -p 8001:8000 --name MCP-Brave-Search -e BRAVE_API_KEY=YOUR_API_KEY_HERE ghcr.io/open-webui/mcpo:main --api-key "top-secret" -- npx -y @modelcontextprotocol/server-brave-search

If you don’t already have a Brave Search API key, you can get one for free here and use it to replace YOUR_API_KEY_HERE. Also, change the top-secret API key to something unique, private, and secure; it will be needed later.

Tip: If you want to run this server in the background, append a -d after run. You can then check server logs by running docker logs MCP-Brave-Search.

If you’d like to spin up multiple MCP servers in Docker, you can run this command again by:

Changing out 8001 for another open port
updating the --name value
and adjusting the server command accordingly

Connecting the server to Open WebUI

Once your MCP server is up and running, we can connect to it either on the user or system level. The latter requires an additional access control list (ACL) configuration to make the model available to users and models. To keep things simple, we’ll be going over how to connect to MCP servers on the individual user level.

From the Open WebUI dashboard, navigate to user settings and open the “tools” tab. From there, create a new connection, and enter the URL — usually something like http://Your_IP_Address_Here:8001 — and the top-secret API key you set earlier.

Under user setting, add your MCP server under the “Tools” tab

If everything works correctly, you should get a couple of green-toast messages in the top right corner, and you should see a new icon indicating how many tools are available to the model next to the chat box.

Now ask your locally installed and selected model something it wouldn’t know; it can automatically trigger a search query and return the results.

Once enabled, the model will perfom a a Brave search if you ask it a question it wouldn't otherwise know, such as when the International Supercomputing Conference kicks off.

Once enabled, the model will perform a Brave search if you ask it a question it wouldn’t otherwise know, such as when the International Supercomputing Conference kicks off this year

Note that this particular MCP server only includes the search API and doesn’t actually scrape the pages. For that, you’d want to look at something like the Puppeteer MCP server, or take advantage of Open WebUI’s built-in web search and crawling features, which we previously explored in our RAG tutorial.

A word on native function calling in Open WebUI

By default Open WebUI handles tool calling internally, determining the appropriate tool to call each time a new message is sent. The downside is that tools can only be called once per exchange.

The advantage of this approach is that it works with just about any model and is generally consistent in its execution. However, it can introduce challenges if, for instance, the model needs to access a tool multiple times to meet the user’s request. For example, if the model was accessing a SQL database, it might need to retrieve its schema to figure out how to format the actual query.

To get around this, you can take advantage of the model’s native tool-calling functionality, which can access tools in a reasoning-action (ReACT) style call.

The tricky bit is that while plenty of models advertise native tool support, many smaller ones aren’t that reliable. With that said, we’ve had pretty good luck with the Qwen 2.5 family of models running in Ollama.

Enabling native function calling in Open WebUI is relatively easy and can be toggled on from the “Controls” menu in the top right corner of Open WebUI. Note that when native function calling is enabled, many inference servers, such as Ollama, disable token streaming, so don’t be surprised if messages start appearing all at once rather than streaming in as they normally would.

You can enable native function calling in Open WebUI from the Chat Controls menu (Little hamburger menu in top right)

Now when you trigger a tool call, you’ll notice a different tool tip indicating which tool was used with a drop down to see what information, if any, was returned.

In addition to making it relatively easy to integrate existing MCP servers, the developers have also gone to great lengths to make it easy to build them.

They provide SDKs for multiple languages including Python, Typescript, Java, Kotlin, and C#, in order to make it easier to adapt existing code for use in an MCP server.

To test this out, we mocked up this simple calculator MCP server in about five minutes using the Python example template.

calculator_server.py

from mcp.server.fastmcp import FastMCP

mcp = FastMCP("MathSupport")

@mcp.tool()
def calculator(equation: str) -> str:
    """
    Calculate the answer to an equation.
    """
    try:
        result = eval(equation)
        return f"{equation} = {result}"
    except Exception as e:
        print(e)
        return "Invalid equation"

From there, connecting it to Open WebUI is as simple as spinning up another MCP proxy server in Docker.

docker run -p 8002:8000 --name MCP-Calculator -v ~/calculator/calculator.py:/calculator.py  ghcr.io/open-webui/mcpo:main --api-key "top-secret" -- uv run --with mcp[cli] mcp run /calculator_server.py

Or if you prefer Claude Desktop:

{
  "mcpServers": {
    "MathSupport": {
      "command": "uv",
      "args": [
        "run",
        "--with",
        "mcp[cli]",
        "mcp",
        "run",
        "/PATH_TO_PYTHON_SCRIPT_HERE/calculator_server.py"
      ]
    }
  }
}

Obviously, this doesn’t even scratch the surface of all the features and capabilities supported by MCP, but it sure at least gives you some idea of how code can be adapted for use with the protocol.

MCP is far from perfect

With thousands of available servers, and now OpenAI and Google backing the open source protocol, MCP is on track to become the de facto standard for AI integrations.

But while the protocol has managed to attract considerable attention in the months since its debut, not everyone is happy with its current implementation, particularly with regard to security, complexity, and scalability.

Security remains one of the biggest criticisms of MCP. The ease of deployment of these servers, combined with the capacity for many of these servers to execute arbitrary code, is potentially problematic unless proper vetting, safeguards, and sandboxing are put in place.

We’ve already seen at least one instance in which an MCP server can be exploited to leak message history in WhatsApp.

Beyond the obvious security concerns, there’s also the issue that while MCP can vastly simplify the integration of services and data, it’s still reliant on an LLM to take advantage of them. And while most modern generative models claim some kind of tool-calling functionality, a quick peek at the Berkeley Function-Calling Leaderboard reveals that some are better than others.

This is why Open WebUI defaults to its integrated, albeit rudimentary, function-calling capabilities, as it is still more reliable than many models’ built-in tool calling capabilities.

And then of course, from a manageability standpoint, wrangling potentially dozens of MCP servers adds operational complexity to deployments, even if they require less work than more mature AI integrations to build or deploy.

With that said, for a project announced less than six months ago, a lot of this is to be expected and many of these concerns will be addressed as it matures. Or so we hope. Who says we’re not optimists?

Additional resources

If you’re interested in playing with more MCP servers we recommend checking out the official project page on GitHub as well as Frank Fiegel’s Open-Source MCP servers catalog which includes more than 3,500 servers as of writing.

Meanwhile, if you’re interesting in building your own MCP servers or clients, we recommend checking out the official MCP docs for more information and example code.

The Register aims to bring you more local AI content like this in the near future, so be sure to share your burning questions in the comments section and let us know what you’d like to see next. ®

Source link