Skip to content

Access Ollama via remote servers (colab, kaggle, etc.) with tunneling

Notifications You must be signed in to change notification settings

amitness/ollama-remote

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ollama-remote

Automatically configure and obtain a URL for Ollama on any remote Linux server (GPU Providers, Google Colab, Kaggle, etc.) using the Cloudflared tunnel..

Note: For Google Colab, here is an example notebook. This is only allowed if you are a paid colab user as per their terms of service. If you use it for free account, do it at your own risk.

This is useful for faster experimentation if the ollama model run too slow locally, and for synthetic data generation in large batches.

Usage

Install the package via pip.

pip install ollama-remote

Then, just run ollama-remote on the remote server and it will give you back the URL.

ollama-remote

You will get back the commands to copy and run locally.

image

Usecase 1: Run via ollama

Once you set OLLAMA_HOST to the assigned URL, you can run any ollama commands on your local terminal. It will feel like working locally, but the actual model inference happens on the server side. Make sure you have ollama CLI installed locally.

export OLLAMA_HOST='https://spa-visiting-voices-omissions.trycloudflare.com'
ollama run phi3:mini --verbose

If the server has GPUs such as Colab, this will be much faster.

image

The commands are same as regular ollama and you can download any models that fits on the GPU server-side.

ollama pull phi3:mini
ollama run phi3:mini

Usecase 2: OpenAI SDK

You are also provided code to use the model through the OpenAI SDK. Make sure to pull the model specified in the code beforehand via ollama pull phi3:mini.

from openai import OpenAI

client = OpenAI(
    base_url="https://spa-visiting-voices-omissions.trycloudflare.com/v1/",
    api_key="ollama",
)

response = client.chat.completions.create(
    model="phi3:mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Hello!"},
    ],
)
print(response.choices[0].message.content)
image

About

Access Ollama via remote servers (colab, kaggle, etc.) with tunneling

Resources

Stars

Watchers

Forks