Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
19 changes: 18 additions & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,24 @@ All notable changes to kagglelink will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).

## [Unreleased]
## [1.2.0] - 2025-01-09

### Added
- **Environment variable fallback support** - Configure via `KAGGLELINK_KEYS_URL` and `KAGGLELINK_TOKEN` env vars as alternative to CLI flags (#14)
- **Unified logging system** - Consistent logging with emoji indicators (⏳ ✅ ❌), timestamps, elapsed time tracking, and error categorization (#15)
- **Success banner with zrok token** - Clear connection instructions displayed after setup with the zrok share token
- **Configuration source logging** - Shows whether config came from CLI args or environment variables
- **Save & Run All tip in README** - Guidance for avoiding Kaggle session timeouts

### Changed
- **Shallow clone** (`--depth 1`) for faster repository setup (#16)
- **Improved error messages** - Error categorization (prerequisite, network, upstream) with actionable suggestions (#16)
- **Commit hash logging** - Shows git commit after successful clone for debugging (#16)

### Fixed
- **gum color codes in Kaggle logs** - Removed ANSI escape codes from success banner for cleaner output in Kaggle's minimal log viewer (#17)
- **SSH command removed from banner** - Prevents confusion with host key warnings on ephemeral instances; SSH instructions now in README only (#17)
- **gum --yes flag** - Fixed interactive prompt during package installation for non-interactive environments

## [1.1.0] - 2025-12-07

Expand Down
165 changes: 113 additions & 52 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,93 +4,139 @@ A streamlined solution for accessing Kaggle computational resources via SSH and

## Overview

kagglelink allows you to ssh into Kaggle and leverage those kaggle resources, or you can run kaggles notebook remotely using VSCode, with more coding support, and better development environment
KaggleLink allows you to connect to Kaggle environments via SSH, enabling you to leverage Kaggle's computational resources

![Image](https://github.com/user-attachments/assets/db4454ff-5545-4094-adeb-47b74ab0c33a)
![](https://github.com/user-attachments/assets/db4454ff-5545-4094-adeb-47b74ab0c33a)

## Requirements
## Getting Started

1. A Zrok token is required for establishing the tunnel. Create an account at [myZrok.io](https://myzrok.io/) to get your token.
### Requirements

2. Ensure your account is on the Starter plan to utilize NetFoundry's public Zrok instance.
To use KaggleLink, you need:

3. You need to upload your public key to a github repository or a public file hosting service
1. **Zrok Token**: A Zrok token is essential for establishing the secure tunnel. Create an account at [myZrok.io](https://myzrok.io/) to obtain your token. Ensure your account is on the **Starter plan** to utilize NetFoundry's public Zrok instance, which offers 2 environment connections (one for your local machine, one for the Kaggle instance).
2. **Public SSH Key**: Your public SSH key needs to be accessible via a URL, either from a GitHub repository or another public file hosting service.

## Quick Setup
### Quick Setup (on Kaggle)

One line command setup?

Paste this into Kaggle cell
Execute the following one-line command in a Kaggle notebook cell. This script will set up Zrok and SSH on your Kaggle instance.

```bash
!curl -sS https://bhdai.github.io/setup | bash -s -- -k <public_key_url> -t <zrok_token>
```

> [!NOTE]
>
> replace <public_key_url> with the URL of your public key file and <zrok_token> with your Zrok token.
> Replace `<public_key_url>` with the URL of your public SSH key file and `<zrok_token>` with your Zrok token.

Wait for the setup to complete. You should see output similar to this upon successful configuration:

Wait for the setup to finish, you should see something like this at the end
![](https://github.com/user-attachments/assets/22f564f3-8622-4c6c-bb82-9c9c63dd322a)

![Image](https://github.com/user-attachments/assets/22f564f3-8622-4c6c-bb82-9c9c63dd322a)
> [!TIP]
> **Avoiding Session Timeouts**: Kaggle's interactive notebook sessions have idle timeouts. For long-running remote development, use the **"Save & Run All"** feature by clicking the **Save Version** button (top right) and selecting "Save". This runs your notebook as a background job, avoiding timeout interruptions. You can still get the zrok share token from the log screen(click active event at bottom left -> Open Logs in Viewer)

### How to setup public key?
#### How to set up your public SSH key?

Generate a new SSH key pair on your local machine (if you haven't already):
1. **Generate an SSH key pair** on your local machine (if you haven't already). Use a descriptive filename, for example:

```bash
ssh-keygen -t rsa -b 4096 -C "kaggle_remote_ssh" -f ~/.ssh/kaggle_rsa
```
```bash
ssh-keygen -t rsa -b 4096 -C "kaggle_remote_ssh" -f ~/.ssh/kaggle_rsa
```

Create a github repository and push the `~/.ssh/kaggle_rsa.pub` file to it. Make sure the repository is public. Once finished, you can get the public key URL by navigating to the file in your repository and clicking on the "Raw" button.
2. **Upload your public key** (`~/.ssh/kaggle_rsa.pub`) to a public GitHub repository or a similar public file hosting service.
3. **Obtain the Raw URL**: Navigate to your uploaded public key file in your repository and click the "Raw" button.

![Image](https://github.com/user-attachments/assets/ec9a884c-1c97-4be6-bd6d-03ac5dd16de7)
![](https://private-user-images.githubusercontent.com/140616004/444039100-ec9a884c-1c97-4be6-bd6d-03ac5dd16de7.png?jwt=eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJpc3MiOiJnaXRodWIuY29tIiwiYXVkIjoicmF3LmdpdGh1YnVzZXJjb250ZW50LmNvbSIsImtleSI6ImtleTUiLCJleHAiOjE3NjU0NjQyMzMsIm5iZiI6MTc2NTQ2MzkzMywicGF0aCI6Ii8xNDA2MTYwMDQvNDQ0MDM5MTAwLWVjOWE4ODRjLTFjOTctNGJlNi1iZDZkLTAzYWM1ZGQxNmRlNy5wbmc_WC1BbXotQWxnb3JpdGhtPUFXUzQtSE1BQy1TSEEyNTYmWC1BbXotQ3JlZGVudGlhbD1BS0lBVkNPRFlMU0E1M1BRSzRaQSUyRjIwMjUxMjExJTJGdXMtZWFzdC0xJTJGczMlMkZhd3M0X3JlcXVlc3QmWC1BbXotRGF0ZT0yMDI1MTIxMVQxNDM4NTNaJlgtQW16LUV4cGlyZXM9MzAwJlgtQW16LVNpZ25hdHVyZT04YjZiY2M1OWRiMDUzYWZiMDUwODUzMjg2NDA4ZTU5NDAxZTM3YWU3ZGJmMDRlMjFiZjA0YmFmOGJlNTJmNzg1JlgtQW16LVNpZ25lZEhlYWRlcnM9aG9zdCJ9.wDGsBk1CyVVAWFLSGh8wRldUbz2hiAOzw6t3Zf39K5A)

Copy the URL from your browser's address bar. It usually takes the form like this `https://raw.githubusercontent.com/<username>/<repo_name>/refs/heads/main/<file_path>`
Copy the URL from your browser's address bar. It typically looks like `https://raw.githubusercontent.com/<username>/<repo_name>/refs/heads/main/<file_path>`.

### How to get zrok token?
#### How to get your Zrok token?

Create your zrok account, if you haven't already, go [here](https://myzrok.io/billing) and change your plan to Starter plan, and then create a new token. Finally visit [https://api-v1.zrok.io](https://api-v1.zrok.io/), you should setup and get your token there
1. If you don't have one, create your Zrok account at [myZrok.io](https://myzrok.io/).
2. Go to the [billing page](https://myzrok.io/billing) and ensure your plan is set to **Starter**.
3. Create a new token.
4. Visit [https://api-v1.zrok.io](https://api-v1.zrok.io/) to retrieve and manage your Zrok tokens.

## Client Setup
### Advanced: Environment Variables

After completing the Kaggle setup, you'll receive a token. Follow these steps on your local machine:
For automated pipelines or power users, you can configure KaggleLink using environment variables instead of CLI flags.

1. Install Zrok locally by following the [official installation guide](https://docs.zrok.io/docs/guides/install/).
| Variable | CLI Equivalent | Description |
|----------|----------------|-------------|
| `KAGGLELINK_KEYS_URL` | `-k` | URL to your public SSH key |
| `KAGGLELINK_TOKEN` | `-t` | Your Zrok token |

For Arch-based distributions, you can use:
```bash
yay -S zrok-bin
```
> [!NOTE]
> CLI arguments (`-k`, `-t`) always override environment variables if both are present.

2. Enable zrok in your local machine
```bash
zrok enable <zrok-token>
```
#### Setting Environment Variables in Kaggle

2. Access your Kaggle instance using the token:
```bash
zrok access private <the_token_from_kaggle>
```
The most secure way to pass these credentials is using **Kaggle Secrets**.

3. This will open a dashboard displaying your connection details, including a local address like `127.0.0.1:9191`.
1. Add your secrets in the Kaggle notebook sidebar (**Add-ons** -> **Secrets**).
2. Use the following Python snippet in a cell *before* running the setup script:

## SSH Connection
```python
from kaggle_secrets import UserSecretsClient
import os

*For VSCode check out the [old instrunction](https://github.com/bhdai/kagglelink/blob/ngrok/README.md#connect-via-ssh) (will update this eventually)*
user_secrets = UserSecretsClient()

# Set environment variables from secrets
# Ensure you have added 'KAGGLELINK_TOKEN' and 'KAGGLELINK_KEYS_URL' (optional) to your secrets
os.environ['KAGGLELINK_TOKEN'] = user_secrets.get_secret("KAGGLELINK_TOKEN")

# You can also set the URL directly if it's public and not stored as a secret
os.environ['KAGGLELINK_KEYS_URL'] = "https://raw.githubusercontent.com/your/repo/main/key.pub"
```

Connect to your Kaggle instance via SSH:
Once the environment variables are set, you can run the setup script without arguments:

```bash
!curl -sS https://bhdai.github.io/setup | bash
```

## Usage

After completing the Kaggle setup, your Kaggle instance is ready for connection. The script will output a Zrok private token at the end which you'll use to connect from your local machine.

### Client Setup (on your Local Machine)

1. **Install Zrok locally**: Follow the [official Zrok installation guide](https://docs.zrok.io/docs/guides/install/).
For Arch-based distributions, you can use:

```bash
yay -S zrok-bin
```

2. **Enable Zrok**: Enable Zrok on your local machine using your personal Zrok token:

```bash
zrok enable <your_personal_zrok_token>
```

3. **Access the private tunnel**: Use the Zrok `private_token` obtained from the Kaggle setup output to establish the connection:

```bash
zrok access private <the_private_token_from_kaggle_setup>
```

This command will open a dashboard in your terminal, displaying your connection details, including a local address like `127.0.0.1:9191`.

### SSH Connection

Connect to your Kaggle instance via SSH using the local address and port provided by Zrok (e.g., `127.0.0.1:9191`).

```bash
ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i ~/.ssh/kaggle_rsa -p 9191 root@127.0.0.1
```

Note: The port (e.g., 9191) generally remains consistent across sessions, so no need to adjust it for each new instance.
> [!NOTE]
> The port (e.g., 9191) generally remains consistent across sessions, so you typically won't need to adjust it for each new instance.

### SSH Configuration
#### SSH Configuration

To simplify future connections, add this configuration to your `~/.ssh/config` file:
To simplify future connections, add the following configuration to your `~/.ssh/config` file:

```
Host Kaggle
Expand All @@ -104,21 +150,36 @@ Host Kaggle

With this configuration, you can simply use `ssh Kaggle` to connect.

## File Transfer with Rsync
### File Transfer with Rsync

Transfer files between your local machine and Kaggle instance:
Transfer files between your local machine and Kaggle instance using `rsync`:

```bash
# From local to remote
rsync -e "ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i ~/.ssh/kaggle_rsa -p 9191" <path_to_local_file> root@127.0.0.1:/kaggle/working
rsync -e "ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i ~/.ssh/kaggle_rsa -p 9191" <path_to_local_file> root@127.0.0.1:<remote_destination_path>
# or if you have you SSH config set up (see above)
rsync -avz <path_to_local_file> Kaggle:<remote_destination_path>

# From remote to local
rsync -e "ssh -o UserKnownHostsFile=/dev/null -o StrictHostKeyChecking=no -i ~/.ssh/kaggle_rsa -p 9191" root@127.0.0.1:<path_to_remote_file> <local_destination_path>
# or if you have you SSH config set up (see above)
rsync -avz Kaggle:<path_to_remote_file> <local_destination_path>
```

> [!NOTE]
>
> If you're using the Starter plan, they only offer 2 environment connection on this plan one for you local machine, one for kaggle instance. While the script will automatically release the Kaggle instance when you turn off Kaggle, but it's best to check [https://api-v1.zrok.io/](https://api-v1.zrok.io/) to make sure your local machine is connected and there are no other active connections before running the script again.
> [!IMPORTANT]
> The Zrok Starter plan limits you to two environment connections. While the script automatically releases the Kaggle instance's connection upon shutdown, it's good practice to verify your active connections at [https://api-v1.zrok.io/](https://api-v1.zrok.io/) before rerunning the script, ensuring your local machine is the primary active connection.

## Contributing

We welcome contributions to KaggleLink! If you're interested in improving this project, please follow these steps:

1. **Fork the repository**.
2. **Create a new branch** for your feature or bug fix (`git checkout -b feature/your-feature-name` or `bugfix/issue-description`).
3. **Make your changes**, adhering to the existing coding style and standards.
4. **Write and run tests** to ensure your changes work as expected and don't introduce regressions.
5. **Commit your changes** with clear and concise commit messages.
6. **Push your branch** to your forked repository.
7. **Open a Pull Request** to the main branch, providing a detailed description of your changes.

## License

Expand Down
Loading