Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

google.colab.auth is unsupported in this environment for custom GCE VM runtime #2533

Open
wcwong opened this issue Dec 31, 2021 · 41 comments
Open

Comments

@wcwong
Copy link

wcwong commented Dec 31, 2021

After deploying a custom GCE VM runtime as per the instructions at https://research.google.com/colaboratory/marketplace.html and connecting, when trying to use the following code

from google.colab import auth
auth.authenticate_user()

I get the following error

NotImplementedError                       Traceback (most recent call last)
<ipython-input-1-1f759c1655bd> in <module>()
      1 from google.colab import auth
----> 2 auth.authenticate_user()

/usr/local/lib/python3.7/dist-packages/google/colab/auth.py in authenticate_user(clear_output)
    144   """
    145   if _os.path.exists('/var/colab/mp'):
--> 146     raise NotImplementedError(__name__ + ' is unsupported in this environment.')
    147   if _check_adc():
    148     return

NotImplementedError: google.colab.auth is unsupported in this environment.

My expectation was that the GCE VM deployed from the marketplace would have the same software environment as the standard runtime but also give me the ability to specify the compute/memory/gpu resources that are avaialble to my GCP project. As such, I was not expecting to need to make code changes to the notebook for it to work on the marketplace GCE VM.

@wcwong wcwong added the bug label Dec 31, 2021
@cperry-goog
Copy link

This is on our radar, apologies for the friction. We don't support auth.authenticate_user() today for a few reasons, we're tracking a fix at b/207007587

@metehanpinarli
Copy link

how do i connect to colab with private GCE server.

@flyosity
Copy link

@cperry-goog Any updates on this? I launched an A100 instance with Google Colab VM specifically to use my Colab Notebook I was using on the Colab Pro + account I was paying for, but on beefier hardware, but can't connect to Drive so it's useless.

@cmtg
Copy link

cmtg commented Jan 19, 2022

Is there a workaround? It would be really handy if results from a Colab Notebook could be saved to Drive.

@blois
Copy link
Contributor

blois commented Jan 21, 2022

Keep in mind that a custom GCE VM will be accessible to all users who have access to VMs within that project. Because of this you need to be careful about putting credentials on the VM- they will be accessible to everyone with access to that VM.

Because the Colab service cannot guarantee the VM is only accessible to a single user we are not allowed to provide credentials to it.

An alternative is to use something such as https://github.com/astrada/google-drive-ocamlfuse-

@wcwong
Copy link
Author

wcwong commented Jan 21, 2022

@blois - I'm a little surprised that anyone who has access to the project has access to notebook data by default. My assumption was that the environment ran in its own container with each different user connecting being given their own containers and their own container local storage. Isn't that how it works in the hosted environment?

I guess that's mostly the crux of my confusion. If there are sufficient environmental protections in place for the hosted environment, why isn't a project security boundary considered equivalent? How is this different than any other project level security boundaries in GCP?

Specifically, doesn't the workaround you describe also put the credentials on the VM? And with FUSE can't anyone in the project, by default, ssh to the VM, then sudo su to the user and have access to the FUSE drive? So this doesn't materially change the security posture?

@alexandrnikitin
Copy link

Because the Colab service cannot guarantee the VM is only accessible to a single user we are not allowed to provide credentials to it.

It's not necessary a gmail login. It can be a service account (the one the VM has access to). Why not to support it for better UX?

@thanawatn
Copy link

thanawatn commented Jan 27, 2022

This is on our radar, apologies for the friction. We don't support auth.authenticate_user() today for a few reasons, we're tracking a fix at b/207007587

You can add in your policy to comply users with agreement and allow google.colab.auth for those who use custom GCE VM runtime. Also it would be nice if it's available in google cloud function na kub.

@EV3RETH
Copy link

EV3RETH commented Feb 8, 2022

Has this been resolved? Colab Pro + only ever gives me P100s so I upgraded to a A100 with GCE Vms but now I can't access all my google drive files.

@reinisindans
Copy link

Was using ocamlfuse solution to access my Drive, but that has just stopped working too. Have to look for an alternative solution, again.
I hope this issue gets adressed, using drive for data storage was quite convenient for smaller personal and research projects.

@jmilagroso
Copy link

how do i connect to colab with private GCE server.

https://research.google.com/colaboratory/marketplace.html

@RayH1975
Copy link

@cperry-goog Any updates on this? I launched an A100 instance with Google Colab VM specifically to use my Colab Notebook I was using on the Colab Pro + account I was paying for, but on beefier hardware, but can't connect to Drive so it's useless.

@itsuzef
Copy link

itsuzef commented Jun 6, 2022

Any updates on this? Trying to connect a custom GCE VM, but it is an unsupported environment

@blois
Copy link
Contributor

blois commented Jun 6, 2022

#2533 (comment) is still the current status.

@nicoleitte
Copy link

If we're connecting to the custom GCE VM through a locally-hosted runtime (via port-forwarding), there's no way to install omcamlfuse, since terminal functionality is disabled.

@djgish485
Copy link

What's the point of using Colab if we can't use beefier hardware? Any recommendations for alternative services?

@xmalina-aibuild
Copy link

It's september. This still hasn't been resolved? Very disappointed.... we just upgraded for the same reasons and got caught by this bug.

@seb-tc
Copy link

seb-tc commented Sep 22, 2022

Hey everyone, I'm just as confused and annoyed at the lack of Google Drive integration with GCE. I hope we find a fix soon.

  • I think I know why ocamlfuse is failing now. I haven't found a fix yet, but as soon as I figure out alternative credentials for ocamlfuse, I'll post my results here.

  • Out-Of-Band Error :
    -- https://developers.google.com/identity/protocols/oauth2/resources/oob-migration

  • Using OAuth 2.0 to Access Google APIs"On February 16 2022, we announced plans to make Google OAuth interactions safer by using more secure OAuth flows. This guide helps you to understand the necessary changes and steps to successfully migrate from the OAuth out-of-band (OOB) flow to supported alternatives. This effort is a protective measure against phishing and app impersonation attacks during interactions with Google's OAuth 2.0 authorization endpoints."

@alexandrnikitin
Copy link

Keep in mind that a custom GCE VM will be accessible to all users who have access to VMs within that project. Because of this you need to be careful about putting credentials on the VM- they will be accessible to everyone with access to that VM.

Because the Colab service cannot guarantee the VM is only accessible to a single user we are not allowed to provide credentials to it.

I mentioned it already in the thread and will do it again. If the only concern is that Google Drive creds/tokens will be accessible to everyone who has access to that VM then we can use a service account.

  1. VM get a service account associated with it
  2. One goes to Google Drive and explicitly shares "Colab Notebooks" and any other folder with the service account
  3. google.colab.auth() knows to auth and access the drive using the service account

@DasDominus
Copy link

DasDominus commented Oct 30, 2022

It would be nice there is a bypass/opt-out.
Not everyone cares about data privacy that much. Our lab for example has all of our data in a shared space (within our lab of course). But essentially anyone has access to the VM and google account, should also have the access to data/drive.

I mean google-drive-ocamlfuse works. but I'd expect it to work out of box.

At least have a prompt when trying to mount etc

@tomasyany
Copy link

Any update on this issue?

@doshik
Copy link

doshik commented Dec 2, 2022

Scrolled through this thread hoping for a solution and was met with disappointment..

@ruidi-huang
Copy link

Disappointment in 2023...

@alexandrnikitin
Copy link

@cperry-goog @blois Any updates on the issue? What is the status of b/207007587?

@corngk
Copy link

corngk commented Mar 18, 2023

After a year and two months of waiting, any update on this issue?

@iamjakob
Copy link

iamjakob commented Mar 25, 2023

Any updates? Paid for custom GCE VM and immediately regretted.

@caioflexa
Copy link

Any updates?

@mauricio-repetto
Copy link

I came here because I'm facing the same issue... unbelievable that there's no updates on this yet.

@kurshakuz
Copy link

@cperry-goog are there any updates on that issue?

@pjspol
Copy link

pjspol commented Apr 23, 2023

Same issue. Hoping for an update!

@chriscast88
Copy link

Hi all! I wanted to share the solution that has been working for me since it seems that this has been an ongoing issue for a lot of people.

I've been using google-drive-ocamlfuse to mount my gDrive on a custom GCE VM. The process is a bit involved and not the most elegant, but it works.

First you'll need to create a new project and OAuth credentials via the API Console. The key here is that we'll need to set it up for Headless Usage since Google Colab doesn't have a web browser.

Follow the steps here on ocamlfuse's documentation to setup Headless Usage HERE and this should give you API access to your Drive, with a client ID and secret key.

Once you have your client ID and secret key setup, you can install ocamlfuse with the following command

!sudo add-apt-repository ppa:alessandro-strada/ppa
!sudo apt-get update
!sudo apt-get install google-drive-ocamlfuse

and then you should be able to now mount your drive with this

!google-drive-ocamlfuse -headless -label me -id ##yourClientID##.apps.googleusercontent.com -secret ###yoursecret##### 

which should then show you something similar to this

   Please, open the following URL in a web browser: https://accounts.google.com/o/oauth2/auth?client_id=##yourClientID##.apps.googleusercontent.com&redirect_uri=urn%3Aietf%3Awg%3Aoauth%3A2.0%3Aoob&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive&response_type=code&access_type=offline&approval_prompt=force

which will take you to a credential page and you can copy and paste your key

   Please enter the verification code: 

And that's basically it! You should be able to mount your drive with the code below

 !mkdir -p /content/drive/MyDrive
 !google-drive-ocamlfuse /content/drive/MyDrive

The only thing is that this has made things cumbersome for when I just have a single notebook that I like to run on either a hosted runtime or GCE VM, so I've made the code below in order to determine whether or not it's on a GCE VM, install ocamlfuse if needed, and mount the drive the old fashioned way, or with ocamlfuse. I pretty much have this code block on all of my notebooks now. Hope this helps!!! Just make sure to replace your client ID and secret keys

#Mount Google Drive
import re
import os

version = !cat /proc/version

if re.search("gce", version[0]):
    print("Session is connected to a custom GCE VM, running ocamlfuse")
    # Check if ocamlfuse is installed
    if 'google-drive-ocamlfuse' in os.popen('pip freeze').read():
        print("ocamlfuse is already installed, mounting...")
    else:
        # If not installed, install it
        print("ocamlfuse is not installed, installing...")
        #!pip install ocamlfuse
        !sudo add-apt-repository ppa:alessandro-strada/ppa
        !sudo apt-get update
        !sudo apt-get install google-drive-ocamlfuse
    # Is anything already mounted? Let's jiggle the handle
    !umount /content/drive/MyDrive
    !rm -rf ~/.gdfuse/default
    !rm -rf /content/drive/MyDrive
    !mkdir -p /content/drive/MyDrive

    # Mount with ocamlfuse
    !google-drive-ocamlfuse -headless -id REPLACE_CLIENT_ID_HERE.apps.googleusercontent.com -secret REPLACE_SECRET_KEY_HERE
    !google-drive-ocamlfuse /content/drive/MyDrive

else:
    print("Session is connected to a hosted runtime, running Google Auth")
    from google.colab import drive
    drive.mount('/content/drive')

@Great-Bucket
Copy link

Hi chriscast88,

Thanks for posting this. After following your guide, I ran this code:
!mkdir -p /content/drive/MyDrive
!google-drive-ocamlfuse /content/drive/MyDrive

But got this error:
/usr/bin/xdg-open: 869: www-browser: not found
/usr/bin/xdg-open: 869: links2: not found
/usr/bin/xdg-open: 869: elinks: not found
/usr/bin/xdg-open: 869: links: not found
/usr/bin/xdg-open: 869: lynx: not found
/usr/bin/xdg-open: 869: w3m: not found
xdg-open: no method available for opening 'https://accounts.google.com/o/oauth2/auth?client_id=XXXXXXXXXXXX.apps.googleusercontent.com&redirect_uri=httpsXXXXXXFgd-ocaml-auth.appspot.com%2Foauth2callback&scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fdrive&response_type=code&access_type=offline&approval_prompt=force&state=XXXXXXXXXXXXXXXXXXXXX'
/bin/sh: 1: firefox: not found
/bin/sh: 1: google-chrome: not found
/bin/sh: 1: chromium-browser: not found
/bin/sh: 1: open: not found
Cannot retrieve auth tokens.
Failure("Error opening URL:https://accounts.google.com/o/oauth2/auth?client_id=XXXXXXXXXXXX.apps.googleusercontent.com&redirect_uri=httpsXXXXXXXXFgd-ocaml-auth.appspot.com%2Foauth2callback&scope=httpsXXXXXXXXXXXwww.googleapis.com%2Fauth%2Fdrive&response_type=code&access_type=offline&approval_prompt=force&state=9XXXXXXXXXXXXXXXXXXX"

If anyone has any advice, I'd appreciate it?

Thanks!

@divyanshsinghvi
Copy link

!google-drive-ocamlfuse -headless -id REPLACE_CLIENT_ID_HERE.apps.googleusercontent.com -secret REPLACE_SECRET_KEY_HERE google-drive-ocamlfuse /content/drive/MyDrive

Try this?

@SishaarRao
Copy link

SishaarRao commented Nov 17, 2023

For my use case, using a Google Storage Bucket as the backing datastore was an equivalent option to Google Drive. It's very straightforward to connect to a bucket with the following code (utilizing gcsfuse)

### MOUNT GOOGLE STORAGE BUCKET
from google.colab import auth
auth.authenticate_user()

!echo "deb https://packages.cloud.google.com/apt gcsfuse-bionic main" > /etc/apt/sources.list.d/gcsfuse.list
!curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
!apt -qq update
!apt -qq install gcsfuse

!mkdir -p mounted-bucket
!gcsfuse --implicit-dirs audio-model-data mounted-bucket
BASE_PATH = "/content/mounted-bucket"

@drkousek
Copy link

still no fix?
Thanks @chriscast88 for solution, worked flawlessly even for Shareddrives with little bit of tweaking.

@deanp70
Copy link

deanp70 commented Mar 17, 2024

Posting with permission from @cperry-goog - we're collaborating with the Colab team to provide DagsHub Storage as an alternative to GDrive that is more scalable and built for use with large datasets. It's an S3-compatible bucket that has much simpler access controls, and can be mounted easily.

It might help avoid the issues above - here's a link to an example notebook to try it out

We're looking for community feedback, so I'd love to get your input if it helps with the issue at hand.

(If you're curious, DagsHub is a platform for ML teams which is why we think Colab should have a storage solution suitable for ML workloads)

@nonlin
Copy link

nonlin commented Jun 14, 2024

For my use case, using a Google Storage Bucket as the backing datastore was an equivalent option to Google Drive. It's very straightforward to connect to a bucket with the following code (utilizing gcsfuse)

### MOUNT GOOGLE STORAGE BUCKET
from google.colab import auth
auth.authenticate_user()

!echo "deb https://packages.cloud.google.com/apt gcsfuse-bionic main" > /etc/apt/sources.list.d/gcsfuse.list
!curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
!apt -qq update
!apt -qq install gcsfuse

!mkdir -p mounted-bucket
!gcsfuse --implicit-dirs audio-model-data mounted-bucket
BASE_PATH = "/content/mounted-bucket"

Umm doesn't this run into the same issue being that "google.colab is unsupported in this environment."

@nishchay-veer
Copy link

How can I change my google colab compute engine?

@luckandrew
Copy link

luckandrew commented Oct 16, 2024

auth.authenticate_user()

Still a problem after 2 years...This took time and $

@Jahetthana
Copy link

@pranavhh
Copy link

@cperry-goog, Any updates on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests