Skip to content
This repository has been archived by the owner on Oct 30, 2023. It is now read-only.

This repo will demonstrate how to connect a GCS bucket with a Labelbox Dataset using Google Cloud Functions

Notifications You must be signed in to change notification settings

Labelbox/labelbox-gcs-stream

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

51 Commits
 
 
 
 
 
 

Repository files navigation

Data Streaming with GCS using Google Cloud Functions

This repo will provide with 3 deployable Cloud Functions:

  • stream_data_rows - Creates a Labelbox data row every time a new asset is uploaded to the GCS bucket
  • update_metadata - Updates Labelbox metadata any time updates in GCS asset metadata are made
  • delete_data_rows - Deletes a data row in Labelbox if that data row is deleted in GCS

Prerequisites:

  1. Set up Delegated Access to Labelbox, save the integration name for deployment
  2. GCS Account with permissions to create GCS functions and read bucket events
  3. Labelbox API Key - needed to create/delete data rows and update data row metadata values

How to Create / Deploy Google Cloud Functions from your Command Line

Setup

Use gcloud auth login to authenticate Google Cloud - you should expect a pop-up from Google prompting a sign in:

gcloud auth login

[Recommended] Use gcloud config to set google project & region for your cloud function / bucket (example region - "us-east1")

  • This can be found on the Configuration tab in the GCS Bucket UI
gcloud config set project PROJECT_NAME
gcloud config set functions/region "us-east1"

Clone this repo, defile your CLI enviornment variables

git clone https://github.com/Labelbox/labelbox-gcs-stream.git
cd labelbox-gcs-stream
GCS_BUCKET_NAME=my_bucket_name
LABELBOX_API_KEY=my_api_key
LABELBOX_INTEGRATION_NAME=my_integration_name

Create GCS Functions from GitHub using gcloud functions delploy

  • If no funciton with the name provided exists, gcloud functions delploy creates a Google Cloud function
  • If a funciton with the name provided exists, gcloud functions delploy updates the existing Google Cloud function
  • You can deploy cloud functions from GitHub by giving your root directory a main.py file and a requirements.txt file like in this repo - for different cloud functions in one repo, specify which python function for which cloud function with the --entry_point parameter in gcloud functions delploy

Deploy stream_data_rows:

gcloud functions deploy stream_data_rows_function --entry-point stream_data_rows --runtime python37 --trigger-bucket=$GCS_BUCKET_NAME --timeout=540 --set-env-vars=labelbox_api_key=$LABELBOX_API_KEY,labelbox_integration_name=$LABELBOX_INTEGRATION_NAME

Deploy update_metadata:

gcloud functions deploy update_metadata_function --entry-point update_metadata --runtime python37 --trigger-resource=$GCS_BUCKET_NAME --trigger-event="google.storage.object.metadataUpdate" --timeout=540 --set-env-vars=labelbox_api_key=$LABELBOX_API_KEY,labelbox_integration_name=$LABELBOX_INTEGRATION_NAME

Deploy delete_data_rows:

gcloud functions deploy delete_data_rows_function --entry-point delete_data_rows --runtime python37 --trigger-resource=$GCS_BUCKET_NAME --trigger-event="google.storage.object.delete" --timeout=540 --set-env-vars=labelbox_api_key=$LABELBOX_API_KEY

About

This repo will demonstrate how to connect a GCS bucket with a Labelbox Dataset using Google Cloud Functions

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages