Google Artifacts Registry
  • 20 Jan 2025
  • Dark
    Light
  • PDF

Google Artifacts Registry

  • Dark
    Light
  • PDF

Article summary

Overview

The Google Artifact Registry (GAR) offers a more robust and scalable solution for Dataloop integrations by supporting multiple artifact formats, including Docker images. This allows users to manage not only containerized workloads but also custom plugins, scripts, and machine learning models more securely and efficiently.

  • GAR's fine-grained IAM controls and VPC Service Controls enhance security, ensuring that data workflows in Dataloop remain compliant and secure.
  • As Google’s recommended and actively developed service, GAR provides a future-proof foundation for managing complex data and AI pipelines on the Dataloop platform. Learn more

To integrate Google Artifacts Registry (GAR) with the Dataloop platform, follow these steps:

  1. Create and Configure a GCP Google Artifacts Registry
  2. Integrate GAR with the Dataloop Platform

Create and Configure a GCP Google Artifacts Registry

Step 1: Enable the Google Artifacts Registry API

  1. Go to the GCP Console
  2. Navigate to APIs & Services > Library.
  3. Search for Artifacts Registry API.
  4. Click Enable.

Alternatively, use the command:

gcloud services enable artifactregistry.googleapis.com

Step 2. Set Up Permissions

Ensure your account has the necessary permissions:

  • Artifact Registry Admin (roles/artifactregistry.admin)
  • Storage Admin (roles/storage.admin) (optional, for managing storage)

Use this command to assign the role:

gcloud projects add-iam-policy-binding [PROJECT-ID] \
  --member="user:[USER-EMAIL]" \
  --role="roles/artifactregistry.admin"

Step 3. Create a Repository

Artifact Registry organizes artifacts in repositories.

Command Format:

gcloud artifacts repositories create [REPO_NAME] \
  --repository-format=[FORMAT] \
  --location=[REGION] \
  --description="[DESCRIPTION]"

Example (Docker repository):

gcloud artifacts repositories create my-docker-repo \
  --repository-format=docker \
  --location=us-central1 \
  --description="Docker repository for container images"
  • repository-format: Supports docker, maven, npm, python, etc.
  • location: Choose a region close to your deployment (e.g., us-central1).

Step 3: Configure Docker Authentication

To push and pull Docker images, configure authentication:

gcloud auth configure-docker [REGION]-docker.pkg.dev

Step 4: Push Docker Images to Artifact Registry

  1. Tag your Docker image:
docker tag [IMAGE_NAME] gcr.io/[PROJECT_ID]/[IMAGE_NAME]:[TAG]
  1. Push the image to GCR:
docker push gcr.io/[PROJECT_ID]/[IMAGE_NAME]:[TAG]
  1. Verify the image:
    1. Navigate to Container Registry in the GCP Console: https://console.cloud.google.com/gcr.
    2. Confirm that the image is visible in the registry.

Integrate GAR with the Dataloop Platform

Step 1: Generate a Service Account Key

  1. Create a service account:
    1. Navigate to IAM & Admin > Service Accounts.
    2. Click Create Service Account.
    3. Provide a Name (e.g., dataloop-access) and click Create.
    4. Assign the following roles:
      1. Storage Object Viewer.
      2. Container Registry Viewer.
    5. Click Done.
  2. Generate a key for the service account:
    1. Find your service account in the list, and click to open.
    2. Select the Keys tab.
    3. Click Add Key > Create New Key.
    4. Select JSON and click Create.
    5. Save the JSON key file securely.

Step 3: Upload the Service Account Key to Dataloop

  1. Log in to your Dataloop Platform.
  2. Navigate to the Integrations page.
  3. Click Create Integration -> Create Registry Integration.
    1. Integration Name: Provide a name for the integration.
    2. Provider: Select GCP from the list.
      1. Service: Select Google Artifacts Registry from the list.
      2. Click Import JSON file.
      3. Location: Enter the full location of your Google Artifact Registry (GAR), including the region, (e.g., us-central1-docker.pkg.dev).
  4. Click Create Integration.

Step 4: Test the Configuration

  • In the Dataloop platform, try pulling or referencing a Docker image hosted in your GCR.
  • Ensure the integration is working without any errors.