Overview
  • 18 Jul 2024
  • Dark
    Light
  • PDF

Overview

  • Dark
    Light
  • PDF

Article summary

Overview

Integrations are your secure connections to a variety of data providers, which include cloud providers like AWS, GCP, or Azure. With multiple ways to connect to each provider, they offer flexibility while ensuring your data's security. They are responsible for the crucial tasks of authentication, authorization, and the secure storage of secrets. This ensures that your data is not only accessible but also well-protected. With a focus on adaptability and security, these connections provide you with the reliability necessary for safe data management, while strictly adhering to the highest standards of data privacy and compliance.

Create Integration By Using Data Management Page

You can now create integrations in Dataloop using the Data Management and Data Governance pages. To create an integration, navigate to the Data page, then click Create Dataset, and select the Integrations section.


Integration Validation for all Supported Cloud Integrations

  • All supported integrations will now be validated for actual resource creation on the customer's side
  • This does not impact existing storage integrations. Only applicable to new integrations.
  • Supported on:
    • GCP: Cross Project, Private Key
    • Azure: Client Secret
    • AWS: Cross Account, STS

Supports Duplicate Naming Enforcement for New Integrations & Secrets

  • Newly created integrations and secrets will now be enforced to be unique by name
  • This does not impact existing integrations and secrets. Only applicable to new integrations and secrets.

Good to know

AWS

Why choosing Cross Account Integration over other integrations is better

  • The Least privilege principle: AWS Cross-Account integration empowers you to grant finely-tuned IAM roles and permissions to users or service accounts across different AWS accounts. This enables you to restrict access to specific actions and resources, thereby minimizing the attack surface in the event of a compromise.

  • Rotational capability: With AWS Cross-Account, you have the flexibility to effortlessly rotate IAM roles. This facilitates periodic refreshment of the credentials used by a third-party service, a significantly more secure approach compared to the use of long-lived access keys that persist until you revoke them.

  • Better auditability: AWS Cross-Account integration produces comprehensive logs of all actions taken by the third-party service, facilitating more effective monitoring and detection of any suspicious activity. This gives you better visibility and control over third-party access to your AWS resources.

  • Separation of responsibilities: The Cross-Account feature in AWS allows you to maintain control over your AWS accounts and delegate specific tasks to third-party services without exposing your credentials. This ensures that third-party services only have access to the resources they require for their tasks, thus minimizing the risk of credential theft or misuse.

In conclusion, the use of AWS Cross-Account integration instead of Access Key or STS integrations offers a more secure and flexible mechanism for granting access to third-party services in AWS. It allows you to implement the principle of least privilege, easily rotate credentials, and provide better auditability and separation of responsibilities.

To learn how to do this, see the AWS Cross Account Integration article.


GCP

Why choosing Cross Project Integration over other integrations is better

  • The Least privilege principle: Cross-Project access in GCP allows you to grant fine-grained IAM roles and permissions to users or service accounts across different GCP projects. This means you can restrict access to specific actions and resources, reducing the attack surface in case the third-party service is compromised.

  • Rotational capability: Cross-Project access in GCP enables you to easily rotate IAM roles, allowing you to periodically refresh the credentials used by a third-party service. This is a more secure approach than using long-lived access keys that persist until you revoke them.

  • Better Audibility: Cross-Project access in GCP generates detailed logs of all actions taken by the third-party service, making it easier to monitor and detect suspicious activity. This provides better visibility and control over third-party access to your GCP resources.

  • Separation of responsibilities: The Cross-Project access in GCP enables you to maintain control over your GCP projects and delegate specific tasks to third-party services without exposing your credentials. This ensures that third-party services only have access to the resources they need to perform their tasks and reduces the risk of credential theft or misuse.

In conclusion, the use of GCP Cross-Project integration instead of Private Key integration offers a more secure and flexible mechanism for granting access to third-party services in GCP. It allows you to implement the principle of least privilege, easily rotate credentials, and provide better auditability and separation of responsibilities.

To learn how to do this, see the GCP Cross Project Integration article.


Important considerations when setting up external cloud storage

  • Consider storing your files in a region close to your annotators, for faster file serving. In annotation work, files are streamed from your storage directly to the end user, without having to go through Dataloop servers first, so faster serving can be key for efficient work.
  • Write access is required, to allow saving thumbnails, modalities, and converted files to a hidden 'dataloop' folder on your storage. A permission "test-file" will be written to your storage when the platform validates permissions.
  • Annotations and metadata are stored in the Dataloop platform: if you delete a file from your external storage, you'll need to trigger a file delete process in Dataloop, or setup Upstream-sync in advance to ensure these events are covered.