- 18 Dec 2024
- Print
- DarkLight
- PDF
Manage Storage Drivers
- Updated On 18 Dec 2024
- Print
- DarkLight
- PDF
Dataloop enables you to execute a variety of storage driver management actions.
AWS S3 Storage Driver
If you're using the Amazon S3 cloud storage service, storage drivers are used to establish the connection between your applications and the cloud storage infrastructure. The storage drivers are an abstract representation of the bucket in the S3 service. You can make use of your existing buckets and folder paths to allow Dataloop to read and write data. For more information about Amazon S3, see the Amazon S3.
This topic describes how to create Amazon S3 storage driver in the Dataloop platform as part of your AWS integration.
You can create an Amazon S3 storage driver on the Dataloop platform if you complete one of the following Amazon integrations:
- Cross Account
- Access Key
- STS
For more information, see the Integration Overview.
Create AWS S3 Storage Driver on the Dataloop Platform
- Log in to the Dataloop platform.
- Select Data from the left-side panel.
- Select the Storage Drivers tab.
- Click Create Storage Driver. The Data Management Resource Creation right-side panel is displayed.
- Storage Driver Name: Enter a Name for the storage driver.
- Provider: Select AWS from the list.
- Resource: Ensure the S3 Bucket is selected by default.
- Integration: Select the relevant AWS Integration from the list.
- Bucket Name: Enter the S3 bucket name.
- Path (Optional): Enter the folder path, if required.
- Storage Class (Optional): Enter the storage class, if required.
- Region: Select the region where the S3 bucket is located from the list.
- Allow Delete Items (Optional): Select the checkbox, if required. This option allows Dataloop to remove items from the storage driver when those items are deleted from Dataloop's dataset.
- If enabled, it does not delete items from the Dataloop dataset when you delete from the storage driver.
- Items in the bucket are deleted only when the last reference (pointer) to them is removed. As long as at least one pointer exists, the item remains.
- Click Create Storage Driver. A confirmation message is displayed.
To Create a Dataset Based on an External Cloud Storage, see the Create a Dataset Based on an External Cloud Storage.
GCP Storage Driver
If you're using Google Cloud Storage (GCS) service, storage drivers are used to establish the connection between your applications and the cloud storage infrastructure. The storage drivers are an abstract representation of the bucket in the GCS service. For more information about GCS, see the Cloud Storage.
This topic describes how to create GCS storage drivers in the Dataloop platform as part of your GCP integration.
You can create a GCS storage driver in the Dataloop only if you complete one of the following GCP integrations:
- Private Key
- Cross Project
For more information, see the Integration Overview.
Create a GCS Storage Driver on the Dataloop Platform
- Log in to the Dataloop platform.
- Select Data from the left-side panel.
- Select the Storage Drivers tab.
- Click Create Storage Driver. The Data Management Resource Creation right-side panel is displayed.
- Storage Driver Name: Enter a Name for the storage driver.
- Provider: Select GCP from the list.
- Resource: Ensure the GCS Bucket is selected by default.
- Integration: Select the Cross Project or Private Key integration from the list.
- Bucket Name: Enter the GCS bucket name.
- Path (Optional): Enter the folder path, if required.
- Allow Delete Items (Optional): Select the checkbox, if required. This option allows Dataloop to remove items from the storage driver when those items are deleted from Dataloop's dataset.
- If enabled, it does not delete items from the Dataloop dataset when you delete from the storage driver.
- Items in the bucket are deleted only when the last reference (pointer) to them is removed. As long as at least one pointer exists, the item remains.
- Click Create Storage Driver. A confirmation message is displayed.
To create a dataset based on an external cloud storage, see the Create a Dataset Based on an External Cloud Storage.
Azure Blob Storage Drive
If you're using Azure Blob Storage service, storage drivers are used to establish the connection between your applications and the cloud storage infrastructure. The storage drivers are an abstract representation of the container in the Azure Blob. For more information, see the Azure Blob Storage.
This topic describes how to create Azure Blob storage drivers in the Dataloop platform as part of your Azure integration.
You can create a Azure storage driver in the Dataloop if you complete Azure Secret Key integration.
For more information, see the Integration Overview.
Create Azure Blob Storage Driver on the Dataloop Platform
- Log in to the Dataloop platform.
- Select Data from the left-side panel.
- Select the Storage Drivers tab.
- Click Create Storage Driver. The Data Management Resource Creation right-side panel is displayed.
- Storage Driver Name: Enter a Name for the storage driver.
- Provider: Select Azure from the list.
- Resource: Select the Blob Storage from the list.
- Integration: Select the Azure (Client Secret) integration from the list.
- Bucket Name: Enter the Azure bucket name.
- Path (Optional): Enter the folder path, if required.
- Allow Delete Items (Optional): Select the checkbox, if required. This option allows Dataloop to remove items from the storage driver when those items are deleted from Dataloop's dataset.
- If enabled, it does not delete items from the Dataloop dataset when you delete from the storage driver.
- Items in the bucket are deleted only when the last reference (pointer) to them is removed. As long as at least one pointer exists, the item remains.
- Click Create Storage Driver. A confirmation message is displayed.
To Create a Dataset Based on an External Cloud Storage, see the Create a Dataset Based on an External Cloud Storage.
Azure Data Lake Gen2 Storage Driver
Azure Data Lake Storage Gen2 is a set of capabilities dedicated to big data analytics, built on Azure Blob Storage.
Data Lake Storage Gen2 provides file system semantics with Hierarchical directory structure.
The storage drivers are an abstract representation of the container in the Azure. For more information, see the Azure Data Lake Storage Gen2.
This topic describes how to create Azure Data Lake Storage Gen2 storage drivers in the Dataloop platform as part of your Azure integration.
You can create a Azure storage driver in the Dataloop only if you complete Azure Secret Key integration.
For more information, see the Integration Overview.
Create an Azure Data Lake Gen2 Storage Driver in the Dataloop Platform
- Log in to the Dataloop platform.
- Select Data from the left-side panel.
- Select the Storage Drivers tab.
- Click Create Storage Driver. The Data Management Resource Creation right-side panel is displayed.
- Storage Driver Name: Enter a Name for the storage driver.
- Provider: Select Azure from the list.
- Resource: Select the Data Lake Storage Gen2 from the list.
- Integration: Select the Azure (Client Secret) integration from the list.
- Bucket Name: Enter the Azure bucket name.
- Path (Optional): Enter the folder path, if required.
- Allow Delete Items (Optional): Select the checkbox, if required. This option allows Dataloop to remove items from the storage driver when those items are deleted from Dataloop's dataset.
- If enabled, it does not delete items from the Dataloop dataset when you delete from the storage driver.
- Items in the bucket are deleted only when the last reference (pointer) to them is removed. As long as at least one pointer exists, the item remains.
To Create a Dataset Based on an External Cloud Storage, see the Create a Dataset Based on an External Cloud Storage.
- Click Create Storage Driver. A confirmation message is displayed.
Edit Storage Drivers
- Navigate to the Data page using the left-side navigation.
- Select the Storage Drivers tab.
- In the Storage Drivers tab, find the storage driver that you want to edit.
- Click on the Edit icon. An edit window is displayed on the right-side.
- Edit the Name of the Storage Driver.
- Click Save Changes. A confirmation message is displayed.
Delete Storage Drivers
- Deleting the storage driver is not possible when connected datasets are present. Prior to deleting the storage drivers, make sure to delete the connected datasets.
- Deleting the storage driver will lead to the loss of access to any connected datasets.
- Navigate to the Data page using the left-side navigation.
- Select the Storage Drivers tab.
- In the Storage Drivers tab, find the storage driver that you want to delete.
- Click on the Trash icon to delete. A confirmation message is displayed.
- Click Delete Storage Driver. A message indicating successful deletion is displayed.