Skip to content Skip to sidebar Skip to footer

Upload File Azure Data Lake Rest Api

Azure Data Lake Storage Gen2 (ADLS Gen2) is a set up of capabilities dedicated to large data analytics, built onAzure Hulk storage, so it supports Azure blob Storage API while also has its own File System API.

Blob Storage API: https://docs.microsoft.com/en-us/residue/api/storageservices/operations-on-blobs

File System API: https://docs.microsoft.com/en-us/rest/api/storageservices/data-lake-storage-gen2

These interfaces allow you to create and manage file systems, likewise every bit to create and manage directories and files in file system. Azure Information Lake Storage Gen2 APIs support Azure Active Directory (Azure AD), Shared Key, and shared access signature (SAS) potency.

In this blog, we will introduce how to employ Azure AD service master to upload file to ADLS gen2 through file organization API using Powershell script.

Part 1: Register an application with the Microsoft identity platform and apply the valid role assignment for access. https://docs.microsoft.com/en-us/azure/active-directory/develop/quickstart-register-app

1. Register a new application in Azure Advertising.

Frank_Pan_0-1611987361652.png

2. Select business relationship type based on your business requirements.

Frank_Pan_2-1611987494564.png

3. Assign Storage Blob Data Possessor role to the service principal, which grants the service primary full access to hulk data rights. Yous may assign other hulk data role co-ordinate to your business requirements. For the details of built-in roles' permissions delight refer to the document https://docs.microsoft.com/en-usa/azure/part-based-access-control/built-in-roles#storage-blob-data-ow....

Frank_Pan_3-1611987578967.png

Office 2: Generate an access token of the service principal for the Remainder API calls. https://docs.microsoft.com/en-u.s./rest/api/azure/#client-credentials-grant-non-interactive-clients

1.In the Azure Portal awarding Overview, we can obtain the Application ID (customer id) and Directory ID(tenant id).

Frank_Pan_4-1611987728000.png

2. In the Document & Clandestine, create a surreptitious with an expiration time.

Frank_Pan_5-1611987780748.png

Frank_Pan_6-1611987813799.png

3. To generate an access token for the storage, we need to proper name the resources endpoint for storage resource provider as storage.azure.com.

In the certificate https://docs.microsoft.com/en-us/azure/active-directory/develop/v2-oauth2-client-creds-grant-flow#ge..., we tin can see how a token endpoint work in a common scenario.

Powershell function Instance:

          function Get-StorageAADAccessToken() {     param($TENANT_ID, $client_id, $client_secret)      $URI="https://login.microsoftonline.com/$TENANT_ID/oauth2/v2.0/token" #Nosotros are using the oauth version 2     $CONTENT_TYPE="application/x-world wide web-grade-urlencoded"      $HEADERS = @{         "Content-Type"=$CONTENT_TYPE     }       $grant_type="client_credentials"     $resource="https://storage.azure.com/.default"      $Torso="grant_type=$grant_type&client_id=$client_id&client_secret=$client_secret&scope=$resource"     $ACCESS_TOKEN = (Invoke-RestMethod -method Mail -Uri $URI -Headers $HEADERS -Body $BODY).access_token      return $ACCESS_TOKEN }                  

Part 3: Upload the file using File Organisation interface.

To upload a file using file system interface volition use the three APIs, Create File, Append Data and Flush Information. All APIs volition utilise the *.dfs.cadre.windows.net endpoint instead of *.blob.core.windows.net endpoint.

  • Create: https://docs.microsoft.com/en-united states/rest/api/storageservices/datalakestoragegen2/path/create
  • Update (Suspend & Flush): https://docs.microsoft.com/en-the states/residual/api/storageservices/datalakestoragegen2/path/update

Here is a logic period to upload a large file.

  • The outset position is 0
  • The next position is the last position plus the last content length.
  • We tin can ship multiple suspend data requests at the aforementioned time, but the position data needs to be calculated.

Frank_Pan_7-1611988384485.png

The Powershell methods example:

1. Create File is a Create API in the file arrangement. Past default, the destination is overwritten if the file already exists and has a broken lease.

          office Create-AzureADLS2File() {     param($STORAGE_ACCOUNT_NAME, $ROOT, $PREFIX) ## storage account is the name of the ADLS gen2 business relationship, root is the file system container, prefix is the path and file proper noun of the storage account          $URI="https://$STORAGE_ACCOUNT_NAME.dfs.core.windows.net/"+$ROOT+"/"+$PREFIX+"?resources=file"      $Engagement = [System.DateTime]::UtcNow.ToString("R")      $ACCESS_TOKEN=Become-StorageAADAccessToken -TENANT_ID $TENANT_ID -client_id $CLIENT_ID -client_secret $CLIENT_SECRET      $HEADERS = @{         "10-ms-engagement"=$DATE          "x-ms-version"="2019-12-12"          "say-so"="Bearer $ACCESS_TOKEN"     }     Invoke-RestMethod -method PUT -Uri $URI -Headers $HEADERS }                  

After create a file by the Powershell custom method beneath, y'all volition get a cipher size file.

          Create-AzureADLS2File -STORAGE_ACCOUNT_NAME frankpanadls2 -ROOT test -PREFIX file1                  

Frank_Pan_8-1611988803457.jpeg

2. Suspend Information is a part of Update API in the file organisation. "append" is to upload data by appending to a file.

          function Upload-AzureADLS2File() {     param($STORAGE_ACCOUNT_NAME, $ROOT, $PREFIX, $POS, $BODY)          $URI="https://$STORAGE_ACCOUNT_NAME.dfs.cadre.windows.net/"+$ROOT+"/"+$PREFIX+"?action=suspend&position=$POS"     $Date = [System.DateTime]::UtcNow.ToString("R")  $ACCESS_TOKEN= Get-StorageAADAccessToken -TENANT_ID $TENANT_ID -client_id $CLIENT_ID -client_secret $CLIENT_SECRET      $HEADERS = @{         "x-ms-engagement"=$Date          "x-ms-version"="2019-12-12"          "say-so"="Bearer $ACCESS_TOKEN"         "content-length"=0     }     Invoke-RestMethod -method PATCH -Uri $URI -Headers $HEADERS -Body $Trunk }                  

If we have content below, we can get a list of position and content length.

data row ane

data row 22

information row 333

Frank_Pan_0-1611989162194.png

          Upload-AzureADLS2File -STORAGE_ACCOUNT_NAME frankpanadls2 -ROOT exam -PREFIX file1 -POS 0 -BODY "data row 1`n" Upload-AzureADLS2File -STORAGE_ACCOUNT_NAME frankpanadls2 -ROOT test -PREFIX file1 -POS xi -BODY "information row 22`n" Upload-AzureADLS2File -STORAGE_ACCOUNT_NAME frankpanadls2 -ROOT test -PREFIX file1 -POS 23 -BODY "data row 333`n"                  

There volition be no data in the file until you flush all content in the file.

Frank_Pan_1-1611989257000.jpeg

iii. Affluent Data is a part of Update API in the file organization. "flush" is to flush previously uploaded data to a file. This request is like to PutBlockList in the blob storage api, but will need to specify position.

          office Flush-AzureADLS2File() {     param($STORAGE_ACCOUNT_NAME, $ROOT, $PREFIX, $POS)          $URI="https://$STORAGE_ACCOUNT_NAME.dfs.core.windows.net/"+$ROOT+"/"+$PREFIX+"?action=flush&position=$POS"     $Date = [System.DateTime]::UtcNow.ToString("R")      $ACCESS_TOKEN= Get-StorageAADAccessToken -TENANT_ID $TENANT_ID -client_id $CLIENT_ID -client_secret $CLIENT_SECRET      $HEADERS = @{         "x-ms-date"=$Date          "x-ms-version"="2019-12-12"          "authorization"="Bearer $ACCESS_TOKEN"         "content-length"=$POS     }      Invoke-RestMethod -method PATCH -Uri $URI -Headers $HEADERS }                  
          Flush-AzureADLS2File -AzureADLS2File -STORAGE_ACCOUNT_NAME frankpanadls2 -ROOT examination -PREFIX file1 -POS 36                  

We will see the flushed file like below with all content.

Frank_Pan_2-1611989481613.png

begaydeaturris.blogspot.com

Source: https://techcommunity.microsoft.com/t5/azure-paas-blog/how-use-storage-adls-gen2-rest-api-to-upload-file-via-aad-access/ba-p/2108778

Post a Comment for "Upload File Azure Data Lake Rest Api"