ADLS Gen1 Writer
Writes to files in Azure Data Lake Storage Gen1. A common use case is to write data from on-premise sources to an ADLS staging area from which it can be consumed by Azure-based analytics tools.
ADLS Gen1 Writer properties
property | type | default value | notes |
---|---|---|---|
Auth token Endpoint | String | the token endpoint URL for your web application (see "Generating the Service Principal" under Using Client Keys) | |
Client ID | String | the application ID for your web application (see "Generating the Service Principal" under Using Client Keys) | |
Client Key | encrypted password | the key for your web application (see "Generating the Service Principal" under Using Client Keys) | |
Compression Type | String | Set to | |
Data Lake Store Name | String | the name of your Data Lake Storage Gen1 account, for example, mydlsname.azuredatalakestore.net (do not include adl://) | |
Directory | String | The full path to the directory in which to write the files. See Setting output names and rollover / upload policies for advanced options. | |
File Name | String | The base name of the files to be written. See Setting output names and rollover / upload policies. | |
Rollover on DDL | Boolean | True | Has effect only when the input stream is the output stream of a CDC reader source. With the default value of True, rolls over to a new file when a DDL event is received. Set to False to keep writing to the same file. |
Rollover Policy | String |
|
This adapter has a choice of formatters. See Supported writer-formatter combinations for more information.
Data is written in 4 MB batches or whenever rollover occurs.
ADLS Gen1 Writer sample application
CREATE APPLICATION testADLSGen1; CREATE SOURCE PosSource USING FileReader ( wildcard: 'PosDataPreview.csv', directory: 'Samples/PosApp/appData', positionByEOF:false ) PARSE USING DSVParser ( header:Yes, trimquote:false ) OUTPUT TO PosSource_Stream; CREATE CQ PosSource_Stream_CQ INSERT INTO PosSource_TransformedStream SELECT TO_STRING(data[1]) AS MerchantId, TO_DATE(data[4]) AS DateTime, TO_DOUBLE(data[7]) AS AuthAmount, TO_STRING(data[9]) AS Zip FROM PosSource_Stream; CREATE TARGET testADSLGen1target USING ADSLGen1Writer ( directory:'mydir', filename:'myfile.json', datalakestorename:'mydlsname.azuredatalakestore.net', clientid:'********-****-****-****-************', authtokenendpoint:'https://login.microsoftonline.com/********-****-****-****-************/oauth2/token', clientkey:'********************************************' ) FORMAT USING JSONFormatter () INPUT FROM PosSource_TransformedStream; END APPLICATION testADLSGen1;
Since the test data set is less than 10,000 events, and ADSLGen1Writer is using the default rollover policy, the data will be uploaded in 30 seconds.