DSV Formatter
Formats a writer's output as delimited text.
DSV Formatter properties
property | type | default value | notes |
---|---|---|---|
Charset | String | ||
Column Delimiter | String | , | |
Header | Boolean | False | Set to True to add a header to the output files for FileWriter or S3Writer. When the target's input stream is of a user-defined type, the header will include the field names. When the target's input stream is the output of a DatabaseReader or CDC reader source, the header will include the source table's column names. In this case, the writer's properties must include When Header is True, if any of the special characters listed in Using non-default case and special characters in table identifiers are used in source column names, they will be preserved in the output. Known issue DEV-14229: in this release, headers may be incorrect if the writer's rolloverpolicy includes an interval. Workaround: use only eventcount and/or filesize in the rolloverpolicy. |
Members | String | comma-separated list of fields to be selected from the writer's input stream; if left blank, selects all fields One use for this property is to remove fields used only to name the output directory in the target (see Setting output names and rollover / upload policies). | |
Null Value | String | NULL | |
Quote Character | String | " | |
Row Delimiter | String | \n | |
Standard | String | none | set to RFC4180 to format output using the RFC 4180 standard and ignore any conflicting values in other properties |
Use Quotes | Boolean | False | set to True to escape values of type String using the quotecharacter |
DSV Formatter sample application
For example, this variation on the PosApp sample application writes to a file using DSVFormatter:
CREATE SOURCE DSVFormatterTestSource USING FileReader ( directory:'Samples/PosApp/appData', wildcard:'PosDataPreview.csv', blocksize: 10240, positionByEOF:false ) PARSE USING DSVParser ( header:Yes, trimquote:false ) OUTPUT TO DSVSource_Stream; CREATE CQ CsvToPosData INSERT INTO DSVTransformed_Stream SELECT TO_STRING(data[1]), TO_DATEF(data[4],'yyyyMMddHHmmss'), DHOURS(TO_DATEF(data[4],'yyyyMMddHHmmss')), TO_DOUBLE(data[7]), TO_STRING(data[9]) FROM DSVSource_Stream; CREATE TARGET DSVFormatterOut using FileWriter( filename:'DSVFormatterOutput') FORMAT USING DSVFormatter () INPUT FROM DSVTransformed_Stream;
The first lines of DSVFormatterOutput
are:
D6RJPwyuLXoLqQRQcOcouJ26KGxJSf6hgbu,2013-03-12T17:32:10.000-07:00,17,2.2,41363 OFp6pKTMg26n1iiFY00M9uSqh9ZfMxMBRf1,2013-03-12T17:32:10.000-07:00,17,22.78,16950 ljh71ujKshzWNfXMdQyN8O7vaNHlmPCCnAx,2013-03-12T17:32:10.000-07:00,17,218.57,18224
If you set DSVFormatter to escape the strings, as follows:
FORMAT USING DSVFormatter ( usequotes:True, quotecharacter:'"')
Then the first lines of DSVFormatterOutput
would be:
"D6RJPwyuLXoLqQRQcOcouJ26KGxJSf6hgbu",2013-03-12T17:32:10.000+01:00,17,2.2,"41363" "OFp6pKTMg26n1iiFY00M9uSqh9ZfMxMBRf1",2013-03-12T17:32:10.000+01:00,17,22.78,"16950" "ljh71ujKshzWNfXMdQyN8O7vaNHlmPCCnAx",2013-03-12T17:32:10.000+01:00,17,218.57,"18224"