Skip to main content

DSV Formatter

Formats a writer's output as delimited text.

DSV Formatter properties

property

type

default value

notes

Charset

String

Column Delimiter

String

,

Header

Boolean

False

Set to True to add a header to the output files for FileWriter or S3Writer.

When the target's input stream is of a user-defined type, the header will include the field names.

When the target's input stream is the output of a DatabaseReader or CDC reader source, the header will include the source table's column names. In this case, the writer's properties must include directory: '%@metadata(TableName)%' or, in S3, bucketname: '%@metadata(TableName)%' or foldername: '%@metadata(TableName)%', and events for each table will be written to a separate directory or S3 bucket.

When Header is True, if any of the special characters listed in Using non-default case and special characters in table identifiers are used in source column names, they will be preserved in the output.Using non-default case and special characters in table identifiers

Known issue DEV-14229: in this release, headers may be incorrect if the writer's rolloverpolicy includes an interval. Workaround: use only eventcount and/or filesize in the rolloverpolicy.

Members

String

comma-separated list of fields to be selected from the writer's input stream; if left  blank, selects all fields

One use for this property is to remove fields used only to name the output directory in the target (see Setting output names and rollover / upload policies).

Null Value

String

NULL

Quote Character

String

"

Row Delimiter

String

\n

see Setting rowdelimiter values

Standard

String 

none

set to RFC4180 to format output using the RFC 4180 standard and ignore any conflicting values in other properties

Use Quotes

Boolean

False

set to True to escape values of type String using the quotecharacter

DSV Formatter sample application

For example, this variation on the PosApp sample application writes to a file using DSVFormatter:

CREATE SOURCE DSVFormatterTestSource USING FileReader (
  directory:'Samples/PosApp/appData',
  wildcard:'PosDataPreview.csv',
  blocksize: 10240,
  positionByEOF:false
)
PARSE USING DSVParser (
  header:Yes,
  trimquote:false
) OUTPUT TO DSVSource_Stream;

CREATE CQ CsvToPosData
INSERT INTO DSVTransformed_Stream
SELECT TO_STRING(data[1]),
       TO_DATEF(data[4],'yyyyMMddHHmmss'),
       DHOURS(TO_DATEF(data[4],'yyyyMMddHHmmss')),
       TO_DOUBLE(data[7]),
       TO_STRING(data[9])
FROM DSVSource_Stream;

CREATE TARGET DSVFormatterOut using FileWriter(
  filename:'DSVFormatterOutput')
FORMAT USING DSVFormatter ()
INPUT FROM DSVTransformed_Stream;

The first lines of DSVFormatterOutput are:

D6RJPwyuLXoLqQRQcOcouJ26KGxJSf6hgbu,2013-03-12T17:32:10.000-07:00,17,2.2,41363
OFp6pKTMg26n1iiFY00M9uSqh9ZfMxMBRf1,2013-03-12T17:32:10.000-07:00,17,22.78,16950
ljh71ujKshzWNfXMdQyN8O7vaNHlmPCCnAx,2013-03-12T17:32:10.000-07:00,17,218.57,18224

If you set DSVFormatter to escape the strings, as follows:

FORMAT USING DSVFormatter (
  usequotes:True,
  quotecharacter:'"')

Then the first lines of DSVFormatterOutput would be:

"D6RJPwyuLXoLqQRQcOcouJ26KGxJSf6hgbu",2013-03-12T17:32:10.000+01:00,17,2.2,"41363"
"OFp6pKTMg26n1iiFY00M9uSqh9ZfMxMBRf1",2013-03-12T17:32:10.000+01:00,17,22.78,"16950"
"ljh71ujKshzWNfXMdQyN8O7vaNHlmPCCnAx",2013-03-12T17:32:10.000+01:00,17,218.57,"18224"