Skip to main content

DSV Formatter

Formats a writer's output as delimited text.

DSV Formatter properties

property	type	default value	notes
Charset	String
Column Delimiter	String	,
Header	Boolean	False	Set to True to add a header to the output files for FileWriter or S3Writer. When the target's input stream is of a user-defined type, the header will include the field names. When the target's input stream is the output of a DatabaseReader or CDC reader source, the header will include the source table's column names. In this case, the writer's properties must include `directory: '%@metadata(TableName)%'` or, in S3, `bucketname: '%@metadata(TableName)%'` or `foldername: '%@metadata(TableName)%'`, and events for each table will be written to a separate directory or S3 bucket. When Header is True, if any of the special characters listed in Using non-default case and special characters in table identifiers are used in source column names, they will be preserved in the output. Known issue DEV-14229: in this release, headers may be incorrect if the writer's rolloverpolicy includes an interval. Workaround: use only eventcount and/or filesize in the rolloverpolicy.
Members	String		comma-separated list of fields to be selected from the writer's input stream; if left blank, selects all fields One use for this property is to remove fields used only to name the output directory in the target (see Setting output names and rollover / upload policies).
Null Value	String	NULL
Quote Character	String	"
Row Delimiter	String	\n	see Setting rowdelimiter values
Standard	String	none	set to RFC4180 to format output using the RFC 4180 standard and ignore any conflicting values in other properties
Use Quotes	Boolean	False	set to True to escape values of type String using the quotecharacter

DSV Formatter sample application

For example, this variation on the PosApp sample application writes to a file using DSVFormatter:

CREATE SOURCE DSVFormatterTestSource USING FileReader (
  directory:'Samples/PosApp/appData',
  wildcard:'PosDataPreview.csv',
  blocksize: 10240,
  positionByEOF:false
)
PARSE USING DSVParser (
  header:Yes,
  trimquote:false
) OUTPUT TO DSVSource_Stream;

CREATE CQ CsvToPosData
INSERT INTO DSVTransformed_Stream
SELECT TO_STRING(data[1]),
       TO_DATEF(data[4],'yyyyMMddHHmmss'),
       DHOURS(TO_DATEF(data[4],'yyyyMMddHHmmss')),
       TO_DOUBLE(data[7]),
       TO_STRING(data[9])
FROM DSVSource_Stream;

CREATE TARGET DSVFormatterOut using FileWriter(
  filename:'DSVFormatterOutput')
FORMAT USING DSVFormatter ()
INPUT FROM DSVTransformed_Stream;

The first lines of DSVFormatterOutput are:

D6RJPwyuLXoLqQRQcOcouJ26KGxJSf6hgbu,2013-03-12T17:32:10.000-07:00,17,2.2,41363
OFp6pKTMg26n1iiFY00M9uSqh9ZfMxMBRf1,2013-03-12T17:32:10.000-07:00,17,22.78,16950
ljh71ujKshzWNfXMdQyN8O7vaNHlmPCCnAx,2013-03-12T17:32:10.000-07:00,17,218.57,18224

If you set DSVFormatter to escape the strings, as follows:

FORMAT USING DSVFormatter (
  usequotes:True,
  quotecharacter:'"')

Then the first lines of DSVFormatterOutput would be:

"D6RJPwyuLXoLqQRQcOcouJ26KGxJSf6hgbu",2013-03-12T17:32:10.000+01:00,17,2.2,"41363"
"OFp6pKTMg26n1iiFY00M9uSqh9ZfMxMBRf1",2013-03-12T17:32:10.000+01:00,17,22.78,"16950"
"ljh71ujKshzWNfXMdQyN8O7vaNHlmPCCnAx",2013-03-12T17:32:10.000+01:00,17,218.57,"18224"

Would you like to provide feedback? Just click here to suggest edits.