Skip to main content

File Reader

Reads files from disk using a compatible parser.

You can create FileReader sources in the web UI using Source Preview.

See Supported reader-parser combinations) for parsing options.

File Reader properties

property

type

default value

notes

Block Size

Integer

64

amount of data in KB for each read operation

Compression Type

String

Set to gzip when wildcard specifies a file or files in gzip format. Otherwise, leave blank.

Directory

String

Specify the path to the directory containing the file(s). The path may be relative to the Striim installation directory (for example, Samples/PosApp/appdata) or from the root.

Include Subdirectories

Boolean

False

Set to True if the files are written to subdirectories of the Directory path, for example, if each day's files are in a subdirectory named by date.

When this property is False, the filename metadata in the File Reader output includes only the file name. When this property is True, the filename metadata in the File Reader output includes the absolute path to the file.

Position By EOF

Boolean

True

  • If set to True, reading starts at the end of the file, so only new data is acquired.

  • If set to False, reading starts at the the beginning of the file and then continues with new data.

  • When FileReader is used with a cache, this setting is ignored and reading always begins from the beginning of the file.

  • When you create a a FileReader using Source Preview, this is set to False.

Rollover Style

String

Default

Set to log4j if reading Log4J files created using RollingFileAppender.

Skip BOM

Boolean

True

If set to True, when the wildcard value specifies multiple files, Striim will read the Byte Order Mark (BOM) in the first file and skip the BOM in all other files. If set to False, it will read the BOM in every file.

Wildcard

String

Specify the name of the file, or a wildcard pattern to match multiple files (for example, *.xml).

  • When reading multiple files, Striim will read them in the default order for the operating system.

  • While File Reader is reading a file, it will ignore any changes to the portion of the file that has already been read.

  • If a file is modified after File Reader has read it, it will be read again, resulting in it sending duplicate events.

The output type is WAevent except when using Avro Parser  or JSONParser.

File Reader sample code

When used with DSV Parser, the type for the output stream can be created automatically from the file header (see Creating the FileReader output stream type automatically).

Striim also provides templates for creating applications that read from files and write to various targets. See Creating an application using a template for details.

An example from the PosApp sample application:

CREATE SOURCE CsvDataSource USING FileReader (
  directory:'Samples/PosApp/appData',
  wildcard:'posdata.csv',
  positionByEOF:false
)
PARSE USING DSVParser (
  header:Yes,
  trimquote:false
)
OUTPUT TO CsvStream;

See PosApp for a detailed explanation and MultiLogApp for additional examples.

Creating the output stream type automatically

When FileReader is used with DSV Parser, the type for the output stream can be created automatically from the file header using OUTPUT TO <stream name> MAP(filename:'<source file name>') . For example:

CREATE SOURCE CsvDataSource USING FileReader (
  directory:'Samples/PosApp/appData',
  wildcard:'posdata*.csv',
  positionByEOF:false
)
PARSE USING DSVParser (
  header:Yes,
  trimquote:false
)
OUTPUT TO CsvStream MAP(filename:’posdata*.csv’);

Notes:

  • The specified source file must exist when the source is created.

  • The header must be the first line of the file (the HeaderLineNo setting is ignored by MAP).

  • If multiple files are specified by the wildcard property, the header will be taken from the first one read.

  • All files must be like the first one read, with headers in the first line and the same number of fields.

Creating the FileReader output stream type automatically

When FileReader is used with DSV Parser, the type for the output stream can be created automatically from the file header using OUTPUT TO <stream name> MAP(filename:'<source file name>'). A regular, unmapped output stream must also be specified. For example:

CREATE SOURCE PosSource USING FileReader (
  wildcard: 'PosDataPreview*.csv',
  directory: 'Samples/PosApp/appData',
  positionByEOF:false )
PARSE USING DSVParser (
  header:Yes,
  trimquote:false
)
OUTPUT TO PosSource_Stream,
OUTPUT TO PosSource_Mapped_Stream MAP(filename:'PosDataPreview.csv');

Notes:

  • When you use a MAP clause, you may not specify the Column Delimit Till. Header Line No, Line Number, or No Column Delimiter properties.

  • The file specified in the MAP clause must be in the directory specified by FileWriter's directory property when the source is created.

  • The header must be the first line of that file.

  • The column names in the header can contain only alphanumeric, _ (underscore) and $ (dollar sign) characters, and may not begin with numbers.

  • All files to be read must be similar to the one specified in the MAP clause, with headers (which will be ignored) in the first line and the same number of fields.

  • All fields in the output stream type will be of type String.

  • In this release, this feature is available only in the console, the MAP clause cannot be edited in the web UI, and changing the Wildcard property value in the web UI will break the source.