Skip to main content

MultiFile Reader

Reads files from disk. This reader is similar to File Reader except that it reads from multiple files at once.

See Supported reader-parser combinations) for parsing options.

MultiFile Reader properties

property

type

default value

notes

Block Size

Integer

64

amount of data in KB for each read operation

Compression Type

String

Set to gzip when wildcard specifies a file or files in gzip format. Otherwise, leave blank.

Directory

String

Specify the path to the directory containing the file(s). The path may be relative to the Striim installation directory (for example, Samples/PosApp/appdata) or from the root.

Group Pattern

String

a regular expression defining the rollover pattern for each set of files (see Using regular expressions (regex))

Position by EOF

Boolean

True

If set to True, reading starts at the end of the file, so only new data is acquired. If set to False, reading starts at the the beginning of the file and then continues with new data.

Rollover Style

String

Default

Set to log4j if reading Log4J files created using RollingFileAppender.

Skip BOM

Boolean

True

If set to True, when the wildcard value specifies multiple files, Striim will read the Byte Order Mark (BOM) in the first file and skip the BOM in all other files. If set to False, it will read the BOM in every file.

Thread Pool Size

Integer

20

For best performance, set to the maximum number of files that will be read at once.

Wildcard

String

name of the file, or a wildcard pattern to match multiple files (for example, *.xml)

Yield After

Integer

20

the number of events after which a thread will be handed off to the next read process

The output type is WAevent except when using Avro Parser  or JSONParser.

MultiFIle Reader example

This example would recognize log.proc1.0 and log.proc1.1 as parts of one log and log.proc2.0 and log.proc2.1 as parts of another, ensuring that all the events from each log will be read in the correct order.

CREATE SOURCE MFRtest USING MultiFileReader (
  directory:'Samples',
  WildCard:'log.proc*',
  grouppattern:'(?:(?:(?:<[^>]+>)*[^<.]*)*.){2}'
)

Alternatively, you can use this statement to ensure the events from each log are read in the correct order:

CREATE SOURCE MFRtest USING MultiFileReader (
  directory:'Samples',
  WildCard:'log.proc*',
  grouppattern:'log\\.proc[0-9]{1,3}'
)