Skip to main content

File Format for Data Forwarding to an Amazon S3 Bucket

note

Data forwarding is not currently supported for data assigned to the Infrequent Tier. 

After you start forwarding data to S3, you should start to see file objects posted in your configured bucket. The log messages are accumulated and returned after being ingested by Sumo.

The log messages are saved in CSV files in compressed gzip files and named according to the convention you specified when you configured Sumo to start data forwarding, as described in step 5 of Start data forwarding to S3. The file naming convention for legacy data forwarding is described below in Legacy File Naming Format

Messages are buffered during data ingest for either approximately 5 minutes or until 100MB of data is received, whichever is first. Then the buffered data is written to a new CSV file and forwarded. 

These file objects will contain the messages received as well as the system metadata for the messages, including:

  • messageId: The unique ID for the specific message within Sumo Logic.
  • sourceName: Is returned blank.
  • sourceHost: Is returned blank.
  • sourceCategory: Is returned blank.
  • messageTime: The parsed message time from the log message, as epoch.
  • receiptTime: The time the service originally received the message, as epoch.
  • sourceID: The unique ID of the Source configured to send the message to the service.
  • collectorId: The unique ID of the Collector configured to send the message to the service.
  • count: The message number from the specific log Source Name. These should be sequential for a specific Source file.
  • format: The timestamp format used to parse the message time from the log message
  • encoding: The encoding of the original file contents.
  • message: The raw log message as read from the original Source.
  • \<field>: Aggregate fields are added based on your query.

Example

Metadata fields:

::: messageId,sourceName,sourceHost,sourceCategory,messageTime,receiptTime,sourceId,collectorId,count,format,view,encoding,message,field1,field2 :::

Sample object:

::: "-9223371513354977010","","","","1472590091453","1472590094034","101688020","100607825","979","plain:atp:o:0:l:29:p:yyyy-MM-dd HH:mm:ss,SSSZZZZ","JchenTest2","UTF8","2016-08-30 13:48:11,453 -0700 WARN [hostId=nite-cqsplitter-1][module=cqsplitter] [localUserName=cqsplitter][logger=cqsplitter.engine.CQsMultiMatchersManager] [thread=DTP-cqsplitter.receiver.consumer.v2.threadpool-6] MultiMatcher queue for customer 0000000000000131 is at capacity, adding element will block.","25","0000000000000131" :::

Legacy File Naming Format

The file naming convention for legacy data forwarding (prior to January 2017) is: <start_epoch>-<end_epoch>--<objectid>.csv.gz

Where:

  • start_epoch is the epoch time representing the parsed message time of the first message contained within the file
  • end_epoch is the epoch time representing the parsed message time of the last message contained within the file.
  • objectid is a unique ID for the file object, which is generated by Sumo at creation time.