Describe the question/issue
Hi team, we are facing the issue of missing values of the metadata fields container_name, container_id and source after the enabling of multiline parser. We are on AWS ECS with Fargate and aws-firelens as the logs-driver.
It appears that after multiline aggregation, the re-emitted record no longer contains the original metadata fields for stdout logs that were injected by Firelens.
Before adding the multiline parser, the metadata fields were present:
{
"@timestamp": "202x-xx-xxT00:00:00.000Z",
"container_id": "XX",
"container_name": "XX",
"source": "stdout",
"log": "XX",
"ecs_cluster": "XX",
"ecs_task_arn": "arn:aws:ecs:XX",
"ecs_task_definition": "XX-task-definition:314"
}
After deploying task-definition-315 with multiline parser enabled, stdout logs are missing container_id, container_name, and source:
{
"@timestamp": "202x-xx-xxT00:00:00.000Z",
"log": "XX",
"ecs_cluster": "XX",
"ecs_task_arn": "arn:aws:ecs:XX",
"ecs_task_definition": "XX-task-definition:315"
}
Interestingly, stderr logs still contain those fields.
So in our case:
Multiline disabled = all fields present
Multiline enabled = fields missing only for stdout
Any help would be greatly appreciated 🙏.
Configuration
Our configuration:
[SERVICE]
HTTP_Server On
HTTP_Listen 0.0.0.0
HTTP_PORT 2020
storage.metrics On
Parsers_File /custom-parsers.conf
Log_Level debug
[FILTER]
Name multiline
Match *
Multiline.Key_Content log
Multiline.Parser go, unknown-query-context
[OUTPUT]
Name cloudwatch_logs
Match *
log_group_name xx
log_stream_prefix xx
region xx
retry_limit 2
[OUTPUT]
Name kafka
Match *
message_key_field ecs_task_arn
xx xx // kafka properties redacted
retry_limit 2
timestamp_format iso8601_ns
The custom multiline parser:
[MULTILINE_PARSER]
name unknown-query-context
type regex
#
# Regex rules for multiline parsing
# ---------------------------------
#
# configuration hints:
#
# - first state always has the name: start_state
# - every field in the rule must be inside double quotes
#
# rules | state name | regex pattern | next state
# ------|---------------|-------------------------------------------------|------------
rule "start_state" "^\d{4}\/\d{2}\/\d{2} \d{2}:\d{2}:\d{2} " "cont"
rule "cont" "^(?!\d{4}\/\d{2}\/\d{2} \d{2}:\d{2}:\d{2} ).*" "cont"
Fluent Bit Log Output
There are no logs showing any relation to the expressed concern.
Relevant startup excerpts (sanitized):
Fluent Bit v4.2.0
[input:forward] listening on unix socket
[input:forward] listening on 127.0.0.1:24224
[filter:multiline] created emitter: emitter_for_multiline.1
[input:emitter] initializing
[output:cloudwatch_logs] worker started
[output:kafka] brokers='REDACTED:PORT' topics='REDACTED'
[http_server] listen iface=0.0.0.0 tcp_port=2020
Fluent Bit Version Info
aws-for-fluent-bit:v3.1.1
Cluster Details
- AWS ECS Fargate
- FireLens log driver
- Fluent Bit runs as FireLens sidecar
- Metadata enrichment (container_name, container_id, source) is injected by FireLens
Application Details
The application logs have not changed the redacted values actually display the exact same message produced in the two deployments.
Steps to reproduce issue
- Deploy ECS task with FireLens log driver
- Confirm metadata fields exist for both stdout and stderr
- Enable multiline filter
- Observe:
- stdout logs lose container_id, container_name, and source
- stderr logs retain those fields
Related Issues
Yes, the exact same issue: #959.
As that was marked as completed, I opened a new issue.
Describe the question/issue
Hi team, we are facing the issue of missing values of the metadata fields
container_name,container_idandsourceafter the enabling of multiline parser. We are on AWS ECS with Fargate and aws-firelens as the logs-driver.It appears that after multiline aggregation, the re-emitted record no longer contains the original metadata fields for
stdoutlogs that were injected by Firelens.Before adding the multiline parser, the metadata fields were present:
{ "@timestamp": "202x-xx-xxT00:00:00.000Z", "container_id": "XX", "container_name": "XX", "source": "stdout", "log": "XX", "ecs_cluster": "XX", "ecs_task_arn": "arn:aws:ecs:XX", "ecs_task_definition": "XX-task-definition:314" }After deploying task-definition-315 with multiline parser enabled,
stdoutlogs are missingcontainer_id,container_name, andsource:{ "@timestamp": "202x-xx-xxT00:00:00.000Z", "log": "XX", "ecs_cluster": "XX", "ecs_task_arn": "arn:aws:ecs:XX", "ecs_task_definition": "XX-task-definition:315" }Interestingly,
stderrlogs still contain those fields.So in our case:
Multiline disabled = all fields present
Multiline enabled = fields missing only for stdout
Any help would be greatly appreciated 🙏.
Configuration
Our configuration:
The custom multiline parser:
Fluent Bit Log Output
There are no logs showing any relation to the expressed concern.
Relevant startup excerpts (sanitized):
Fluent Bit Version Info
aws-for-fluent-bit:v3.1.1Cluster Details
Application Details
The application logs have not changed the redacted values actually display the exact same message produced in the two deployments.
Steps to reproduce issue
Related Issues
Yes, the exact same issue: #959.
As that was marked as completed, I opened a new issue.