Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix conditional evaluation with a dangling filter #4058

Merged

Conversation

bazsi
Copy link
Collaborator

@bazsi bazsi commented Jun 24, 2022

This PR started out as a fix for an issue reported by @faxm0dem on gitter, with config and patterndb here: https://gist.github.com/faxm0dem/3dfc0873842d4c05ebcd8cb7f8eda8ab

The issue turned out to be an issue how we process the "unmatched" state for a log path. With time, another case came up in #4339 and it turns out we have a limited-incomplete definition on how we define that a log path did not match.

Since this is a complicated issue, let me try to write this up, hopefully it will be useful for my future self or anyone trying to understand the compilation process.

So let's take one step back.

Why do we need the notion of a message being "unmatched" for a log path?

syslog-ng supports both independent and dependent log paths. An independent one is simple: let's take the stream of messages and route them to multiple outputs. These outputs are independent, as one output does not depend on the result of the other.

For example, there are two independent outgoing paths from the whatever source:

log {
    source(whatever);
    log {
        filter(app1);
        destination(app1_logfile);
   };
   log {
       filter(app2);
       destinatrion(app2_logfile);
   };
};

Sometimes however we only want to take a route if a previous did not match, like this:

log {
    source(whatever);
    log {
        filter(app1);
        destination(app1_logfile);
        flags(final);
   };
   log {
       destinatrion(catchall_logfile);
   };
};

Notice the "final" flag in the first log {} statement. This means that if the first log statement "matches", the 2nd is not processed at all. This form of routing is sometimes easier to write, but it also means that syslog-ng needs the notion of whether a message matched a specific log path.

Sometimes, we have hundreds of these log {} rules in a configuration, syslog-ng trying to match the right one automatically.

So what does it mean that a message matches?

In simple cases, a message is assumed to always match a path, unless a filter or parser drops it. From the example above:

    log {
        filter(app1);
        destination(app1_logfile);
        flags(final);
   };

if filter(app1) matches the message, this log path would match. If it does not, then the entire path is considered unmatching.

What about complex cases?

syslog-ng's language is flexible enough to route messages around based on complex criteria. So a simple filter might actually be a complex set of parsing/filtering/processing rules, hidden behind a single line of reference. Just look at our cisco parser (https://github.com/syslog-ng/syslog-ng/blob/master/scl/cisco/plugin.conf).

What about destinations?

So knowing what is "matching" is super useful. But what is considered a match, if we have "multiple" exit points from a log path:

Let's look at this example:

log {
    source(whatever);
    destination(dest1);
    filter(filter_app);
    destination(dest2);
};

As you can see, dest1 and dest2 are both destinations of this log path. dest1 would get a copy of all messages while dest2 would only get those that match the filter filter_app. What message is "matching" this log path?

Those that reach the first destination? Or those that also reach the second one too?

While this configuration may not make sense in most cases, sometimes it is pretty useful. If you want to debug your configuration for example, then using a destination right in the middle-point would make sense. You could get a sample of what the messages look like that traverse this part of the configuration.

I decided that a log path is considered matching if it fully matches all filtering elements along a log path, with the exception of filtering done by destinations themselves. If a destination were to drop messages as a part of its operation, the log path that references this destination would still be matching. This more or less matches existing behaviour except for implementation bugs. These bugs are now fixed and the definition above is the guiding rule.

This means that this rule would not match if filter_app does not match:

log {
    source(whatever);
    destination(dest1);
    filter(filter_app);
    destination(dest2);
};

While this is doing the same, this would actually be matching:

log {
    source(whatever);
    destination(dest1);
    destination {
        log {
            filter(filter_app);
            destination(dest2);
        };
    };
};

The reason this would match is that the filter at this time is part of a "destination" reference, and filtering embedded in destinations do not affect the parent log path. This is useful when a destination wants to be selective about what is delivered and what is not. It may have relatively complex rules that affect delivery of messages.

What else?

Another closely related issue was conditional evaluation, which is also fixed here.

The problem was roughly this:

log {
    if {
       parser { something-that-parses-a-message-and-drops-it-if-invalid-format() };
    } else {
       parser { unmatched-messages-are-processed-here(); };
    };
    ...
    ...
    filter { we-drop-message-here(); };
};

Without this PR, the else branch of the if statements could be taken, even if the parser in the if branch itself was successful/matching. The reason that happened was that the "unmatched" state from the subsequent filter was propagated back to the if (incorrectly!), which caused the else to be taken (as well).

What is in this PR

So here's a list of changes in this PR:

  • it fixes the bug in conditional evaluation (if/else/elif) by using a new ENL_CONDITIONAL expr node and dropping an older, partial solution of the same problem (LC_DROP_UNMATCHED)
  • it defines the midpoint destination problem and makes sure destinations don't propagate their matched state to the parent logpath.
  • it adds more details into the "info" member of log paths
  • it adds light based tests to exercise both the conditional evaluation and the midpoint destination cases

@kira-syslogng
Copy link
Contributor

Build FAILURE

@bazsi bazsi force-pushed the fix-if-block-evaluation-with-a-dangling-filter branch 2 times, most recently from b444123 to 5a0e91a Compare July 16, 2022 08:31
@bazsi bazsi changed the title cfg-tree: use "drop-unmatched" flag for the if {} format as well WIP: fix conditional evaluation with a dangling filter Jul 16, 2022
@bazsi bazsi force-pushed the fix-if-block-evaluation-with-a-dangling-filter branch from 846057e to 26fbd7c Compare September 14, 2022 10:37
@kira-syslogng
Copy link
Contributor

Build FAILURE

@MrAnno MrAnno marked this pull request as draft September 18, 2022 11:31
@bazsi bazsi force-pushed the fix-if-block-evaluation-with-a-dangling-filter branch from 26fbd7c to fe48a19 Compare February 24, 2023 22:09
@bazsi bazsi mentioned this pull request Feb 24, 2023
@bazsi bazsi force-pushed the fix-if-block-evaluation-with-a-dangling-filter branch 2 times, most recently from 15b82fe to 3150052 Compare February 26, 2023 14:56
@bazsi bazsi changed the title WIP: fix conditional evaluation with a dangling filter fix conditional evaluation with a dangling filter Feb 26, 2023
@bazsi bazsi marked this pull request as ready for review February 26, 2023 14:56
@bazsi bazsi force-pushed the fix-if-block-evaluation-with-a-dangling-filter branch 2 times, most recently from e2d0a2d to 00519c6 Compare February 26, 2023 17:12
@bazsi bazsi added this to the syslog-ng 4.1 milestone Feb 26, 2023
@bazsi bazsi force-pushed the fix-if-block-evaluation-with-a-dangling-filter branch from 00519c6 to 6478a14 Compare February 28, 2023 19:15
@bazsi
Copy link
Collaborator Author

bazsi commented Feb 28, 2023

As discussed IRL, we needed to handle N level depth of ifs, which I have added now, along with tests that exercise these complex conditionals.

I'll still have to edit the history probably, to squash the 2-level solution with the N-level, but I've figured it's easier to review just the change.

@kira-syslogng
Copy link
Contributor

Build FAILURE

@bazsi bazsi force-pushed the fix-if-block-evaluation-with-a-dangling-filter branch 2 times, most recently from e74c105 to 37742f5 Compare March 2, 2023 15:43
@kira-syslogng
Copy link
Contributor

Build FAILURE

@MrAnno
Copy link
Collaborator

MrAnno commented Mar 3, 2023

@kira-syslogng retest this please

@MrAnno
Copy link
Collaborator

MrAnno commented Mar 3, 2023

@kira-syslogng do stresstest

@kira-syslogng
Copy link
Contributor

Kira-stress-test: Build SUCCESS

lib/cfg-tree.c Outdated Show resolved Hide resolved
tests/light/functional_tests/logpath/test_conditionals.py Outdated Show resolved Hide resolved
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
@alltilla alltilla force-pushed the fix-if-block-evaluation-with-a-dangling-filter branch from 37742f5 to 4b5e916 Compare March 7, 2023 15:14
This patch reverts f1c3090 which was added
as a special log path flag to support if/elif/else conditionals in
log paths.

Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Attila Szakacs <attila.szakacs@axoflow.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
bazsi added 14 commits March 7, 2023 16:33
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
On a specific log path, we might be using a junction to classify
a log message along multiple alternatives.


                 / alt1 \
path -> ... ->  +--alt2--+ -> ... -> filter -> ...
                 \ alt3 /


In this case, the entire path is "matched" if none of the filtering
elements drop the message.

In the case of a junction, we should set matched to FALSE, if neither
the alt1, alt2 or alt3 branches match.

What might happen though is that one of our branches actually lets the
message through (for example alt1) but then a subsequent filter drops it.

What would happen in this case is that the junction would retry with alt2,
even though alt1 was already successful and it was not the junction itself
that decided the message did not match.

This will cause alternative paths to be considered even though the problem
is outside of the junction. Also, those alternatives may have side-effects
(e.g. the delivery of the message to a destination).

The solution is to make sure that once the message leaves the junction,
our "matched" pointer should not propagate any further, rather any
further processing should rather change the "matched" of our parent
log path.

The solution is to add "outer_matched" pointer to LogPathOptions and that
the last LogPipe of the junction (the join_pipe) will switch the the outer
value. This means that even though the log path may not match as a whole,
the junction itself will not try alternative branches.

Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Add a dedicated LogExprNode for conditionals, so we can delay building the
exact LogPipe structure until compilation time.

Previously a conditional was nothing but a junction, each branch
being "final", conditions being inserted into the LogPipe sequence.

The new structure is aware whether we are using simple ifs (e.g. if (filter-expr)) or
compound ifs (e.g. if { }). The two needs to be compiled differently.

A compound if is the same as it was, a junction, each branch is final,
filtering happens as the body of the if.

A simple if however is different, once the if (filter-expr) matches, we
should not wonder to alternative branches, even if the body of if does not
match at the end.

Therefore we need a midpoint within the conditional: after the filter and
before the body of the conditional. Once we reach this point, we are
considered "matching", any subsequent filters will only impact the parent
log path.

Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Attila Szakacs <attila.szakacs@axoflow.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Instead of relying on the value of pipe_next, use an explicit setting
to control delivery propagation to the parent log path.

This should be set in case we are a real junction and as such the multiplexer
as a whole is performing a filtering function.

This should be unset in case the multiplexer just serves to dispatch
messages to a set of destinations (e.g. destination reference or inline
destinations).

Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Just to mark implicitly generated source/destination junctions
differently from user configured ones.

Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Attila Szakacs <attila.szakacs@axoflow.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Attila Szakacs <attila.szakacs@axoflow.com>
@alltilla alltilla force-pushed the fix-if-block-evaluation-with-a-dangling-filter branch from 4b5e916 to 004cda0 Compare March 7, 2023 15:33
@MrAnno
Copy link
Collaborator

MrAnno commented Mar 7, 2023

LGTM (only @alltilla's changes)

@MrAnno
Copy link
Collaborator

MrAnno commented Mar 7, 2023

@kira-syslogng do stresstest

@kira-syslogng
Copy link
Contributor

Kira-stress-test: Build SUCCESS

@alltilla alltilla merged commit 73d3d91 into syslog-ng:master Mar 7, 2023
15 checks passed
@bazsi bazsi deleted the fix-if-block-evaluation-with-a-dangling-filter branch March 10, 2023 13:56
bazsi added a commit to bazsi/syslog-ng that referenced this pull request Mar 10, 2023
…not be considered unmatching

As a followup to syslog-ng#4366 and syslog-ng#4058, another edge case was identified
where syslog-ng evaluation handles matched messages incorrectly.

If we are using a correlation while processing the message, the incoming
messages are consumed into a context and then the individual messages
get dropped, at least when using inject-mode(aggregate-only).

In these cases we should not consider dropping of these messages to
be unmatching, as dropping happens as a part of correlation.

This patch adds a light based testcase to exercise this problem and includes
a fix, so that messages consumed into corrlation in
inject-mode(aggregate-only) mode will not cause those messages to traverse
any alternative paths.

Signed-off-by: Balazs Scheidler <balazs.scheidler@axoflow.com>
bazsi added a commit to bazsi/syslog-ng that referenced this pull request Mar 11, 2023
…not be considered unmatching

As a followup to syslog-ng#4366 and syslog-ng#4058, another edge case was identified
where syslog-ng evaluation handles matched messages incorrectly.

If we are using a correlation while processing the message, the incoming
messages are consumed into a context and then the individual messages
get dropped, at least when using inject-mode(aggregate-only).

In these cases we should not consider dropping of these messages to
be unmatching, as dropping happens as a part of correlation.

This patch adds a light based testcase to exercise this problem and includes
a fix, so that messages consumed into corrlation in
inject-mode(aggregate-only) mode will not cause those messages to traverse
any alternative paths.

Signed-off-by: Balazs Scheidler <balazs.scheidler@axoflow.com>
bazsi added a commit to bazsi/syslog-ng that referenced this pull request Mar 15, 2023
…not be considered unmatching

As a followup to syslog-ng#4366 and syslog-ng#4058, another edge case was identified
where syslog-ng evaluation handles matched messages incorrectly.

If we are using a correlation while processing the message, the incoming
messages are consumed into a context and then the individual messages
get dropped, at least when using inject-mode(aggregate-only).

In these cases we should not consider dropping of these messages to
be unmatching, as dropping happens as a part of correlation.

This patch adds a light based testcase to exercise this problem and includes
a fix, so that messages consumed into corrlation in
inject-mode(aggregate-only) mode will not cause those messages to traverse
any alternative paths.

Signed-off-by: Balazs Scheidler <balazs.scheidler@axoflow.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants