Duplicated logs on rolling file

classic Classic list List threaded Threaded
6 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Duplicated logs on rolling file

rsyslog-users mailing list
Hallo,

I'm developing a webapp Java, which logs to a rolling file using log4j-2.
The logged file(s) are later forwarded with rsyslog to 2 actions: 'omfwd'
to a Splunk server and 'omprog'.

I've been executing the high load tests on the webapp, and once they finish
I've realized that on both destinations are more logs than in files. And
these duplicated logs are the first(s) line(s) of the rolled files.

When trying to activate rsyslog with debug[1], doesn't send all the logs
from the original files then.

I've published a simpler example[2] which reproduces almost all the times
the issue. Is a Java application which can be run with gradle inside a
docker container.

There is also a Splunk server runnable with docker, in the same repo.

Can be a rsyslog rolling detection issue?

Thanks,

  [1]: http://www.rsyslog.com/doc/v8-stable/troubleshooting/debug.html
  [2]: https://github.com/lucasvc/rsyslog-tests/tree/master/log4j2-rollover
--
Lucas
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Duplicated logs on rolling file

Rainer Gerhards
Hola Lucas,


2017-06-02 11:39 GMT+02:00 Lucas Ventura Carro via rsyslog
<[hidden email]>:

> Hallo,
>
> I'm developing a webapp Java, which logs to a rolling file using log4j-2.
> The logged file(s) are later forwarded with rsyslog to 2 actions: 'omfwd'
> to a Splunk server and 'omprog'.
>
> I've been executing the high load tests on the webapp, and once they finish
> I've realized that on both destinations are more logs than in files. And
> these duplicated logs are the first(s) line(s) of the rolled files.
>
> When trying to activate rsyslog with debug[1], doesn't send all the logs
> from the original files then.
>
> I've published a simpler example[2] which reproduces almost all the times
> the issue. Is a Java application which can be run with gradle inside a
> docker container.
>
> There is also a Splunk server runnable with docker, in the same repo.
>
> Can be a rsyslog rolling detection issue?

It sounds like the log4j config overwrites the same inode. If so, this
is the root cause of the problem. Overwriting the same inode is highly
racy. On a busy system, you will almost always loose some logs. It can
also happen that the logic that *tries* to detect overwrites
misdetects them and duplicates potentially large amounts of data. We
have seen this in the past and for that reason imfile did not support
overwrites at all for quite some while.

Based on user requests, we *experiementally* added the option
"reopenOnTruncate" to support this somewhat again. However, as we know
this is racy and highly problematic, we do not even try to debug such
situations - we've done this often enough in the past and the plain
truth is that problems are inevitable in that mode.

The real cure is to ensure that a new inode is created when the log
file is rolled over.

We should probably add an even bolder warning to the doc that highly
discourages to use "reopenOnTruncate".

Sorry I have no better answer, but that's simply how it is...
Rainer

>
> Thanks,
>
>   [1]: http://www.rsyslog.com/doc/v8-stable/troubleshooting/debug.html
>   [2]: https://github.com/lucasvc/rsyslog-tests/tree/master/log4j2-rollover
> --
> Lucas
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Duplicated logs on rolling file

rsyslog-users mailing list
Log4j has a way to rollover to new file (without truncating old file).
I'd look at Rolling Appender if you want to use it with imfile.

Or its simpler and more efficient to make log4j send logs over to
Rsyslog (or any other syslog impl) directly. Here what a log4j
properties file with syslog appender may look like:

log4j.rootLogger=debug, syslog
log4j.appender.stdout=org.apache.log4j.ConsoleAppender
log4j.appender.stdout.layout=org.apache.log4j.EnhancedPatternLayout
log4j.appender.stdout.layout.ConversionPattern=%5p [%t] (%F:%L) - %m%n
%throwable{1000}

log4j.appender.syslog=org.apache.log4j.net.SyslogAppender
log4j.appender.syslog.syslogHost=127.0.0.1:11514
log4j.appender.syslog.layout=org.apache.log4j.EnhancedPatternLayout
log4j.appender.syslog.layout.conversionPattern=log@1,myapp,myteam
%d{ISO8601} %-5p [%t] %c{2} %x - %m%n %throwable{1000}
log4j.appender.syslog.Facility=LOCAL0



On Fri, Jun 2, 2017 at 4:23 PM, Rainer Gerhards
<[hidden email]> wrote:

> Hola Lucas,
>
>
> 2017-06-02 11:39 GMT+02:00 Lucas Ventura Carro via rsyslog
> <[hidden email]>:
>> Hallo,
>>
>> I'm developing a webapp Java, which logs to a rolling file using log4j-2.
>> The logged file(s) are later forwarded with rsyslog to 2 actions: 'omfwd'
>> to a Splunk server and 'omprog'.
>>
>> I've been executing the high load tests on the webapp, and once they finish
>> I've realized that on both destinations are more logs than in files. And
>> these duplicated logs are the first(s) line(s) of the rolled files.
>>
>> When trying to activate rsyslog with debug[1], doesn't send all the logs
>> from the original files then.
>>
>> I've published a simpler example[2] which reproduces almost all the times
>> the issue. Is a Java application which can be run with gradle inside a
>> docker container.
>>
>> There is also a Splunk server runnable with docker, in the same repo.
>>
>> Can be a rsyslog rolling detection issue?
>
> It sounds like the log4j config overwrites the same inode. If so, this
> is the root cause of the problem. Overwriting the same inode is highly
> racy. On a busy system, you will almost always loose some logs. It can
> also happen that the logic that *tries* to detect overwrites
> misdetects them and duplicates potentially large amounts of data. We
> have seen this in the past and for that reason imfile did not support
> overwrites at all for quite some while.
>
> Based on user requests, we *experiementally* added the option
> "reopenOnTruncate" to support this somewhat again. However, as we know
> this is racy and highly problematic, we do not even try to debug such
> situations - we've done this often enough in the past and the plain
> truth is that problems are inevitable in that mode.
>
> The real cure is to ensure that a new inode is created when the log
> file is rolled over.
>
> We should probably add an even bolder warning to the doc that highly
> discourages to use "reopenOnTruncate".
>
> Sorry I have no better answer, but that's simply how it is...
> Rainer
>>
>> Thanks,
>>
>>   [1]: http://www.rsyslog.com/doc/v8-stable/troubleshooting/debug.html
>>   [2]: https://github.com/lucasvc/rsyslog-tests/tree/master/log4j2-rollover
>> --
>> Lucas
>> _______________________________________________
>> rsyslog mailing list
>> http://lists.adiscon.net/mailman/listinfo/rsyslog
>> http://www.rsyslog.com/professional-services/
>> What's up with rsyslog? Follow https://twitter.com/rgerhards
>> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
> _______________________________________________
> rsyslog mailing list
> http://lists.adiscon.net/mailman/listinfo/rsyslog
> http://www.rsyslog.com/professional-services/
> What's up with rsyslog? Follow https://twitter.com/rgerhards
> NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.



--
Regards,
Janmejay
http://codehunk.wordpress.com
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Duplicated logs on rolling file

rsyslog-users mailing list
Wow! Thanks for so fast and explicative replies! :D

2017-06-02 12:53 GMT+02:00 Rainer Gerhards <[hidden email]>:

> It sounds like the log4j config overwrites the same inode. If so, this
> is the root cause of the problem. Overwriting the same inode is highly
> racy. On a busy system, you will almost always loose some logs. It can
> also happen that the logic that *tries* to detect overwrites
> misdetects them and duplicates potentially large amounts of data. We
> have seen this in the past and for that reason imfile did not support
> overwrites at all for quite some while.
>
> Based on user requests, we *experiementally* added the option
> "reopenOnTruncate" to support this somewhat again. However, as we know
> this is racy and highly problematic, we do not even try to debug such
> situations - we've done this often enough in the past and the plain
> truth is that problems are inevitable in that mode.
>
> The real cure is to ensure that a new inode is created when the log
> file is rolled over.
>

> We should probably add an even bolder warning to the doc that highly
> discourages to use "reopenOnTruncate".
>

Ok, I'll review then the way log4j is writting and rolling. For sure I'll
avoid the experimental and problematic feature ;)


>  Sorry I have no better answer, but that's simply how it is...
>

No problem at all! You pointed me where next to look for or who ask :)


2017-06-02 13:04 GMT+02:00 singh.janmejay via rsyslog <[hidden email]
.com>:

> Log4j has a way to rollover to new file (without truncating old file).
> I'd look at Rolling Appender if you want to use it with imfile.
>

I've tried my test with the RollingFile appender but I got same bad results.

Or its simpler and more efficient to make log4j send logs over to
> Rsyslog (or any other syslog impl) directly. Here what a log4j
> properties file with syslog appender may look like:


I wanted to avoid UDP/TCP connection for two reasons: IMHO is faster write
to file and the main use case is sending to the 'omprog' plugin.
Thanks for alternative anyway!
--
Lucas
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Duplicated logs on rolling file

David Lang
On Tue, 6 Jun 2017, Lucas Ventura Carro via rsyslog wrote:

> I wanted to avoid UDP/TCP connection for two reasons: IMHO is faster write
> to file and the main use case is sending to the 'omprog' plugin.
> Thanks for alternative anyway!

writing to /dev/log doesn't use the network and will be faster than writing to a
file and then reading from the file (no disk I/O involved)

David Lang
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Duplicated logs on rolling file

rsyslog-users mailing list
Hallo again!

2017-06-06 22:28 GMT+02:00 David Lang <[hidden email]>:

> writing to /dev/log doesn't use the network and will be faster than
> writing to a file and then reading from the file (no disk I/O involved)


Oh! Seems a good alternative, I will try to explore it later.

Meanwhile, I did a little shell script which showed the file inodes from
the log4j files[1]. So I executed during the test and all the inodes seems
correct: when the first column changes, the previous inode moves to the end.
Inspecting log4j code (and its logs), when rollover has to be done is with
Java 'Files.move' method, which states as atomic move.

The last test I could do is to write not using log4j and use Unix
'logrotate' to rollover the files.

    [1]:
https://github.com/lucasvc/rsyslog-tests/blob/master/log4j2-rollover/inodes.sh

--
Lucas
_______________________________________________
rsyslog mailing list
http://lists.adiscon.net/mailman/listinfo/rsyslog
http://www.rsyslog.com/professional-services/
What's up with rsyslog? Follow https://twitter.com/rgerhards
NOTE WELL: This is a PUBLIC mailing list, posts are ARCHIVED by a myriad of sites beyond our control. PLEASE UNSUBSCRIBE and DO NOT POST if you DON'T LIKE THAT.
Loading...