Skip to content

Log file timestamp not updated on some file systems. Support periodic sync to flush metadata to FS #3593

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
roytmana opened this issue Apr 8, 2025 · 14 comments
Labels
appenders Affects one or more Appender plugins waiting-for-maintainer

Comments

@roytmana
Copy link

roytmana commented Apr 8, 2025

Hello we are experiencing issue on Azure file shares where log files metadata (particularly last updated timestamp) is not being updated till log file is closed. We confirmed that it is not log4j issue but a problem with any FileOutputStream on this file system.

Would it be possible to add functionality to file manager implementations whereby FileOutputStream's file descriptor would be fsynched with a configured Appender's frequency

FileOutputStream.getFD().sync();

Or perhaps a more generalized solution

@vy
Copy link
Member

vy commented Apr 13, 2025

@roytmana, I am reluctant to push a fix for an OpenJDK bug that will probably be addressed by either the OpenJDK or Azure in the future. Would it be possible for you to repeat the following in a background thread?

  1. Programmatically obtain the appender
  2. Get your hands on its FileOutputStream (via reflection?)
  3. Run FileOutputStream.getFD().sync() on it

@vy vy added appenders Affects one or more Appender plugins waiting-for-user More information is needed from the user and removed waiting-for-maintainer labels Apr 13, 2025
@ppkarwasz ppkarwasz moved this from To triage to Waiting for user in Log4j bug tracker Apr 13, 2025
@roytmana
Copy link
Author

roytmana commented Apr 14, 2025

Hello @vy Sorry for the delay I had that hack working before I submitted the ticket but to be absolutely sure I do not mislead you I wanted to get Ops deploy my test to Azure to in exactly the way our prod apps are

I just got a confirmation from them that it did solve the issue.

I did not do it in a background thread I instead copied rolling file appender and overwritten append() method where I used reflection to get the output stream on every write

So if it is feasible to add something like fileSystemSyncFrequencySec=60 to file based appenders (or at least the RollingFileAppender) to indicate that if present FileOutputStream.getFD().sync() should be called it would be great

We have a support ticket opened with microsoft on the issue but it does not seem to be getting anywhere. I can see if I can get the link to post here but not sure if it is publicly available

Thanks again for your prompt response!
It would be hard to add this synch from outside of the log4j project source code - all appenders are final and builders are statically associated with them - lots of copying of your code just to get past protected members and final classes :-)

@github-actions github-actions bot added waiting-for-maintainer and removed waiting-for-user More information is needed from the user labels Apr 14, 2025
@roytmana
Copy link
Author

@vy
from my perspective it would probably be sufficient to do it inside of append() method checking on elapsed time since previous sync
The disadvantage of course is that if a log does not get much traffic it will not be synched for some time and dates not flushed. Thread based sync would be much nicer as it would work as a heartbeat regardless of particular appender activity. You would need to have a list of all appenders configured for this heartbeat and do it for all of them though. In both cases you may need to synchronize normal appender write ops and call to sync - I do not know if calling them concurrently may pose any issues

@roytmana
Copy link
Author

@vy
Potentially file output stream could be opened using NIO with java.nio.file.StandardOpenOption.SYNC option. It probably (need to test) will do the trick but I am wondering if there will be any noticeable performance penalty since it'll happen on every write

I can test it as well (not with log4j but with a plain text file)

/**
 * Requires that every update to the file's content or metadata be written
 * synchronously to the underlying storage device.
 *
 * @see <a href="package-summary.html#integrity">Synchronized I/O file integrity</a>
 */
SYNC,

/**
 * Requires that every update to the file's content be written
 * synchronously to the underlying storage device.
 *
 * @see <a href="package-summary.html#integrity">Synchronized I/O file integrity</a>
 */
DSYNC;

@vy
Copy link
Member

vy commented Apr 16, 2025

@roytmana, double checking: Does immediateFlush not help?

@ppkarwasz
Copy link
Contributor

Potentially file output stream could be opened using NIO with java.nio.file.StandardOpenOption.SYNC option. It probably (need to test) will do the trick but I am wondering if there will be any noticeable performance penalty since it'll happen on every write

I disagree, if the performance penalty was be negligible, that option would be on by default.

Microsoft will probably solve the problem sooner or later, so I see no reason to modify the behavior for all Log4j Core users, however we can:

Does immediateFlush not help?

That would be really strange: immediateFlush should only flush the Java buffers (mostly our buffers, but some implementations of OutputStream also have buffers), while the problem here seems to be related to OS buffers that are not synced with the underlying device.

@roytmana
Copy link
Author

roytmana commented Apr 17, 2025

I just confirmed that immediateFlush does not work. I also tried just in case random rolling file appender did not help either

i tried SYNC and DSYNC on a text file (and it will require bunch of other options to be supplied like CREATE, APPEND etc) it does not appear to fix the issue but I may need to test it a bit more with new file vs appended existing etc if we want to give that a try

@ppkarwasz
Copy link
Contributor

@rm5248, @FreeAndNil,

Do you have similar issues in Log4cxx and Log4net on Azure?

@roytmana
Copy link
Author

we are a java shop :-)

@ppkarwasz
Copy link
Contributor

we are a java shop :-)

Trying to figure out if this is Java specific.

@roytmana
Copy link
Author

we are java and we run on linux (redhat 8) containers in azure Kubernetes. Unfortunately I have no way to test. if you could give me an executable I can see if we can run it from windows machine to azure file share but it'll be apple to oranges likely as is the OS/networking stack or maybe SMB file shares that are at fault

I appreciate very much your sticking with it because it is really not a log4j issue I was just hoping that an enhancement for periodic fsync would be helpful enough in general and non-intrusive enough to make it to the codebase :-)

@rm5248
Copy link
Contributor

rm5248 commented Apr 17, 2025

@rm5248, @FreeAndNil,

Do you have similar issues in Log4cxx and Log4net on Azure?

I feel like I might have seen something similar a long time ago, but I can't recall what it was. I definitely haven't heard of anything like this happening recently though. It sounds like an OS or JVM issue though, as metadata such as the last updated timestamp should be handled by the filesystem.

Interestingly it seems that NTFS may not update the last access timestamp immediately:

The NTFS file system delays updates to the last access time for a file by up to 1 hour after the last access.

That's probably not relevant here, unless the filesystem where logs are being written to is somehow NTFS.

@FreeAndNil
Copy link

@rm5248 , @ppkarwasz on NTFS I often see that the file size is not updated for log files.
It looks like the log file is empty, but when I look into it, there is content that is older than the point I checked the file size. When I look again at the file size, it is now "correct". I assume this is an optimization by Microsoft.

Our few azure systems use azure analytics instead of log4net.

@roytmana
Copy link
Author

In our case timestamp is not updated at all and after 60 days when archival process (external to the app) looks for old logs it hits on live open log files and delete them completely breaking logging - log4j would not recover and will continue to "write" to deleted file writing nothing and not re-creating the file

I do not know what actual file system is behind those azure file shares but they are SMB shares mounted in Red Hat ubi8 based PODS on Azure AKS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
appenders Affects one or more Appender plugins waiting-for-maintainer
Projects
Status: Waiting for user
Development

No branches or pull requests

5 participants