Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"FileNotFoundError" during montlhy rotation with compression #220

Closed
Delgan opened this issue Mar 2, 2020 · 12 comments
Closed

"FileNotFoundError" during montlhy rotation with compression #220

Delgan opened this issue Mar 2, 2020 · 12 comments
Labels
question Further information is requested

Comments

@Delgan
Copy link
Owner

Delgan commented Mar 2, 2020

Hi @Delgan ,
I have a similar problem. My handler is like this,

logger.add(
  sink='task-scheduler-bootstrap-{time:YYYY-MM}.log',
  format=LOG_FORMAT,
  rotation="monthly",
  compression="xz",
  level="INFO")

and my program encountered the following exception when entering March

Traceback (most recent call last):
  File "/home/nsiq/.local/lib/python3.8/site-packages/loguru/_ctime_functions.py", line 32, in get_ctime                                                                                                                                                                                                                        return float(os.getxattr(filepath, b"user.loguru_crtime"))
FileNotFoundError: [Errno 2] No such file or directory: 'task-scheduler-bootstrap-2020-02.log'

It seems that Feb's log was properly rotated out and compressed while the program still wanted to write to it. Is there a way to fix this?

Thank you!

Originally posted by @ns-kyuan in #213 (comment)

@Delgan
Copy link
Owner Author

Delgan commented Mar 2, 2020

Hey @ns-kyuan, can you please tell me which Loguru version you are using (print(loguru.__version__))?

@ns-kyuan
Copy link

ns-kyuan commented Mar 2, 2020

Thanks for the reply, the version I am using is 0.4.1

@Delgan
Copy link
Owner Author

Delgan commented Mar 2, 2020

Ok, it looks like a bug then... 😬

Do you have the same error if you try to rotate "hourly" and not "montlhy"? Are you using threads maybe?

@ns-kyuan
Copy link

ns-kyuan commented Mar 2, 2020

I did not try the hourly rotation.
I use multiprocessing. The main process subscribes to kafka messages and start worker processes to do tasks.
I use different handlers for main and worker processes, but both have this issue.

@Delgan
Copy link
Owner Author

Delgan commented Mar 3, 2020

Hum... This might be related tomultiprocessing. This part of the documentation may be interesting, although you're using Linux and not Windows so this should be less problematic: Compatibility with multiprocessing using enqueue argument.

Where are you configuring the logger? Is it possible that .add() is called multiple times when yours workers are started?

Basically, it seems multiple processes are trying to write logs to the same file. Rotation is not happening exactly at the same time, so when the first one close and compress the file, a conflict may happen if a second process try to write later on the removed file.

@ns-kyuan
Copy link

ns-kyuan commented Mar 3, 2020

No, I only call add() twice in at the very beginning of the main process.

logger.add(
  sink=scheduler_log,
  format=LOG_FORMAT,
  filter=lambda record: record['process'].id == pid,
  backtrace=True,
  diagnose=True,
  rotation="monthly",
  compression="xz",
  level="INFO")

logger.add(
  sink=worker_log,
  format=LOG_FORMAT,
  filter=lambda record: record['process'].id != pid,
  backtrace=True,
  diagnose=True,
  rotation="monthly",
  compression="xz",
  level="INFO")

And the problem actually did not just happen at the moment entering March, it kept lasting and even the workers created on March 2nd had the same exception.

@Delgan
Copy link
Owner Author

Delgan commented Mar 4, 2020

Ok, I will double-check the rotation source code but I think your problem is related to multiprocessing. You see, even if you call .add() only twice in your code, it may happen that add() is called again while creating a new process. This happens if the new process is "spawned" and not "forked": in such case you whole program need to be re-executed (this is the way Python implements multiprocessing). If the program is called multiple times, then handlers will be added multiple times too without you explicitly doing it.

I don't know exactly how is working Kafka and how it is managing workers. Is there any possibility that you create a minimal reproducible example or that you share your code with me? It's hard to tell without seeing all the configuration steps.

@ns-kyuan
Copy link

ns-kyuan commented Mar 4, 2020

# test_log.py
from multiprocessing import Process
import os
import sys
import time

from loguru import logger
from test_worker import worker
pid = os.getpid()
LOG_FORMAT = "[{time:YYYY-MM-DD HH:mm:ss.SSS!UTC}][{level}]<{file}::{module}> {message}"


print("calling logger.add")
logger.add(
  sink="main.log",
  format=LOG_FORMAT,
  filter=lambda record: record['process'].id == pid,
  backtrace=True,
  diagnose=True,
  rotation="monthly",
  compression="xz",
  level="INFO")

print("calling logger.add")
logger.add(
  sink="worker.log",
  format=LOG_FORMAT,
  filter=lambda record: record['process'].id != pid,
  backtrace=True,
  diagnose=True,
  rotation="monthly",
  compression="xz",
  level="INFO")

logger.info('main started.')


while True:
  p = Process(
    target=worker)
  # non-blocking
  p.daemon = True
  p.start()
  time.sleep(5)



# test_worker.py
import random
import time
from loguru import logger
import os


def worker():
  pid = os.getpid()
  load = random.randint(1, 4)
  logger.info("worker ({}) started".format(pid))
  time.sleep(load)
  logger.info("worker ({}) terminating".format(pid))

results

❯ python tmp/test_log.py
calling logger.add
calling logger.add
2020-03-04 11:32:41.469 | INFO     | __main__:<module>:34 - main started.
2020-03-04 11:32:41.472 | INFO     | test_worker:worker:10 - worker (95976) started
2020-03-04 11:32:42.477 | INFO     | test_worker:worker:12 - worker (95976) terminating
2020-03-04 11:32:46.475 | INFO     | test_worker:worker:10 - worker (97361) started
2020-03-04 11:32:47.479 | INFO     | test_worker:worker:12 - worker (97361) terminating
2020-03-04 11:32:51.475 | INFO     | test_worker:worker:10 - worker (98404) started
^CProcess Process-3:
Traceback (most recent call last):
  File "tmp/test_log.py", line 43, in <module>
    time.sleep(5)
KeyboardInterrupt

@Delgan
Copy link
Owner Author

Delgan commented Mar 4, 2020

Thanks! Multiple processes using the logger without explicit synchronisation will run into conflicts.

You should try to add your handlers with the enqueue=True parameter (it will ensure rotation and others related stuff occurs only in one process).

@ns-kyuan
Copy link

ns-kyuan commented Mar 6, 2020

Thanks for your response. Let me try that and get back to you!

@ns-kyuan
Copy link

ns-kyuan commented Mar 7, 2020

It works!

@Delgan Delgan added the question Further information is requested label Mar 8, 2020
@Delgan
Copy link
Owner Author

Delgan commented Mar 8, 2020

Great! 🎉

I'm closing this issue then, don't hesitate to ask me if you have other problems. 😉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants