Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mariadb crashing since 2.5.0 #893

Open
johnassel opened this issue Jan 19, 2025 · 18 comments
Open

mariadb crashing since 2.5.0 #893

johnassel opened this issue Jan 19, 2025 · 18 comments
Labels

Comments

@johnassel
Copy link

johnassel commented Jan 19, 2025

I'm running the mariadb data directory (/var/lib/mysql/) inside a gocryptfs mount.
After upgrading to 2.5.0 I'm experiencing frequent crashes of the maridb process:
Process 39501 (mariadbd) crashed in page_cur_tuple_insert(page_cur_t*, dtuple_t const*,...

There might be an underlying problem with permissions or such since I'm getting errors for cronjobs which are calling php scripts inside a gocryptfs directory since 2.5.0, too.
Failed to open stream: Permission denied in ...
But these are way more rare than the crashes of the mariadb daemon.

Going back to 2.4.0 fixes both problems.

The gocryptfs mount is mounted throught the root account using the -allow_other flag.
System is a Fedora 41 with the static binaries on ext4.
I'm not sure how to debug this futher, since gocryptfs does not log anything in normal mode.

It looks somewhat similar to #892 but for me older versions are working.

Edit: Startet the mount with --debug. Waiting now for a crash to happen.
Edit2: It seems (as mentioned in the issue above, too) as if --debug stops the crashes.

@rfjakob rfjakob added the bug label Jan 19, 2025
@rfjakob
Copy link
Owner

rfjakob commented Jan 19, 2025

Ugh. Maybe: go-fuse was updated from v2.3.0 to v2.5.0: https://github.com/rfjakob/gocryptfs/compare/v2.4.0..v2.5.0#diff-33ef32bf6c23acb95f5902d7097b7a1d5128ca061167ec0716715b0b9eeaa5f6L7

Let me try to repro this.

@rfjakob
Copy link
Owner

rfjakob commented Jan 19, 2025

And, what CPU do you have? Maybe f5007b2

@johnassel
Copy link
Author

Yes, it is an AMD EPYC 9634

@rfjakob
Copy link
Owner

rfjakob commented Jan 19, 2025

Oh, I missed Edit2.

Do you have the whole backtrace for Process 39501 (mariadbd) crashed in page_cur_tuple_insert or even the core file?

@johnassel
Copy link
Author

johnassel commented Jan 19, 2025

You mean this one?

Process 1707 (mariadbd) of user 27 dumped core.
 
 Module libzstd.so.1 from rpm zstd-1.5.6-2.fc41.x86_64
 Module ha_rocksdb.so from rpm mariadb10.11-10.11.10-1.fc41.x86_64
 Module libsnappy.so.1 from rpm snappy-1.2.1-2.fc41.x86_64
 Module provider_snappy.so from rpm mariadb10.11-10.11.10-1.fc41.x86_64
 Module provider_lzo.so from rpm mariadb10.11-10.11.10-1.fc41.x86_64
 Module liblzma.so.5 from rpm xz-5.6.2-2.fc41.x86_64
 Module provider_lzma.so from rpm mariadb10.11-10.11.10-1.fc41.x86_64
 Module liblz4.so.1 from rpm lz4-1.10.0-1.fc41.x86_64
 Module provider_lz4.so from rpm mariadb10.11-10.11.10-1.fc41.x86_64
 Module libbz2.so.1 from rpm bzip2-1.0.8-19.fc41.x86_64
 Module provider_bzip2.so from rpm mariadb10.11-10.11.10-1.fc41.x86_64
 Module libcap.so.2 from rpm libcap-2.70-4.fc41.x86_64
 Module libcrypto.so.3 from rpm openssl-3.2.2-11.fc41.x86_64
 Module libssl.so.3 from rpm openssl-3.2.2-11.fc41.x86_64
 Module libz.so.1 from rpm zlib-ng-2.2.3-1.fc41.x86_64
 Module libsystemd.so.0 from rpm systemd-256.11-1.fc41.x86_64
 Module libaio.so.1 from rpm libaio-0.3.111-20.fc41.x86_64
 Module libcrypt.so.2 from rpm libxcrypt-4.4.38-2.fc41.x86_64
 Module libpcre2-8.so.0 from rpm pcre2-10.44-1.fc41.1.x86_64
 Module mariadbd from rpm mariadb10.11-10.11.10-1.fc41.x86_64
 Stack trace of thread 6224:
 #00x00007f3a1e27f0f4 __pthread_kill_implementation (libc.so.6 + 0x730f4)
 #10x00007f3a1e225fde raise (libc.so.6 + 0x19fde)
 #20x00007f3a1e20d9d2 abort (libc.so.6 + 0x19d2)
 #30x0000556d2542fcd1 _ZN2ib5fatalD2Ev (mariadbd + 0x6acd1)
 #40x0000556d25465f39 _ZL18os_file_sync_posixi.isra.0.cold (mariadbd + 0xa0f39)
 #50x0000556d25b99778 _ZL9log_flushm (mariadbd + 0x7d4778)
 #60x0000556d25b9c444 _Z15log_write_up_tombPK19completion_callback (mariadbd + 0x7d7444)
 #70x0000556d25c24a5c _ZL23trx_flush_log_if_neededmP5trx_t.part.0.lto_priv.0 (mariadbd + 0x85fa5c)
 #80x0000556d25b4a997 _ZL15innobase_commitP10handlertonP3THDb.lto_priv.0 (mariadbd + 0x785997)
 #90x0000556d258b14fc _ZL18commit_one_phase_2P3THDbP9THD_TRANSb (mariadbd + 0x4ec4fc)
 #10 0x0000556d258b28cb _Z15ha_commit_transP3THDb (mariadbd + 0x4ed8cb)
 #11 0x0000556d2577fcc3 _Z17trans_commit_stmtP3THD (mariadbd + 0x3bacc3)
 #12 0x0000556d255fe3cc _Z21mysql_execute_commandP3THDb (mariadbd + 0x2393cc)
 #13 0x0000556d25603b9b _Z11mysql_parseP3THDPcjP12Parser_state (mariadbd + 0x23eb9b)
 #14 0x0000556d255f239c _Z16dispatch_command19enum_server_commandP3THDPcjb (mariadbd + 0x22d39c)
 #15 0x0000556d255f40d8 _Z10do_commandP3THDb (mariadbd + 0x22f0d8)
 #16 0x0000556d2577be20 _Z24do_handle_one_connectionP7CONNECTb (mariadbd + 0x3b6e20)
 #17 0x0000556d2577c2e6 handle_one_connection (mariadbd + 0x3b72e6)
 #18 0x0000556d25aab77a pfs_spawn_thread (mariadbd + 0x6e677a)
 #19 0x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #20 0x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 1708:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c5a2 pthread_cond_timedwait@@GLIBC_2.3.2 (libc.so.6 + 0x705a2)
 #20x0000556d25d09ddd timer_handler (mariadbd + 0x944ddd)
 #30x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #40x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 1707:
 #00x00007f3a1e2fe917 __select (libc.so.6 + 0xf2917)
 #10x0000556d254b5598 _ZL17close_connectionsv.lto_priv.0 (mariadbd + 0xf0598)
 #20x0000556d254c8bb8 _Z11mysqld_mainiPPc (mariadbd + 0x103bb8)
 #30x00007f3a1e20f248 __libc_start_call_main (libc.so.6 + 0x3248)
 #40x00007f3a1e20f30b __libc_start_main@@GLIBC_2.34 (libc.so.6 + 0x330b)
 #50x0000556d254a69a5 _start (mariadbd + 0xe19a5)
 
 Stack trace of thread 1711:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c239 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0x70239)
 #20x00007f3a1e4416c0 _ZNSt18condition_variable4waitERSt11unique_lockISt5mutexE (libstdc++.so.6 + 0x416c0)
 #30x00007f3a1c4ad231 _ZN7rocksdb14ThreadPoolImpl4Impl8BGThreadEm (ha_rocksdb.so + 0x2ad231)
 #40x00007f3a1c4b31ef _ZN7rocksdb14ThreadPoolImpl4Impl15BGThreadWrapperEPv (ha_rocksdb.so + 0x2b31ef)
 #50x00007f3a1e44b524 n/a (libstdc++.so.6 + 0x4b524)
 #60x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #70x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 1710:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c239 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0x70239)
 #20x00007f3a1c4ea511 _ZN7rocksdb4port7CondVar4WaitEv (ha_rocksdb.so + 0x2ea511)
 #30x00007f3a1c42e61d _ZN7rocksdb19InstrumentedCondVar4WaitEv (ha_rocksdb.so + 0x22e61d)
 #40x00007f3a1c41aaf0 _ZN7rocksdb15DeleteScheduler20BackgroundEmptyTrashEv (ha_rocksdb.so + 0x21aaf0)
 #50x00007f3a1e44b524 n/a (libstdc++.so.6 + 0x4b524)
 #60x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #70x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 1744:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c5a2 pthread_cond_timedwait@@GLIBC_2.3.2 (libc.so.6 + 0x705a2)
 #20x00007f3a1c4ea59c _ZN7rocksdb4port7CondVar9TimedWaitEm (ha_rocksdb.so + 0x2ea59c)
 #30x00007f3a1c42cc61 _ZN7rocksdb19InstrumentedCondVar17TimedWaitInternalEm (ha_rocksdb.so + 0x22cc61)
 #40x00007f3a1c42e818 _ZN7rocksdb19InstrumentedCondVar9TimedWaitEm (ha_rocksdb.so + 0x22e818)
 #50x00007f3a1c3462eb _ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZN7rocksdb16RepeatableThreadC4ESt8functionIFvvEERKNSt7__cxx1112basic_stringIcSt11cha>
 #60x00007f3a1e44b524 n/a (libstdc++.so.6 + 0x4b524)
 #70x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #80x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 1747:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c5a2 pthread_cond_timedwait@@GLIBC_2.3.2 (libc.so.6 + 0x705a2)
 #20x00007f3a1c2c6c47 _ZN7myrocks21Rdb_drop_index_thread3runEv (ha_rocksdb.so + 0xc6c47)
 #30x00007f3a1c2fcf3e _ZN7myrocks10Rdb_thread11thread_funcEPv (ha_rocksdb.so + 0xfcf3e)
 #40x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #50x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 1712:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c239 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0x70239)
 #20x00007f3a1e4416c0 _ZNSt18condition_variable4waitERSt11unique_lockISt5mutexE (libstdc++.so.6 + 0x416c0)
 #30x00007f3a1c4ad231 _ZN7rocksdb14ThreadPoolImpl4Impl8BGThreadEm (ha_rocksdb.so + 0x2ad231)
 #40x00007f3a1c4b31ef _ZN7rocksdb14ThreadPoolImpl4Impl15BGThreadWrapperEPv (ha_rocksdb.so + 0x2b31ef)
 #50x00007f3a1e44b524 n/a (libstdc++.so.6 + 0x4b524)
 #60x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #70x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 1743:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c5a2 pthread_cond_timedwait@@GLIBC_2.3.2 (libc.so.6 + 0x705a2)
 #20x00007f3a1c4ea59c _ZN7rocksdb4port7CondVar9TimedWaitEm (ha_rocksdb.so + 0x2ea59c)
 #30x00007f3a1c42cc61 _ZN7rocksdb19InstrumentedCondVar17TimedWaitInternalEm (ha_rocksdb.so + 0x22cc61)
 #40x00007f3a1c42e818 _ZN7rocksdb19InstrumentedCondVar9TimedWaitEm (ha_rocksdb.so + 0x22e818)
 #50x00007f3a1c3462eb _ZNSt6thread11_State_implINS_8_InvokerISt5tupleIJZN7rocksdb16RepeatableThreadC4ESt8functionIFvvEERKNSt7__cxx1112basic_stringIcSt11cha>
 #60x00007f3a1e44b524 n/a (libstdc++.so.6 + 0x4b524)
 #70x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #80x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 1751:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c239 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0x70239)
 #20x0000556d25c5bc95 _ZL22buf_flush_page_cleanerv.lto_priv.0 (mariadbd + 0x896c95)
 #30x00007f3a1e44b524 n/a (libstdc++.so.6 + 0x4b524)
 #40x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #50x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 1746:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c5a2 pthread_cond_timedwait@@GLIBC_2.3.2 (libc.so.6 + 0x705a2)
 #20x00007f3a1c2d0bae _ZN7myrocks21Rdb_background_thread3runEv (ha_rocksdb.so + 0xd0bae)
 #30x00007f3a1c2fcf3e _ZN7myrocks10Rdb_thread11thread_funcEPv (ha_rocksdb.so + 0xfcf3e)
 #40x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #50x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 6865:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c239 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0x70239)
 #20x0000556d259e66d9 net_real_write (mariadbd + 0x6216d9)
 #30x0000556d25507a47 _ZN8Protocol11net_send_okEP3THDjjyyPKcb (mariadbd + 0x142a47)
 #40x0000556d25508427 _ZN8Protocol13end_statementEv (mariadbd + 0x143427)
 #50x0000556d255f0d87 _Z16dispatch_command19enum_server_commandP3THDPcjb (mariadbd + 0x22bd87)
 #60x0000556d255f40d8 _Z10do_commandP3THDb (mariadbd + 0x22f0d8)
 #70x0000556d2577be20 _Z24do_handle_one_connectionP7CONNECTb (mariadbd + 0x3b6e20)
 #80x0000556d2577c2e6 handle_one_connection (mariadbd + 0x3b72e6)
 #90x0000556d25aab77a pfs_spawn_thread (mariadbd + 0x6e677a)
 #10 0x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #11 0x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 1750:
 #00x00007f3a1e2f337d __poll (libc.so.6 + 0xe737d)
 #10x0000556d25c5410e _ZN12mem_pressure16pressure_routineEPS_ (mariadbd + 0x88f10e)
 #20x00007f3a1e44b524 n/a (libstdc++.so.6 + 0x4b524)
 #30x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #40x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 1749:
 #00x00007f3a1e2fee9d syscall (libc.so.6 + 0xf2e9d)
 #10x0000556d25cac9c7 _ZN5tpool9aio_linux23getevent_thread_routineEPS0_ (mariadbd + 0x8e79c7)
 #20x00007f3a1e44b524 n/a (libstdc++.so.6 + 0x4b524)
 #30x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #40x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 8450:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c8b2 pthread_cond_clockwait@GLIBC_2.30 (libc.so.6 + 0x708b2)
 #20x0000556d25cace0c _ZN5tpool19thread_pool_generic14wait_for_tasksERSt11unique_lockISt5mutexEPNS_11worker_dataE (mariadbd + 0x8e7e0c)
 #30x0000556d25cad153 _ZN5tpool19thread_pool_generic8get_taskEPNS_11worker_dataEPPNS_4taskE (mariadbd + 0x8e8153)
 #40x0000556d25caf84e _ZN5tpool19thread_pool_generic11worker_mainEPNS_11worker_dataE (mariadbd + 0x8ea84e)
 #50x00007f3a1e44b524 n/a (libstdc++.so.6 + 0x4b524)
 #60x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #70x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 1758:
 #00x00007f3a1e226d18 __sigtimedwait (libc.so.6 + 0x1ad18)
 #10x0000556d254bf65b signal_hand (mariadbd + 0xfa65b)
 #20x0000556d25aab77a pfs_spawn_thread (mariadbd + 0x6e677a)
 #30x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #40x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 8509:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c8b2 pthread_cond_clockwait@GLIBC_2.30 (libc.so.6 + 0x708b2)
 #20x0000556d25cace0c _ZN5tpool19thread_pool_generic14wait_for_tasksERSt11unique_lockISt5mutexEPNS_11worker_dataE (mariadbd + 0x8e7e0c)
 #30x0000556d25cad153 _ZN5tpool19thread_pool_generic8get_taskEPNS_11worker_dataEPPNS_4taskE (mariadbd + 0x8e8153)
 #40x0000556d25caf84e _ZN5tpool19thread_pool_generic11worker_mainEPNS_11worker_dataE (mariadbd + 0x8ea84e)
 #50x00007f3a1e44b524 n/a (libstdc++.so.6 + 0x4b524)
 #60x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #70x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 1755:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c239 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0x70239)
 #20x0000556d255ec300 handle_manager (mariadbd + 0x227300)
 #30x0000556d25aab77a pfs_spawn_thread (mariadbd + 0x6e677a)
 #40x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #50x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 8650:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c8b2 pthread_cond_clockwait@GLIBC_2.30 (libc.so.6 + 0x708b2)
 #20x0000556d25cace0c _ZN5tpool19thread_pool_generic14wait_for_tasksERSt11unique_lockISt5mutexEPNS_11worker_dataE (mariadbd + 0x8e7e0c)
 #30x0000556d25cad153 _ZN5tpool19thread_pool_generic8get_taskEPNS_11worker_dataEPPNS_4taskE (mariadbd + 0x8e8153)
 #40x0000556d25caf84e _ZN5tpool19thread_pool_generic11worker_mainEPNS_11worker_dataE (mariadbd + 0x8ea84e)
 #50x00007f3a1e44b524 n/a (libstdc++.so.6 + 0x4b524)
 #60x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #70x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 6236:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c239 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0x70239)
 #20x0000556d259e66d9 net_real_write (mariadbd + 0x6216d9)
 #30x0000556d25507a47 _ZN8Protocol11net_send_okEP3THDjjyyPKcb (mariadbd + 0x142a47)
 #40x0000556d25508427 _ZN8Protocol13end_statementEv (mariadbd + 0x143427)
 #50x0000556d255f0d87 _Z16dispatch_command19enum_server_commandP3THDPcjb (mariadbd + 0x22bd87)
 #60x0000556d255f40d8 _Z10do_commandP3THDb (mariadbd + 0x22f0d8)
 #70x0000556d2577be20 _Z24do_handle_one_connectionP7CONNECTb (mariadbd + 0x3b6e20)
 #80x0000556d2577c2e6 handle_one_connection (mariadbd + 0x3b72e6)
 #90x0000556d25aab77a pfs_spawn_thread (mariadbd + 0x6e677a)
 #10 0x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #11 0x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 6219:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c239 pthread_cond_wait@@GLIBC_2.3.2 (libc.so.6 + 0x70239)
 #20x0000556d259e66d9 net_real_write (mariadbd + 0x6216d9)
 #30x0000556d25507a47 _ZN8Protocol11net_send_okEP3THDjjyyPKcb (mariadbd + 0x142a47)
 #40x0000556d25508427 _ZN8Protocol13end_statementEv (mariadbd + 0x143427)
 #50x0000556d255f0d87 _Z16dispatch_command19enum_server_commandP3THDPcjb (mariadbd + 0x22bd87)
 #60x0000556d255f40d8 _Z10do_commandP3THDb (mariadbd + 0x22f0d8)
 #70x0000556d2577be20 _Z24do_handle_one_connectionP7CONNECTb (mariadbd + 0x3b6e20)
 #80x0000556d2577c2e6 handle_one_connection (mariadbd + 0x3b72e6)
 #90x0000556d25aab77a pfs_spawn_thread (mariadbd + 0x6e677a)
 #10 0x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #11 0x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 6242:
 #00x00007f3a1e2fee9d syscall (libc.so.6 + 0xf2e9d)
 #10x0000556d25ba5b2b _ZN17group_commit_lock7acquireEmPK19completion_callback (mariadbd + 0x7e0b2b)
 #20x0000556d25b9c46a _Z15log_write_up_tombPK19completion_callback (mariadbd + 0x7d746a)
 #30x0000556d25c24a5c _ZL23trx_flush_log_if_neededmP5trx_t.part.0.lto_priv.0 (mariadbd + 0x85fa5c)
 #40x0000556d25b4a997 _ZL15innobase_commitP10handlertonP3THDb.lto_priv.0 (mariadbd + 0x785997)
 #50x0000556d258b14fc _ZL18commit_one_phase_2P3THDbP9THD_TRANSb (mariadbd + 0x4ec4fc)
 #60x0000556d258b28cb _Z15ha_commit_transP3THDb (mariadbd + 0x4ed8cb)
 #70x0000556d2577fcc3 _Z17trans_commit_stmtP3THD (mariadbd + 0x3bacc3)
 #80x0000556d255fe3cc _Z21mysql_execute_commandP3THDb (mariadbd + 0x2393cc)
 #90x0000556d25603b9b _Z11mysql_parseP3THDPcjP12Parser_state (mariadbd + 0x23eb9b)
 #10 0x0000556d255f239c _Z16dispatch_command19enum_server_commandP3THDPcjb (mariadbd + 0x22d39c)
 #11 0x0000556d255f40d8 _Z10do_commandP3THDb (mariadbd + 0x22f0d8)
 #12 0x0000556d2577be20 _Z24do_handle_one_connectionP7CONNECTb (mariadbd + 0x3b6e20)
 #13 0x0000556d2577c2e6 handle_one_connection (mariadbd + 0x3b72e6)
 #14 0x0000556d25aab77a pfs_spawn_thread (mariadbd + 0x6e677a)
 #15 0x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #16 0x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 8627:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c8b2 pthread_cond_clockwait@GLIBC_2.30 (libc.so.6 + 0x708b2)
 #20x0000556d25cace0c _ZN5tpool19thread_pool_generic14wait_for_tasksERSt11unique_lockISt5mutexEPNS_11worker_dataE (mariadbd + 0x8e7e0c)
 #30x0000556d25cad153 _ZN5tpool19thread_pool_generic8get_taskEPNS_11worker_dataEPPNS_4taskE (mariadbd + 0x8e8153)
 #40x0000556d25caf84e _ZN5tpool19thread_pool_generic11worker_mainEPNS_11worker_dataE (mariadbd + 0x8ea84e)
 #50x00007f3a1e44b524 n/a (libstdc++.so.6 + 0x4b524)
 #60x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #70x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 8652:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c8b2 pthread_cond_clockwait@GLIBC_2.30 (libc.so.6 + 0x708b2)
 #20x0000556d25cace0c _ZN5tpool19thread_pool_generic14wait_for_tasksERSt11unique_lockISt5mutexEPNS_11worker_dataE (mariadbd + 0x8e7e0c)
 #30x0000556d25cad153 _ZN5tpool19thread_pool_generic8get_taskEPNS_11worker_dataEPPNS_4taskE (mariadbd + 0x8e8153)
 #40x0000556d25caf84e _ZN5tpool19thread_pool_generic11worker_mainEPNS_11worker_dataE (mariadbd + 0x8ea84e)
 #50x00007f3a1e44b524 n/a (libstdc++.so.6 + 0x4b524)
 #60x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #70x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 1748:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c5a2 pthread_cond_timedwait@@GLIBC_2.3.2 (libc.so.6 + 0x705a2)
 #20x00007f3a1c2d10ee _ZN7myrocks28Rdb_manual_compaction_thread3runEv (ha_rocksdb.so + 0xd10ee)
 #30x00007f3a1c2fcf3e _ZN7myrocks10Rdb_thread11thread_funcEPv (ha_rocksdb.so + 0xfcf3e)
 #40x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #50x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 1754:
 #00x00007f3a1e2797e9 __futex_abstimed_wait_common (libc.so.6 + 0x6d7e9)
 #10x00007f3a1e27c5a2 pthread_cond_timedwait@@GLIBC_2.3.2 (libc.so.6 + 0x705a2)
 #20x0000556d25a3e189 my_service_thread_sleep (mariadbd + 0x679189)
 #30x0000556d25a46719 ma_checkpoint_background (mariadbd + 0x681719)
 #40x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #50x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 
 Stack trace of thread 6240:
 #00x00007f3a1e2fee9d syscall (libc.so.6 + 0xf2e9d)
 #10x0000556d25ba5b2b _ZN17group_commit_lock7acquireEmPK19completion_callback (mariadbd + 0x7e0b2b)
 #20x0000556d2540763f _Z27log_write_and_flush_preparev (mariadbd + 0x4263f)
 #30x0000556d2540eed4 _ZN5mtr_t11commit_fileER11fil_space_tPKc (mariadbd + 0x49ed4)
 #40x0000556d254400fa _ZN11fil_space_t6renameEPKcbb.cold (mariadbd + 0x7b0fa)
 #50x0000556d25c5ee21 _ZNK12dict_table_t17rename_tablespaceEN3st_4spanIKcEEb (mariadbd + 0x899e21)
 #60x0000556d25c6374f _Z26dict_table_rename_in_cacheP12dict_table_tN3st_4spanIKcEEb (mariadbd + 0x89e74f)
 #70x0000556d25bf1343 _Z26row_rename_table_for_mysqlPKcS0_P5trx_tb (mariadbd + 0x82c343)
 #80x0000556d25b782d1 _Z18commit_try_rebuildP18Alter_inplace_infoP23ha_innobase_inplace_ctxP5TABLEPKS3_bP5trx_tPKc (mariadbd + 0x7b32d1)
 #90x0000556d25b89707 _ZN11ha_innobase26commit_inplace_alter_tableEP5TABLEP18Alter_inplace_infob (mariadbd + 0x7c4707)
 #10 0x0000556d25e2ffc6 _ZL25mysql_inplace_alter_tableP3THDP10TABLE_LISTP5TABLES4_P18Alter_inplace_infoP11MDL_requestP16st_ddl_log_stateP20TRIGGER_RENAME_PAR>
 #11 0x0000556d256f1879 _Z17mysql_alter_tableP3THDPK25st_mysql_const_lex_stringS3_P22Table_specification_stP10TABLE_LISTP13Recreate_infoP10Alter_infojP8st_or>
 #12 0x0000556d256f230a _Z20mysql_recreate_tableP3THDP10TABLE_LISTP13Recreate_infob (mariadbd + 0x32d30a)
 #13 0x0000556d2577a700 _ZL20admin_recreate_tableP3THDP10TABLE_LISTP13Recreate_info.lto_priv.0 (mariadbd + 0x3b5700)
 #14 0x0000556d25df1948 _ZL17mysql_admin_tableP3THDP10TABLE_LISTP15st_ha_check_optPK25st_mysql_const_lex_string13thr_lock_typebbjPFiS0_S2_S4_EM7handlerFiS0_S>
 #15 0x0000556d2577f27e _ZN22Sql_cmd_optimize_table7executeEP3THD (mariadbd + 0x3ba27e)
 #16 0x0000556d255fe870 _Z21mysql_execute_commandP3THDb (mariadbd + 0x239870)
 #17 0x0000556d25603b9b _Z11mysql_parseP3THDPcjP12Parser_state (mariadbd + 0x23eb9b)
 #18 0x0000556d255f239c _Z16dispatch_command19enum_server_commandP3THDPcjb (mariadbd + 0x22d39c)
 #19 0x0000556d255f40d8 _Z10do_commandP3THDb (mariadbd + 0x22f0d8)
 #20 0x0000556d2577be20 _Z24do_handle_one_connectionP7CONNECTb (mariadbd + 0x3b6e20)
 #21 0x0000556d2577c2e6 handle_one_connection (mariadbd + 0x3b72e6)
 #22 0x0000556d25aab77a pfs_spawn_thread (mariadbd + 0x6e677a)
 #23 0x00007f3a1e27d148 start_thread (libc.so.6 + 0x71148)
 #24 0x00007f3a1e3010cc __clone3 (libc.so.6 + 0xf50cc)
 ELF object binary architecture: AMD x86-64

@rfjakob
Copy link
Owner

rfjakob commented Jan 19, 2025

I was hoping for line numbers, but better than nothing.

I failed to to get mariadb to run on gocryptfs due to SELinux (do you have a hint beside disabling selinux?) I am doing this ( https://optimizedbyotto.com/post/grokking-mariadb-test-run-mtr/ ):

sudo dnf install mariadb-test
cd /usr/share/mysql-test
MTR_PRINT_CORE=detailed ./mtr --parallel=auto --skip-test-list=unstable-tests.amd64 --big-test --vardir=/tmp/mysql-test.b/var

/tmp/mysql-test.b is a gocryptfs mount.

Let's see.

@johnassel
Copy link
Author

For SELinux:
getenforce 0 on the console or setting in /etc/selinux/config
SELINUX=enforcing to SELINUX=disabled
and then reboot.

@rfjakob
Copy link
Owner

rfjakob commented Jan 19, 2025

Well. mariadb-test failed, but it also failed on plain ext4 (differently). So that's not a very good reproducer.

How do you trigger the crash?

@johnassel
Copy link
Author

It seems very random. Mostly it occurs when one of my cronjobs runs which is triggering a PHP skript which is doing something with the DB. A lot of calls are OK, but maybe once in a hour it crashes.
I tried to reproduce it by putting some load onto the DB but this didn't work.

@rfjakob
Copy link
Owner

rfjakob commented Jan 19, 2025

Looking os_file_sync_posix from the backtrace ( https://github.com/MariaDB/server/blob/3d0fb150289716ca75cd64d62823cf715ee47646/storage/innobase/os/os0file.cc#L883 ) you should have gotten a log entry like fsync() returned 123 wherever mariadb logs to. Can you check?

@rfjakob
Copy link
Owner

rfjakob commented Jan 19, 2025

After dropping --big-test I get mariadb-test-run to pass on both tmpfs and gocryptfs-on-tmpfs, command line:

MTR_PRINT_CORE=detailed ./mariadb-test-run --parallel=auto --skip-test-list=unstable-tests.amd64 --vardir=/tmp/mysql-test.b/var

But I just realized I missed the critical ingredient of running gocryptfs as root with -allow_other ( #892 (comment) )

@rfjakob
Copy link
Owner

rfjakob commented Jan 19, 2025

That did the trick. I got a crash, and also some mkdir: Operation not permitted @glhughes123

@rfjakob
Copy link
Owner

rfjakob commented Jan 19, 2025

Here we go. Same "Operation not permitted" issue.

2025-01-19 21:11:48 0 [ERROR] mariadbd: File './mysqld-bin.~rec~' not found (Errcode: 1 "Operation not permitted")
2025-01-19 21:11:48 0 [ERROR] MYSQL_BIN_LOG::open_purge_index_file failed to open register  file.
2025-01-19 21:11:48 0 [ERROR] MYSQL_BIN_LOG::open_index_file failed to sync the index file.
2025-01-19 21:11:48 0 [ERROR] Aborting

@johnassel
Copy link
Author

Looking os_file_sync_posix from the backtrace ( https://github.com/MariaDB/server/blob/3d0fb150289716ca75cd64d62823cf715ee47646/storage/innobase/os/os0file.cc#L883 ) you should have gotten a log entry like fsync() returned 123 wherever mariadb logs to. Can you check?

Hmm, unfortunately my mariadb did not log such a thing, not in /var/log/mariadb/mariadb.log nor journald.
But I think I found the coredump file if that helps.

@glhughes123
Copy link

@johnassel if you need a workaround I have set up my gocryptfs mounts to use a script that calls runuser. This allows the system to mount as the owner (instead of root) with the allow_other flag set (so other users can read/write) and seems to avoid the permission errors for me.

For example:

[greg@selene:~] cat bin/mount_crypt_ares_backups.sh 
#!/bin/sh
/usr/sbin/runuser -u greg -- /usr/bin/gocryptfs --passfile=/home/greg/.keys/ares.key $*

and the fstab entry looks like:

/home/greg/bin/mount_crypt_ares_backups.sh#/backups/ares/home/.greg.crypt /backups/ares/home/greg fuse rw,user,auto,allow_other,x-systemd.requires=/mnt/storage0 0 0

@johnassel
Copy link
Author

Ah ok - my workaround is reverting to 2.4.0 for now.

@rfjakob
Copy link
Owner

rfjakob commented Jan 21, 2025

Reproducer script: https://gist.github.com/rfjakob/411d5b77bb39322eee3aca769b1d9577
Bug bisected to 08b6ed1

@rfjakob
Copy link
Owner

rfjakob commented Jan 21, 2025

Looks like golang/sys@d0df966 broke it.

rfjakob added a commit that referenced this issue Jan 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants