Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

<jemalloc>: Unsupported system page size with cockroach 23.1.5 #106745

Closed
fveauvy opened this issue Jul 13, 2023 · 7 comments
Closed

<jemalloc>: Unsupported system page size with cockroach 23.1.5 #106745

fveauvy opened this issue Jul 13, 2023 · 7 comments
Assignees
Labels
B-arch-arm Issues specific to ARM64 builds (linux or macos) B-os-linux Issues specific to the Linux OS (any distribution) C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. db-cy-23 O-community Originated from the community T-storage Storage Team X-blathers-triaged blathers was able to find an owner

Comments

@fveauvy
Copy link

fveauvy commented Jul 13, 2023

Describe the problem

I'm getting an error when running cockroachdb v23.1.5 linux-arm64. I'm using an Asahi Linux kernel (6.3.0-asahi-10-1-ARCH) running on a Mac M1 mini.

Cockroach fails with :

<jemalloc>: Unsupported system page size
<jemalloc>: Unsupported system page size
<jemalloc>: Unsupported system page size
runtime/cgo: malloc failed: Cannot allocate memory

Note that I don't get this error using the version 22.2.11 Docker image : cockroachdb/cockroach:arm64-v22.2.11

To Reproduce

Using Asahi Linux on a Apple Silicon Mac :

  1. docker pull cockroachdb/cockroach:23.1.5
  2. docker volume create roach-single
  3. docker run -it -v "roach-single:/cockroach/cockroach-data" cockroachdb/cockroach:23.1.5 start-single-node --insecure
  4. You'll see the error mentioned above in the console output

Expected behavior

Since cockroach 22.2.11 is working fine on linux/arm64 with Linux Asahi, I think it should also run with 23.1.5.

Additional data / screenshots

Here are some related links and issues :

CleanShot 2023-07-13 at 11 52 13

Additional context

We're migrating our CI/CD pipeline to use GitHub Actions Self-Hosted Runners. For some reasons, we want to use Linux and our available Mac Mini M1 in order to run testing workflows that includes CockroachDB services.

Jira issue: CRDB-29697

@fveauvy fveauvy added the C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. label Jul 13, 2023
@blathers-crl
Copy link

blathers-crl bot commented Jul 13, 2023

Hello, I am Blathers. I am here to help you get the issue triaged.

Hoot - a bug! Though bugs are the bane of my existence, rest assured the wretched thing will get the best of care here.

I have CC'd a few people who may be able to assist you:

If we have not gotten back to your issue within a few business days, you can try the following:

  • Join our community slack channel and ask on #cockroachdb.
  • Try find someone from here if you know they worked closely on the area and CC them.

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@blathers-crl blathers-crl bot added O-community Originated from the community X-blathers-triaged blathers was able to find an owner labels Jul 13, 2023
@RaduBerinde
Copy link
Member

To fix this, we need to configure jemalloc to support a larger page size (--with-lg-page=16). However, in some cases that setting caused problems, at least a while back - #81581.

@blathers-crl blathers-crl bot added the T-storage Storage Team label Jul 13, 2023
@knz knz added B-unsupported-arch Non-x86_64 architectures: PPC, MIPS, etc B-os-linux Issues specific to the Linux OS (any distribution) B-arch-arm Issues specific to ARM64 builds (linux or macos) and removed B-unsupported-arch Non-x86_64 architectures: PPC, MIPS, etc labels Jul 13, 2023
@RaduBerinde
Copy link
Member

@fveauvy I will try to tweak the jemalloc build settings for ARM to support larger pages. I do want to point out though that for now ARM support is best-effort and we don't recommend using it for production. We have filed internal tickets to improve the documentation to be more clear about this.

@fveauvy
Copy link
Author

fveauvy commented Jul 17, 2023

@fveauvy I will try to tweak the jemalloc build settings for ARM to support larger pages. I do want to point out though that for now ARM support is best-effort and we don't recommend using it for production. We have filed internal tickets to improve the documentation to be more clear about this.

@RaduBerinde Duly noted. As I mentioned in additional context, we don't plan on using it for production purposes but for our unit/integration test environment with GitHub Actions self hosted runners.

Thanks a lot for your input.

RaduBerinde added a commit to RaduBerinde/cockroach that referenced this issue Jul 17, 2023
We have had two reports of users not being able to run cockroach 23.1
on linux/arm64 with 16k pages (jemalloc does not support the page
size).  This change attempts to fix this by compiling jemalloc with
16K page support, only in the linux/arm64 configuration.

Epic: None
Informs: cockroachdb#106745
Release note: None
craig bot pushed a commit that referenced this issue Jul 17, 2023
106923: go.mod: bump Pebble to 809057a10ee4 r=RaduBerinde a=sumeerbhola

809057a1 sstable: avoid caching meta blocks
03c97cda db: do not cache compaction block reads
a89c926f internal/cache: move Alloc and Free into package-level functions
9976c78b manifest: reduce test size under race
10d9e4a0 lint: disable during race tests
7bb765ec db: smoother thresholds for the delete pacer
02413ad7 cmd: enable benchmarking shared storage including the secondary cache
4606eaf6 db,sstable: bug fixes for keys with obsolete bit
9d75815f objstorage: close the shared provider including the catalog & cache
840866b8 ingest: Use Overlaps() to calculate L0 overlaps in excise
2ad4e668 vfs: Deflake TestDiskHealthChecking_File*
5afd803f db: add some logging for the cleaner
88bbab59 db: Bring back call to deleteObsoleteFiles in Close()
5c2da530 ingest: don't assume flushable ingests are locally present
a9a079d4 sharedobjcat: allow addition/deletion of object in same batch
52a12cd5 internal/manifest: delete L0Sublevels_LargeImport
168079ac metamorphic: test shared storage including the secondary cache
fa85ec45 db: deflake TestCompactionPickerScores
ca11d1bb *: allow virtual sstables to truncate range keys/dels
678f7e48 internal/manifest: link issue to skipped test

Epic: None
Release note: None

106929: c-deps: support 16K pages for ARM64 r=RaduBerinde a=RaduBerinde

We have had two reports of users not being able to run cockroach 23.1 on linux/arm64 with 16k pages (jemalloc does not support the page size). This change attempts to fix this by compiling jemalloc with 16K page support, only in the linux/arm64 configuration.

Epic: None
Informs: #106745
Release note: None

Co-authored-by: sumeerbhola <[email protected]>
Co-authored-by: Radu Berinde <[email protected]>
blathers-crl bot pushed a commit that referenced this issue Jul 18, 2023
We have had two reports of users not being able to run cockroach 23.1
on linux/arm64 with 16k pages (jemalloc does not support the page
size).  This change attempts to fix this by compiling jemalloc with
16K page support, only in the linux/arm64 configuration.

Epic: None
Informs: #106745
Release note: None
@RaduBerinde
Copy link
Member

@fveauvy Hopefully the next 23.1.x release works.

rickystewart added a commit to rickystewart/cockroach that referenced this issue Jul 18, 2023
Epic: none
Release note (build): On Linux/ARM64, use 16k page sizes in `jemalloc`
Part of: cockroachdb#106745
craig bot pushed a commit that referenced this issue Jul 19, 2023
107128: c-deps: update archived c-deps to pick up #106929 r=rail,RaduBerinde a=rickystewart

Epic: none
Release note (build): On Linux/ARM64, use 16k page sizes in `jemalloc`
Part of: #106745

107178: Revert "github-pull-request-make: temporary workaround" r=healthy-pod a=knz

Epic: CRDB-18499

This reverts commit da33ea2.
Not needed any more since is merged now.

Co-authored-by: Ricky Stewart <[email protected]>
Co-authored-by: Raphael 'kena' Poss <[email protected]>
@pawalt
Copy link
Contributor

pawalt commented Dec 16, 2023

FYI this is fixed for me on the most recent version of CRDB. I'm also on Asahi kernel 6.5.0-asahi. This is the command that passed for me:

sudo docker run -it -v "roach-single:/cockroach/cockroach-data" cockroachdb/cockroach:v23.1.13 start-single-node --insecure

Thanks for the help @RaduBerinde

@RaduBerinde
Copy link
Member

Thanks @pawalt! I will close this then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
B-arch-arm Issues specific to ARM64 builds (linux or macos) B-os-linux Issues specific to the Linux OS (any distribution) C-bug Code not up to spec/doc, specs & docs deemed correct. Solution expected to change code/behavior. db-cy-23 O-community Originated from the community T-storage Storage Team X-blathers-triaged blathers was able to find an owner
Projects
Archived in project
Development

No branches or pull requests

4 participants