Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

caffe on jetson tk1 #1861

Closed
kaaicheung opened this issue Feb 13, 2015 · 12 comments
Closed

caffe on jetson tk1 #1861

kaaicheung opened this issue Feb 13, 2015 · 12 comments

Comments

@kaaicheung
Copy link

Got an error during runtest
F0213 10:25:10.921640 9387 db.hpp:109] Check failed: mdb_status == 0 (-30792 vs. 0) MDB_MAP_FULL: Environment mapsize limit reached
*** Check failure stack trace: ***
@ 0x44a85060 (unknown)
@ 0x44a84f5c (unknown)
@ 0x44a84b78 (unknown)
@ 0x44a86f98 (unknown)
@ 0x265ec2 caffe::db::LMDBTransaction::Put()
@ 0x21b932 caffe::DBTest_TestWrite_Test<>::TestBody()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()
@ 0x230568 testing::internal::HandleExceptionsInMethodIfSupported<>()

@   0x230568  testing::internal::HandleExceptionsInMethodIfSupported<>()
@   0x230568  testing::internal::HandleExceptionsInMethodIfSupported<>()
@   0x230568  testing::internal::HandleExceptionsInMethodIfSupported<>()
@   0x230568  testing::internal::HandleExceptionsInMethodIfSupported<>()
@   0x230568  testing::internal::HandleExceptionsInMethodIfSupported<>()
@   0x230568  testing::internal::HandleExceptionsInMethodIfSupported<>()
@   0x230568  testing::internal::HandleExceptionsInMethodIfSupported<>()
@   0x230568  testing::internal::HandleExceptionsInMethodIfSupported<>()
@   0x230568  testing::internal::HandleExceptionsInMethodIfSupported<>()
@   0x230568  testing::internal::HandleExceptionsInMethodIfSupported<>()
@   0x230568  testing::internal::HandleExceptionsInMethodIfSupported<>()

Aborted
make: *** [runtest] Error 134
I followed the install instruction mostly from http://petewarden.com/2014/10/25/how-to-run-the-caffe-deep-learning-vision-library-on-nvidias-jetson-mobile-gpu-board/ , except for that I installed some newer version stuff.
Anyone know how to solve this?
Great THX in advance!

@erictzeng
Copy link
Contributor

You could try enabling the DEBUG flag in Makefile.config and see if you get any more useful information.

Unfortunately, we have little experience running on this hardware, so we can't provide support here. You may have better luck asking on the caffe-users mailing list.

@jetsonhacks
Copy link

A link to a write-up on an updated install: http://jetsonhacks.com/2015/01/17/nvidia-jetson-tk1-caffe-deep-learning-framework/

@kaaicheung
Copy link
Author

Great THX to Jetsonhaks providing a very nice and easy solution! Thx to erictzeng for advice too!

@ajschumacher
Copy link
Contributor

@kaaicheung What solved the problem for you? @jetsonhacks do you have any thoughts? I'm building master at 543afd3 and getting the following from make runtest on my Jetson:

[----------] 5 tests from DBTest/1, where TypeParam = caffe::TypeLMDB
[ RUN      ] DBTest/1.TestWrite
F0614 22:42:15.186868 10357 db.hpp:109] Check failed: mdb_status == 0 (-30792 vs. 0) MDB_MAP_FULL: Environment mapsize limit reached
*** Check failure stack trace: ***
    @ 0x432cd060  (unknown)
    @ 0x432ccf5c  (unknown)
    @ 0x432ccb78  (unknown)
    @ 0x432cef98  (unknown)
    @ 0x43be9450  caffe::db::LMDBTransaction::Put()
    @   0x20c0d2  caffe::DBTest_TestWrite_Test<>::TestBody()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
    @   0x2b3700  testing::internal::HandleExceptionsInMethodIfSupported<>()
make: *** [runtest] Aborted

Things seem to be working otherwise; is the problem local to lmdb components?

@ajschumacher
Copy link
Contributor

@erictzeng I think this issue is due to the Jetson being a 32-bit (ARM) device, and the constant LMDB_MAP_SIZE in src/caffe/util/db.cpp being too big for it to understand. Here's the whole line:

const size_t LMDB_MAP_SIZE = 1099511627776;  // 1 TB

The solution suggested by Боголюбский Алексей of using 2^29 (536870912) instead works at least well enough to get all the tests to run successfully.

I'm not at all sure that this is the best solution; another value might be better, or some other workaround for 32-bit systems; I'm not sure. Maybe there's a way to add a setting in Makefile.configure so that this can be adjusted without hand-editing the source.

Fixing this would be really great; small systems like the Jetson are super fun places to run Caffe and it would be nice if it installed easily there.

@erictzeng would you consider re-opening this issue and suggesting some preferable solution along the lines of the above?

I'll try to reply to the two threads on the email list where this has come up.

Thanks @erictzeng @kaaicheung @jetsonhacks!

@ajschumacher
Copy link
Contributor

Argh; I'm afraid the "fix" was not as successful as I had hoped. While make runtest works just fine, I'm still unable to run the MNIST LeNet example using LMDB. If I switch over to LevelDB it seems to work though, so I have that workaround for the moment.

@jetsonhacks
Copy link

@ajschumacher That was a good catch on the 32 bit issue. I guess there will always be a tension between the 64 bit unconstrained implementation versus the 32 bit with more limited resources.

@ajschumacher
Copy link
Contributor

Thanks @jetsonhacks - credit due to Боголюбский on the mailing list.

I also wrote up more about my setup efforts here: http://planspace.org/20150614-the_nvidia_jetson_tk1_with_caffe_on_mnist/

@coreyt
Copy link

coreyt commented Jul 11, 2015

To get the LMDB portion of tests to work, make sure to also update examples/mnist/convert_mnist_data.cpp as well:

examples/mnist/convert_mnist_data.cpp:89:56: warning: large integer implicitly truncated to unsigned type [-Woverflow]
     CHECK_EQ(mdb_env_set_mapsize(mdb_env, 1099511627776), MDB_SUCCESS)  // 1TB
                                                        ^

Here is the end of my run:

$ make runtest

[...]

[----------] Global test environment tear-down
[==========] 1209 tests from 210 test cases ran. (4425801 ms total)
[  PASSED  ] 1209 tests.

  YOU HAVE 2 DISABLED TESTS

@aseuteurideu
Copy link

I tried to change the LMDB_MAP_SIZE to 536870912 and my second try is 268435456
I use create_imagenet.sh
I think the convert_mnist_data.cpp doesn't have connection to create_imagenet.sh... but I change it also in the second try to 268435456
Both are failed in the same position....
Below is the error
Any idea?

E1209 10:04:34.680079 4314 convert_imageset.cpp:143] Processed 5475000 files.
E1209 10:04:37.770831 4314 convert_imageset.cpp:143] Processed 5476000 files.
F1209 10:04:39.165669 4314 db.hpp:109] Check failed: mdb_status == 0 (-30792 vs. 0) MDB_MAP_FULL: Environment mapsize limit reached
*** Check failure stack trace: ***
@ 0x7f9970dc2daa (unknown)
@ 0x7f9970dc2ce4 (unknown)
@ 0x7f9970dc26e6 (unknown)
@ 0x7f9970dc5687 (unknown)
@ 0x7f99711bc9c0 caffe::db::LMDBTransaction::Put()
@ 0x403802 main
@ 0x7f996ffd2ec5 (unknown)
@ 0x40431c (unknown)
@ (nil) (unknown)
Aborted (core dumped)

FYI, I use amazon web service with ubuntu14.04 64bit. I put image data into other partition and I put the lmdb file to another partition.

@aseuteurideu
Copy link

Update for my problem,
finally I made it.
I increase the LMDB_MAP_SIZE to 2TB (2^31 or twice the original 1TB)
and I used to forget to make all the caffe, so it was not affecting anything without make all the caffe (newbie's mistake haha)

@jerpint
Copy link

jerpint commented Jun 1, 2016

Hello, I am having a problem going form lmdb files generated on my mac (64-bit) to the jetson tx1 (32-bit)

After a full day of trouble shooting, I've come to the conclusion that lmdb files that work on my TX1 dont work on my mac, and lmdb files that work on my mac don t work on my TX1. I am creating my lmdb files from my mac, and using an external drive to bring them over to the TX1. Does anyone know how I could convert my lmdb file from the 64-bit format to the 32 bit format??

Everything else works smoothly.. for now ^^

Thanks

J

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants