Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

getting corrupted double-linked list #634

Closed
josd opened this issue Nov 25, 2024 · 13 comments
Closed

getting corrupted double-linked list #634

josd opened this issue Nov 25, 2024 · 13 comments

Comments

@josd
Copy link

josd commented Nov 25, 2024

We are getting a corrupted double-linked list:

cd /tmp
git clone https://github.com/eyereasoner/eye3
cd eye3
git checkout tags/v1.1.2
tpl -g main eye3.pl etc/lee.pl
:- op(1150,xfx,=>).

'urn:example:route'([[1,1],[9,8],[[[2,3],[4,5]],[[6,6],[8,8]]]],[[9,8],[9,7],[9,6],[9,5],[8,5],[7,5],[6,5],[5,5],[5,4],[5,3],[5,2],[4,2],[3,2],[2,2],[1,2],[1,1]]) => true.
corrupted double-linked list
Aborted

It also occurs in 7 of our other examples and test cases:

cd etc
./test
----------------
Running eye3 etc
eye3 v1.1.2
trealla v2.60.18
----------------

ackermann.pl            corrupted double-linked list
../eye3: line 4: 80437 Aborted                 tpl -g main ${EYE3} "$@"
    726 msec OK
acp.pl                      101 msec OK
bmt.pl                     1419 msec OK
complex.pl              corrupted double-linked list
../eye3: line 4: 80473 Aborted                 tpl -g main ${EYE3} "$@"
     85 msec OK
control.pl                   77 msec OK
dt.pl                      1694 msec OK
easter.pl               corrupted double-linked list
../eye3: line 4: 80509 Aborted                 tpl -g main ${EYE3} "$@"
     82 msec OK
eulers-identity.pl           77 msec OK
fibonacci.pl                 84 msec OK
fourcolor.pl                 84 msec OK
fuse.pl                      77 msec OK
good-cobbler.pl              75 msec OK
gps.pl                  corrupted double-linked list
../eye3: line 4: 80581 Aborted                 tpl -g main ${EYE3} "$@"
     80 msec OK
graph.pl                corrupted double-linked list
../eye3: line 4: 80593 Aborted                 tpl -g main ${EYE3} "$@"
     89 msec OK
hanoi.pl                     93 msec OK
lee.pl                  corrupted double-linked list
../eye3: line 4: 80617 Aborted                 tpl -g main ${EYE3} "$@"
     80 msec OK
mi.pl                        76 msec OK
polygon.pl                   72 msec OK
polynomial.pl                91 msec OK
sdcoding.pl             corrupted double-linked list
../eye3: line 4: 80665 Aborted                 tpl -g main ${EYE3} "$@"
     79 msec OK
socrates.pl                  76 msec OK
sudoku.pl               corrupted double-linked list
../eye3: line 4: 80689 Aborted                 tpl -g main ${EYE3} "$@"
   1017 msec OK
tak.pl                      879 msec OK
turing.pl                    93 msec OK
workplace-benchmark.pl     2443 msec OK
workplace.pl                 73 msec OK

10 sec 26 OK 0 FAILED
@josd
Copy link
Author

josd commented Nov 25, 2024

It seems to have to do with format/2 and things are back OK in https://github.com/eyereasoner/eye3/tree/v1.2.1

@infradig
Copy link
Contributor

infradig commented Nov 25, 2024 via email

@infradig
Copy link
Contributor

By the way, this didn't give an error for me...

$ git checkout tags/v1.1.2
$ tpl -g main eye3.pl etc/lee.pl
:- op(1150,xfx,=>).

'urn:example:route'([[1,1],[9,8],[[[2,3],[4,5]],[[6,6],[8,8]]]],[[9,8],[9,7],[9,6],[9,5],[8,5],[7,5],[6,5],[5,5],[5,4],[5,3],[5,2],[4,2],[3,2],[2,2],[1,2],[1,1]]) => true.
$

whereas running ./test did.

@infradig
Copy link
Contributor

But valgrind shows somethings up...

$ valgrind tpl -g main eye3.pl etc/lee.pl
==859502== Memcheck, a memory error detector
==859502== Copyright (C) 2002-2022, and GNU GPL'd, by Julian Seward et al.
==859502== Using Valgrind-3.22.0 and LibVEX; rerun with -h for copyright info
==859502== Command: tpl -g main eye3.pl etc/lee.pl
==859502== 
:- op(1150,xfx,=>).

'urn:example:route'([[1,1],[9,8],[[[2,3],[4,5]],[[6,6],[8,8]]]],[[9,8],[9,7],[9,6],[9,5],[8,5],[7,5],[6,5],[5,5],[5,4],[5,3],[5,2],[4,2],[3,2],[2,2],[1,2],[1,1]]) => true.
==859502== Invalid read of size 8
==859502==    at 0x215531: clear_clause (in /home/andrew/trealla/tpl)
==859502==    by 0x210FB7: module_destroy (in /home/andrew/trealla/tpl)
==859502==    by 0x22FF4A: pl_destroy.part.0 (in /home/andrew/trealla/tpl)
==859502==    by 0x119456: main (in /home/andrew/trealla/tpl)
==859502==  Address 0x656ebd0 is 0 bytes inside a block of size 30 free'd
==859502==    at 0x484988F: free (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==859502==    by 0x239CF4: query_destroy (in /home/andrew/trealla/tpl)
==859502==    by 0x21F891: run (in /home/andrew/trealla/tpl)
==859502==    by 0x230497: pl_eval (in /home/andrew/trealla/tpl)
==859502==    by 0x118FEF: main (in /home/andrew/trealla/tpl)
==859502==  Block was alloc'd at
==859502==    at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==859502==    by 0x21CB8D: tokenize (in /home/andrew/trealla/tpl)
==859502==    by 0x21C4BE: tokenize (in /home/andrew/trealla/tpl)
==859502==    by 0x21C4BE: tokenize (in /home/andrew/trealla/tpl)
==859502==    by 0x21C4BE: tokenize (in /home/andrew/trealla/tpl)
==859502==    by 0x21C4BE: tokenize (in /home/andrew/trealla/tpl)
==859502==    by 0x20F311: load_fp (in /home/andrew/trealla/tpl)
==859502==    by 0x20FC8C: load_file (in /home/andrew/trealla/tpl)
==859502==    by 0x2306F5: pl_consult (in /home/andrew/trealla/tpl)
==859502==    by 0x11991A: main (in /home/andrew/trealla/tpl)
==859502== 
==859502== 
==859502== HEAP SUMMARY:
==859502==     in use at exit: 466 bytes in 24 blocks
==859502==   total heap usage: 22,967 allocs, 22,943 frees, 210,034,834 bytes allocated
==859502== 
==859502== LEAK SUMMARY:
==859502==    definitely lost: 466 bytes in 24 blocks
==859502==    indirectly lost: 0 bytes in 0 blocks
==859502==      possibly lost: 0 bytes in 0 blocks
==859502==    still reachable: 0 bytes in 0 blocks
==859502==         suppressed: 0 bytes in 0 blocks
==859502== Rerun with --leak-check=full to see details of leaked memory
==859502== 
==859502== For lists of detected and suppressed errors, rerun with: -s
==859502== ERROR SUMMARY: 9 errors from 1 contexts (suppressed: 0 from 0)
$ 

@josd
Copy link
Author

josd commented Nov 26, 2024

Good to see that and it is back OK for eye3 v1.2.1

del /tmp/eye3
cd /tmp
git clone https://github.com/eyereasoner/eye3
cd eye3
git checkout tags/v1.2.1
valgrind tpl -g main eye3.pl etc/lee.pl
==89139== Memcheck, a memory error detector
==89139== Copyright (C) 2002-2017, and GNU GPL'd, by Julian Seward et al.
==89139== Using Valgrind-3.18.1 and LibVEX; rerun with -h for copyright info
==89139== Command: tpl -g main eye3.pl etc/lee.pl
==89139==
:- op(1150,xfx,=>).

'urn:example:route'([[1,1],[9,8],[[[2,3],[4,5]],[[6,6],[8,8]]]],[[9,8],[9,7],[9,6],[9,5],[8,5],[7,5],[6,5],[5,5],[5,4],[5,3],[5,2],[4,2],[3,2],[2,2],[1,2],[1,1]]) => true.
==89139==
==89139== HEAP SUMMARY:
==89139==     in use at exit: 0 bytes in 0 blocks
==89139==   total heap usage: 12,841 allocs, 12,841 frees, 95,094,425 bytes allocated
==89139==
==89139== All heap blocks were freed -- no leaks are possible
==89139==
==89139== For lists of detected and suppressed errors, rerun with: -s
==89139== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

@infradig
Copy link
Contributor

So you removed format and it's good now?

@josd
Copy link
Author

josd commented Nov 26, 2024

It is!

@guregu
Copy link
Contributor

guregu commented Nov 26, 2024

Not sure if this is helpful, but another snippet that triggers a similar invalid read:

:- use_module(library(clpz)).

test(X) :-
    Y mod 100 #= 0,
    X #= Y.
?- test(X).
   clpz:(X mod 100#=0).
==2299== Invalid read of size 8
==2299==    at 0x2A6B70: __aarch64_ldadd8_acq_rel (in /Users/guregu/code/trealla/trealla/tpl)
==2299==    by 0x28264B: unshare_cell_ (internal.h:927)
==2299==    by 0x28B603: query_destroy (query.c:1808)
==2299==    by 0x26072F: run (parser.c:4174)
==2299==    by 0x27ED5F: pl_eval (prolog.c:145)
==2299==    by 0x118D1F: main (tpl.c:378)
==2299==  Address 0x643b640 is 0 bytes inside a block of size 41 free'd
==2299==    at 0x4887B60: free (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==2299==    by 0x28266B: unshare_cell_ (internal.h:928)
==2299==    by 0x28B603: query_destroy (query.c:1808)
==2299==    by 0x26072F: run (parser.c:4174)
==2299==    by 0x27ED5F: pl_eval (prolog.c:145)
==2299==    by 0x118D1F: main (tpl.c:378)
==2299==  Block was alloc'd at
==2299==    at 0x48850E8: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-arm64-linux.so)
==2299==    by 0x248ABF: make_string_internal (parser.c:122)
==2299==    by 0x25FC0B: tokenize (parser.c:4076)
==2299==    by 0x25D0AF: tokenize (parser.c:3635)
==2299==    by 0x25D0AF: tokenize (parser.c:3635)
==2299==    by 0x25D0AF: tokenize (parser.c:3635)
==2299==    by 0x25D0AF: tokenize (parser.c:3635)
==2299==    by 0x25D0AF: tokenize (parser.c:3635)
==2299==    by 0x25D0AF: tokenize (parser.c:3635)
==2299==    by 0x25D0AF: tokenize (parser.c:3635)
==2299==    by 0x25D0AF: tokenize (parser.c:3635)
==2299==    by 0x25D0AF: tokenize (parser.c:3635)
==2299== 

@infradig
Copy link
Contributor

Also...

?- sub_atom('1234567890123456789',B,L,A,X),write(X),nl,fail;true.
...
==1005261== Invalid read of size 8
==1005261==    at 0x2447DA: unshare_cell_ (internal.h:927)
==1005261==    by 0x245F6F: unshare_cells (parser.c:299)
==1005261==    by 0x245FAD: clear_clause (parser.c:305)
==1005261==    by 0x24620E: parser_destroy (parser.c:339)
==1005261==    by 0x280FDC: pl_eval (prolog.c:147)
==1005261==    by 0x119FC2: main (tpl.c:378)
==1005261==  Address 0x6703fb0 is 0 bytes inside a block of size 36 free'd
==1005261==    at 0x484988F: free (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1005261==    by 0x2841D2: unshare_cell_ (internal.h:928)
==1005261==    by 0x28CBB3: query_destroy (query.c:1808)
==1005261==    by 0x25E744: run (parser.c:4174)
==1005261==    by 0x280F99: pl_eval (prolog.c:145)
==1005261==    by 0x119FC2: main (tpl.c:378)
==1005261==  Block was alloc'd at
==1005261==    at 0x4846828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1005261==    by 0x2456ED: make_string_internal (parser.c:122)
==1005261==    by 0x25DAE5: tokenize (parser.c:4076)
==1005261==    by 0x25AC2A: tokenize (parser.c:3635)
==1005261==    by 0x25E1AE: run (parser.c:4114)
==1005261==    by 0x280F99: pl_eval (prolog.c:145)
==1005261==    by 0x119FC2: main (tpl.c:378)
==1005261== 
   true.
?- 

@guregu
Copy link
Contributor

guregu commented Nov 26, 2024

I found that running turning on trace on the one I posted explodes partway through with some corrupted pointers, might be a hint:

?- trace, test(X).
...
[0:0x6626000:1759:f14:fp15:cp3:sp493:hp241:tp9] CALL loader:strip_module(put_terminating(_1,from_to(inf,sup),fd_props([],[],[])),_487,_488)
==3137== Invalid read of size 8
==3137==    at 0x23F848: match_op_internal (module.c:1303)
==3137==    by 0x23FCE3: match_op (module.c:1384)
==3137==    by 0x271337: print_term_to_buf_ (print.c:996)
==3137==    by 0x27BE3B: print_term_to_buf (print.c:1491)
==3137==    by 0x27CB03: print_term_to_strbuf (print.c:1579)
==3137==    by 0x282AE3: trace_call (query.c:111)
==3137==    by 0x289D3F: start (query.c:1615)
==3137==    by 0x28A4BB: execute (query.c:1779)
==3137==    by 0x2605D3: run (parser.c:4161)
==3137==    by 0x27DD43: pl_eval (prolog.c:145)
==3137==    by 0x118D1F: main (tpl.c:378)
==3137==  Address 0x2068 is not stack'd, malloc'd or (recently) free'd

infradig added a commit that referenced this issue Nov 26, 2024
@infradig
Copy link
Contributor

Introduced problem elsewhere it seems.

infradig added a commit that referenced this issue Nov 26, 2024
infradig added a commit that referenced this issue Nov 26, 2024
@infradig
Copy link
Contributor

I'm going with this fix. It just introduces a single small memory leak on initialization goals. I'll look into it again tomorrow.

@guregu
Copy link
Contributor

guregu commented Nov 26, 2024

Thanks, it looks better now for sure. Trace thing still happens but maybe unrelated.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants