boomer selects suboptimal solution in simple 3-node problem #158

cmungall · 2021-02-05T17:35:40Z

for text files see #157.

Given:

Pr(A properSubClassOf C) = 0.99
Pr(A equiv B) = 0.95
Pr(B equiv C) = 0.95

(in each case, the only other possibility is siblingOf)

note each class is in a separate prefix space, so there is no penalty for equivalence between any

Solutions:

1,2,3 : incoherent
1,2 : .99 * .95 * (1-.95) = 0.04
1,3 : .99 * .95 * (1-.95) = 0.04
2,3 : .95 * .95 * (1-0.99) = 0.009
1 : .99 * .05 * .05 = 0.0023
2 : .01 * .95 * .05 = 0.000475
3 : .01 * .95 * .05 = 0.000475
{} : .01 * .05 * .05 = 2.5e-05

boomer generally selects {1} depending on params, but never the optimal

I am pretty sure I have not made a typo - I put each class in its own ID space, so it is not avoiding 2 or 3 (which would happen if A/B/C were in the same ID space)

boomer -p prefixes.yaml -w 100 -r 1000 -t ptable.tsv --ontology logical.omn 
...
2021.02.05 09:23:19:376 [zio-def...] [INFO ] org.monarchinitiative.boomer.Main.program:49 - Most probable: 0.0024750000000000015
...
$ more output.txt 
A:1 SiblingOf B:1               0.05
B:1 SiblingOf C:1               0.05
A:1 ProperSubClassOf C:1        (most probable) 0.99

The text was updated successfully, but these errors were encountered:

cmungall · 2021-02-05T17:38:46Z

I can confirm it's not avoiding any collapses, as if I reduce the ptable to omit 1

ie

A:1	B:1	0.0	0.0	0.95	0.05
B:1	C:1	0.0	0.0	0.95	0.05
A:1	C:1	0.99	0.0	0.01	0.0

then it correctly finds

B:1 EquivalentTo C:1    (most probable) 0.95
A:1 EquivalentTo B:1    (most probable) 0.95

balhoff · 2021-02-05T19:58:43Z

I think the issue here is the high number of "windows" requested (100). Input rows are sorted according to their best probability, then the list of rows is chunked into the given number of windows. Across each independent run, shuffling occurs within each window, but the windows stay in the same total order. So it will always first add A ProperSubClassOf C. If you use a window value of 1, the rows are completely randomized and it is able to find the best solution.

balhoff · 2021-02-05T22:42:48Z

See the logging at the beginning of a run (with 100 windows requested):

2021.02.05 14:32:54:070 [zio-def...] [INFO ] org.monarchinitiative.boomer.Boom.evaluate:30 - Bin size: 1; Most probable: 0.99
2021.02.05 14:32:54:091 [zio-def...] [INFO ] org.monarchinitiative.boomer.Boom.evaluate:30 - Bin size: 2; Most probable: 0.95
2021.02.05 14:32:54:095 [zio-def...] [INFO ] org.monarchinitiative.boomer.Boom.evaluate:33 - Max possible joint probability: -0.11263692462860261

The axioms in the first bin will always be added before proceeding to the next bin. Different runs will just shuffle the order of the two items in the second bin.

cmungall · 2021-02-06T01:03:42Z

my ticket is in error... more later

balhoff · 2021-04-30T20:24:01Z

I think we cleared this up. "windows" may not be as obvious as they ought to be but I think the UI will continue to evolve.

cmungall · 2023-01-31T05:07:27Z

still an issue

A:1	B:1	0.0	0.0	0.95	0.05
B:1	C:1	0.0	0.0	0.95	0.05
A:1	C:1	0.99	0.0	0.01	0.0

running
boomer -t triangle.ptable.tsv -a triangle.owl -p prefixes.yaml -r 500 -w 1 -e 200 --output-internal-axioms true

yields

## SINGLETONS
Method: singletons
Score: -0.05129329438755058
Estimated probability: 1.0
Confidence: 1.0
Subsequent scores (max 10):

- [B:1](http://purl.obolibrary.org/obo/B_1) EquivalentTo [C:1](http://purl.obolibrary.org/obo/C_1)      (most probable) 0.95

and an incoherent output.ofn

balhoff mentioned this issue Feb 5, 2021

example where boomer fails #157

Closed

balhoff closed this as completed Apr 30, 2021

cmungall reopened this Jan 31, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

boomer selects suboptimal solution in simple 3-node problem #158

boomer selects suboptimal solution in simple 3-node problem #158

cmungall commented Feb 5, 2021

cmungall commented Feb 5, 2021

balhoff commented Feb 5, 2021

balhoff commented Feb 5, 2021

cmungall commented Feb 6, 2021

balhoff commented Apr 30, 2021

cmungall commented Jan 31, 2023

boomer selects suboptimal solution in simple 3-node problem #158

boomer selects suboptimal solution in simple 3-node problem #158

Comments

cmungall commented Feb 5, 2021

cmungall commented Feb 5, 2021

balhoff commented Feb 5, 2021

balhoff commented Feb 5, 2021

cmungall commented Feb 6, 2021

balhoff commented Apr 30, 2021

cmungall commented Jan 31, 2023