Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Cosmovisor Upgrade Bug in v1.7.0 #22731

Open
1 task done
lucas2brh opened this issue Dec 3, 2024 · 7 comments
Open
1 task done

[Bug]: Cosmovisor Upgrade Bug in v1.7.0 #22731

lucas2brh opened this issue Dec 3, 2024 · 7 comments
Labels
C:Cosmovisor Issues and PR related to Cosmovisor T:Bug

Comments

@lucas2brh
Copy link

lucas2brh commented Dec 3, 2024

Is there an existing issue for this?

  • I have searched the existing issues

What happened?

Summary

Cosmovisor appears to have a bug in version v1.7.0. When using the add-upgrade command to schedule an upgrade, the binary is not replaced after reaching the expected height. This issue does not occur in version v1.6.0.

Observations and Configuration

  1. Upgrade Info
    The upgrade target is v0.12.1, and the expected height for the upgrade is 322000.
{"name":"v0.12.1","time":"0001-01-01T00:00:00Z","height":322000}
  1. Cosmovisor Directory Structure
    The directory structure confirms that the upgrade binary for v0.12.1 is properly prepared:
.story/story/cosmovisor/
├── upgrades/
│   └── v0.12.1/
│       └── bin/
│           └── story
└── current -> upgrades/v0.12.0.test3
  1. Prepared Binary
    The binary for the upgrade is located in the correct path and has appropriate permissions:
.story/story/cosmovisor/upgrades/v0.12.1/bin/story
-rwxr-xr-x. 1 ec2-user ec2-user 114919456 Dec 3 12:11 story
  1. Cosmovisor Version
    The active version of Cosmovisor is v1.7.0:
cosmovisor version: v1.7.0

Expected Behavior

At the configured upgrade height (322000), Cosmovisor should replace the binary and start using the v0.12.1 binary.

Actual Behavior

The binary remains unchanged, and the process continues to use the v0.12.0.test3 binary.
Error logs:

Dec 03 12:47:26 ip-172-31-1-73.us-west-1.compute.internal cosmovisor[1679574]: 24-12-03 12:47:26.908 INFO 👾 ABCI call: FinalizeBlock              height=322000 proposer=fe90140
Dec 03 12:47:26 ip-172-31-1-73.us-west-1.compute.internal cosmovisor[1679574]: 24-12-03 12:47:26.908 ERRO UPGRADE "v0.12.1" NEEDED at height: 322000:  module=x/upgrade
Dec 03 12:47:26 ip-172-31-1-73.us-west-1.compute.internal cosmovisor[1679574]: 24-12-03 12:47:26.908 ERRO Finalize req failed [BUG]                height=322000 err="module manager preblocker: UPGRADE \"v0.12.1\" NEEDED at height: 322000: " stacktrace="[errors.go:39 app.go:161 baseapp.go:706 abci.go:756 abci.go:884 cmt_abci.go:44 abci.go:99 local_client.go:185 app_conn.go:104 execution.go:224 execution.go:219 replay.go:534 replay.go:433 replay.go:274 setup.go:182 node.go:359 node.go:279 start.go:251 start.go:133 start.go:56 cmd.go:56 command.go:985 command.go:1117 command.go:1041 command.go:1034 cmd.go:34 main.go:10 proc.go:272 asm_amd64.s:1700]"
Dec 03 12:47:26 ip-172-31-1-73.us-west-1.compute.internal cosmovisor[1679574]: 24-12-03 12:47:26.909 ERRO error in proxyAppConn.FinalizeBlock      module=consensus err="module manager preblocker: UPGRADE \"v0.12.1\" NEEDED at height: 322000: " stacktrace="[errors.go:39 app.go:161 baseapp.go:706 abci.go:756 abci.go:884 cmt_abci.go:44 abci.go:99 local_client.go:185 app_conn.go:104 execution.go:224 execution.go:219 replay.go:534 replay.go:433 replay.go:274 setup.go:182 node.go:359 node.go:279 start.go:251 start.go:133 start.go:56 cmd.go:56 command.go:985 command.go:1117 command.go:1041 command.go:1034 cmd.go:34 main.go:10 proc.go:272 asm_amd64.s:1700]"
Dec 03 12:47:26 ip-172-31-1-73.us-west-1.compute.internal cosmovisor[1679574]: 24-12-03 12:47:26.909 ERRO !! Fatal error occurred, app died️ unexpectedly !! err="create comet node: create node: error during handshake: error on replay: module manager preblocker: UPGRADE \"v0.12.1\" NEEDED at height: 322000: " stacktrace="[errors.go:39 start.go:261 start.go:133 start.go:56 cmd.go:56 command.go:985 command.go:1117 command.go:1041 command.go:1034 cmd.go:34 main.go:10 proc.go:272 asm_amd64.s:1700]"

Additional Notes

  • The issue is not reproducible on v1.6.0.
  • The issue occurs when using a block with a significantly distant number, requiring approximately 2 hours to reach.
  • Cosmovisor works fine when scheduling with a relatively near number, such as 40,000 blocks.

Environment

  • Cosmovisor Version: v1.7.0

How to reproduce?

  1. Use Cosmovisor v1.7.0 with the add-upgrade command to schedule an upgrade.
  2. Prepare the directory structure and binaries as shown above.
  3. Set the upgrade height to a testable block.
    • Note: The issue occurs when using a block with a significantly distant number, requiring approximately 2 hours to reach.
  4. Observe the behavior after the block is reached.

Please investigate this issue and provide a fix or workaround. Let me know if further information is required.

@lucas2brh lucas2brh added the T:Bug label Dec 3, 2024
@github-project-automation github-project-automation bot moved this to 📋 Backlog in Cosmos-SDK Dec 3, 2024
@julienrbrt
Copy link
Member

@akhilkumarpilli are you able to check this?

@julienrbrt julienrbrt added the C:Cosmovisor Issues and PR related to Cosmovisor label Dec 3, 2024
@dylanschultzie
Copy link

Can confirm this bug.

@dasanchez
Copy link

Hi! This may also happen without using an add-upgrade command (i.e. the upgrade is governance-gated).

@akhilkumarpilli akhilkumarpilli moved this from 📋 Backlog to 🤸‍♂️ In Progress in Cosmos-SDK Dec 12, 2024
@akhilkumarpilli
Copy link
Contributor

Hey @lucas2brh, we attempted to reproduce the issue with simapp by testing an upgrade scheduled for a block that takes 5 hours to reach. However, cosmovisor functioned as expected and the binary was successfully replaced after the upgrade. Could you please share additional logs or details that might help us investigate further? Thanks!

@akhilkumarpilli akhilkumarpilli moved this from 🤸‍♂️ In Progress to 👀 Waiting / In review in Cosmos-SDK Dec 22, 2024
@lucas2brh
Copy link
Author

lucas2brh commented Dec 30, 2024

Hi @akhilkumarpilli, thanks for your response.

I reproduced the issue with our Story client. FYI:

  1. Spin up the node with geth-0.10.1 and story-0.12.0.
  2. Schedule an upgrade with:
    {
      "name": "v0.12.1",
      "time": "0001-01-01T00:00:00Z",
      "height": 322000
    }
  3. Wait for the height to be reached, and you’ll encounter the error again.
Dec 26 03:00:25 ip-172-31-1-73.us-west-1.compute.internal cosmovisor[2235044]: 24-12-26 03:00:25.841 ERRO UPGRADE "v0.12.1" NEEDED at height: 322000:  module=x/upgrade
Dec 26 03:00:25 ip-172-31-1-73.us-west-1.compute.internal cosmovisor[2235044]: 24-12-26 03:00:25.841 ERRO Finalize req failed [BUG]                height=322000 err="module manager preblocker: UPGRADE \"v0.12.1\" NEEDED at height: 322000: " stacktrace="[errors.go:39 app.go:161 baseapp.go:706 abci.go:756 abci.go:884 cmt_abci.go:44 abci.go:99 local_client.go:185 app_conn.go:104 execution.go:224 execution.go:219 replay.go:534 replay.go:433 replay.go:274 setup.go:182 node.go:359 node.go:279 start.go:251 start.go:133 start.go:56 cmd.go:56 command.go:985 command.go:1117 command.go:1041 command.go:1034 cmd.go:34 main.go:10 proc.go:272 asm_amd64.s:1700]"
Dec 26 03:00:25 ip-172-31-1-73.us-west-1.compute.internal cosmovisor[2235044]: 24-12-26 03:00:25.841 ERRO error in proxyAppConn.FinalizeBlock      module=consensus err="module manager preblocker: UPGRADE \"v0.12.1\" NEEDED at height: 322000: " stacktrace="[errors.go:39 app.go:161 baseapp.go:706 abci.go:756 abci.go:884 cmt_abci.go:44 abci.go:99 local_client.go:185 app_conn.go:104 execution.go:224 execution.go:219 replay.go:534 replay.go:433 replay.go:274 setup.go:182 node.go:359 node.go:279 start.go:251 start.go:133 start.go:56 cmd.go:56 command.go:985 command.go:1117 command.go:1041 command.go:1034 cmd.go:34 main.go:10 proc.go:272 asm_amd64.s:1700]"
Dec 26 03:00:25 ip-172-31-1-73.us-west-1.compute.internal cosmovisor[2235044]: 24-12-26 03:00:25.841 ERRO !! Fatal error occurred, app died️ unexpectedly !! err="create comet node: create node: error during handshake: error on replay: module manager preblocker: UPGRADE \"v0.12.1\" NEEDED at height: 322000: " stacktrace="[errors.go:39 start.go:261 start.go:133 start.go:56 cmd.go:56 command.go:985 command.go:1117 command.go:1041 command.go:1034 cmd.go:34 main.go:10 proc.go:272 asm_amd64.s:1700]"
Dec 26 03:00:25 ip-172-31-1-73.us-west-1.compute.internal cosmovisor[2235036]: Error: exit status 1

Related

[ec2-user@ip-172-31-1-73 ~]$ cosmovisor version
cosmovisor version: v1.7.0
5:44AM INF running app args=["version"] module=cosmovisor path=/home/ec2-user/.story/story/cosmovisor/genesis/bin/story
5:44AM INF starting the batch watcher loop module=cosmovisor
Version       v0.12.0-stable
Git Commit    fcfb283
Git Timestamp 2024-10-25T04:32:37Z
[ec2-user@ip-172-31-1-73 ~]$ go version
go version go1.23.4 linux/amd64
[ec2-user@ip-172-31-1-73 ~]$ cat ~/.story/story/data/upgrade-info.json
{"name":"v0.12.1","time":"0001-01-01T00:00:00Z","height":322000}

@akhilkumarpilli
Copy link
Contributor

@lucas2brh That's strange, we will try to reproduce issue with story client.

@akhilkumarpilli akhilkumarpilli moved this from 👀 Waiting / In review to 🤸‍♂️ In Progress in Cosmos-SDK Dec 30, 2024
@akhilkumarpilli
Copy link
Contributor

@lucas2brh, would you be able to share any documentation or steps on scheduling an upgrade on-chain using story and geth clients?

@akhilkumarpilli akhilkumarpilli moved this from 🤸‍♂️ In Progress to 👀 Waiting / In review in Cosmos-SDK Jan 3, 2025
@akhilkumarpilli akhilkumarpilli removed their assignment Jan 6, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C:Cosmovisor Issues and PR related to Cosmovisor T:Bug
Projects
Status: 👀 Waiting / In review
Development

No branches or pull requests

5 participants