Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor git-pull #3875

Draft
wants to merge 5 commits into
base: master
Choose a base branch
from
Draft

Conversation

raypappa
Copy link

@raypappa raypappa commented Dec 30, 2024

I noticed that git pull was error'ing out when trying to restart HA due to a lack of supervisor token. I opted to update the script to use bashio contexts and functions to smooth out the use of it.

Summary by CodeRabbit

  • Chores
    • Simplified Dockerfile by removing Home Assistant CLI setup.
    • Updated run.sh script to use the bashio library for improved configuration management.
    • Enhanced error handling and logging in the deployment script.
    • Improved maintainability and clarity in the Git synchronization logic.

There is a bug present causing git-pull to no longer be able to restart or validate the configuration after updating the git repo.

I suspect part of this is related bashio, possibly related to not using with-contenv.

In this commit/pr I'm making use of the addon support present in bashio to avoid direct use of jq, and api calls.

home-assistant#3714
…x a failure when dealing with checking if files matched
Copy link

@home-assistant home-assistant bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @raypappa

It seems you haven't yet signed a CLA. Please do so here.

Once you do that we will be able to review and accept this pull request.

Thanks!

@home-assistant home-assistant bot marked this pull request as draft December 30, 2024 03:07
@home-assistant
Copy link

Please take a look at the requested changes, and use the Ready for review button when you are done, thanks 👍

Learn more about our pull request process.

@raypappa raypappa marked this pull request as ready for review December 30, 2024 03:09
Copy link
Contributor

coderabbitai bot commented Dec 30, 2024

📝 Walkthrough

Walkthrough

The pull request involves modifications to the Dockerfile and run.sh script in the git_pull directory. The Dockerfile changes remove Home Assistant CLI-related setup, eliminating specific build arguments and CLI binary download commands. The run.sh script has been comprehensively refactored to use the bashio library, replacing jq configuration parsing with bashio::config calls, updating logging mechanisms, and improving error handling and control flow for Git repository synchronization.

Changes

File Change Summary
git_pull/Dockerfile - Removed BUILD_ARCH argument
- Removed CLI_VERSION argument
- Deleted Home Assistant CLI binary download command
git_pull/data/run.sh - Updated shebang to use bashio environment
- Replaced jq configuration parsing with bashio::config
- Updated logging to use bashio::log
- Refactored git-synchronize function with improved error handling
- Enhanced configuration change validation

Sequence Diagram

sequenceDiagram
    participant Script as run.sh
    participant Bashio as Bashio Library
    participant Git as Git Repository
    
    Script->>Bashio: Load configuration
    Bashio-->>Script: Configuration data
    
    Script->>Git: Check repository exists
    alt Repository does not exist
        Script->>Git: Clone repository
    else Repository exists
        Script->>Git: Synchronize changes
    end
    
    Script->>Bashio: Validate configuration
    alt Configuration valid
        Script->>Script: Restart if changes detected
    else Configuration invalid
        Script->>Bashio: Log error and exit
    end
Loading

Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai generate docstrings to generate docstrings for this PR. (Beta)
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai or @coderabbitai title anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🔭 Outside diff range comments (1)
git_pull/data/run.sh (1)

Add file permission restrictions to git credentials file

The verification shows that while the script properly sets secure permissions (600) for SSH-related files, it lacks similar protection for the git credentials file. The original review comment's concern is valid as the credentials file permissions are not explicitly set.

Recommended changes:

  • Add chmod 600 /tmp/git-credentials after storing the credentials to restrict read/write access to only the owner, similar to how SSH files are handled in the script.
🔗 Analysis chain

Line range hint 79-114: Review security implications of credential storage.

While storing credentials in /tmp/git-credentials is temporary, consider:

  1. The file permissions are not explicitly set
  2. Credentials are stored in plaintext

Consider adding file permission restrictions:

 # Use git commands to write the credentials to ~/.git-credentials
 bashio::log.info "[Info] Saving git credentials to /tmp/git-credentials"
 # shellcheck disable=SC2259
 git credential fill | git credential approve <<< "$cred_data"
+# Restrict file permissions
+chmod 600 /tmp/git-credentials

Let me verify the security implications by checking the codebase for any existing file permission handling.


Let me check if there are any file permission settings in the initialization or setup parts of the script.

🏁 Scripts executed

The following scripts were executed for the analysis:

Error: No shell scripts found


Script:

#!/bin/bash
# Check if there are any file permission settings for git-credentials
rg -A 2 -B 2 "git-credentials" 

# Look for chmod commands related to credentials
rg "chmod.*credentials"

# Check if there are any other credential helper configurations
rg "credential.helper"

Length of output: 1004


Script:

#!/bin/bash
# Check the entire run.sh file for any umask or permission related settings
rg -A 2 -B 2 "umask|chmod" git_pull/data/run.sh

# Check if there are any security-related initialization functions
ast-grep --pattern 'function init$$$' git_pull/data/run.sh

# Look for any file permission related settings in the script
rg "0[0-9]{3}" git_pull/data/run.sh

Length of output: 278

🧹 Nitpick comments (5)
git_pull/data/run.sh (5)

67-77: Enhance error handling in SSH connection check.

The SSH connection check could be more robust by:

  1. Adding explicit error message when SSH check fails
  2. Handling different SSH connection failure scenarios

Consider this improvement:

 if OUTPUT_CHECK=$(ssh -T -o "StrictHostKeyChecking=no" -o "BatchMode=yes" "$DOMAIN" 2>&1) || { [[ $DOMAIN = *"@github.com"* ]] && [[ $OUTPUT_CHECK = *"You've successfully authenticated"* ]]; }; then
     bashio::log.info "[Info] Valid SSH connection for $DOMAIN"
 else
-    bashio::log.warn "[Warn] No valid SSH connection for $DOMAIN"
+    bashio::log.warn "[Warn] SSH connection failed for $DOMAIN: $OUTPUT_CHECK"
     add-ssh-key
+    # Verify connection after adding key
+    if ! ssh -T -o "StrictHostKeyChecking=no" -o "BatchMode=yes" "$DOMAIN" 2>&1; then
+        bashio::log.error "[Error] SSH connection still failing after adding key"
+    fi
 fi

136-138: Optimize git fetch operation.

The current git fetch operation fetches all branches. For better performance, especially in repositories with many branches, consider fetching only the required branch.

-git fetch "$GIT_REMOTE" || bashio::exit.nok "[Error] Git fetch failed";
+git fetch "$GIT_REMOTE" "$GIT_BRANCH" || bashio::exit.nok "[Error] Git fetch failed";

191-192: Enhance changed files logging.

The current logging of changed files could be more readable for long lists of files.

-bashio::log.info "Changed Files: $CHANGED_FILES"
+bashio::log.info "Changed Files:"
+for file in $CHANGED_FILES; do
+    bashio::log.info "  - $file"
+done

Line range hint 235-246: Consider adding error recovery mechanism.

The main loop could benefit from error recovery logic to handle temporary failures.

 while true; do
+    # Add retry mechanism for temporary failures
+    RETRY_COUNT=0
+    MAX_RETRIES=3
+    while [ $RETRY_COUNT -lt $MAX_RETRIES ]; do
     check-ssh-key
     setup-user-password
-    if git-synchronize ; then
+        if git-synchronize ; then
         validate-config
+            break
+        else
+            RETRY_COUNT=$((RETRY_COUNT + 1))
+            bashio::log.warning "Attempt $RETRY_COUNT of $MAX_RETRIES failed"
+            sleep 10
+        fi
+    done
     fi
     # do we repeat?
     if [ ! "$REPEAT_ACTIVE" == "true" ]; then

Line range hint 1-246: Overall: Well-executed refactoring with good error handling.

The transition to bashio and improved error handling aligns well with the PR objectives. The script should now handle Home Assistant restarts properly. The code is more maintainable and better integrated with the Home Assistant ecosystem.

A few suggestions have been made for further improvements, but none are blocking issues.

Consider implementing the suggested improvements in a follow-up PR:

  1. Enhanced SSH error handling
  2. Secure credential storage
  3. Optimized git fetch
  4. Improved change logging
  5. Error recovery mechanism
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 91160e1 and 6ad3713.

📒 Files selected for processing (2)
  • git_pull/Dockerfile (0 hunks)
  • git_pull/data/run.sh (5 hunks)
💤 Files with no reviewable changes (1)
  • git_pull/Dockerfile
🧰 Additional context used
🪛 Shellcheck (0.10.0)
git_pull/data/run.sh

[error] 1-1: This shebang was unrecognized. ShellCheck only supports sh/bash/dash/ksh/'busybox sh'. Add a 'shell' directive to specify.

(SC1008)

🔇 Additional comments (1)
git_pull/data/run.sh (1)

1-20: LGTM! Improved configuration management using bashio.

The transition from jq to bashio::config for configuration management is a good improvement. It provides better integration with Home Assistant's configuration system and more robust error handling.

🧰 Tools
🪛 Shellcheck (0.10.0)

[error] 1-1: This shebang was unrecognized. ShellCheck only supports sh/bash/dash/ksh/'busybox sh'. Add a 'shell' directive to specify.

(SC1008)

… change suggested by the coderabbit to fetch the specific branch from git
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
git_pull/data/run.sh (1)

Line range hint 79-114: Enhance credential security

Storing credentials in plaintext in /tmp poses a security risk. Consider:

  1. Using memory-based storage (tmpfs)
  2. Implementing credential cleanup
+# Ensure /tmp/git-credentials is on tmpfs
+if ! mountpoint -q /tmp; then
+    bashio::log.warn "[Warn] /tmp is not a tmpfs mount point"
+fi

 git config --system credential.helper 'store --file=/tmp/git-credentials'
+
+# Schedule cleanup of credentials
+trap 'rm -f /tmp/git-credentials' EXIT
🧹 Nitpick comments (5)
git_pull/data/run.sh (5)

Line range hint 59-62: Improve error handling for file restoration

The silent failure of cp commands could lead to missing files without notification. Consider logging the status of these operations.

-cp "${BACKUP_LOCATION}" "!(*.yaml)" /config 2>/dev/null
+if ! cp "${BACKUP_LOCATION}" "!(*.yaml)" /config 2>/dev/null; then
+    bashio::log.info "[Info] No non-yaml files to restore from backup"
+fi

-cp "${BACKUP_LOCATION}/secrets.yaml" /config 2>/dev/null
+if ! cp "${BACKUP_LOCATION}/secrets.yaml" /config 2>/dev/null; then
+    bashio::log.info "[Info] No secrets.yaml to restore from backup"
+fi
🧰 Tools
🪛 Shellcheck (0.10.0)

[error] 43-43: Couldn't parse this brace group. Fix to allow more checks.

(SC1073)


65-77: Enhance domain extraction robustness

The current domain extraction might not handle all Git URL formats correctly. Consider using git-url-parse or a more robust parsing approach.

-IFS=':' read -ra GIT_URL_PARTS <<< "$REPOSITORY"
-DOMAIN="${GIT_URL_PARTS[0]}"
+DOMAIN=$(echo "$REPOSITORY" | sed -E 's#^(git@|https?://)([^/:]+).*$#\2#')

129-131: Remove unreachable code

The return statement after bashio::exit.nok is unreachable.

-        bashio::exit.nok "[Error] git origin does not match $REPOSITORY!";
-        return
+        bashio::exit.nok "[Error] git origin does not match $REPOSITORY!"

200-206: Improve error handling in file matching

Using set +e/-e for grep could be replaced with a more idiomatic approach.

-                    set +e
-                    restart_required_file=$(echo "${changed_file}" | grep "^${restart_ignored_file}")
-                    set -e
+                    restart_required_file=$(echo "${changed_file}" | grep -E "^${restart_ignored_file}" || true)

Line range hint 232-246: Add validation for repeat interval

Consider validating the repeat interval to ensure it's a positive number.

+# Validate repeat interval
+if [ "$REPEAT_ACTIVE" == "true" ] && ! [[ "$REPEAT_INTERVAL" =~ ^[0-9]+$ ]]; then
+    bashio::exit.nok "[Error] Repeat interval must be a positive number"
+fi

 while true; do
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 6ad3713 and 2378c9f.

📒 Files selected for processing (1)
  • git_pull/data/run.sh (5 hunks)
🧰 Additional context used
🪛 Shellcheck (0.10.0)
git_pull/data/run.sh

[error] 1-1: This shebang was unrecognized. ShellCheck only supports sh/bash/dash/ksh/'busybox sh'. Add a 'shell' directive to specify.

(SC1008)

🔇 Additional comments (3)
git_pull/data/run.sh (3)

1-20: Well-structured configuration setup using bashio!

The transition to bashio::config from jq improves reliability and maintainability. The configuration setup properly handles all necessary parameters including arrays.

🧰 Tools
🪛 Shellcheck (0.10.0)

[error] 1-1: This shebang was unrecognized. ShellCheck only supports sh/bash/dash/ksh/'busybox sh'. Add a 'shell' directive to specify.

(SC1008)


136-170: Well-implemented git synchronization logic!

The git synchronization implementation is robust with proper error handling and support for different sync strategies (pull/reset).


Line range hint 24-41: Consider security implications of disabled StrictHostKeyChecking

While disabling StrictHostKeyChecking makes deployment easier, it bypasses SSH host verification which could expose the system to MITM attacks. Consider:

  1. Limiting this to specific known hosts
  2. Adding key validation before writing to the file
🧰 Tools
🪛 Shellcheck (0.10.0)

[error] 1-1: This shebang was unrecognized. ShellCheck only supports sh/bash/dash/ksh/'busybox sh'. Add a 'shell' directive to specify.

(SC1008)

git_pull/data/run.sh Outdated Show resolved Hide resolved
@raypappa
Copy link
Author

raypappa commented Jan 5, 2025

I pushed a fix in commit 82be79c and e84beb4, please review it.

@raypappa
Copy link
Author

@coderabbitai review

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🔭 Outside diff range comments (1)
git_pull/data/run.sh (1)

Line range hint 79-115: Improve credential security.

Storing credentials in plaintext in /tmp/git-credentials poses a security risk.

Consider these improvements:

  1. Use memory-based storage (tmpfs) for credentials
  2. Add cleanup on script exit
  3. Set restrictive file permissions
+# At the start of the script
+trap 'rm -f /dev/shm/git-credentials' EXIT
+
 function setup-user-password {
 if [ -n "$DEPLOYMENT_USER" ]; then
     cd /config || return
     bashio::log.info "[Info] setting up credential.helper for user: ${DEPLOYMENT_USER}"
-    git config --system credential.helper 'store --file=/tmp/git-credentials'
+    git config --system credential.helper 'store --file=/dev/shm/git-credentials'
+    touch /dev/shm/git-credentials
+    chmod 600 /dev/shm/git-credentials
🧹 Nitpick comments (2)
git_pull/data/run.sh (2)

201-207: Improve error handling in file pattern matching.

The use of set +e and set -e around grep commands can be replaced with a more idiomatic approach.

Consider this improvement:

-                    set +e
-                    restart_required_file=$(echo "${changed_file}" | grep "^${restart_ignored_file}")
-                    set -e
+                    restart_required_file=$(echo "${changed_file}" | grep -E "^${restart_ignored_file}" || true)

Line range hint 1-246: Consider implementing logging and monitoring.

While the script is well-structured and properly uses bashio, consider these operational improvements:

  1. Add structured logging for better observability
  2. Implement metrics collection for sync failures and restarts
  3. Add health checks for the Git service

Example structured logging:

+function log_metric() {
+    local metric=$1
+    local value=$2
+    bashio::log.info "{\"metric\": \"$metric\", \"value\": $value, \"timestamp\": \"$(date -u +"%Y-%m-%dT%H:%M:%SZ")\"}"
+}
+
 function git-synchronize {
     # is /config a local git repo?
     if ! git rev-parse --is-inside-work-tree &>/dev/null; then
         bashio::log.warn "[Warn] Git repository doesn't exist"
         git-clone
+        log_metric "git_clone_count" 1
         return
     fi
📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 2378c9f and e84beb4.

📒 Files selected for processing (1)
  • git_pull/data/run.sh (5 hunks)
🔇 Additional comments (3)
git_pull/data/run.sh (3)

1-21: LGTM! Well-structured configuration setup.

The script properly uses bashio for configuration management and follows shell scripting best practices with appropriate shellcheck directives.


116-171: LGTM! Robust Git synchronization implementation.

The function properly handles:

  • Repository initialization
  • Remote URL validation
  • Branch switching
  • Different sync modes (pull/reset)
    with appropriate error handling using bashio.

Line range hint 234-246: LGTM! Well-structured main program loop.

The main program effectively:

  • Handles Git synchronization and config validation
  • Implements repeat functionality with proper interval handling
  • Uses appropriate error handling with bashio

Comment on lines 28 to 31

(
echo "Host *"
echo " StrictHostKeyChecking no"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Consider more secure SSH configuration.

Disabling StrictHostKeyChecking globally is a security risk as it makes the system vulnerable to man-in-the-middle attacks.

Consider this more secure approach:

-    (
-        echo "Host *"
-        echo "    StrictHostKeyChecking no"
-    ) > ~/.ssh/config
+    # Extract domain from repository URL
+    IFS=':' read -ra GIT_URL_PARTS <<< "$REPOSITORY"
+    DOMAIN="${GIT_URL_PARTS[0]#*@}"
+    (
+        echo "Host $DOMAIN"
+        echo "    StrictHostKeyChecking accept-new"
+    ) > ~/.ssh/config

Committable suggestion skipped: line range outside the PR's diff.

Copy link
Member

@agners agners left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like a very nice refactor, thanks for your PR!

Since we no longer use the CLI, please also remove CLI_VERSION from build.yaml.

And finally, to trigger an updated, please increment the version number in config.yaml and add a suitable CHANGELOG.md entry. I'd go for 7.15.0 or maybe even 8.0.0.

If you are at it, you may also consider bumping to the latest Alpine base image (3.21 at this time, see https://github.com/home-assistant/docker-base).

@home-assistant home-assistant bot marked this pull request as draft January 14, 2025 11:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants