Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for arm64 #2589

Open
wants to merge 22 commits into
base: main
Choose a base branch
from

Conversation

mathieu-benoit
Copy link
Contributor

@mathieu-benoit mathieu-benoit commented Jun 15, 2024

This PR has not yet perfectly tested the full arm64 support, but it's helping make progress towards this goal.

Here is what covers this PR:

Here is what this PR is waiting for before going in an official "Ready for review" state from "Draft":

Here is what this PR is not covering and has not yet tested:

JFYI, not related to the arm64 support, here is what has been done too while here:

  • adservice: eclipse-temurin:21.0.4_7 --> eclipse-temurin:21.0.5_11
  • cartservice: .NET 9.0.0 --> 9.0.1
  • checkoutservice, frontend, productcatalogservice, shippingservice: golang:1.23.2 --> golang:1.23.4
  • currencyservice, paymentservice: node:20.17.0-alpine --> node:20.18.1-alpine
  • emailservice: python:3.12.7-alpine --> python:3.12.8-alpine
  • loadgenerator: python:3.12.8-slim --> python:3.12.8-alpine
  • shoppingassistantservice: python:3.12.6-slim --> python:3.12.8-slim

@mathieu-benoit mathieu-benoit requested review from yoshi-approver and a team as code owners June 15, 2024 18:17
@mathieu-benoit mathieu-benoit marked this pull request as draft June 15, 2024 18:48
@bourgeoisor
Copy link
Member

Hi @mathieu-benoit ! What do you think is the effort needed to make this work?

From comments here it looks like there's still a ton of want for this feature: #622

@mathieu-benoit
Copy link
Contributor Author

mathieu-benoit commented Dec 30, 2024

Hi @mathieu-benoit ! What do you think is the effort needed to make this work?

From comments here it looks like there's still a ton of want for this feature: #622

@bourgeoisor and @NimJay, please could you approve the CI tests on this PR? I'd like to see what my latest changes will produce. Thanks!

@bourgeoisor bourgeoisor changed the title arm64 Add support for arm64 Jan 13, 2025
@bourgeoisor bourgeoisor mentioned this pull request Jan 13, 2025
@bourgeoisor
Copy link
Member

@mathieu-benoit tests are passing. Thank you for your patience; getting back from the holidays! Feel free to DM me if I don't respond within a few days.

@mathieu-benoit mathieu-benoit marked this pull request as ready for review January 14, 2025 00:13
@mathieu-benoit
Copy link
Contributor Author

mathieu-benoit commented Jan 14, 2025

@bourgeoisor, thanks!

I think this is now ready for your review.

Again, it's not publishing the arm64 images, but at least this PR is supposed to make the containers able to run locally on an arm64 platform. If someone can test this locally on MacOS, I think that would be very beneficial.

@vlsi
Copy link

vlsi commented Jan 15, 2025

@mathieu-benoit , I've tested the branch with Apple M1. The branch seems to fix the build issues. skaffold run executes, and the demo app seems to be running with OrbStack k8s:

microservices-demo frontend page

The resulting images seem to be ARM64 as well.

@mathieu-benoit
Copy link
Contributor Author

Thanks @vlsi, for testing and following up, that's good news!

@bourgeoisor
Copy link
Member

Hi @mathieu-benoit! I just tested this by skaffold dev from my ARM MacBook, deploying to C4A nodes on GKE (ARM-based), and it looked like the two Python services fail with a "wrong architecture"-like error. Any ideas what could've gone wrong? The Google Cloud project did not have any prior Online Boutique images.

~/w/microservices-demo-mathieu (arm64✔=) k get pods
NAME                                     READY   STATUS             RESTARTS        AGE
adservice-55d4f88848-z622l               1/1     Running            0               14m
cartservice-54d4fb9d68-hmdw5             1/1     Running            0               14m
checkoutservice-76bd6c7dd8-z76n2         1/1     Running            0               14m
currencyservice-5dffc78857-n8thw         1/1     Running            0               14m
emailservice-7d4576ccd5-jx2xf            0/1     CrashLoopBackOff   7 (3m21s ago)   14m
frontend-5b7fdc5b7d-4g5jh                1/1     Running            0               14m
paymentservice-cb89cfbbc-tdtw2           1/1     Running            0               14m
productcatalogservice-74899b6887-p4ptk   1/1     Running            0               14m
recommendationservice-58d97df5b8-fr54s   0/1     CrashLoopBackOff   7 (3m22s ago)   14m
redis-cart-7756c55f85-8242m              1/1     Running            0               14m
shippingservice-7dd5b87767-9dmhz         1/1     Running            0               14m

~/w/microservices-demo-mathieu (arm64✔=) k logs emailservice-7d4576ccd5-jx2xf
exec /usr/local/bin/python: exec format error

~/w/microservices-demo-mathieu (arm64✔=) k logs recommendationservice-58d97df5b8-fr54s
exec /usr/local/bin/python: exec format error

@mathieu-benoit
Copy link
Contributor Author

mathieu-benoit commented Jan 15, 2025

@vlsi, guessing that's the same for you where recommendationservice and emailservice are in same CrashLoopBackOff error?

@bourgeoisor, could you please deploy the loadgenerator too in order to see if it's having the same error? It's another python app but without any apk in the Dockerfile, so trying to narrow down and see if this can come from this.

@vlsi
Copy link

vlsi commented Jan 15, 2025

Looks like the services are fine for me:

NAME                                     READY   STATUS    RESTARTS        AGE
adservice-5b779c9565-jrd7f               1/1     Running   0               8h
cartservice-6cb7856b7f-6b2hv             1/1     Running   0               8h
checkoutservice-55bfc5bcfb-6nqb5         1/1     Running   0               8h
currencyservice-5d5b997c67-g25jd         1/1     Running   2 (52m ago)     8h
emailservice-7b94467544-kj5n8            1/1     Running   0               8h
frontend-ffb8f574-dsps8                  1/1     Running   0               8h
loadgenerator-6c95876f4d-jwjk6           1/1     Running   0               8h
paymentservice-5cb7f77ff7-fkl88          1/1     Running   1 (4h22m ago)   8h
productcatalogservice-74cdb9fb45-cvz8n   1/1     Running   0               8h
recommendationservice-6d9874d78-sdvsf    1/1     Running   0               8h
redis-cart-59cd576876-n7hqk              1/1     Running   0               8h
shippingservice-74f86dd47f-xrs8g         1/1     Running   0               8h

Here's the output from the loadgenerator pod:

Type     Name                                                                          # reqs      # fails |    Avg     Min     Max    Med |   req/s  failures/s
--------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|-----------
GET      /                                                                               3161     6(0.19%) |     29       0    2410     25 |    0.00        0.00
GET      /cart                                                                           9221     9(0.10%) |     30       1   11247     23 |    0.20        0.00
POST     /cart                                                                           9099     3(0.03%) |     18       2   11826     13 |    0.30        0.00
POST     /cart/checkout                                                                  2992    16(0.53%) |     17       3    2504     13 |    0.20        0.00
GET      /product/0PUK6V6EV0                                                             4358     4(0.09%) |     27       4    7486     22 |    0.00        0.00
GET      /product/1YMWWN1N4O                                                             4439     3(0.07%) |     24       4    5573     22 |    0.10        0.00
GET      /product/2ZYFJ3GM2N                                                             4534     2(0.04%) |     32       4    8596     22 |    0.10        0.00
GET      /product/66VCHSJNUP                                                             4473     6(0.13%) |     27       1    5691     21 |    0.40        0.00
GET      /product/6E92ZMYYFZ                                                             4448     3(0.07%) |     29       4    8731     22 |    0.10        0.00
GET      /product/9SIQT8TOJO                                                             4386     1(0.02%) |     28       4    7325     22 |    0.50        0.00
GET      /product/L9ECAV7KIM                                                             4419     2(0.05%) |     25       1    6295     22 |    0.30        0.00
GET      /product/LS4PSXUNUM                                                             4427     3(0.07%) |     24       5    4947     21 |    0.20        0.00
GET      /product/OLJCESPC7Z                                                             4275     3(0.07%) |     29       2    9290     22 |    0.10        0.00
POST     /setCurrency                                                                    6178     6(0.10%) |     41       3   20729     26 |    0.20        0.00
--------|----------------------------------------------------------------------------|-------|-------------|-------|-------|-------|-------|--------|-----------
         Aggregated                                                                     70410    67(0.10%) |     27       0   20729     21 |    2.70        0.00

Previously, I was trying to adjust SHAs since they referred AMD64 ids.

For instance (see https://hub.docker.com/layers/library/python/3.12.7-alpine/images/sha256-b83d5ec7274bee17d2f4bd0bfbb082f156241e4513f0a37c70500e1763b1d90d). The images are multi-arch ones, so we should use multi-arch sha rather than amd64 sha)

-FROM python:3.12.7-alpine@sha256:b83d5ec7274bee17d2f4bd0bfbb082f156241e4513f0a37c70500e1763b1d90d AS base
+FROM python:3.12.7-alpine@sha256:5049c050bdc68575a10bcb1885baa0689b6c15152d8a56a7e399fb49f783bf98 AS base

However, I reverted the changes when trying the PR branch.

@vlsi
Copy link

vlsi commented Jan 15, 2025

I just checked python binary within emailservice container, and it is indeed x86-64:

% file python3.12
python3.12: ELF 64-bit LSB pie executable, x86-64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-musl-x86_64.so.1, BuildID[sha1]=9e60ed8a434f8548b6287d5ee1b32b36d3c05662, stripped

@mathieu-benoit
Copy link
Contributor Author

mathieu-benoit commented Jan 15, 2025

The images are multi-arch ones, so we should use multi-arch sha rather than amd64 sha)

@vlsi, this 👆 is a really good point, I should update the Dockerfiles accordingly, commit about that coming soon, it will be a good start to continue the tests based on that. I'll keep you posted when done. Thanks for your help and your patience! 😃

@mathieu-benoit
Copy link
Contributor Author

@bourgeoisor, could you please approve the run of the CI with latest commits? Thanks!

@vlsi
Copy link

vlsi commented Jan 16, 2025

The build fails now:

#9 190.4       cc1: warning: command-line option '-std=c++14' is valid for C++/ObjC++ but not for C
#9 190.4       In file included from third_party/abseil-cpp/absl/base/internal/low_level_alloc.cc:26:
#9 190.4       third_party/abseil-cpp/absl/base/internal/direct_mmap.h:36:10: fatal error: linux/unistd.h: No such file or directory
#9 190.4          36 | #include <linux/unistd.h>
#9 190.4             |          ^~~~~~~~~~~~~~~~
...
#9 190.4 Failed to build grpcio
#9 190.9 ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (grpcio)
#9 ERROR: process "/bin/sh -c pip install -r requirements.txt" did not complete successfully: exit code: 1
...
190.9 ERROR: ERROR: Failed to build installable wheels for some pyproject.toml based projects (grpcio)
------
Dockerfile:25
--------------------
  23 |     # get packages
  24 |     COPY requirements.txt .
  25 | >>> RUN pip install -r requirements.txt
  26 |
  27 |     FROM base
--------------------
ERROR: failed to solve: process "/bin/sh -c pip install -r requirements.txt" did not complete successfully: exit code: 1
Building [recommendationservice]...
Target platforms: [linux/arm64]
Build [recommendationservice] was canceled
Building [shippingservice]...
Target platforms: [linux/arm64]
Build [shippingservice] was canceled
Building [productcatalogservice]...
Target platforms: [linux/arm64]
Build [productcatalogservice] was canceled
build [emailservice] failed: exit status 1. Docker build ran into internal error. Please retry.
If this keeps happening, please open an issue.

@vlsi
Copy link

vlsi commented Jan 16, 2025

The following fixes the build to a certain degree:

diff --git a/src/emailservice/Dockerfile b/src/emailservice/Dockerfile
index 9105a5a5..3a69c2a3 100644
--- a/src/emailservice/Dockerfile
+++ b/src/emailservice/Dockerfile
@@ -17,7 +17,7 @@ FROM --platform=$BUILDPLATFORM python:3.12.8-alpine@sha256:54bec49592c8455de8d59
 FROM base AS builder

 RUN apk update \
-    && apk add --no-cache wget g++ \
+    && apk add --no-cache wget g++ linux-headers libstdc++ \
     && rm -rf /var/cache/apk/*

 # get packages
diff --git a/src/recommendationservice/Dockerfile b/src/recommendationservice/Dockerfile
index 72add8de..3653e7b6 100644
--- a/src/recommendationservice/Dockerfile
+++ b/src/recommendationservice/Dockerfile
@@ -20,8 +20,18 @@ RUN apk update \
     && apk add --no-cache \
         wget \
         g++ \
+        linux-headers \
+        libstdc++ \
     && rm -rf /var/cache/apk/*

However, both emailservice and recommendationservice still fail:

 - deployment/productcatalogservice is ready. [10/11 deployment(s) still pending]
 - deployment/checkoutservice is ready. [9/11 deployment(s) still pending]
 - deployment/currencyservice is ready. [8/11 deployment(s) still pending]
 - deployment/emailservice: container server terminated with exit code 1
    - pod/emailservice-744746f964-xm2cr: container server terminated with exit code 1
      > [emailservice-744746f964-xm2cr server] Traceback (most recent call last):
      > [emailservice-744746f964-xm2cr server]   File "/email_server/email_server.py", line 22, in <module>
      > [emailservice-744746f964-xm2cr server]     import grpc
      > [emailservice-744746f964-xm2cr server]   File "/usr/local/lib/python3.12/site-packages/grpc/__init__.py", line 22, in <module>
      > [emailservice-744746f964-xm2cr server]     from grpc import _compression
      > [emailservice-744746f964-xm2cr server]   File "/usr/local/lib/python3.12/site-packages/grpc/_compression.py", line 20, in <module>
      > [emailservice-744746f964-xm2cr server]     from grpc._cython import cygrpc
      > [emailservice-744746f964-xm2cr server] ImportError: Error loading shared library libstdc++.so.6: No such file or directory (needed by /usr/local/lib/python3.12/site-packages/grpc/_cython/cygrpc.cpython-312-aarch64-linux-musl.so)
 - deployment/emailservice failed. Error: container server terminated with exit code 1.

@vlsi
Copy link

vlsi commented Jan 16, 2025

Here's the fix: linux-headers is needed during the build time, and libstdc++ is needed in the runtime.

diff --git a/src/emailservice/Dockerfile b/src/emailservice/Dockerfile
index 9105a5a5..6f6c2dcb 100644
--- a/src/emailservice/Dockerfile
+++ b/src/emailservice/Dockerfile
@@ -17,7 +17,7 @@ FROM --platform=$BUILDPLATFORM python:3.12.8-alpine@sha256:54bec49592c8455de8d59
 FROM base AS builder

 RUN apk update \
-    && apk add --no-cache wget g++ \
+    && apk add --no-cache wget g++ linux-headers \
     && rm -rf /var/cache/apk/*

 # get packages
@@ -30,6 +30,11 @@ ENV PYTHONUNBUFFERED=1
 # Enable Profiler
 ENV ENABLE_PROFILER=1

+RUN apk update \
+    && apk add --no-cache \
+        libstdc++  \
+    && rm -rf /var/cache/apk/*
+
 WORKDIR /email_server

 # Grab packages from builder
diff --git a/src/recommendationservice/Dockerfile b/src/recommendationservice/Dockerfile
index 72add8de..6c661479 100644
--- a/src/recommendationservice/Dockerfile
+++ b/src/recommendationservice/Dockerfile
@@ -20,6 +20,7 @@ RUN apk update \
     && apk add --no-cache \
         wget \
         g++ \
+        linux-headers \
     && rm -rf /var/cache/apk/*

 # get packages
@@ -30,6 +31,11 @@ FROM base
 # Enable unbuffered logging
 ENV PYTHONUNBUFFERED=1

+RUN apk update \
+    && apk add --no-cache \
+        libstdc++  \
+    && rm -rf /var/cache/apk/*
+
 # get packages
 WORKDIR /recommendationservice

@mathieu-benoit
Copy link
Contributor Author

mathieu-benoit commented Jan 17, 2025

Thanks @vlsi, this #2589 (comment) did it indeed, awesome!

JFYI, with the new availability today of the arm64 GH runner in public preview, I was able to deploy the OnlineBoutique containers in both platforms, amd64 and arm64, with associated GH runners there: mathieu-benoit#2. All good now apparently!

@vlsi, could you please give it a try on your end to check that everything is working successfully?

Same for you @bourgeoisor with scaffold dev targeting your GKE cluster with arm64 nodes?

@bourgeoisor
Copy link
Member

For some reason I'm now unable to build the emailservice. I keep getting a fail exactly here. Tested 3 times, fails exactly at the same spot.

#10 8.836 Building wheels for collected packages: google-cloud-profiler, grpcio
#10 8.836   Building wheel for google-cloud-profiler (pyproject.toml): started
#10 10.42   Building wheel for google-cloud-profiler (pyproject.toml): finished with status 'done'
#10 10.42   Created wheel for google-cloud-profiler: filename=google_cloud_profiler-4.1.0-cp312-cp312-linux_aarch64.whl size=842450 sha256=b5343a25811245973cef0355bfe3d84812ae22c069f6b231efadd29b887d7a14
#10 10.42   Stored in directory: /root/.cache/pip/wheels/4c/e9/0e/051a26de1731259c679b0d9546e4a069b9e2adf536bb0566a2
#10 10.42   Building wheel for grpcio (pyproject.toml): started
#10 71.42   Building wheel for grpcio (pyproject.toml): still running...
ERROR: failed to receive status: rpc error: code = Unavailable desc = error reading from server: EOF
. . .
. . .
build [emailservice] failed: exit status 1. Docker build ran into internal error. Please retry.
If this keeps happening, please open an issue..

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants