Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nonlinear equations work on CPU but not GPU #914

Open
nicholaskl97 opened this issue Nov 18, 2024 · 2 comments
Open

Nonlinear equations work on CPU but not GPU #914

nicholaskl97 opened this issue Nov 18, 2024 · 2 comments
Labels

Comments

@nicholaskl97
Copy link

nicholaskl97 commented Nov 18, 2024

Describe the bug 🐞

Some nonlinear equations (or boundary conditions) lead to NaNs after optimization on GPU. As shown in the MRE below, the equation u(x) ~ 0 works on GPU (and CPU, not shown), but the equation u(x)^2 ~ 0 only works on CPU, not GPU. The same is true for the boundary conditions u(0) ~ 0 and u(0)^2 ~ 0.

Expected behavior

If a PDESystem works on CPU, it should also work on GPU. Equations that are nonlinear in the dependent variable shouldn't result in NaN. Users should be able to specify the equivalent boundary conditions u(0) ~ 0 and u(0)^2 ~ 0 and get similar results.

Minimal Reproducible Example 👇

using NeuralPDE
using Lux, LuxCUDA, Optimization, OptimizationOptimisers, Random, ComponentArrays

Random.seed!(200)

@parameters x
@variables u(..)

eq1 = [u(x) ~ 0.0]
eq2 = [u(x)^2 ~ 0.0]

bc = [u(0.0) ~ 0.0]

domain = [x  (-1.0, 1.0)]

# Define neural network discretization
dim_state = 1
dim_hidden = 10
dim_output = 1
chain = Chain(
            Dense(dim_state, dim_hidden, tanh),
            Dense(dim_hidden, dim_hidden, tanh),
            Dense(dim_hidden, 1)
        )

ps = first(Lux.setup(Random.default_rng(), chain))
const gpud = gpu_device()
ps = ps |> ComponentArray |> gpud |> f64

strategy = QuasiRandomTraining(1000)

# Linear, on GPU: u(x) ~ 0
discretization1 = PhysicsInformedNN(chain, strategy; init_params = ps)
@named pde_system1 = PDESystem(eq1, bc, domain, [x], [u(x)])

prob1 = discretize(pde_system1, discretization1)
res1 = Optimization.solve(prob1, OptimizationOptimisers.Adam(); maxiters = 10)

@assert !any(isnan.(res1.u))

# Nonlinear, on GPU: u(x)^2 ~ 0
@named pde_system2 = PDESystem(eq2, bc, domain, [x], [u(x)])

prob2 = discretize(pde_system2, discretization1)
res2 = Optimization.solve(prob2, OptimizationOptimisers.Adam(); maxiters = 10)

@assert all(isnan.(res2.u))

# Nonlinear, on CPU: u(x)^2 ~ 0
discretization3 = PhysicsInformedNN(chain, strategy)
@named pde_system3 = PDESystem(eq1, bc, domain, [x], [u(x)])

prob3 = discretize(pde_system3, discretization3)
res3 = Optimization.solve(prob3, OptimizationOptimisers.Adam(); maxiters = 10)

@assert !any(isnan.(res3.u))
using NeuralPDE
using Lux, LuxCUDA, Optimization, OptimizationOptimisers, Random, ComponentArrays

Random.seed!(200)

@parameters x
@variables u(..)

eq = [u(x) ~ 0.0]

bc1 = [u(0.0) ~ 0.0]
bc2 = [u(0.0)^2 ~ 0.0]

domain = [x  (-1.0, 1.0)]

# Define neural network discretization
dim_state = 1
dim_hidden = 10
dim_output = 1
chain = Chain(
            Dense(dim_state, dim_hidden, tanh),
            Dense(dim_hidden, dim_hidden, tanh),
            Dense(dim_hidden, 1)
        )

ps = first(Lux.setup(Random.default_rng(), chain))
const gpud = gpu_device()
ps = ps |> ComponentArray |> gpud |> f64

strategy = QuasiRandomTraining(1000)

# u(0) ~ 0 works fine on GPU
discretization1 = PhysicsInformedNN(chain, strategy; init_params = ps)
@named pde_system1 = PDESystem(eq, bc1, domain, [x], [u(x)])

prob1 = discretize(pde_system1, discretization1)
res1 = Optimization.solve(prob1, OptimizationOptimisers.Adam(); maxiters = 10)

@assert !any(isnan.(res1.u))

# u(0)^2 ~ 0 yields NaN parameters on GPU
@named pde_system2 = PDESystem(eq, bc2, domain, [x], [u(x)])

prob2 = discretize(pde_system2, discretization1)
res2 = Optimization.solve(prob2, OptimizationOptimisers.Adam(); maxiters = 10)

@assert all(isnan.(res2.u))

# u(0)^2 ~ 0 works fine on CPU
discretization3 = PhysicsInformedNN(chain, strategy)
@named pde_system3 = PDESystem(eq, bc2, domain, [x], [u(x)])

prob3 = discretize(pde_system3, discretization3)
res3 = Optimization.solve(prob3, OptimizationOptimisers.Adam(); maxiters = 10)

@assert !any(isnan.(res3.u))

Environment (please complete the following information):

  • Output of using Pkg; Pkg.status()
Status `~/julia_projects/NeuralPDEtests/Project.toml`
  [b0b7db55] ComponentArrays v0.15.19
⌃ [b2108857] Lux v1.2.3
  [d0bbae9a] LuxCUDA v0.3.3
  [961ee093] ModelingToolkit v9.51.0
  [315f7962] NeuralPDE v5.17.0
  [7f7a1694] Optimization v4.0.5
  [42dfb2eb] OptimizationOptimisers v0.3.4
Info Packages marked with ⌃ have new versions available and may be upgradable.
  • Output of using Pkg; Pkg.status(; mode = PKGMODE_MANIFEST)
Status `~/julia_projects/NeuralPDEtests/Manifest.toml`
  [47edcb42] ADTypes v1.10.0
  [621f4979] AbstractFFTs v1.5.0
  [80f14c24] AbstractMCMC v5.6.0
  [1520ce14] AbstractTrees v0.4.5
  [7d9f7c33] Accessors v0.1.38
  [79e6a3ab] Adapt v4.1.1
  [0bf59076] AdvancedHMC v0.6.4
  [66dad0bd] AliasTables v1.1.3
  [dce04be8] ArgCheck v2.3.0
  [ec485272] ArnoldiMethod v0.4.0
  [4fba245c] ArrayInterface v7.17.1
  [4c555306] ArrayLayouts v1.10.4
  [a9b6321e] Atomix v0.1.0
  [13072b0f] AxisAlgorithms v1.1.0
  [39de3d68] AxisArrays v0.4.7
  [ab4f0b2a] BFloat16s v0.5.0
  [198e06fe] BangBang v0.4.3
  [9718e550] Baselet v0.1.1
  [e2ed5e7c] Bijections v0.1.9
  [62783981] BitTwiddlingConvenienceFunctions v0.1.6
  [8e7c35d0] BlockArrays v1.1.1
  [70df07ce] BracketingNonlinearSolve v1.1.0
  [fa961155] CEnum v0.5.0
  [2a0fbf3d] CPUSummary v0.2.6
  [00ebfdb7] CSTParser v3.4.3
  [052768ef] CUDA v5.5.2
  [1af6417a] CUDA_Runtime_Discovery v0.3.5
  [7057c7e9] Cassette v0.3.14
  [082447d4] ChainRules v1.72.1
  [d360d2e6] ChainRulesCore v1.25.0
  [fb6a15b2] CloseOpenIntervals v0.1.13
⌅ [3da002f7] ColorTypes v0.11.5
⌃ [5ae59095] Colors v0.12.11
  [861a8166] Combinatorics v1.0.2
  [a80b9123] CommonMark v0.8.15
  [38540f10] CommonSolve v0.2.4
  [bbf7d656] CommonSubexpressions v0.3.1
  [f70d9fcc] CommonWorldInvalidations v1.0.0
  [34da2185] Compat v4.16.0
  [b0b7db55] ComponentArrays v0.15.19
  [b152e2b5] CompositeTypes v0.1.4
  [a33af91c] CompositionsBase v0.1.2
  [2569d6c7] ConcreteStructs v0.2.3
  [88cd18e8] ConsoleProgressMonitor v0.1.2
  [187b0558] ConstructionBase v1.5.8
  [adafc99b] CpuId v0.3.1
  [a8cc5b0e] Crayons v4.1.1
  [667455a9] Cubature v1.5.1
  [9a962f9c] DataAPI v1.16.0
  [a93c6f00] DataFrames v1.7.0
  [864edb3b] DataStructures v0.18.20
  [e2d170a0] DataValueInterfaces v1.0.0
  [244e2a9f] DefineSingletons v0.1.2
  [8bb1440f] DelimitedFiles v1.9.1
  [2b5f629d] DiffEqBase v6.160.0
⌃ [459566f4] DiffEqCallbacks v4.1.0
  [77a26b50] DiffEqNoiseProcess v5.23.0
  [163ba53b] DiffResults v1.1.0
  [b552c78f] DiffRules v1.15.1
  [a0c0ee7d] DifferentiationInterface v0.6.22
  [8d63f2c5] DispatchDoctor v0.4.17
  [31c24e10] Distributions v0.25.113
  [ffbed154] DocStringExtensions v0.9.3
  [5b8099bc] DomainSets v0.7.14
  [7c1d4256] DynamicPolynomials v0.6.0
  [06fc5a27] DynamicQuantities v1.3.0
  [4e289a0a] EnumX v1.0.4
  [f151be2c] EnzymeCore v0.8.6
  [e2ba6199] ExprTools v0.1.10
⌅ [6b7a57c9] Expronicon v0.8.5
  [7a1cc6ca] FFTW v1.8.0
  [7034ab61] FastBroadcast v0.3.5
  [9aa1b823] FastClosures v0.3.2
  [29a986be] FastLapackInterface v2.0.4
  [a4df4552] FastPower v1.1.1
  [1a297f60] FillArrays v1.13.0
  [64ca27bc] FindFirstFunctions v1.4.1
  [6a86dc24] FiniteDiff v2.26.1
  [53c48c17] FixedPointNumbers v0.8.5
  [1fa38f19] Format v1.3.7
  [f6369f11] ForwardDiff v0.10.38
  [f62d2435] FunctionProperties v0.1.2
  [069b7b12] FunctionWrappers v1.1.3
  [77dc65aa] FunctionWrappersWrappers v0.1.3
⌅ [d9f16b24] Functors v0.4.12
⌅ [0c68f7d7] GPUArrays v10.3.1
⌅ [46192b85] GPUArraysCore v0.1.6
⌅ [61eb1bfa] GPUCompiler v0.27.8
  [c145ed77] GenericSchur v0.5.4
  [c27321d9] Glob v1.3.1
  [86223c79] Graphs v1.12.0
  [19dc6840] HCubature v1.7.0
  [3e5b6fbb] HostCPUFeatures v0.1.17
  [0e44f5e4] Hwloc v3.3.0
  [34004b35] HypergeometricFunctions v0.3.25
  [7869d1d1] IRTools v0.4.14
  [615f187c] IfElse v0.1.1
  [d25df0c9] Inflate v0.1.5
  [22cec73e] InitialValues v0.3.1
  [842dd82b] InlineStrings v1.4.2
  [505f98c9] InplaceOps v0.3.0
  [18e54dd8] IntegerMathUtils v0.1.2
  [de52edbc] Integrals v4.5.0
  [a98d9a8b] Interpolations v0.15.1
  [8197267c] IntervalSets v0.7.10
  [3587e190] InverseFunctions v0.1.17
  [41ab1584] InvertedIndices v1.3.0
  [92d709cd] IrrationalConstants v0.2.2
  [c8e1da08] IterTools v1.10.0
  [82899510] IteratorInterfaceExtensions v1.0.0
  [692b3bcd] JLLWrappers v1.6.1
  [98e50ef6] JuliaFormatter v1.0.62
  [ccbc3e58] JumpProcesses v9.14.0
  [ef3ab10e] KLU v0.6.0
  [63c18a36] KernelAbstractions v0.9.29
  [5ab0869b] KernelDensity v0.6.9
  [ba0b0d4f] Krylov v0.9.8
  [5be7bae1] LBFGSB v0.4.1
  [929cbde3] LLVM v9.1.3
  [8b046642] LLVMLoopInfo v1.0.0
  [b964fa9f] LaTeXStrings v1.4.0
  [23fbe1c1] Latexify v0.16.5
  [73f95e8e] LatticeRules v0.0.1
  [10f19ff3] LayoutPointers v0.1.17
  [5078a376] LazyArrays v2.2.2
  [1d6d02ad] LeftChildRightSiblingTrees v0.2.0
  [87fe0de2] LineSearch v0.1.4
  [d3d80556] LineSearches v7.3.0
  [7ed4a6bd] LinearSolve v2.37.0
  [6fdf6af0] LogDensityProblems v2.1.2
  [996a588d] LogDensityProblemsAD v1.13.0
  [2ab3a3ac] LogExpFunctions v0.3.28
  [e6f89c97] LoggingExtras v1.1.0
  [bdcacae8] LoopVectorization v0.12.171
⌃ [b2108857] Lux v1.2.3
  [d0bbae9a] LuxCUDA v0.3.3
⌃ [bb33d45b] LuxCore v1.1.0
⌃ [82251201] LuxLib v1.3.7
  [c7f686f2] MCMCChains v6.0.6
  [be115224] MCMCDiagnosticTools v0.3.12
⌃ [7e8f7934] MLDataDevices v1.5.3
  [e80e1ace] MLJModelInterface v1.11.0
  [d8e11817] MLStyle v0.4.17
  [1914dd2f] MacroTools v0.5.13
  [d125e4d3] ManualMemory v0.1.8
  [bb5d69b7] MaybeInplace v0.1.4
  [128add7d] MicroCollections v0.2.0
  [e1d29d7a] Missings v1.2.0
  [961ee093] ModelingToolkit v9.51.0
  [4886b29c] MonteCarloIntegration v0.2.0
  [0987c9cc] MonteCarloMeasurements v1.2.1
  [46d2c3a1] MuladdMacro v0.2.4
  [102ac46a] MultivariatePolynomials v0.5.7
  [d8a4904e] MutableArithmetics v1.6.0
  [d41bc354] NLSolversBase v7.8.3
  [872c559c] NNlib v0.9.24
  [5da4648a] NVTX v0.3.5
  [77ba4419] NaNMath v1.0.2
  [c020b1a1] NaturalSort v1.0.0
  [315f7962] NeuralPDE v5.17.0
  [8913a72c] NonlinearSolve v4.2.0
  [be0214bd] NonlinearSolveBase v1.3.3
  [5959db7a] NonlinearSolveFirstOrder v1.1.0
  [9a2c21bd] NonlinearSolveQuasiNewton v1.0.0
  [26075421] NonlinearSolveSpectralMethods v1.0.0
  [6fe1bfb0] OffsetArrays v1.14.1
  [429524aa] Optim v1.10.0
⌅ [3bd65402] Optimisers v0.3.4
  [7f7a1694] Optimization v4.0.5
  [bca83a33] OptimizationBase v2.4.0
  [42dfb2eb] OptimizationOptimisers v0.3.4
  [bac558e1] OrderedCollections v1.6.3
  [90014a1f] PDMats v0.11.31
  [d96e819e] Parameters v0.12.3
  [e409e4f3] PoissonRandom v0.4.4
  [f517fe37] Polyester v0.7.16
  [1d0040c9] PolyesterWeave v0.2.2
  [2dfb63ee] PooledArrays v1.4.3
  [85a6dd25] PositiveFactorizations v0.2.4
  [d236fae5] PreallocationTools v0.4.24
  [aea7be01] PrecompileTools v1.2.1
  [21216c6a] Preferences v1.4.3
  [08abe8d2] PrettyTables v2.4.0
  [27ebfcd6] Primes v0.5.6
  [33c8b6b6] ProgressLogging v0.1.4
  [92933f4c] ProgressMeter v1.10.2
  [43287f4e] PtrArrays v1.2.1
  [1fd47b50] QuadGK v2.11.1
  [8a4e6c94] QuasiMonteCarlo v0.3.3
  [74087812] Random123 v1.7.0
  [e6cf234a] RandomNumbers v1.6.0
  [b3c3ace0] RangeArrays v0.3.2
  [c84ed2f1] Ratios v0.4.5
  [c1ae055f] RealDot v0.1.0
  [3cdcf5f2] RecipesBase v1.3.4
  [731186ca] RecursiveArrayTools v3.27.3
  [f2c3362d] RecursiveFactorization v0.2.23
  [189a3867] Reexport v1.2.2
  [ae029012] Requires v1.3.0
  [ae5879a3] ResettableStacks v1.1.1
  [79098fc4] Rmath v0.8.0
  [7e49a35a] RuntimeGeneratedFunctions v0.5.13
  [94e857df] SIMDTypes v0.1.0
  [476501e8] SLEEFPirates v0.6.43
  [0bca4576] SciMLBase v2.62.0
  [19f34311] SciMLJacobianOperators v0.1.1
  [c0aeaf25] SciMLOperators v0.3.12
  [53ae85a6] SciMLStructures v1.5.0
  [30f210dd] ScientificTypesBase v3.0.0
  [6c6a2e73] Scratch v1.2.1
  [91c51154] SentinelArrays v1.4.7
  [efcf1570] Setfield v1.1.1
  [727e6d20] SimpleNonlinearSolve v2.0.0
  [699a6c99] SimpleTraits v0.9.4
  [ce78b400] SimpleUnPack v1.1.0
  [ed01d8cd] Sobol v1.5.0
  [a2af1166] SortingAlgorithms v1.2.1
  [9f842d2f] SparseConnectivityTracer v0.6.8
  [dc90abb0] SparseInverseSubset v0.1.2
  [0a514795] SparseMatrixColorings v0.4.10
  [e56a9233] Sparspak v0.3.9
  [276daf66] SpecialFunctions v2.4.0
  [171d559e] SplittablesBase v0.1.15
  [aedffcd0] Static v1.1.1
  [0d7ed370] StaticArrayInterface v1.8.0
  [90137ffa] StaticArrays v1.9.8
  [1e83bf80] StaticArraysCore v1.4.3
  [64bff920] StatisticalTraits v3.4.0
  [10745b16] Statistics v1.11.1
  [82ae8749] StatsAPI v1.7.0
  [2913bbd2] StatsBase v0.34.3
  [4c63d2b9] StatsFuns v1.3.2
  [7792a7ef] StrideArraysCore v0.5.7
  [892a3eda] StringManipulation v0.4.0
⌃ [09ab397b] StructArrays v0.6.18
  [2efcf032] SymbolicIndexingInterface v0.3.35
  [19f23fe9] SymbolicLimits v0.2.2
  [d1185830] SymbolicUtils v3.7.2
  [0c5d862f] Symbolics v6.21.0
  [3783bdb8] TableTraits v1.0.1
  [bd369af6] Tables v1.12.0
  [8ea1fca8] TermInterface v2.0.0
  [5d786b92] TerminalLoggers v0.1.7
  [1c621080] TestItems v1.0.0
  [8290d209] ThreadingUtilities v0.5.2
  [a759f4b9] TimerOutputs v0.5.25
  [0796e94c] Tokenize v0.5.29
  [28d57a85] Transducers v0.4.84
  [d5829a12] TriangularSolve v0.2.1
  [410a4b4d] Tricks v0.1.9
  [781d530d] TruncatedStacktraces v1.4.0
  [5c2747f8] URIs v1.5.1
  [3a884ed6] UnPack v1.0.2
  [1986cc42] Unitful v1.21.0
  [a7c27f48] Unityper v0.1.6
  [013be700] UnsafeAtomics v0.2.1
  [d80eeb9a] UnsafeAtomicsLLVM v0.2.1
  [3d5dd08c] VectorizationBase v0.21.71
  [d49dbf32] WeightInitializers v1.0.4
  [efce3f68] WoodburyMatrices v1.0.0
  [e88e6eb3] Zygote v0.6.73
  [700de1a5] ZygoteRules v0.2.5
  [02a925ec] cuDNN v1.4.0
  [4ee394cb] CUDA_Driver_jll v0.10.3+0
  [76a88914] CUDA_Runtime_jll v0.15.4+0
  [62b44479] CUDNN_jll v9.4.0+0
  [7bc98958] Cubature_jll v1.0.5+0
  [f5851436] FFTW_jll v3.3.10+1
  [e33a78d0] Hwloc_jll v2.11.2+1
  [1d5cc7b8] IntelOpenMP_jll v2024.2.1+0
  [9c1d0b0a] JuliaNVTXCallbacks_jll v0.2.1+0
  [dad2f222] LLVMExtra_jll v0.0.34+0
  [81d17ec3] L_BFGS_B_jll v3.0.1+0
  [856f044c] MKL_jll v2024.2.0+0
  [e98f9f5b] NVTX_jll v3.1.0+2
  [efe28fd5] OpenSpecFun_jll v0.5.5+0
  [f50d1b31] Rmath_jll v0.5.1+0
  [1e29f10c] demumble_jll v1.3.0+0
  [1317d2d5] oneTBB_jll v2021.12.0+0
  [0dad84c5] ArgTools v1.1.2
  [56f22d72] Artifacts v1.11.0
  [2a0f44e3] Base64 v1.11.0
  [ade2ca70] Dates v1.11.0
  [8ba89e20] Distributed v1.11.0
  [f43a241f] Downloads v1.6.0
  [7b1f6079] FileWatching v1.11.0
  [9fa8497b] Future v1.11.0
  [b77e0a4c] InteractiveUtils v1.11.0
  [4af54fe1] LazyArtifacts v1.11.0
  [b27032c2] LibCURL v0.6.4
  [76f85450] LibGit2 v1.11.0
  [8f399da3] Libdl v1.11.0
  [37e2e46d] LinearAlgebra v1.11.0
  [56ddb016] Logging v1.11.0
  [d6f4376e] Markdown v1.11.0
  [a63ad114] Mmap v1.11.0
  [ca575930] NetworkOptions v1.2.0
  [44cfe95a] Pkg v1.11.0
  [de0858da] Printf v1.11.0
  [9a3f8284] Random v1.11.0
  [ea8e919c] SHA v0.7.0
  [9e88b42a] Serialization v1.11.0
  [1a1011a3] SharedArrays v1.11.0
  [6462fe0b] Sockets v1.11.0
  [2f01184e] SparseArrays v1.11.0
  [4607b0f0] SuiteSparse
  [fa267f1f] TOML v1.0.3
  [a4e569a6] Tar v1.10.0
  [8dfed614] Test v1.11.0
  [cf7118a7] UUIDs v1.11.0
  [4ec0a83e] Unicode v1.11.0
  [e66e0078] CompilerSupportLibraries_jll v1.1.1+0
  [deac9b47] LibCURL_jll v8.6.0+0
  [e37daf67] LibGit2_jll v1.7.2+0
  [29816b5a] LibSSH2_jll v1.11.0+1
  [c8ffd9c3] MbedTLS_jll v2.28.6+0
  [14a3606d] MozillaCACerts_jll v2023.12.12
  [4536629a] OpenBLAS_jll v0.3.27+1
  [05823500] OpenLibm_jll v0.8.1+2
  [bea87d4a] SuiteSparse_jll v7.7.0+0
  [83775a58] Zlib_jll v1.2.13+1
  [8e850b90] libblastrampoline_jll v5.11.0+0
  [8e850ede] nghttp2_jll v1.59.0+0
  [3f19e933] p7zip_jll v17.4.0+2
Info Packages marked with ⌃ and ⌅ have new versions available. Those with ⌃ may be upgradable, but those with ⌅ are restricted by compatibility constraints from upgrading. To see why use `status --outdated -m`
  • Output of versioninfo()
Julia Version 1.11.1
Commit 8f5b7ca12ad (2024-10-16 10:53 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 20 × 12th Gen Intel(R) Core(TM) i9-12900H
  WORD_SIZE: 64
  LLVM: libLLVM-16.0.6 (ORCJIT, alderlake)
Threads: 20 default, 0 interactive, 10 GC (on 20 virtual cores)
Environment:
  JULIA_NUM_THREADS = auto
  JULIA_EDITOR = code

Additional context

@ChrisRackauckas, we spoke about this on a call on Thursday. It prevents GPU support for NeuralLyapunov in all but the simplest cases.

@nicholaskl97 nicholaskl97 changed the title Barely nontrivial boundary conditions work on CPU but not GPU Nonlinear equations work on CPU but not GPU Nov 27, 2024
@nicholaskl97
Copy link
Author

I've done a bit more digging and it's more complicated than just "nonlinear" vs. "linear". In particular terms like u(x) * Dx(u(x)) or Dx(u(x)^2) don't cause an issue, but u(x)^2 * Dx(u(x)) and Dx(u(x)^3) do.

@TusharNaugain
Copy link

I've done a bit more digging and it's more complicated than just "nonlinear" vs. "linear". In particular terms like u(x) * Dx(u(x)) or Dx(u(x)^2) don't cause an issue, but u(x)^2 * Dx(u(x)) and Dx(u(x)^3) do.

Bug: GPU Computation Failures with Certain Nonlinear Equations in NeuralPDE

Issue Description

Some nonlinear equations and boundary conditions fail to compute correctly on GPU, while working as expected on CPU. This significantly limits GPU support for more complex physics-informed neural network (PINN) problems.

Specific Observations

  • Linear equations and simple nonlinear terms work correctly on GPU
  • Certain nonlinear operations cause NaN results during GPU optimization
  • The problem is not uniform across all nonlinear operations

Detailed Breakdown
Problematic operations include:

  • u(x)^2 terms
  • Boundary conditions of the form u(0)^2 ~ 0
  • Some derivative-nonlinearity combinations like u(x)^2 * Dx(u(x)) and Dx(u(x)^3)

Minimal Reproducible Example

using NeuralPDE
using Lux, LuxCUDA, Optimization, OptimizationOptimisers, Random, ComponentArrays

Random.seed!(200)

@parameters x
@variables u(..)

# Problematic equation
eq = [u(x)^2 ~ 0.0]

bc = [u(0.0)^2 ~ 0.0]
domain = [x  (-1.0, 1.0)]

# Neural network setup
chain = Chain(
    Dense(1, 10, tanh),
    Dense(10, 10, tanh),
    Dense(10, 1)
)

ps = first(Lux.setup(Random.default_rng(), chain))
const gpud = gpu_device()
ps = ps |> ComponentArray |> gpud |> f64

strategy = QuasiRandomTraining(1000)

# GPU computation fails
discretization = PhysicsInformedNN(chain, strategy; init_params = ps)
@named pde_system = PDESystem(eq, bc, domain, [x], [u(x)])

prob = discretize(pde_system, discretization)
res = Optimization.solve(prob, OptimizationOptimisers.Adam(); maxiters = 10)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants