Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Method Initialization Improvements Discussion #418

Open
mabruzzo opened this issue Jun 20, 2024 · 1 comment
Open

Method Initialization Improvements Discussion #418

mabruzzo opened this issue Jun 20, 2024 · 1 comment

Comments

@mabruzzo
Copy link
Contributor

Enzo-E's foundations and Cello's overall architecture are fantastic! It does a lot of things extremely well, has a very modular design and provides a lot of conveniences! (everything @jobordner has done is truly fantastic!) I've only come to appreciate the overall design more with time (especially now that I've spent some time working with other codes).

Enzo-E's Method objects are arguably one of its most defining features. They have a very logical structure (initialization happens in the constructor, Block-data is updated by compute, timestep any requirements are computed/reported by timestep) that is extremely approachable (and you generally know where to look). It provides extremely useful conveniences (like the refresh-machinery). Plus it's easy to switch the machinery on and off.

In my opinion, the one area that needs improvement is Method-initialization.

About Method Initialization

Currently, there is a one-to-one mapping between the Method objects initialized by Enzo-E and the contents of the parameter files. In more detail, the precise Method classes as well as their explicit execution order are specified in the parameter-file. The logic that actually initializes the methods is performed by the Problem class.

Before I go further, let me emphasize that this is a very sensible choice (I probably would have done the same thing)! Plus, it has definitely served us well! Some of the benefits from this approach include:

  1. What you see (in the parameter file) is what you get in the simulation. There are basically no surprises about the order of operations.

    For example, imagine Method:list is assigned ["order_morton", "balance", "mhd_vlct", "grackle"]. The control flow executed each cycle looks like

 --------------      ---------      ----------      ---------
|              |    |         |    |          |    |         |
| order_morton | -> | balance | -> | mhd_vlct | -> | grackle |
|              |    |         |    |          |    |         |
 --------------      ---------      ----------      ---------
  1. You are able to freely adjust the order of operations at runtime. This is extremely useful for debugging. It's very easy to insert a method to show/save state at any point.
  2. The logic in the codebase for initializing methods is extremely straight-forward.
  3. This initialization-logic is consistent with the initialization logic for other entities in the codebase. (And we happen)

Importantly, the challenges with this approach are not easy to foresee, and only really become problematic "at scale" (i.e. as more Method classes are introduced and more people designing the Method classes).

Challenges with the current approach

In recent years, the following factors/challenges have become apparent:

  1. A number of different actions/physics need to be implemented by separate a sequence of Method instances. When you have a long list of methods (8? 10?) it is easy to forget about them.
  2. Relatedly, there are all sorts of ordering requirements and conventions for different methods
    • "order_morton" (or "order_hilbert") should come before "balance"
    • "balance" can't occur at the end of a cycle. It should generally come near the start
    • "order_morton" (or "order_hilbert") should come before "check"
    • soon-to-be-added "frame_transform" should be the last thing
    • "gravity" related:
      • I think it should come after "pm_deposit" but before "pm_update"
      • should come before your hydro method. But I think "pm_update" comes after
      • I think it should come before "background_acceleration"
      • are there any ordering requirements related to "comoving_expansion"? Does it come after "pm_update"
      • when should particle-creation methods happen? (after "pm_update"?)
    • "m1_closure" should come after hydro, but before "grackle"
    • When should "feedback" occur in relation to "star_maker", hydro, "m1_closure"
    • When should "accretion" occur in relation to "sink_maker", hydro?
    • "flux_correct" must follow hydro
  3. The system was designed with the expectation that we would know all of the fields before initializing the first Method instance. While the system was modified a few years back to let you specify the required fields in constructors, the modified system doesn't work particularly well.
    • It presents a challenge for refresh lists. Since methods like "grackle" usually get initialized after the hydro-solver, the hydro-solver can't know all of the required passive-scalar fields that must be refreshed
    • A similar issue arises for the flux-correction (see issue #?)
    • A method might like to preallocate scratch-space in the constructor, but may need to wait to lazily initialize this space until the first compute or timestep invocation (from a code-design perspective, that's kinda ugly. In the abs, we want to encourage people not to mutate state after initialization)
    • The "output" method would also like to know all of the fields when it is first initialized to report errors early about desired output fields (rather than the first time it writes an output)
    • We could also facilitate some useful optimizations if we knew the fields when we initialize the "physics" objects
  4. Ensuring proper scheduling of related methods
    • ordinary scheduling (like "order_morton" and "balance")
    • subcycling sets of methods together (like "m1_closure" and "grackle")
  5. We want to inject certain functionality that only needs to be injected in certain cases. If using dual-energy formalism, should synchronize energy after a "flux_correct". To implement divergence cleaning, we would also need to inject a step after a "flux_correct".
  6. The system doesn't work well with higher-order time integration. It essentially assumes that all methods as a whole use forward-euler time integration. Individual methods are free to implement higher-order time integrate (e.g. see mhd_vlct), but that doesn't map well across methods. We could accomplish this Involves lots of extra steps
  7. Swapping out equivalent methods (e.g. ppm for mhd_vlct... )
  8. The need for coordinated/consistent parameters for different methods. We have worked around this with Physics objects, but may not always be best solution.
  9. Methods are not very composable.
    • There have been times where it would be great to call one method inside another (note: this is probably not necessary if other issues are addressed).
    • It would also be nice to group various source-term methods together

Importantly, most of the burden for points 1 and 2 falls on the end-user. And it's really easy to screw this up without getting any indication from Enzo-E. As the codebase grows, it will be impossible to comprehensively check for this!

  • you also may need to inspect some internal-state of how a neighboring method is configured
  • (can't comprehensively check in constructors since subsequent methods not initialized yet).
  • Plus, you the list of explicit checks will just continue to grow with time

How do we address this?

Honestly, I don't have all of the answers (I'm making this issue to solicit feedback). But, I have some ideas for the first few steps.

As a first step, I propose that we decouple Method initialization from the Problem class:

  • make Problem store a callback function, provided by the enzo-layer, maybe with the signature. (we may be able to avoid passing Config&)

    typedef void application_setup_callback(Problem&, Config&);
  • we define a function with this signature in the Enzo-Layer (maybe called enzoe_setup_callback).

    • The current contents of Problem::initialize_method would be duplicated here
    • we would also extract the current contents of EnzoProblem::create_method_ (probably putting it in a helper function)
    • As the function initialized Method objects, it would register them with the Problem
  • we would register this callback inside EnzoProblem

  • define a new method of Problem, maybe called Problem::call_application_setup_callback(Config&) and have Simulation::initialize call this method (instead of calling Problem::initialize_method).

  • Aside: technically, we could make enzoe_setup_callback a method of EnzoProblem, but I think the separation might be good (since the logic will eventually start deviating from the basic formula that Problem's other initialize_* methods conform to)

After that, I have a few ideas (to be tackled in subsequent PRs):

  • we would modify the logic in enzoe_setup_callback in order to ensure that methods are initialized in the proper order.
    • We would eventually disregard the order of methods specified by the Method:list parameter and build in logic to initialize the methods in a pre-defined order
    • we could also add logic to introduce missing methods (like a missing pm_deposit or a missing flux_correct)
    • I have some thoughts about how to do this, but the detail's are beyond the scope of this issue
  • we would add logic to determine the list of required fields before calling the constructors of any Method instance (again, I have some thoughts on how to do this in a standardized manner)
  • we could eventually eliminate Problem::initialize_physics and make enzoe_setup_callback responsible for this (this would facilitate some EOS-related optimizations and simplifications)

I very much welcome feedback. I have other ideas too.

The goal here is to pursue a solution somewhere between what Athena++ does for initializing tasks and what Enzo-E currently does. (I would prefer something a little more readable than Athena++)

What do we lose?

We lose some clarity about the order of methods?. We can work around this by passing a command line argument to Enzo-E that tells it to just print the methods.

We lose some "flexibility" of arbitrary reordering. In really, the only case where we currently have that kind of flexibility for any production sims is "output" dumps. In science runs, you probably want to standardize the locations (put @ start of cycle or end). This is only really a loss for scheduling debugging methods at runtime.

  • If this is a really big concern, we could probably work around this! If we implement functionality to print the list of methods, (with associated ordering), we could add command-line args to inject these methods based on that list.
@jobordner
Copy link
Contributor

I need to spend more time organizing and consolidating my thoughts on this, it's something that I've been concerned with for years. Here's a dump of my current thinking.

Cello implements an abstract "sequence of methods" for operating on data, and Enzo-E's operational behavior in a simulation is almost entirely defined in terms of the method list in the parameter file. Over the years, the combination of introducing non-physics operations into the method component (checkpoint, ordering blocks, load-balancing, flux-correction, etc.), and increasing software entropy, have increased the difficulty in specifying the method list.

I think many of the issues you mention above can be effectively dealt with by simplifying the organization of Enzo-E methods to address the technical debt. Things like merging "pm_deposit" into "gravity", renaming "pm_update" as "move_particles", having "check" and "balance" call "order" themselves as-needed, etc. In general making Enzo-E methods more "physics"- or "function"- oriented and less "algorithm"-centric.

It will necessarily break backward-compatability, so it would be a good time to introduce and use versioning parameters, and keep careful track of method modifications and the associated version.

After cleaning up Enzo-E methods, then we can shift attention to the Cello layer to address any remaining issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants