Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cache parsed SQL in buildDiagram Function #1418

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ahtrotta
Copy link
Contributor

@ahtrotta ahtrotta commented Oct 5, 2023

Fixes #1417

When loading a sequence diagram, the buildDiagram function is one of the most expensive function calls. When it uses the hash getter on an instance of Event, the buildStableHash function is called, which ultimately ends up parsing SQL if the event has any. It turns out that parsing SQL is a relatively expensive operation, so this can add up to a lot of computation if there are a lot of events with SQL in the sequence diagram. In this flame graph you can see that peg$parseRule accounts for 41% of buildDiagram:

image

This PR creates an LRU cache in the buildDiagram method, and then passes it into the getHashWithSqlCache method on the Event class so that we don't repeat the computation needlessly.

This will need to be split up into two PRs so that we can release a version of @appland/models that will get used in @appland/sequence-diagram.

@ahtrotta ahtrotta force-pushed the feat/cache-parsed-sql branch from e6b6fbc to 5d6f222 Compare October 5, 2023 20:32
@ahtrotta ahtrotta changed the title feat: Cache parsed SQL in Event class Cache parsed SQL in Event class Oct 5, 2023
@ahtrotta ahtrotta marked this pull request as ready for review October 5, 2023 21:28
@kgilpin
Copy link
Contributor

kgilpin commented Oct 5, 2023

Take a look at

const ASTBySQLString = new LRUCache<string, QueryAST | 'parse-error'>({ max: 1000 });
which also caches the parse tree.

@@ -19,6 +19,8 @@ function alias(obj, prop, alias) {
// This class supercedes `CallTree` and `CallNode`. Events are stored in a flat
// array and can also be traversed like a tree via `parent` and `children`.
export default class Event {
static parsedSqlCache = {};
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider a Map instead of Object.

Copy link
Contributor

@kgilpin kgilpin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s problematic to keep a cache of every SQL query in memory because tools that process many AppMaps will end up consuming a lot of memory doing this. That’s why the scanner uses an LRU cache.

Adding a cache to the Event will impact a lot of code in addition to the code we are trying to optimize.

Can we add the cache we need in a way that’s more specific to the use case. Eg in the DiagramComponent, or an LRU cache in buildDiagram.

@ahtrotta ahtrotta force-pushed the feat/cache-parsed-sql branch 2 times, most recently from 76ca47a to e6e3cdd Compare October 9, 2023 21:01
Copy link
Contributor

@kgilpin kgilpin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It’s problematic to keep a cache of every SQL query in memory because tools that process many AppMaps will end up consuming a lot of memory doing this. That’s why the scanner uses an LRU cache.

Adding a cache to the Event will impact a lot of code in addition to the code we are trying to optimize.

Can we add the cache we need in a way that’s more specific to the use case. Eg in the DiagramComponent, or an LRU cache in buildDiagram.

@ahtrotta
Copy link
Contributor Author

ahtrotta commented Oct 9, 2023

It’s problematic to keep a cache of every SQL query in memory because tools that process many AppMaps will end up consuming a lot of memory doing this. That’s why the scanner uses an LRU cache.

Adding a cache to the Event will impact a lot of code in addition to the code we are trying to optimize.

Can we add the cache we need in a way that’s more specific to the use case. Eg in the DiagramComponent, or an LRU cache in buildDiagram.

I just changed how this works, so now I'm creating an LRU cache in buildDiagram here and then passing the cache into the new getHashWithSqlCache method on the Event class that can be used instead of the hash getter. This won't impact any of the other tools.

@ahtrotta ahtrotta force-pushed the feat/cache-parsed-sql branch from e6e3cdd to a6b3766 Compare October 9, 2023 21:20
@ahtrotta ahtrotta changed the title Cache parsed SQL in Event class Cache parsed SQL in buildDiagram Function Oct 9, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve sequence diagram load time (@appland/models)
2 participants