-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature request: group modules in visualizaton using project structure #125
Comments
Hello @AlexandrHoroshih, thanks for opening that very detailed issue! This reminds me #33 that was initially thought by @pigoz, that is the Overall, I do agree that skott web application usefulness is quickly limited when the number of nodes grows hence it should provide ways to seemlessly group some of the nodes into namespaces, with custom rules that would fit based on the project size and structure. Just for you own information, note that there is already a way to do some sort of grouping using the API, by providing a custom dependency resolver, the default one being EcmaScriptDependencyResolver (resolving all JS/TS modules, one node = one module). The API allows you to hook into skott module and internal graph resolution. This is a feature that I implemented for Rush.js, a popular monorepo tool. You can see an example here of a custom resolver I wrote for it there: https://github.com/antoine-coulon/krush/blob/af6d8e661deb300f93733f1e9ef1b4c839ccb309/plugins/rush-plugin-skott/src/plugins/visualize/main.ts#L27 The idea is to provide an overview of monorepo graphs where nodes are monorepo projects (app or lib), in the same spirit as Nx graph. So you could imagine doing the same, where you build your own graph based on your own rules, and skott can then deal with the rest, such as visualization, cyclic dependencies, and all other features that relying on the emitted graph (this is a old screen shot) This is literally reducing the whole monorepo file tree around 2k files to few nodes only, enhancing a lot visualization. Nonetheless, I grant you the fact that it is not the simplest solution and we should probably provide a better and more straightforward and flexible way to do that. Specification I would say given this impacts all of the visualization modes (even though web application is by far the most used one), this should be a Let me just recap your suggestion Given the following file tree:
and the following definition: "groups": {
"features": "src/features/{name}/*",
"widgets": "src/widgets/{name}/*"
} you expect the following graph to be emitted: {
"features/feature-a": {...},
"features/feature-b": {...},
"widgets/widget-a": {...}
} That is 3 nodes, right? Of course all dependencies between internal modules of each group would still be captured. From the definition itself, do you imagine other cases where one would want Implementation At first glance I'd say that "groups" are like "views" in database in the sense that they expose a different graph shape but the internal graph stays untouched. That would allow all information such third-party, builtin modules, etc to still be collected in the background, but applying the "groups" filter on it, would emit another version of the graph in the same spirit as it's done there: https://github.com/antoine-coulon/krush/blob/af6d8e661deb300f93733f1e9ef1b4c839ccb309/plugins/rush-plugin-skott/src/plugins/visualize/graph.spec.ts#L229. The difference would be that instead of modifying the internal graph itself, we would emit both side-by-side as patching the main graph is not a good idea imho, we still need to keep the source of truth reachable so that we can still rely on it anytime. Consequently we could expose both graphs, one "full" containing everything and a "compacted" one: const { getStructure } = await skott(withGroupsConfig);
const { graph, compactedGraph } = getStructure();
assert.deepEqual(graph, {
"src/features/feature-a/index.ts": {...},
"src/features/feature-b/index.ts": {...},
});
assert.deepEqual(compactedGraph, {
"features/feature-a": {...},
"features/feature-b": {...},
}); In the end, all visualization modes could detect whether there are other "views" of the graph and decide to use them or not. One benefit of having both for the web application would also be to dynamically switch modes between "compacted" and "full", if you ever want to expand whatever is inside Recap:
What do you think of that? |
Thanks for a detailed reponse!
Yep, given that tree i would expect just 3 nodes
So far, not really 🤔 Approaches like Feature Sliced Design or similiar are introducing "internal" structure rules too, though But I think, that, given that the skott's graph is covering project as a whole, these internals, even if structured, are still not really relevant I think, that it is probably makes sense to add a possibility to "unwrap" a one specific group in the graph, so internal details are visible only when needed and only for one specific group, which is being investigated 🤔
Yep, this sound good to me 👍 |
Actually, there is one case 😅
which is in fact "multi-level" group, like |
But actually we cound try to group it like
though 🤔 Something like this should work, i think "groups": {
"teamAFeatures": "src/team-a/features/{name}/*",
"teamAWidgets": "src/team-a/widgets/{name}/*",
"teamBFeatures": "src/team-b/features/{name}/*",
"teamBWidgets": "src/team-b/widgets/{name}/*",
} |
Also, if combined with |
Actually these are interesting use cases. I think we can start by providing a straightforward way of grouping things as you initially mentioned and what I refer to "group by all first children sub-folders".
Exactly, I think one should aim for the smallest subset to observe, either using skott({
cwd: "src/team-a",
"groups": {
"features": "{cwd}/features",
"widgets": "{cwd}/widgets"
}
}) If one still wants to have the global picture but in a grouped way, then we could still allow this kind of configuration that you mentioned: "groups": {
"teamAFeatures": "src/team-a/features/{name}/*",
"teamAWidgets": "src/team-a/widgets/{name}/*",
"teamBFeatures": "src/team-b/features/{name}/*",
"teamBWidgets": "src/team-b/widgets/{name}/*",
} I think that this behavior of grouping can be generalized to work with any length of record so whether using the config above alone or combining it with other API options would be fine. But again, I think this "groups" feature should just apply a custom grouping to the "raw" graph without mutating it, so it's fine experimenting that and adding another property to the |
Would you be willing to implement that? Just for you to know, there is no time constraint. I will try to provide detailed hints tonight of what could be a first big-picture implementation of that feature |
Sure! |
Given the "group by all direct sub-folders starting from this one" definition, i think, something like that should work?
|
Yes, that definitely looks good to me!
That Groups API looks ok to me. I'm just thinking about the dynamic segments i.e: Consequently: "groups": {
"features": "src/features/*", // create groups from all direct children sub folders, `src/features/team-a` will be a group
"core": "src/core" // if no "*" wildcard provided, whole folder is a group
} For the edge case you mentioned, I think we should provide a constraint defining that no group definition should overlap, at least in the first iteration, because it will be hard to deal correctly with multiple versions of a same set of files under different groups and still be able to represent an unified version of the graph. In other words, you would end up with 2 nodes where 1 node is a subset of the other one and at first sight this would be too tedious to deal with that in all aspects. Usually achieving that is done by clustering, but in clustering only one version of the group is there at all time, sub groups are being merged in outer groups.
I initially named the thing "compacted" but when rethinking it, it might be a bit misleading, because we are talking about "groups" and the "groups API", and then we end up with a "compacted" property thing when defining group 🤔 My bad actually, I think this should be named
Yes, but I think for that we should favor "undefined" instead of "null", because the property won't be there when no "groups" are defined. Consequently you'd have interface SkottStructure {
// ... already existing props
graphWithGroups?: Graph;
} For the Web App, I fully agree with your vision:
Yes 🙂
And yes ✅ |
Note that you could split the work in two parts as Also, as we're talking about the "grouping" feature, it makes me realize that some of the features around "skott-webapp" will only work with graphs containing files. This means that if we turn on the "graphWithGroups" visualization mode, sections such as "File Explorer" and some of the sections in "Summary" become irrelevant. But this is great, because it's a subject that I needed to tackle for the new version of the web app to be compatible with monorepo visualization, with hundred/thousands of projects where "nodes" are packages and not files. I'll keep you updated |
Just to clarify: Basically expect(async () => await skott({
groups: {
"teamFeatures": "src/team/features/{name}",
"team": "src/team/{name}",
}
})).toThrowErrorMatchingInlineSnapshot(`Bad groups configuration: "team" and "teamFeatures" groups are overlapping with each other`) |
I thought about it a bit and I agree about Maybe it could just be an explicit configuration object? groups: {
features: {
parentPath: "src/features",
// modulePath == ["feature-name", ...]
getGroup: (modulePath) => modulePath[0],
},
core: "src/core" // whole folder is a group
} This way groups resolving is mostly moved to the userland, which is very flexible, so there is zero problems with handling custom project structures But probably less desirable, since it is not possible to do any kind-of-static analysis as it is possible with a string pattern E.g. with string pattern it is possible to do an ahead-of-time validation, that groups are not overlapping, but it will only work somewhere in the middle of the And since this function would be called on every module path, it would be a potential performance bottlneck, which will be out of So, looks like there is needs to be some string-template-pattern anyway 🤔 |
Hey again @AlexandrHoroshih! Having a configuration object with functions could be nice indeed to cover more edgy cases that would not make sense with a simple string literal, this would also follow the path I initiated with dependency resolvers where skott offers some sort of IoC for graph resolution. This would for sure delegate all the responsibility to the consumers and so the grouping would become their own responsibility (dealing with overlap etc) and it's pretty much ok with that.
Well yeah but grouping is like a lazy action, internal graph resolution construction will remain the same and once the user or any of the visualization modes use So if I sum up for me there would be 3 possible definitions, from what we've been discussing: const groupsConfig = {
"features-1": "src/core", // whole "core" folder is a group, and this group is named "features-1"
"features-2": "src/core/*", // all sub-folders (only first level such as `src/core/lol`) of "src/core" are grouped, for instance `features-2/lol` is a group. "/*" means all direct folders, unlike "/**", matching recursively all sub folders
"features-3": {
basePath: "src/feature",
// in features-3 configuration, groups would become constructed such as `features-3/group0`, `features`
group: () => (["group0", "group1/skott", "group2/is-an-awesome", "group3/library"]-
}
} Regarding the overlap checking we can naively check that all the static paths are not part of the same base path, in the above case, So
Yes, I would favor fail fast when some unexpected config is provided rather than silently patching things under the hood or still providing a graph that will most likely confuse and not help finding that the config is wrongly provided. |
Also as configuring groups manually (specifying keys of the record statically "features-1", "features-2" etc) can become tedious on such big modules, I'd then expect the config option to be a function where one can dynamically configure keys (group names) and their matching content. This would for instance allow some automatic grouping based on Nx, Rush, Turborepo configurations, but also some other rules that could be established dynamically depending on the needs. This is out of the scope of first iterations, I'm just thinking about it loudly 😄 |
Upvoting the configuration function option 🙂 |
Let's make it happen then! |
Hello @antoine-coulon ! I finally had found some time to work on this feature - here is a small PR draft, which is still Work in Progress though, so i don't want to target it to the main But in the process of working on it, I discovered a few interesting points that I'd like to discuss: The config API designCurrent implementationCurrent design is implemented as we discussed before, with an edge-case handled - in function API an option to return groups?: {
[groupName: string]:
| string // <- path pattern
| {
/**
* Scope of the group
*/
basePath: string;
/**
*
* @param modulePath path to the module
* @returns group name, to which the module belongs or null if it does not belong to the group
*/
getGroup: (modulePath: string) => string | null;
};
}; And generated graph looks something like that After trying out this approach at my work project i eventually moved to using only a function API: {
groups: {
app: {
basePath: 'client',
getGroup: (nodePath) => {
if (nodePath.includes('client/src/entrypoint')) {
return 'entrypoint';
}
if (nodePath.includes('client/src/internal')) {
return 'internal';
}
if (nodePath.includes('client/src/platform')) {
return 'platform'
}
if (nodePath.includes('client/src/product')) {
const match = nodePath.match(
/product\/(?<team>.*?)\/(?<entity>.*?)\/(?<name>.*?)\//,
);
if (match && match.groups) {
const { team, entity, name } = match.groups;
return `product/${team}/${entity}/${name}`;
}
}
return null;
},
}
} The "config" option just wasn't really useful to me in the end SuggestionI feel like that "function" option is in fact the most flexible and the least confusing option of all 🤔 Also, while working on the PR, i got better understanding of I suggest to simplify the API design and only accept single function, something like that: {
groups: (nodePath) => {
if (anyConditionToMatchPathToSomeGroup(nodePath)) return "groupName"
// Module did not matched any group and is discarded from the compact version of the graph
return null;
}
} ☝️ Also i think, that if we will decide to go with this API - it might need a better name, like "collapseRule" or "compactSomething" 🤔 Pros
Cons
The latter con is only a problem because there is currently no way to run While it is possible to describe groups in cli like I think, that Something like this: // project/scripts/build-skott-graph.mjs
import { Skott, runSkottInstance } from "skott"
const instance = new Skott({
...config
})
await runSkottInstance(instance, {
displayMode: "webapp",
...options
}) If you ok with that, i can file a separate feature-request issue (and make it happen, when i have time 😅) What do you think about this suggestions? |
Hello @AlexandrHoroshih, nice to see the work of this API going forward! I'm just back from holidays, so I'll be a bit more reactive from now on :) I checked your current PR and the generated graph looks definitely great, nice job! About what you mentioned Also as you rightly mentioned, static definitions are great and were meant to provide better DX and stricter by just being a description, but turns out that implementation-wise it gets harder (by looking at your PR we can already witness the added complexity to just validate paths) and also there will be rules that won't be able to be represented as pure strings and we don't want to end up providing some weird skottish patterns to provide all the cases required. Consequently, I would say I agree with your proposition of just exposing one entry function such as: {
groups: (nodePath) => {
if (anyConditionToMatchPathToSomeGroup(nodePath)) return "groupName"
// Module did not matched any group and is discarded from the compact version of the graph
return null;
}
} That just says given a function taking a nodePath, if a groupName (string) is returned from that function then include that nodePath in the groupName. Otherwise, just don't include that nodePath in a group. I feel like this is pretty straightforward, flexible and inverts the complexity by offering to the user basically all the joy of managing its own rules. One important note about the snippet above and things I saw in your PR:
That will ensure many things:
Given all these points, the following function const fullGraph = {
"src/features/a/index.js": {} // whatever inside there
};
const {groupedGraph} = this.getStructure(); // getStructure() is the one computing lazily the groupedGraph
// and basically, this is what happens in the group function
const apiConfig = {
..._,
groups: (nodePath) => {
// nodePath === "src/features/a/index.js"
}
}
function getStructure() {
let groupedGraph
// graphNodes could be created from Object.values(this.#projectGraph )
for (const node of graphNodes) {
const groupName = apiConfig.groups(node.id)
// add other information to the new node that will be added
if(groupName) // add all the node info to the group within "groupedGraph"
}
return {
...otherThings,
groupedGraph
}
} And so "groups" function needs to be called with each node from the graph while traversing it (seems like what you did a bit there https://github.com/AlexandrHoroshih/skott/pull/1/files#diff-7b7a6308c7431b730b864232c5ef51e67e376482a19da77140ca8f139cb07610R354 apart that you added it in the About the "groups" function I think instead of returning
So about the cons, this is an interesting point. The "skott-webapp" package is a standalone package that can be started independently from "skott" itself. The only requirement for "skott-webapp" is to have an http server that exposes a set of routes providing the expected data and serving static files, but it could be coming from anywhere. Turns out by default it comes from there
One interesting note is that I should probably expose a util function allowing to seamlessly create that web server so that it's easy run the web app and skott API on their own, without going through the CLI. About the CLI and CLI-based display modes Now this is where your statement would be correct, when it comes to using CLI based display modes "file-tree", "graph" etc, there is currently no solution to both use the API (benefiting from the dynamic nature of the config) and then to render things in the CLI. But given I have new use cases about the skott CLI coming from Effect friends, I'll probably manage to find a way to use both the API and the CLI-based display modes, without having to rely on skott's binary. Something like: import skott from "skott"
import { displayModes } from "skott/cli"
const skottResult = await skott();
// this will print the result in the CLI
displayModes.fileTree.render(skottResult); Summary
|
Thanks for the reply!
That would be really great! Such a feature, as I think, would simplify things a lot, because a path like
Got it 👍 I think i will split the work in two parts then (as you initially suggested actually) - first the |
Hi @antoine-coulon ! Here is the first part - the I also tried to use this API locally (by monkeypatching my What is interesting is that eventually i opened a few tabs with different groupedGraph renders at different levels of details: And since the project is quite large, i was switching tabs with different views to make more sense of it. Maybe it could lead to next iteration of this feature in |
Hello @AlexandrHoroshih, thanks for landing the PR! I just reviewed it. The grouped graph, looks great on the picture, it will definitely improve the visualization experience and help extracting useful information.
I'm not sure I understood that part, what do you exactly suggest? Having a set of groupedGraph and each graph would be uniquely identified (string identifier) that would follow different rules, instead to only one (as done in the PR)? If it is what you had in mind, I think it's a great idea :) If it was not that, then it probably just gave us an interesting idea! In the web application, you could then switch between all the different graphs (views). We could even imagine provide a way of creating groupedGraphs directly from the web application itself. |
Thanks!
Yeah, I was more or less thinking along those lines 😅 But I'm not sure that's the best way to go about it: I think that maybe in the future, using the grouping API, it will be possible to do some more advanced analysis that could show "harmful" groups of modules. |
@AlexandrHoroshih thanks for contributing, the PR just got merged and the 0.33.0 got released with the new feature!
Isn't it because you need to have a more precise definition of your groups instead of having huge top-level groups? |
I gave it a thought and i agree, that makes sense 🤔 I tried to do a deeper analysis and flow was kind of the same you described here:
|
Yeah definitely, to me it seems hard to get the right groups at first glance. It feels like knowing what groups you want goes by visualizing first and incrementally grouping into things that make sense for the current use case. Thing is, doing that back and forth does not really provide a great UX/DX as you always need to change the But I guess those will be done in next iterations, allowing the grouped graph to be displayed at first and allowing a switch between the "full" and "grouped" graph will already be good enough! |
Speaking of that 😅 Currently most of the Given that we are already discussing an arbitrary number of such graphs i feel like it would make sense to refactor export interface AppState {
data: DataState;
ui: UiState;
} it should expect something like that export interface AppState {
data: {
source: DataState, // data for original modules graph, always available
[key: string]?: DataState, // Any graph derived from source one - e.g. "grouped"
};
ui: UiState;
}
export interface UiState {
filters: {
// ...
};
network: {
currentTarget: string; // e.g. "source" by default
// ...
}
} So that it is possible to do something like that: subscription = appStore.store$
.pipe(
map(({ data, ui }) => data[ui.network.currentTarget]),
distinctUntilChanged(isEqual)
)
.subscribe((data) => {
const { graphNodes, graphEdges } = makeNodesAndEdges(
Object.values(data.graph),
{ entrypoint: data.entrypoint }
);
setNodesDataset(new DataSet(graphNodes));
setEdgesDataset(new DataSet(graphEdges));
}); and that <Menu.Dropdown>
<Menu.Label>Graph View</Menu.Label>
{Object.keys(state.data).map((name) => (
<Menu.Item key={name}
onClick={() => setNetworkCurrentTarget(name)}
>
{name}
</Menu.Item>
))}
</Menu.Dropdown> What do you think about this approach to implementation? |
I'm not sure if this is the right time to generalize yet, as having custom groups created on the fly will have other implications such as having custom UI states per custom group as well (so all UI state can't be at the top-level store, there might be shared UI state but also per-graph UI state in addition to data). So I might need to handle as well the entire store design to make that fit with multiple graphs at once. For now, I would suggest you to stick to what we had discussed in the first place, which is dealing with two statically known graphs
And given the "grouped" graph being defined or not, just display a UI switch to display the "source"/"full" or the "grouped" one. For now, let's keep things simple and keep the same UiState shared, that is if "circular dependencies" were checked before doing the graph switch, then it should be automatically rendered with the newly selected graph. Another notes:
As a side note, I'm totally fine with having hardcoded such as |
Hello and thanks for a great library!
Summary
Context
Our project is using approach similiar to Feature Sliced Design, where code is divided into layers of some entities, all of which have some sort of public API
Example:
Problem
But if i run
skott
against project entrypoint or any of the features index file - the generated visualization is very noisy, because all of the internal modules are exposed right awaySuggestion
Since all of these features, widgets, etc are essentialy "groups" of modules behind single entrypoint with a public API, i suggest to add an option to provide glob-like pattern, which would describe them:
The visualization then can use these groups, e.g. to skip the render of each and every internal module of
src/features/feature-a
and render it as the singlefeatures/feature-a
nodeThis should make the visualization:
Details
This feature would also allow to use
skott
with bigger projects.For example our project now has about 7000 modules in it and currently
skott
just can't handle it in any way, since web-app graph takes forever to load and svg export shows just something like "Maximum size exceeded"Visualization does work, if i choose an
src/widgets/{name}/index.ts
as an entrypoint, but result is still not really useful because of all of the internals being exposed:At the same time our project has only about a 200-300 such "services", "features" and "widgets", which graph rendering should handle much easier
Standard questions
Please answer these questions to help us investigate your issue more quickly:
The text was updated successfully, but these errors were encountered: