-
Notifications
You must be signed in to change notification settings - Fork 723
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add experimental monitor mode to BasicCrawler #2692
base: master
Are you sure you want to change the base?
Conversation
Fixes apify#2680 Add a new Monitor class to track and display time estimation and concurrency status in the CLI output at regular intervals. * **Monitor Class**: - Add `Monitor` class in `packages/core/src/monitor.ts`. - Include logic to write into the output and gather and calculate the monitor data. * **BasicCrawler Integration**: - Import `Monitor` class in `packages/basic-crawler/src/internals/basic-crawler.ts`. - Initialize and start the `Monitor` class in the `run` function. - Ensure monitor output and `log` output are written on separate lines. - Add `monitor` option to `BasicCrawlerOptions` interface. --- For more details, open the [Copilot Workspace session](https://copilot-workspace.githubnext.com/apify/crawlee/issues/2680?shareId=XXXX-XXXX-XXXX-XXXX).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks for the PR!
few nits to begin with, i will try it out later today
* If you encounter issues due to this change, please: | ||
* - report it to us: https://github.com/apify/crawlee | ||
* - set `requestLocking` to `false` in the `experiments` option of the crawler |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please remove this part, its wrong. also this should not be included in experiments
@@ -904,11 +912,15 @@ export class BasicCrawler<Context extends CrawlingContext = BasicCrawlingContext | |||
this.events.on(EventType.MIGRATING, boundPauseOnMigration); | |||
this.events.on(EventType.ABORTING, boundPauseOnMigration); | |||
|
|||
const monitor = this.experiments.monitor ? new Monitor(this.stats, this.log) : null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep this conditional, but not inside experiments, we use those for unstable features, I don't forsee any actual problems with it here.
@B4nan If you trying to run this code, You should expect something to go wrong because I'm not done yet (that's why I made this PR a draft), I should make the other log above this monitor like in Not even writing a test yet, should I write a unit or e2e test for it? which one or both? |
Co-authored-by: Martin Adámek <[email protected]>
Yes, we need some tests, both works fine, but I am not sure now how to implement checks for the E2E tests really.
That's a bit scary, opening a PR with code you haven't even tried...
We won't be merging this before it's ready, the fact that this is in progress PR is not relevant. |
…into add-monitor-mode
you will need to add some tests. maybe the e2e test could do the job here, we at least need to know it wont break anything, we can check how it works manually, but its important to verify no runtime failures can happen. |
Fixes #2680
Add a new Monitor class to track and display time estimation and concurrency status in the CLI output at regular intervals.
Monitor
class inpackages/core/src/monitor.ts
.Monitor
class inpackages/basic-crawler/src/internals/basic-crawler.ts
.Monitor
class in therun
function.log
output are written on separate lines.monitor
option toBasicCrawlerOptions
interface.