Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Drawing to a texture can be extremely slow #876

Open
JaniM opened this issue Jan 7, 2025 · 12 comments
Open

Drawing to a texture can be extremely slow #876

JaniM opened this issue Jan 7, 2025 · 12 comments

Comments

@JaniM
Copy link

JaniM commented Jan 7, 2025

I managed to reduce the issue to the following code. As is, it runs at 3fps on my machine. If I remove any of the three marked lines, I get 80fps+.

Tested against master and 0.4.13 on MacOS. This does NOT happen on 0.4.13, ie. this is a regression.

use macroquad::prelude::*;

#[macroquad::main("Letterbox")]
async fn main() {
    let width = screen_width();
    let height = screen_height();
    let target = render_target(width as u32, height as u32);

    let mut render_target_cam = Camera2D::from_display_rect(Rect::new(0., 0., width, height));
    render_target_cam.render_target = Some(target.clone());

    loop {
        // Removing this line fixes the problem.
        set_camera(&render_target_cam); // <--
        clear_background(BLACK);

        for _ in 0..1000 {
            let rect = Rect::new(10.0, 50.0, 100.0, 30.0);

            // Removing either of the following lines fixes the problem.
            draw_rectangle(rect.x + 100.0, rect.y, rect.w, rect.h, WHITE); // <--
            draw_text("wow", rect.x, rect.y, 24.0, WHITE); // <--
        }

        set_default_camera();
        clear_background(BLACK);
        draw_texture_ex(
            &target.texture,
            0.0,
            0.0,
            WHITE,
            DrawTextureParams {
                dest_size: Some(vec2(width, height)),
                flip_y: true, // Must flip y otherwise 'render_target' will be upside down
                ..Default::default()
            },
        );
        draw_fps();

        next_frame().await;
    }
}
@JaniM
Copy link
Author

JaniM commented Jan 7, 2025

@ollej on Discord noticed rearranging the code like follows works as we'd expect, with stable 80fps+.

use macroquad::prelude::*;

#[macroquad::main("Letterbox")]
async fn main() {
    let width = screen_width();
    let height = screen_height();
    let target = render_target(width as u32, height as u32);

    let mut render_target_cam = Camera2D::from_display_rect(Rect::new(0., 0., width, height));
    render_target_cam.render_target = Some(target.clone());

    loop {
        set_camera(&render_target_cam); // <--
        clear_background(BLACK);

        let rect = Rect::new(10.0, 50.0, 100.0, 30.0);
        for _ in 0..1000 {
            draw_rectangle(rect.x + 100.0, rect.y, rect.w, rect.h, WHITE); // <--
        }

        for _ in 0..1000 {
            draw_text("wow", rect.x, rect.y, 24.0, WHITE); // <--
        }

        set_default_camera();
        clear_background(BLACK);
        draw_texture_ex(
            &target.texture,
            0.0,
            0.0,
            WHITE,
            DrawTextureParams {
                dest_size: Some(vec2(width, height)),
                flip_y: true, // Must flip y otherwise 'render_target' will be upside down
                ..Default::default()
            },
        );
        draw_fps();

        next_frame().await;
    }
}

@historydev
Copy link

Because you are creating Rect 1000 times per frame.
Move Rect::new outside the game loop:


#[macroquad::main("Letterbox")]
async fn main() {
    let width = screen_width();
    let height = screen_height();
    let target = render_target(width as u32, height as u32);

    let mut render_target_cam = Camera2D::from_display_rect(Rect::new(0., 0., width, height));
    render_target_cam.render_target = Some(target.clone());

    let rect = Rect::new(10.0, 50.0, 100.0, 30.0);

    loop {
        set_camera(&render_target_cam); // <--
        clear_background(BLACK);


        for _ in 0..1000 {
            draw_rectangle(rect.x + 100.0, rect.y, rect.w, rect.h, WHITE); // <--
        }

        for _ in 0..1000 {
            draw_text("wow", rect.x, rect.y, 24.0, WHITE); // <--
        }

        set_default_camera();
        clear_background(BLACK);
        draw_texture_ex(
            &target.texture,
            0.0,
            0.0,
            WHITE,
            DrawTextureParams {
                dest_size: Some(vec2(width, height)),
                flip_y: true, // Must flip y otherwise 'render_target' will be upside down
                ..Default::default()
            },
        );

        draw_text(&get_fps().to_string(), screen_width() - 50., 30.0, 30.0, BLUE);

        next_frame().await;
    }
}

@JaniM
Copy link
Author

JaniM commented Jan 7, 2025

Because you are creating Rect 1000 times per frame. Move Rect::new outside the game loop:

  1. Rect is so cheap to create that it's not relevant.
  2. LLVM is smart enough to bring it out (or completely inline here).
  3. You do not need to touch the rect in my original reproduction to get rid of the perf issue.

@historydev
Copy link

Because you are creating Rect 1000 times per frame. Move Rect::new outside the game loop:

  1. Rect is so cheap to create that it's not relevant.
  2. LLVM is smart enough to bring it out (or completely inline here).
  3. You do not need to touch the rect in my original reproduction to get rid of the perf issue.

Your code gives me a stable 240 frames with vsync, without it - it gives more. That's why I physically can't test it and I have to look for the problem by eye.

But in 90% of cases such drops are human error, so I would check it a few more times.

@JaniM
Copy link
Author

JaniM commented Jan 7, 2025

Your code gives me a stable 240 frames with vsync, without it - it gives more. That's why I physically can't test it and I have to look for the problem by eye.

Based on your snippet not using draw_fps(), I assume you are using the released version of macroquad, which i explicitly said does not have this issue. You also based your snippet on the version I said works on master too.

@historydev
Copy link

Your code gives me a stable 240 frames with vsync, without it - it gives more. That's why I physically can't test it and I have to look for the problem by eye.

Based on your snippet not using draw_fps(), I assume you are using the released version of macroquad, which i explicitly said does not have this issue. You also based your snippet on the version I said works on master too.

Yes, I tested on 0.4.13, but you didn't specify the version on which this happens.

@KurlykovDanila
Copy link
Contributor

The problem is also reproduced on Linux(X11 Fedora 40 Gnome). Viewing the framegraph showed that most of the time is spent on the quad_gl draw method and waiting for the iris_dri.so driver to respond (as far as I know, this is the mesa opengl driver for intel). For some reason, it is in this situation that a huge wait for the driver to respond occurs. Whether this is a driver error or something else - I do not know. I am sharing the files with you. The "bad" file is a snapshot with the program from the example problem. The "good" file is a snapshot when I draw text or a rectangle in different cycles.
This is very bad (Note: the x-axis placement doesn't make sense (random placement). The y-axis is the nesting of functions):
bad
This is good(we can see that the rendering functions take up most of the program's time):
good

@BnDLett
Copy link

BnDLett commented Jan 19, 2025

I can't say that this would help the issue, but I'd like to point out that (from my understanding) it is REALLY REALLY bad practice to shadow a variable unless necessary (such as to change the type while retaining the variable name). Since you're not changing the type, it may be most beneficial to just allocate the rect variable outside of the main loop. Below is a reply that shows something similar. I really do doubt that not shadowing the variable would help performance, but it is good practice nonetheless.

@ollej on Discord noticed rearranging the code like follows works as we'd expect, with stable 80fps+.

use macroquad::prelude::*;

#[macroquad::main("Letterbox")]
async fn main() {
let width = screen_width();
let height = screen_height();
let target = render_target(width as u32, height as u32);

let mut render_target_cam = Camera2D::from_display_rect(Rect::new(0., 0., width, height));
render_target_cam.render_target = Some(target.clone());

loop {
    set_camera(&render_target_cam); // <--
    clear_background(BLACK);

    let rect = Rect::new(10.0, 50.0, 100.0, 30.0);
    for _ in 0..1000 {
        draw_rectangle(rect.x + 100.0, rect.y, rect.w, rect.h, WHITE); // <--
    }

    for _ in 0..1000 {
        draw_text("wow", rect.x, rect.y, 24.0, WHITE); // <--
    }

    set_default_camera();
    clear_background(BLACK);
    draw_texture_ex(
        &target.texture,
        0.0,
        0.0,
        WHITE,
        DrawTextureParams {
            dest_size: Some(vec2(width, height)),
            flip_y: true, // Must flip y otherwise 'render_target' will be upside down
            ..Default::default()
        },
    );
    draw_fps();

    next_frame().await;
}

}

@JaniM
Copy link
Author

JaniM commented Jan 19, 2025

The original code does not shadow any variables. It'd not be considered bad practice in rust anyhow.

@BnDLett
Copy link

BnDLett commented Jan 19, 2025

Sorry, you are right about the shadowing part. It was late when I was typing that. Nonetheless, the code can be improved (imo) by adjusting this segment specifically.

        for _ in 0..1000 {
            let rect = Rect::new(10.0, 50.0, 100.0, 30.0); // <- this can and should be initialized outside of the main loop

            // Removing either of the following lines fixes the problem.
            draw_rectangle(rect.x + 100.0, rect.y, rect.w, rect.h, WHITE);
            draw_text("wow", rect.x, rect.y, 24.0, WHITE);
        }

This is further justified by the way that you already do that with the width, height, and target. Additionally, here is a stackoverflow answer that further supports my point (of course, that's for the C/C++ compiler, so 🤷‍).

However, it's up to you on if you want to implement that. I'm just throwing my words of wisdom out here. I'm sure the compiler would recognize that and fix it for you -- but relying on the compiler to fix something that you know you can improve is just outright bad practice in my opinion.

If you do have your reasons for doing that, then it'd be highly helpful if you specified those reasons. :)

@KurlykovDanila
Copy link
Contributor

KurlykovDanila commented Jan 20, 2025

@BnDLett You are certainly right about shadowing, but the question was about something else, so your answer is not relevant.

@BnDLett
Copy link

BnDLett commented Jan 20, 2025

Certainly not relevant. As such, I rest my case.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants