Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Data Analysis: Do high emissions predict reporting non-compliance? #147

Draft
wants to merge 15 commits into
base: main
Choose a base branch
from

Conversation

colton-lapp
Copy link
Collaborator

@colton-lapp colton-lapp commented Jan 13, 2025

Description

This pull request creates:

  1. A jupyter notebook file with data analysis findings
  2. A blog post describing the findings in the jupyter notebook file

The blog post investigates a question raised in #114 - does poor performance correlate with non-reporting? The short answer is no, I didn't find that pattern in the data.

The data analysis in the Jupter notebook consists of the following steps:

  1. Create some basic data viz showing variables of interest and compliance trends over time
  2. Create lagged variables of emissions last year and the emission trends from 2 years ago to 1 year ago
  3. Create graphs comparing mean/median GHG intensity last year and GHG trend from 2 years ago to 1 year ago vs reporting compliance, showing basically no difference
  4. Run a regression with a single control variable (square footage) to confirm there is no statistically significant relationship
  5. Run some robustness checks by dropping outliers and dropping covid and repeating steps 3-4; still no significant results

These findings are then summarized in a new blog post.

A couple other notes:

  • I used the graphing package plotly to make html graphs that are interactive, and embedded those in my blog. Because the html graphs allow you to over over individual data points and display info, they take up a decent amount of space (between 1-20 mb). This also makes the Jupyter notebook file larger, but I tried to cut down the size by making some of the plots static image files
  • I don't know anything about javascript or html so I relied on Gen AI to do a lot of the coding for embedding and rendering interactive html graphs and fetching regression results from a JSON file. This could probably use some serious attention
  • I've introduced some new dependencies for data visualization and am not sure how this is managed in the project

This is my first time creating a PR for a public repo and for this project specifically so happy to restructure any work or accept any feedback! I'm expecting some heavy feedback on files committed (i.e. new packages used in requirements.txt, python virtual environment, directory structure).

Fixes #114

Testing Instructions

I would recommend pulling, running docker-compose up and looking at my blog. Additionally, check out the Jupyter notebook to verify that I'm analyzing the correct variables and don't have data analysis mistakes, etc. To see the interactive html graphs in the notebook, you have to view the Jupyter file in NBViewer as it won't render in Github

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

@colton-lapp colton-lapp self-assigned this Jan 13, 2025
Copy link

netlify bot commented Jan 13, 2025

Deploy Preview for radiant-cucurucho-d09bae ready!

Name Link
🔨 Latest commit 4065b79
🔍 Latest deploy log https://app.netlify.com/sites/radiant-cucurucho-d09bae/deploys/6790771436d07d0008f2e3da
😎 Deploy Preview https://deploy-preview-147--radiant-cucurucho-d09bae.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

@colton-lapp colton-lapp added enhancement New feature or request data Data updates & tweaks labels Jan 13, 2025
<script lang="ts">
import { Component, Vue } from 'vue-property-decorator';
import NewTabIcon from '~/components/NewTabIcon.vue';
import { ComponentOptions } from 'vue';

Check warning

Code scanning / ESLint

Disallow unused variables Warning

'ComponentOptions' is defined but never used.

Copilot Autofix AI about 3 hours ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

'https://data.cityofchicago.org/Environment-Sustainable-Development/Chicago-Energy-Benchmarking/xq83-jr8c/explore/query/SELECT%0A%20%20%60data_year%60%2C%0A%20%20%60id%60%2C%0A%20%20%60property_name%60%2C%0A%20%20%60reporting_status%60%2C%0A%20%20%60address%60%2C%0A%20%20%60zip_code%60%2C%0A%20%20%60chicago_energy_rating%60%2C%0A%20%20%60exempt_from_chicago_energy_rating%60%2C%0A%20%20%60community_area%60%2C%0A%20%20%60primary_property_type%60%2C%0A%20%20%60gross_floor_area_buildings_sq_ft%60%2C%0A%20%20%60year_built%60%2C%0A%20%20%60of_buildings%60%2C%0A%20%20%60water_use_kgal%60%2C%0A%20%20%60energy_star_score%60%2C%0A%20%20%60electricity_use_kbtu%60%2C%0A%20%20%60natural_gas_use_kbtu%60%2C%0A%20%20%60district_steam_use_kbtu%60%2C%0A%20%20%60district_chilled_water_use_kbtu%60%2C%0A%20%20%60all_other_fuel_use_kbtu%60%2C%0A%20%20%60site_eui_kbtu_sq_ft%60%2C%0A%20%20%60source_eui_kbtu_sq_ft%60%2C%0A%20%20%60weather_normalized_site_eui_kbtu_sq_ft%60%2C%0A%20%20%60weather_normalized_source_eui_kbtu_sq_ft%60%2C%0A%20%20%60total_ghg_emissions_metric_tons_co2e%60%2C%0A%20%20%60ghg_intensity_kg_co2e_sq_ft%60%2C%0A%20%20%60latitude%60%2C%0A%20%20%60longitude%60%2C%0A%20%20%60location%60%2C%0A%20%20%60row_id%60%2C%0A%20%20%60%3A%40computed_region_43wa_7qmu%60%2C%0A%20%20%60%3A%40computed_region_vrxf_vc4k%60%2C%0A%20%20%60%3A%40computed_region_6mkv_f3dw%60%2C%0A%20%20%60%3A%40computed_region_bdys_3d7i%60%2C%0A%20%20%60%3A%40computed_region_awaf_s7ux%60%0AWHERE%0A%20%20%28%60data_year%60%20IN%20%28%222019%22%2C%20%222020%22%2C%20%222021%22%2C%20%222022%22%2C%20%222018%22%29%29%0A%20%20AND%20caseless_one_of%28%60reporting_status%60%2C%20%22Not%20Submitted%22%29/page/filter';

// New properties
results: any = null; // Holds fetched JSON data

Check warning

Code scanning / ESLint

Disallow the `any` type Warning

Unexpected any. Specify a different type.

Copilot Autofix AI about 3 hours ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

}

// Meta information for the page
metaInfo(): any {

Check warning

Code scanning / ESLint

Disallow the `any` type Warning

Unexpected any. Specify a different type.

Copilot Autofix AI about 3 hours ago

Copilot could not generate an autofix suggestion

Copilot could not generate an autofix suggestion for this alert. Try pushing a new commit or if the problem persists contact support.

@colton-lapp colton-lapp requested a review from vkoves January 15, 2025 22:59
@colton-lapp
Copy link
Collaborator Author

I've reduced the file size of the graphs down to a cumulative 2MB. I did this by dropping some of the data displayed on hover. We could reduce the file size even more by not displaying every single observation in the scatterplots and only displaying a handful of of the buildings that have standard emissions (it's hard to tell them apart anyways). We could also just convert the images to static PNG files. Let me know what you think is best.

@vkoves
Copy link
Owner

vkoves commented Jan 20, 2025

@colton-lapp - I meant to comment when I pushed up my fixes - I've added some date stamps to the blog posts and reodered it so yours comes first (since it's newer). I'm fine with that file size, but it looks like there's some responsiveness issues with the graphs - if you can fix those, I'm good with it, but otherwise we could move to images. Here's an example:

Desktop (shows scrollbar) Mobile (cut-off)
Screenshot from 2025-01-20 16-37-50 Screenshot from 2025-01-20 16-38-02

Also is there a way to note dependencies for you Jupyter notebook? I tried running it locally but had to manually install dependencies like plotly, which aren't in our requirements.txt. Maybe you can add some instructions at the start of your notebook and maybe a requirements.txt file? I don't know what's typical there

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
data Data updates & tweaks enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

See If There's A Correlation Between Poor Performance & Not Reporting
2 participants