Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove old device_connections in worker #1807

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

nshoes
Copy link
Contributor

@nshoes nshoes commented Jan 22, 2025

Defaults to 14 days if env var DEVICE_CONNECTION_MAX_AGE_DAYS isn't set.

days_ago = DateTime.shift(DateTime.utc_now(), day: -interval)

DeviceConnection
|> where([d], d.last_seen_at < ^days_ago)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@joshk Since there are no timestamps on device_connections is last_seen_at the right column to query against?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you can actually do this by id, as UUID v7s have the timestamp at the front, but I think a few other things pop out at me first.

We shouldn't delete connections which are still active (where state == connected).

And it would also be good to keep at least one device connection row per device. This current query could remove all device connections for a device if it hasn't reconnected recently.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we need limit and batching protection? Only delete X amount at a time and if there is more remaining, delete the next batch (or reschedule job)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if this is run hourly then we will be fine, its the first one which we might want to run manually with some batching.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could argue that batching should be the default, in case truncation is enabled for an installation that has a large dataset already, but I don't think anyone is using it yet, or at least to the extremes as SmartRent.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Batching by default is right here, I'll implement.

@@ -40,9 +40,6 @@ defmodule Mix.Tasks.NervesHub.Gen.Devices do
org_id: org.id,
product_id: product.id,
identifier: "generated-#{i}",
connection_status: :connected,
connection_established_at: DateTime.now!("Etc/UTC"),
connection_last_seen_at: DateTime.now!("Etc/UTC"),
connection_metadata: %{
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

since we have metadata on DeviceConnection, I wonder if we remove this field too? (we just need to make sure we save the geo data to the DeviceConnection

@nshoes nshoes force-pushed the remove-old-device-connections-worker branch from 87be70f to ab30c3d Compare January 23, 2025 01:18
|> join(:inner, [dc], d in Device, on: dc.device_id == d.id)
|> where([dc, _d], dc.last_seen_at < ^days_ago)
|> where([dc, _d], dc.status != :connected)
|> where([dc, d], dc.id != d.latest_connection_id)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that is sooooo smart!!!!!

@@ -468,7 +470,7 @@ defmodule NervesHubWeb.Live.Devices.Show do
end)
end

defp device_connection(%{device_connections: [connection]}), do: connection
defp device_connection(%{latest_connection: latest_connection}), do: latest_connection
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is always going to return something, so I think we can remove this line and the line below

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How will it always return something? device.latest_connection_id will be nil until the device connects to the socket, so there is an in-between period. Should we backfill? Even then, a device could still not have a device_connection, so maybe that's a bad idea.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants