-
-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create an analyzer that checks for simple, ignorable non-text changes #175
Comments
At this weeks analyst meeting, CAPTHAs came up as another constantly changing thing that is hopefully easy to identify. Also:
More far out:
We should probably turn this issue into an umbrella/epic issue for all these different ideas and pieces of work. |
From some BLM examples @jschell42 sent me:
There’s definitely an interesting thing here I wasn’t thinking about before… we could make a big split in prioritization based simply on textual (+ images and such) content changes. I can see some super-useful annotation data we could display for analysts (especially in their sheets) like:
Some diffs for examples:
|
This issue has been automatically marked as stale because it has not had recent activity. It will be closed in seven days if no further activity occurs. If it should not be closed, please comment! Thank you for your contributions. |
Another example of something that should really be totally ignored: https://monitoring.envirodatagov.org/page/c4328d30-cada-452f-8642-4bff721f5fc2/9a448c37-9285-4107-9ffd-ea72214561a4..a8fab661-07bb-4409-92f7-f73deadf4e29 (change to class attribute) |
As a first test of all the things needed to automatically rate a change’s significance, priority, let’s start with something simple that looks for changes that we can pretty confidently say aren’t meaningful:
'
→’
)title
,alt
,href
, orsrc
(any others?) are not importantExample: https://monitoring.envirodatagov.org/page/b2b0b8cb-5e9b-4178-91c0-b8cb4466d2bd/b76dd1ab-a7aa-41d6-89f3-c45117a80dc5..2b55beed-db97-4249-b30a-600f61d94eb5
This is an easy analysis to do (and covers a lot of the kinds of changes I think we see), so it’s a good way to make sure we’ve built out:
The text was updated successfully, but these errors were encountered: