You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on Jan 10, 2025. It is now read-only.
As we move to a different storage system and way of representing operations on a dataset, we will need a more robust schema. Currently, the very simple schema we have is
Project: Contains multiple datasets
Dataset: represents the full dataset as a set of summary data and multiple Columns and MetaColumns
Column: Represents a column in the original dataset, has a name and a list of unique entries
MetaColumn: A simple way of treating two columns as 1, this ultimetly gets merged in to a single column when we run the code output
Entry: A unique entry in a column which has a value and the number of times it occurs in that column
Mapping: A collections of entries for a specific column that will be mapped to another value,
We probably want to rethink this schema to make it a lot more rhobust to other tasks we want to run in smooshr.
The text was updated successfully, but these errors were encountered:
Sign up for freeto subscribe to this conversation on GitHub.
Already have an account?
Sign in.
As we move to a different storage system and way of representing operations on a dataset, we will need a more robust schema. Currently, the very simple schema we have is
We probably want to rethink this schema to make it a lot more rhobust to other tasks we want to run in smooshr.
The text was updated successfully, but these errors were encountered: