-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Go] Schema inference on RecordFromJSON
and TableFromJSON
functions
#30
Comments
RecordFromJSON
and TableFromJSON
functions
Hi @agchang, thanks for opening this issue. I think this feature very well may make sense to implement, and we would welcome your contribution if you decide to do so! I'll write down a few of my thoughts because something like this will generally involve some tradeoffs: Unlike in CSV where changing the number of columns between rows is invalid, JSON allows changes to the "schema" element-by-element. This can mean adding/removing a field between rows or even having entirely disjoint sets of fields.
If we want to go with the latter approach, my recommendation would be to focus on a dedicated implementation of the "first-pass" which infers an Arrow schema from JSON. We can then just use the output of this function as input to the existing ones: func InferSchemaFromJSON(r io.Reader) (*arrow.Schema, error) { ... } // This needs to be implemented
func main() {
jsonBlob := `{ ... }`
schema, err := InferSchemaFromJSON(strings.NewReader(jsonBlob))
if err != nil {
log.Fatal(err)
}
table, err := TableFromJSON(memory.DefaultAllocator, schema, []string{jsonBlob})
if err != nil {
log.Fatal(err)
}
// do table stuff
} |
Describe the enhancement requested
I am interested in support for schema inference in the
RecordFromJSON
andTableFromJSON
functions, as these currently require anarrow.Schema
up front. I can try to contribute this if people think it makes sense. I noticed for CSV, there is NewInferringReader which just assumes the type of the first row.Component(s)
Go
The text was updated successfully, but these errors were encountered: