You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The task is to recreate the original parquet file that we used in notebooks for data science and model training. The parquet file is a compressed format that makes a smaller size file and improves loading.
This can possibly be a python script that queries basex and save the resulting table in parquet format.
The original parquet file included all harmonized attributes from the Biosample XML and in addition a set of selected (non-attribute) properties:
primaryId
sraId
sampleName
ownerAbbr
ownerName
taxonomyId
taxonomyName
organismName
status
statusDate
model
package
packageName
title
accession
submissionDate
lastUpdate
publicationDate
dnaSource
entrezTarget
entrezLabel
entrezValue
paragraph
The task is to recreate the original parquet file that we used in notebooks for data science and model training. The parquet file is a compressed format that makes a smaller size file and improves loading.
This can possibly be a python script that queries basex and save the resulting table in parquet format.
The original parquet file included all harmonized attributes from the Biosample XML and in addition a set of selected (non-attribute) properties:
primaryId
sraId
sampleName
ownerAbbr
ownerName
taxonomyId
taxonomyName
organismName
status
statusDate
model
package
packageName
title
accession
submissionDate
lastUpdate
publicationDate
dnaSource
entrezTarget
entrezLabel
entrezValue
paragraph
@turbomam @wdduncan @hrshdhgd
The text was updated successfully, but these errors were encountered: