You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
emtsv does not handle CoNLL-U comments very well. If the input is a tsv file, two things happen:
If the file only has the form column, comments (lines starting with "# ") are treated as a token and are analyzed as a single "word" token
If the file has other columns (e.g. form anas lemma xpostag to which I want to add upostag feats), only the new header is returned.
Expected behavior: comments should be kept in the text and returned as-is, and they should not prevent emtsv to analyze the text (as in the second case).
The text was updated successfully, but these errors were encountered:
Specifiing this in the docs is ok, but changing the default in xtsv requires new major version at least in xtsv. These breaking changes should be commited in batches to minimise disruption. (We have others in mind.)
emtsv does not handle CoNLL-U comments very well. If the input is a tsv file, two things happen:
form
column, comments (lines starting with "#
") are treated as a token and are analyzed as a single "word" tokenform anas lemma xpostag
to which I want to addupostag feats
), only the new header is returned.Expected behavior: comments should be kept in the text and returned as-is, and they should not prevent emtsv to analyze the text (as in the second case).
The text was updated successfully, but these errors were encountered: