Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider spans in output #35

Open
lizgzil opened this issue Apr 29, 2020 · 5 comments
Open

Consider spans in output #35

lizgzil opened this issue Apr 29, 2020 · 5 comments
Assignees

Comments

@lizgzil
Copy link
Contributor

lizgzil commented Apr 29, 2020

In the output of split_parser, split and parser we have an output of tokens and predictions.

It may be worth considering a different type of output with the spans of each reference/token rather than the tokens themselves.

@lizgzil lizgzil changed the title Consider output type Consider spans in output Apr 29, 2020
@nsorros
Copy link

nsorros commented Apr 30, 2020

I am not sure how controversial this would be but it would definitely eliminate the need to merge tokens after as the algorithm would extract start and end for each component in a QA fashion

@ivyleavedtoadflax
Copy link
Contributor

ivyleavedtoadflax commented Apr 30, 2020

I thought of these outputs as placeholders. All those scripts are not suitable for production because they would instantiate the model every time they made a prediction, so their utility is somewhat limited. That said, I think I implemented an --output flag which will dump the output to a json.

@lizgzil
Copy link
Contributor Author

lizgzil commented May 1, 2020

@ivyleavedtoadflax ok that makes sense re outputs.

In terms of the instantiation of the model, is it not true that

splitter_parser = SplitParser(config_file=MULTITASK_CFG)

instantiates the model and then you could do

reference_predictions = splitter_parser.split_parse(text)

as many times as you wanted without having to reinstantiate the model?

@nsorros
Copy link

nsorros commented May 1, 2020

@ivyleavedtoadflax ok that makes sense re outputs.

In terms of the instantiation of the model, is it not true that

splitter_parser = SplitParser(config_file=MULTITASK_CFG)

instantiates the model and then you could do

reference_predictions = splitter_parser.split_parse(text)

as many times as you wanted without having to reinstantiate the model?

Even though unrelated to this issue, I am almost 100% you are right. @ivyleavedtoadflax can confirm.

@ivyleavedtoadflax
Copy link
Contributor

Yup exactly right @lizgzil. That's not how I had done it in the split, parse, split_parse commands, which is why they are no good for prod.

@nsorros nsorros removed their assignment Feb 21, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants