Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Grammar railroad diagram #344

Open
mingodad opened this issue Sep 23, 2024 · 8 comments
Open

Grammar railroad diagram #344

mingodad opened this issue Sep 23, 2024 · 8 comments

Comments

@mingodad
Copy link

I've just added this project grammar to https://mingodad.github.io/parsertl-playground/playground/ an Yacc/Lex compatible online editor/tester (select Protocompile parser from Examples then click Parse to see a parse tree for the content in Input Source), it also generate an EBNF understood by (IPV4) https://rr.red-dove.com/ui or (IPV6) https://www.bottlecaps.de/rr/ui to generate a nice navigable railroad diagram (see instruction bellow at the top).

I hope it can help develop/debug/document this project grammar.

//
// EBNF to be viewd at
//    (IPV6) https://www.bottlecaps.de/rr/ui
//    (IPV4) https://rr.red-dove.com/ui
//
// Copy and paste this at one of the urls shown above in the 'Edit Grammar' tab
// then click the 'View Diagram' tab.
//

file::=
	  syntaxDecl
	| editionDecl
	| fileBody
	| syntaxDecl fileBody
	| editionDecl fileBody
	| /*%empty*/

fileBody::=
	  semicolons fileElements

fileElements::=
	  fileElements fileElement
	| fileElement

fileElement::=
	  importDecl
	| packageDecl
	| optionDecl
	| messageDecl
	| enumDecl
	| extensionDecl
	| serviceDecl

semicolonList::=
	  ';'
	| semicolonList ';'

semicolons::=
	  semicolonList
	| /*%empty*/

semicolon::=
	  ';'
	| /*%empty*/

syntaxDecl::=
	  _SYNTAX '=' stringLit ';'

editionDecl::=
	  _EDITION '=' stringLit ';'

importDecl::=
	  _IMPORT stringLit semicolons
	| _IMPORT _WEAK stringLit semicolons
	| _IMPORT _PUBLIC stringLit semicolons

packageDecl::=
	  _PACKAGE qualifiedIdentifier semicolons

qualifiedIdentifier::=
	  identifier
	| qualifiedIdentifier '.' identifier

qualifiedIdentifierDot::=
	  qualifiedIdentifierFinal
	| qualifiedIdentifierLeading qualifiedIdentifierFinal

qualifiedIdentifierLeading::=
	  qualifiedIdentifierEntry
	| qualifiedIdentifierLeading qualifiedIdentifierEntry

qualifiedIdentifierFinal::=
	  identifier
	| qualifiedIdentifierEntry

qualifiedIdentifierEntry::=
	  identifier '.'

msgElementIdent::=
	  msgElementName
	| msgElementIdent '.' identifier

extElementIdent::=
	  extElementName
	| extElementIdent '.' identifier

oneofElementIdent::=
	  oneofElementName
	| oneofElementIdent '.' identifier

notGroupElementIdent::=
	  notGroupElementName
	| notGroupElementIdent '.' identifier

mtdElementIdent::=
	  mtdElementIdentFinal
	| mtdElementIdentLeading mtdElementIdentFinal

mtdElementIdentLeading::=
	  mtdElementIdentEntry
	| mtdElementIdentLeading mtdElementIdentEntry

mtdElementIdentFinal::=
	  mtdElementName
	| mtdElementIdentEntry

mtdElementIdentEntry::=
	  mtdElementName '.'

oneofOptionDecl::=
	  _OPTION optionName '=' optionValue semicolon

optionDecl::=
	  _OPTION optionName '=' optionValue semicolons

optionNamePart::=
	  identifier
	| extensionName

optionNameEntry::=
	  optionNamePart '.'

optionNameFinal::=
	  optionNamePart
	| optionNameEntry

optionNameLeading::=
	  optionNameEntry
	| optionNameLeading optionNameEntry

optionName::=
	  optionNameFinal
	| optionNameLeading optionNameFinal

extensionName::=
	  '(' typeName ')'

optionValue::=
	  scalarValue
	| messageLiteralWithBraces

scalarValue::=
	  stringLit
	| numLit
	| specialFloatLit
	| identifier

numLit::=
	  _FLOAT_LIT
	| '-' _FLOAT_LIT
	| _INT_LIT
	| '-' _INT_LIT

specialFloatLit::=
	  '-' _INF
	| '-' _NAN

stringLit::=
	  _STRING_LIT
	| stringLit _STRING_LIT

messageLiteralWithBraces::=
	  '{' messageTextFormat '}'
	| '{' '}'

messageTextFormat::=
	  messageLiteralFields

messageLiteralFields::=
	  messageLiteralFieldEntry
	| messageLiteralFieldEntry messageLiteralFields

messageLiteralFieldEntry::=
	  messageLiteralField
	| messageLiteralField ','
	| messageLiteralField ';'

messageLiteralField::=
	  messageLiteralFieldName ':' fieldValue
	| messageLiteralFieldName messageValue

messageLiteralFieldName::=
	  identifier
	| '[' qualifiedIdentifierDot ']'
	| '[' qualifiedIdentifierDot '/' qualifiedIdentifierDot ']'

fieldValue::=
	  fieldScalarValue
	| messageLiteral
	| listLiteral

fieldScalarValue::=
	  stringLit
	| numLit
	| '-' identifier
	| identifier

messageValue::=
	  messageLiteral
	| listOfMessagesLiteral

messageLiteral::=
	  messageLiteralWithBraces
	| '<' messageTextFormat '>'
	| '<' '>'

listLiteral::=
	  '[' listElements ']'
	| '[' ']'

listElements::=
	  listElement
	| listElements ',' listElement

listElement::=
	  fieldScalarValue
	| messageLiteral

listOfMessagesLiteral::=
	  '[' messageLiterals ']'
	| '[' ']'

messageLiterals::=
	  messageLiteral
	| messageLiterals ',' messageLiteral

typeName::=
	  qualifiedIdentifierDot
	| '.' qualifiedIdentifierDot

msgElementTypeIdent::=
	  msgElementIdent
	| '.' qualifiedIdentifier

extElementTypeIdent::=
	  extElementIdent
	| '.' qualifiedIdentifier

oneofElementTypeIdent::=
	  oneofElementIdent
	| '.' qualifiedIdentifier

notGroupElementTypeIdent::=
	  notGroupElementIdent
	| '.' qualifiedIdentifier

mtdElementTypeIdent::=
	  mtdElementIdent
	| '.' qualifiedIdentifierDot

fieldCardinality::=
	  _REQUIRED
	| _OPTIONAL
	| _REPEATED

compactOptions::=
	  '[' compactOptionDecls ']'
	| '[' ']'

compactOptionDecls::=
	  compactOptionFinal
	| compactOptionLeadingDecls compactOptionFinal

compactOptionLeadingDecls::=
	  compactOptionEntry
	| compactOptionLeadingDecls compactOptionEntry

compactOptionFinal::=
	  compactOption
	| compactOptionEntry

compactOptionEntry::=
	  compactOption ','

compactOption::=
	  optionName '=' optionValue
	| optionName

groupDecl::=
	  fieldCardinality _GROUP identifier '=' _INT_LIT '{' messageBody '}'
	| fieldCardinality _GROUP identifier '=' _INT_LIT compactOptions '{' messageBody '}'

messageGroupDecl::=
	  fieldCardinality _GROUP identifier '=' _INT_LIT '{' messageBody '}' semicolons
	| fieldCardinality _GROUP identifier '=' _INT_LIT compactOptions '{' messageBody '}' semicolons
	| fieldCardinality _GROUP identifier '{' messageBody '}' semicolons
	| fieldCardinality _GROUP identifier compactOptions '{' messageBody '}' semicolons

oneofDecl::=
	  _ONEOF identifier '{' oneofBody '}' semicolons

oneofBody::=
	  /*%empty*/
	| oneofElements

oneofElements::=
	  oneofElements oneofElement
	| oneofElement

oneofElement::=
	  oneofOptionDecl
	| oneofFieldDecl
	| oneofGroupDecl

oneofFieldDecl::=
	  oneofElementTypeIdent identifier '=' _INT_LIT semicolon
	| oneofElementTypeIdent identifier '=' _INT_LIT compactOptions semicolon
	| oneofElementTypeIdent identifier semicolon
	| oneofElementTypeIdent identifier compactOptions semicolon

oneofGroupDecl::=
	  _GROUP identifier '=' _INT_LIT '{' messageBody '}'
	| _GROUP identifier '=' _INT_LIT compactOptions '{' messageBody '}'
	| _GROUP identifier '{' messageBody '}'
	| _GROUP identifier compactOptions '{' messageBody '}'

mapFieldDecl::=
	  mapType identifier '=' _INT_LIT semicolons
	| mapType identifier '=' _INT_LIT compactOptions semicolons
	| mapType identifier semicolons
	| mapType identifier compactOptions semicolons

mapType::=
	  _MAP '<' mapKeyType ',' typeName '>'

mapKeyType::=
	  _INT32
	| _INT64
	| _UINT32
	| _UINT64
	| _SINT32
	| _SINT64
	| _FIXED32
	| _FIXED64
	| _SFIXED32
	| _SFIXED64
	| _BOOL
	| _STRING

extensionRangeDecl::=
	  _EXTENSIONS tagRanges ';' semicolons
	| _EXTENSIONS tagRanges compactOptions semicolons

tagRanges::=
	  tagRange
	| tagRanges ',' tagRange

tagRange::=
	  _INT_LIT
	| _INT_LIT _TO _INT_LIT
	| _INT_LIT _TO _MAX

enumValueRanges::=
	  enumValueRange
	| enumValueRanges ',' enumValueRange

enumValueRange::=
	  enumValueNumber
	| enumValueNumber _TO enumValueNumber
	| enumValueNumber _TO _MAX

enumValueNumber::=
	  _INT_LIT
	| '-' _INT_LIT

msgReserved::=
	  _RESERVED tagRanges ';' semicolons
	| reservedNames

enumReserved::=
	  _RESERVED enumValueRanges ';' semicolons
	| reservedNames

reservedNames::=
	  _RESERVED fieldNameStrings semicolons
	| _RESERVED fieldNameIdents semicolons

fieldNameStrings::=
	  stringLit
	| fieldNameStrings ',' stringLit

fieldNameIdents::=
	  identifier
	| fieldNameIdents ',' identifier

enumDecl::=
	  _ENUM identifier '{' enumBody '}' semicolons

enumBody::=
	  semicolons
	| semicolons enumElements

enumElements::=
	  enumElements enumElement
	| enumElement

enumElement::=
	  optionDecl
	| enumValueDecl
	| enumReserved

enumValueDecl::=
	  enumValueName '=' enumValueNumber semicolons
	| enumValueName '=' enumValueNumber compactOptions semicolons

messageDecl::=
	  _MESSAGE identifier '{' messageBody '}' semicolons

messageBody::=
	  semicolons
	| semicolons messageElements

messageElements::=
	  messageElements messageElement
	| messageElement

messageElement::=
	  messageFieldDecl
	| enumDecl
	| messageDecl
	| extensionDecl
	| extensionRangeDecl
	| messageGroupDecl
	| optionDecl
	| oneofDecl
	| mapFieldDecl
	| msgReserved

messageFieldDecl::=
	  fieldCardinality notGroupElementTypeIdent identifier '=' _INT_LIT semicolons
	| fieldCardinality notGroupElementTypeIdent identifier '=' _INT_LIT compactOptions semicolons
	| msgElementTypeIdent identifier '=' _INT_LIT semicolons
	| msgElementTypeIdent identifier '=' _INT_LIT compactOptions semicolons
	| fieldCardinality notGroupElementTypeIdent identifier semicolons
	| fieldCardinality notGroupElementTypeIdent identifier compactOptions semicolons
	| msgElementTypeIdent identifier semicolons
	| msgElementTypeIdent identifier compactOptions semicolons

extensionDecl::=
	  _EXTEND typeName '{' extensionBody '}' semicolons

extensionBody::=
	  /*%empty*/
	| extensionElements

extensionElements::=
	  extensionElements extensionElement
	| extensionElement

extensionElement::=
	  extensionFieldDecl
	| groupDecl

extensionFieldDecl::=
	  fieldCardinality notGroupElementTypeIdent identifier '=' _INT_LIT semicolon
	| fieldCardinality notGroupElementTypeIdent identifier '=' _INT_LIT compactOptions semicolon
	| extElementTypeIdent identifier '=' _INT_LIT semicolon
	| extElementTypeIdent identifier '=' _INT_LIT compactOptions semicolon

serviceDecl::=
	  _SERVICE identifier '{' serviceBody '}' semicolons

serviceBody::=
	  semicolons
	| semicolons serviceElements

serviceElements::=
	  serviceElements serviceElement
	| serviceElement

serviceElement::=
	  optionDecl
	| methodDecl

methodDecl::=
	  _RPC identifier methodMessageType _RETURNS methodMessageType semicolons
	| _RPC identifier methodMessageType _RETURNS methodMessageType '{' methodBody '}' semicolons

methodMessageType::=
	  '(' _STREAM typeName ')'
	| '(' mtdElementTypeIdent ')'

methodBody::=
	  semicolons
	| semicolons methodElements

methodElements::=
	  methodElements methodElement
	| methodElement

methodElement::=
	  optionDecl

msgElementName::=
	  _NAME
	| _SYNTAX
	| _EDITION
	| _IMPORT
	| _WEAK
	| _PUBLIC
	| _PACKAGE
	| _TRUE
	| _FALSE
	| _INF
	| _NAN
	| _DOUBLE
	| _FLOAT
	| _INT32
	| _INT64
	| _UINT32
	| _UINT64
	| _SINT32
	| _SINT64
	| _FIXED32
	| _FIXED64
	| _SFIXED32
	| _SFIXED64
	| _BOOL
	| _STRING
	| _BYTES
	| _MAP
	| _TO
	| _MAX
	| _SERVICE
	| _RPC
	| _STREAM
	| _RETURNS

extElementName::=
	  _NAME
	| _SYNTAX
	| _EDITION
	| _IMPORT
	| _WEAK
	| _PUBLIC
	| _PACKAGE
	| _OPTION
	| _TRUE
	| _FALSE
	| _INF
	| _NAN
	| _DOUBLE
	| _FLOAT
	| _INT32
	| _INT64
	| _UINT32
	| _UINT64
	| _SINT32
	| _SINT64
	| _FIXED32
	| _FIXED64
	| _SFIXED32
	| _SFIXED64
	| _BOOL
	| _STRING
	| _BYTES
	| _ONEOF
	| _MAP
	| _EXTENSIONS
	| _TO
	| _MAX
	| _RESERVED
	| _ENUM
	| _MESSAGE
	| _EXTEND
	| _SERVICE
	| _RPC
	| _STREAM
	| _RETURNS

enumValueName::=
	  _NAME
	| _SYNTAX
	| _EDITION
	| _IMPORT
	| _WEAK
	| _PUBLIC
	| _PACKAGE
	| _TRUE
	| _FALSE
	| _INF
	| _NAN
	| _REPEATED
	| _OPTIONAL
	| _REQUIRED
	| _DOUBLE
	| _FLOAT
	| _INT32
	| _INT64
	| _UINT32
	| _UINT64
	| _SINT32
	| _SINT64
	| _FIXED32
	| _FIXED64
	| _SFIXED32
	| _SFIXED64
	| _BOOL
	| _STRING
	| _BYTES
	| _GROUP
	| _ONEOF
	| _MAP
	| _EXTENSIONS
	| _TO
	| _MAX
	| _ENUM
	| _MESSAGE
	| _EXTEND
	| _SERVICE
	| _RPC
	| _STREAM
	| _RETURNS

oneofElementName::=
	  _NAME
	| _SYNTAX
	| _EDITION
	| _IMPORT
	| _WEAK
	| _PUBLIC
	| _PACKAGE
	| _TRUE
	| _FALSE
	| _INF
	| _NAN
	| _DOUBLE
	| _FLOAT
	| _INT32
	| _INT64
	| _UINT32
	| _UINT64
	| _SINT32
	| _SINT64
	| _FIXED32
	| _FIXED64
	| _SFIXED32
	| _SFIXED64
	| _BOOL
	| _STRING
	| _BYTES
	| _ONEOF
	| _MAP
	| _EXTENSIONS
	| _TO
	| _MAX
	| _RESERVED
	| _ENUM
	| _MESSAGE
	| _EXTEND
	| _SERVICE
	| _RPC
	| _STREAM
	| _RETURNS

notGroupElementName::=
	  _NAME
	| _SYNTAX
	| _EDITION
	| _IMPORT
	| _WEAK
	| _PUBLIC
	| _PACKAGE
	| _OPTION
	| _TRUE
	| _FALSE
	| _INF
	| _NAN
	| _REPEATED
	| _OPTIONAL
	| _REQUIRED
	| _DOUBLE
	| _FLOAT
	| _INT32
	| _INT64
	| _UINT32
	| _UINT64
	| _SINT32
	| _SINT64
	| _FIXED32
	| _FIXED64
	| _SFIXED32
	| _SFIXED64
	| _BOOL
	| _STRING
	| _BYTES
	| _ONEOF
	| _MAP
	| _EXTENSIONS
	| _TO
	| _MAX
	| _RESERVED
	| _ENUM
	| _MESSAGE
	| _EXTEND
	| _SERVICE
	| _RPC
	| _STREAM
	| _RETURNS

mtdElementName::=
	  _NAME
	| _SYNTAX
	| _EDITION
	| _IMPORT
	| _WEAK
	| _PUBLIC
	| _PACKAGE
	| _OPTION
	| _TRUE
	| _FALSE
	| _INF
	| _NAN
	| _REPEATED
	| _OPTIONAL
	| _REQUIRED
	| _DOUBLE
	| _FLOAT
	| _INT32
	| _INT64
	| _UINT32
	| _UINT64
	| _SINT32
	| _SINT64
	| _FIXED32
	| _FIXED64
	| _SFIXED32
	| _SFIXED64
	| _BOOL
	| _STRING
	| _BYTES
	| _GROUP
	| _ONEOF
	| _MAP
	| _EXTENSIONS
	| _TO
	| _MAX
	| _RESERVED
	| _ENUM
	| _MESSAGE
	| _EXTEND
	| _SERVICE
	| _RPC
	| _RETURNS

identifier::=
	  _NAME
	| _SYNTAX
	| _EDITION
	| _IMPORT
	| _WEAK
	| _PUBLIC
	| _PACKAGE
	| _OPTION
	| _TRUE
	| _FALSE
	| _INF
	| _NAN
	| _REPEATED
	| _OPTIONAL
	| _REQUIRED
	| _DOUBLE
	| _FLOAT
	| _INT32
	| _INT64
	| _UINT32
	| _UINT64
	| _SINT32
	| _SINT64
	| _FIXED32
	| _FIXED64
	| _SFIXED32
	| _SFIXED64
	| _BOOL
	| _STRING
	| _BYTES
	| _GROUP
	| _ONEOF
	| _MAP
	| _EXTENSIONS
	| _TO
	| _MAX
	| _RESERVED
	| _ENUM
	| _MESSAGE
	| _EXTEND
	| _SERVICE
	| _RPC
	| _STREAM
	| _RETURNS

//Tokens

_BOOL ::= "bool"
_BYTES ::= "bytes"
_DOUBLE ::= "double"
_EDITION ::= "edition"
_ENUM ::= "enum"
_EXTEND ::= "extend"
_EXTENSIONS ::= "extensions"
_FALSE ::= "false"
_FIXED32 ::= "fixed32"
_FIXED64 ::= "fixed64"
_FLOAT ::= "float"
_GROUP ::= "group"
_IMPORT ::= "import"
_INF ::= "inf"
_INT32 ::= "int32"
_INT64 ::= "int64"
_MAP ::= "map"
_MAX ::= "max"
_MESSAGE ::= "message"
_NAN ::= "nan"
_ONEOF ::= "oneof"
_OPTIONAL ::= "optional"
_OPTION ::= "option"
_PACKAGE ::= "package"
_PUBLIC ::= "public"
_REPEATED ::= "repeated"
_REQUIRED ::= "required"
_RESERVED ::= "reserved"
_RETURNS ::= "returns"
_RPC ::= "rpc"
_SERVICE ::= "service"
_SFIXED32 ::= "sfixed32"
_SFIXED64 ::= "sfixed64"
_SINT32 ::= "sint32"
_SINT64 ::= "sint64"
_STREAM ::= "stream"
_STRING ::= "string"
_SYNTAX ::= "syntax"
_TO ::= "to"
_TRUE ::= "true"
_UINT32 ::= "uint32"
_UINT64 ::= "uint64"
_WEAK ::= "weak"
@jhump
Copy link
Member

jhump commented Oct 30, 2024

@mingodad, FWIW, we've written a specification for the Protobuf language, and that also includes a grammar that is intended to be human-readable: https://protobuf.com/docs/language-spec

@mingodad
Copy link
Author

@jhump thank you for reply !
Would be nice to have the full grammar in one place there too (instead/plus of piece-wise that's there now) !

@jhump
Copy link
Member

jhump commented Oct 30, 2024

The site that hosts the language spec also contains some other parser examples, including an ANTLR grammar configuration: https://protobuf.com/docs/examples

@mingodad
Copy link
Author

Thank you again !
I see now that I've used the grammar from https://github.com/bufbuild/protocompile/blob/v0.2.0/parser/proto.y on my Yacc/Lex playground.
The last released version is https://github.com/bufbuild/protocompile/releases/tag/v0.14.1 so the parsers/grammars you have on those links seem to be outdated.

@jhump
Copy link
Member

jhump commented Nov 1, 2024

@mingodad, thanks for pointing that out. I've updated the site so it now points to the latest release instead (0.14.1).

@mingodad
Copy link
Author

mingodad commented Nov 1, 2024

Hello @jhump I'm not sure that it's right because comparing the content of the links there is no difference.
Does that mean that the version has changed but the grammar/parser continues the same.

@jhump
Copy link
Member

jhump commented Nov 4, 2024

There over ~1400 lines changed, just in the file proto.y, between the two versions:
https://github.com/bufbuild/protocompile/compare/v0.2.0..v0.14.1#diff-cbe13169fa724e83181b277eea5d1e7c0b7b7647441f3330ddb73abf446f0aee

If you are referring to changes between that latest version and the railroad diagram you created above, it appears you are correct that there is no difference. It looks like you made that diagram based on the latest version, not based on the very old version that was previously linked from the protobuf.com page.

FWIW, the grammar in the protobuf.com site, and in the other example grammars, is much more accurate. The proto.y grammar in protocompile is very lenient and allows several things that the actual protobuf language forbids. This is done specifically so that we can get an AST for a file, even when the source is invalid; this was intended to aid the implementation of a language server (see LSP), where it is common for files to be incomplete/in the process of being edited, and thus to contain transient syntax errors. Note that we are in the process of implementing a new parser (which will temporarily live in an experimental folder, until it's mature enough to replace the existing parser). That one will be even more lenient, specifically for better error recovery and diagnostics for a language server.

@mingodad
Copy link
Author

mingodad commented Nov 5, 2024

@jhump thank you again for reply !
It seems that you are right about the grammar version that I've used (somehow I've got confused).

Now I've got the spec here https://github.com/bufbuild/protobuf-language-spec/blob/main/language-spec.md?plain=1 and manually separated the ebnf parts and converted to an EBNF understood by https://github.com/GuntherRademacher/rr (see bellow instructions at the top).

//
// EBNF to be viewd at
//    (IPV6) https://www.bottlecaps.de/rr/ui
//    (IPV4) https://rr.red-dove.com/ui
//
// Copy and paste this at one of the urls shown above in the 'Edit Grammar' tab
// then click the 'View Diagram' tab.
//
//From: https://github.com/bufbuild/protobuf-language-spec/blob/0df02baa0678241fca35704da220ae699de79144/language-spec.md?plain=1

start ::= File //to make easy for navigation

//Whitespace and Comments
whitespace ::= " " | "\n" | "\r" | "\t" | "\f" | "\v"
comment    ::= line_comment | block_comment

line_comment       ::= "//" ( [^#x10#x00] )*
block_comment      ::= "/*"  block_comment_rest
block_comment_rest ::= "*" block_comment_tail |
                     [^*#x00] block_comment_rest
block_comment_tail ::= "/" |
                     "*" block_comment_tail |
                     [^*/#x00] block_comment_rest

//Character Classes
letter        ::= [A-Za-z_]
decimal_digit ::= [0-9]
octal_digit   ::= [0-7]
hex_digit     ::= [0-9A-Fa-f]

byte_order_mark ::= "\xEF\xBB\xBF"

//Identifiers and Keywords
identifier ::= letter ( letter | decimal_digit )*

syntax   ::= "syntax"
float    ::= "float" .
oneof      ::= "oneof"
import   ::= "import"
double   ::= "double"
map        ::= "map"
weak     ::= "weak"
int32    ::= "int32"
extensions ::= "extensions"
public   ::= "public"
int64    ::= "int64"
to         ::= "to"
package  ::= "package"
uint32   ::= "uint32"
max        ::= "max"
option   ::= "option"
uint64   ::= "uint64"
reserved   ::= "reserved"
inf      ::= "inf"
sint32   ::= "sint32"
enum       ::= "enum"
repeated ::= "repeated"
sint64   ::= "sint64"
message    ::= "message"
optional ::= "optional"
fixed32  ::= "fixed32"
extend     ::= "extend"
required ::= "required"
fixed64  ::= "fixed64"
service    ::= "service"
bool     ::= "bool"
sfixed32 ::= "sfixed32"
rpc        ::= "rpc"
string   ::= "string"
sfixed64 ::= "sfixed64"
stream     ::= "stream"
bytes    ::= "bytes"
group    ::= "group"
returns    ::= "returns"

//Numeric Literals
numeric_literal ::= "."? decimal_digit digit_point_or_exp*

digit_point_or_exp ::= "." | decimal_digit | ( "e" | "E" ) ( "+" | "-" )? | letter
int_literal ::= decimal_literal | octal_literal | hex_literal

decimal_literal ::= "0" | [1-9] decimal_digits?
octal_literal   ::= "0" octal_digits
hex_literal     ::= "0" ( "x" | "X" ) hex_digits
decimal_digits  ::= decimal_digit decimal_digit*
octal_digits    ::= octal_digit octal_digit*
hex_digits      ::= hex_digit hex_digit*

float_literal ::= decimal_digits "." decimal_digits? decimal_exponent? |
                decimal_digits decimal_exponent |
                "." decimal_digits decimal_exponent?

decimal_exponent  ::= ( "e" | "E" ) ( "+" | "-" )? decimal_digits

//String Literals
string_literal ::= single_quoted_string_literal | double_quoted_string_literal

single_quoted_string_literal ::= "'" ( [^#x10#x00'\] | rune_escape_seq )* "'"
double_quoted_string_literal ::= '"' ( [^#x10#x00"\] | rune_escape_seq )* '"'

rune_escape_seq    ::= simple_escape_seq | hex_escape_seq | octal_escape_seq | unicode_escape_seq
simple_escape_seq  ::= '\' ( "a" | "b" | "f" | "n" | "r" | "t" | "v" | '\' | "'" | '"' | "?" )
hex_escape_seq     ::= '\' ( "x" | "X" ) hex_digit hex_digit?
octal_escape_seq   ::= '\' octal_digit ( octal_digit octal_digit? )?
unicode_escape_seq ::= "\u" hex_digit hex_digit hex_digit hex_digit |
                     "\U" hex_digit hex_digit hex_digit hex_digit
                             hex_digit hex_digit hex_digit hex_digit

//Punctuation and Operators
semicolon ::= ";"
colon     ::= ":"
l_paren   ::= "("
l_bracket ::= "["
comma     ::= ","
equals    ::= "="
r_paren   ::= ")"
r_bracket ::= "]"
dot       ::= "."
minus     ::= "-"
l_brace   ::= "{"
l_angle   ::= "<"
slash     ::= "/"
plus      ::= "+"
r_brace   ::= "}"
r_angle   ::= ">"

//Syntax
//Source File Organization
File ::= byte_order_mark? SyntaxDecl? FileElement*

FileElement ::= ImportDecl |
              PackageDecl |
              OptionDecl |
              MessageDecl |
              EnumDecl |
              ExtensionDecl |
              ServiceDecl |
              EmptyDecl

//Declaration Types
EmptyDecl ::= semicolon

//Syntax Declaration
SyntaxDecl ::=  syntax equals SyntaxLevel semicolon

SyntaxLevel ::= StringLiteral

StringLiteral ::= string_literal string_literal*

PackageDecl ::= package PackageName semicolon

PackageName ::= QualifiedIdentifier

//Imports
ImportDecl ::= import ( weak | public )? ImportedFileName semicolon

ImportedFileName ::= StringLiteral

//Type References
TypeName ::= dot? QualifiedIdentifier

QualifiedIdentifier ::= identifier dot identifier*

//Options
OptionDecl ::= option OptionName equals OptionValue semicolon

CompactOptions ::= l_bracket CompactOption ( comma CompactOption )* r_bracket

CompactOption  ::= OptionName equals OptionValue

//Option Names
OptionName ::= ( SimpleName | ExtensionName ) ( dot OptionName )?

SimpleName    ::= identifier
ExtensionName ::= l_paren TypeName r_paren

//Option Values
OptionValue ::= ScalarValue | MessageLiteralWithBraces

ScalarValue  ::= StringLiteral | UintLiteral | IntLiteral | FloatLiteral | identifier
UintLiteral  ::= plus? int_literal
IntLiteral   ::= minus int_literal
FloatLiteral ::= ( minus | plus )? ( float_literal | inf )

MessageLiteralWithBraces ::= l_brace MessageTextFormat r_brace

//Protobuf Text Format
MessageTextFormat ::= ( MessageLiteralField ( comma | semicolon )? )*

MessageLiteralField ::= MessageLiteralFieldName colon Value |
                      MessageLiteralFieldName MessageValue

MessageLiteralFieldName ::= FieldName |
                          l_bracket SpecialFieldName r_bracket
SpecialFieldName        ::= ExtensionFieldName | TypeURL
ExtensionFieldName      ::= QualifiedIdentifier
TypeURL                 ::= QualifiedIdentifier slash QualifiedIdentifier

Value          ::= ScalarValue | MessageLiteral | ListLiteral
MessageValue   ::= MessageLiteral | ListOfMessagesLiteral
MessageLiteral ::= MessageLiteralWithBraces |
                 l_angle MessageTextFormat r_angle

ListLiteral ::= l_bracket ( ListElement ( comma ListElement )* )? r_bracket
ListElement ::= ScalarValue | MessageLiteral

ListOfMessagesLiteral ::= l_bracket ( MessageLiteral ( comma MessageLiteral )* )? r_bracket

//Messages
MessageDecl ::= message MessageName l_brace MessageElement* r_brace

MessageName    ::= identifier
MessageElement ::= FieldDecl |
                 MapFieldDecl |
                 GroupDecl |
                 OneofDecl |
                 OptionDecl |
                 ExtensionRangeDecl |
                 MessageReservedDecl |
                 MessageDecl |
                 EnumDecl |
                 ExtensionDecl |
                 EmptyDecl

//Fields
FieldDecl ::= FieldCardinality? TypeName FieldName equals FieldNumber
            CompactOptions? semicolon

FieldCardinality ::= required | optional | repeated
FieldName        ::= identifier
FieldNumber      ::= int_literal

//Maps
MapFieldDecl ::= MapType FieldName equals FieldNumber CompactOptions? semicolon

MapType      ::= map l_angle MapKeyType comma MapValueType r_angle
MapKeyType   ::= int32   | int64   | uint32   | uint64   | sint32 | sint64 |
               fixed32 | fixed64 | sfixed32 | sfixed64 | bool   | string
MapValueType ::= TypeName

//Groups
GroupDecl ::= FieldCardinality? group FieldName equals FieldNumber
            CompactOptions? l_brace MessageElement* r_brace

//Oneofs
OneofDecl ::= oneof OneofName l_brace OneofElement* r_brace

OneofName    ::= identifier
OneofElement ::= OptionDecl |
               OneofFieldDecl |
               OneofGroupDecl

OneofFieldDecl ::= TypeName FieldName equals FieldNumber
                 CompactOptions? semicolon

OneofGroupDecl ::= group FieldName equals FieldNumber
                 CompactOptions? l_brace MessageElement* r_brace

//Extension Ranges
ExtensionRangeDecl ::= extensions TagRanges CompactOptions? semicolon

TagRanges     ::= TagRange ( comma TagRange )*
TagRange      ::= TagRangeStart ( to TagRangeEnd )?
TagRangeStart ::= FieldNumber
TagRangeEnd   ::= FieldNumber | max

//Reserved Names and Numbers
MessageReservedDecl ::= reserved ( TagRanges | Names ) semicolon

Names ::= StringLiteral ( comma StringLiteral )*

//Enums
EnumDecl ::= enum EnumName l_brace EnumElement* r_brace

EnumName    ::= identifier
EnumElement ::= OptionDecl |
              EnumValueDecl |
              EnumReservedDecl |
              EmptyDecl

//Enum Values
EnumValueDecl ::= EnumValueName equals EnumValueNumber CompactOptions? semicolon

EnumValueName   ::= identifier
EnumValueNumber ::= minus? int_literal

//Reserved Names and Numbers
EnumReservedDecl ::= reserved ( EnumValueRanges | Names ) semicolon

EnumValueRanges     ::= EnumValueRange ( comma EnumValueRange )*
EnumValueRange      ::= EnumValueRangeStart ( to EnumValueRangeEnd )?
EnumValueRangeStart ::= EnumValueNumber
EnumValueRangeEnd   ::= EnumValueNumber | max

//Extensions
ExtensionDecl ::= extend ExtendedMessage l_brace ExtensionElement* r_brace

ExtendedMessage  ::= TypeName
ExtensionElement ::= FieldDecl |
                   GroupDecl

//Services
ServiceDecl ::= service ServiceName l_brace ServiceElement* r_brace

ServiceName    ::= identifier
ServiceElement ::= OptionDecl |
                 MethodDecl |
                 EmptyDecl

//Methods
MethodDecl ::= rpc MethodName InputType returns OutputType semicolon |
             rpc MethodName InputType returns OutputType l_brace MethodElement* r_brace

MethodName    ::= identifier
InputType     ::= MessageType
OutputType    ::= MessageType
MethodElement ::= OptionDecl |
                EmptyDecl

MessageType ::= l_paren stream? TypeName r_paren

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants