Slang alpha release

A syntax analysis API and unified grammar for all versions of Solidity. We are looking for feedback!

Slang alpha release

A syntax analysis API and unified grammar for all versions of Solidity. We are looking for feedback!

It’s been a while since we announced we’d start working on our Solidity compiler project, Slang, and a lot has happened since then. What started as a vision that needed funding and a team, has materialized into key engineering hires who have spent the last year doing research and development. We’re excited to finally start sharing some of the progress.

Slang objectives

The team started out with the following objectives:

  • Produce modular reusable infrastructure to make it more accessible for the Ethereum ecosystem to build Solidity tooling of any kind, increasing the quality and availability of developer tooling over time.
  • Match the language behavior of solc, which is the de-facto language definition, for every language version (74 in total!) back to Solidity 0.4.11.
  • Allowing Slang to be integrated using multiple programming languages and environments (e.g. aim for WASM compatibility).
  • Ensure the continued compatibility against future versions of Solidity regardless of how much it may change.
  • Provide a built-in language server for all language versions.
  • Create a language specification website to act as the definitive guide to all versions of Solidity.

Slang architecture

A particular challenge is the rapid evolution of Solidity, and the need to support each released version for as long as that version is used in the Ethereum chains. Rather than producing a separate version of Slang for each version of the language, Slang treats language versions as a first-class concept, and allows you to provide the language version as a parameter. Since one library covers all versions of the language, there’s no need to change version dependencies in your tools or download multiple library versions. This has the additional benefit of exposing the versioning in the language model itself, so we can, for example, generate documentation showing the differences between versions.

We decided to build Slang on an architecture around a declarative single source of truth and code generation. This means that:

  • The definition of the language is purely data — there is no code involved. This is not unusual when defining the syntax of a language in tools such as Bison and ANTLR, but we intend to take this much further. We think we can remain completely declarative for all aspects of the language definition, including binding, type checking, refactoring, control flow analysis, and more.
  • The specification of the language is defined as a single (virtual) document.
  • We don’t manually write parsing code, nor binding analysers, type checkers etc. We write code generators that generate everything from the declarative specification. We intend to also generate inputs to other tools, such as a TreeSitter and TextMate grammars. Even though the input specification contains some markdown prose, this is enhanced by a lot of generated documentation.

For the compilers and languages people out there, you can think of the underlying technology of Slang as being similar in spirit to The Spoofax Language Workbench. We use some of their research results, such as Scope Graphs. In contrast to Spoofax, Slang:

  • uses a declarative, single specification
  • has language versioning as a first-class concept
  • is built for the world of LSPs
  • produces APIs for many languages
  • produces user documentation such as a language specification
  • is written in Rust

What’s in this release?

This first alpha release is intended to be a complete and correct parser with respect to Solidity versions 0.4.11 to 0.8.19. It provides a fail-fast parser, without error recovery, that produces a stable Concrete Syntax Tree.

This release consists of the following resources:

Notable omissions in this release

  • The parser produces a tree only from syntactically valid inputs.
  • The language reference documentation site includes the grammar but is missing most explanatory prose.
  • Performance isn’t yet a goal. It’s not yet as fast as we want Slang to be.

We think this release is a good match for tools requiring syntactic analysis of correct Solidity code, especially where you want to cover all in-use versions of Solidity with a single tool.

Next steps

The immediate items on our roadmap are:

  • an initial implementation of error recovery
  • binding analysis using scope graphs i.e. def-use information
  • domain-typed AST wrappers around the enum-typed CST
  • start filling out the prose description for the language reference

We intend to iterate rapidly on error recovery. Currently we fail-on-error, and don’t produce a tree for invalid input. Our eventual goal is to produce best-in-class automatic correction, diagnostic messages, and suggestions, using automated methods in conjunction with Datalog patterns embedded in the specification for edge cases. Elm-level quality is our benchmark.

In this initial release error messages look like this:

But our goal is to over time get to error messages that look like this:

Get involved

Our main objective behind this release is to get real usage on the API to iterate based on user suggestions. We would love feedback on and improvements to the reference documentation, the parser API, and especially bug reports about any corners of Solidity that the specification may not cover properly yet. Additionally, contributions to the language description are very welcome.

Please reach out if you’re interested in helping with initial feedback, either via the Slang Github repo, or the Slang telegram channel.

Thank you!
Slang Team at Nomic Foundation.