How to write your own Solidity linter using Slang
… in 20 lines of code!
… in 20 lines of code!
Slang is Nomic Foundation’s modular set of compiler APIs empowering the next generation of Solidity code analysis and developer tooling. It’s written in Rust and distributed in multiple languages. It’s currently in alpha stage and in active development, but already useful for many things! Check out the initial alpha release announcement to learn more.
In this guide we will show how you can use Slang to write a simple linter for Solidity in just 20 lines of code. To pick a simple, yet real-life example, we will write our own version of the solhint avoid-tx-origin
rule, which warns whenever tx.origin
is used in the code.
Let’s get started!
The official Solidity documentation includes an example that illustrates why using tx.origin
for authorization is a bad idea:
// SPDX-License-Identifier: GPL-3.0
pragma solidity >=0.7.0 <0.9.0;
// THIS CONTRACT CONTAINS A BUG - DO NOT USE
contract TxUserWallet {
address owner;
constructor() {
owner = msg.sender;
}
function transferTo(address payable dest, uint amount) public {
// THE BUG IS RIGHT HERE:
// You must use 'msg.sender' instead of 'tx.origin'.
require(tx.origin == owner);
dest.transfer(amount);
}
}
We want our linter to be able to briefly report where the offending code is:
example.sol:15:13: warning: avoid using `tx.origin`
Rather than manually walk the syntax tree to find patterns of source code, like most linters do, we can use Slang’s tree query language, instead. It allows us to concisely specify the pattern we are looking for and the query engine will perform all the heavy-lifting for us!
To give you a sense of where we’re going, here’s a preview:
// Query that we’ll use to find the `tx.origin` expression
const query = Query.parse(`
@txorigin [MemberAccessExpression
[Expression ["tx"]]
["origin"]
]
`);
// Parse the source code:
const parser = Parser.create("0.8.22");
const output = parser.parse(NonterminalKind.SourceUnit, contents);
// Query the CST and print the results:
const cursor = output.createTreeCursor();
for (const match of cursor.query([query])) {
const txorigin = match.captures["txorigin"]![0]!;
const { line, column } = txorigin.textOffset;
console.warn(`${filePath}:${line + 1}:${column + 1}: warning: avoid using \`tx.origin\``);
}
This prints exactly what we want! Let’s dive in and learn how to implement this step-by-step.
Installation
First, we need to install Slang. The compiler is written in Rust and distributed both as a Rust package and an NPM package with TypeScript definitions. In this guide, we will use the latter.
Let’s open a terminal and create a new project:
mkdir my-awesome-linter/
cd my-awesome-linter
npm init
npm install @nomicfoundation/slang@0.18
Setting up TypeScript
We will use TypeScript to write our linter. Let’s install it and create a tsconfig.json
file:
npm install --save-dev typescript@5
npm install --save-dev @types/node@22
npx tsc --init
Parsing the Solidity code
To analyze the code, we need to parse it into a concrete syntax tree (CST). A CST can represent incomplete or invalid code, and is a good starting point for writing a linter.
Let’s start by writing a simple index.mts
that reads the contents of a file, specified as the first command line argument:
import fs from 'node:fs';
const filePath = process.argv[2];
const contents = fs.readFileSync(filePath, 'utf8');
Supporting multiple versions of Solidity
The Solidity language has changed quite a lot over time, however Slang is able to parse all versions of Solidity that are in use today, which we consider to be 0.4.11
and later.
Let’s say that we want to be source-compatible with code that’s expected to work with Solidity 0.8.22
. First, we construct an instance of the Parser
class, which is the main entry point for parsing Solidity code:
import { Parser } from "@nomicfoundation/slang/parser";
const parser = Parser.create("0.8.22");
Parsing different language constructs
To parse the file using Slang, we’ll use the parser.parse()
method, which takes a NonterminalKind
as its first argument, allowing us to specify which language construct to parse. Since we want to parse the entire file, we'll use NonterminalKind.SourceUnit
.
import { NonterminalKind } from "@nomicfoundation/slang/cst";
const output = parser.parse(NonterminalKind.SourceUnit, contents);
Inspecting the parse output
The parse
function returns a ParseOutput
object, which contains the root of the CST (.tree
) and a list of parse errors (.errors
), if there are any.
To inspect the CST, we could print it to the console to see what it looks like:
console.log(output.tree.toJson());
// Should print something like:
// { "kind":"SourceUnit", "children": [...] }
Matching specific patterns of code
We have just parsed the Solidity code into a structured representation that we can now analyze.
To analyze the CST we will use Slang’s tree query language, which was designed specifically for tasks like ours, and is a great alternative to analyzing the tree manually, due to its brevity and declarative nature.
The tree queries are instances of the Query
class, which are created by parsing a query string, that match specific CST patterns and optionally binds variables to them. The syntax is described in the Tree Query Language reference.
Without going too much into the details of this query, we want to match the tx.origin
expression, which is a MemberAccessExpression
with tx
identifier as the left-hand side and origin
identifier as the right-hand side:
import { Query } from "@nomicfoundation/slang/cst";
const query = Query.parse(`
@txorigin [MemberAccessExpression
[Expression ["tx"]]
["origin"]
]
`);
That’s a lot to unpack here! Let’s break it down:
- tree nodes are enclosed in square brackets
[]
. - the first name in the square brackets match the given node’s
NonterminalKind
. - after it, there is a list of children nodes we expect to match.
@
-prefixed names before nodes are captures, which are used to refer to specific nodes of the matched pattern.
Running the queries
The queries are executed using the Cursor
class, which is another way to traverse the syntax tree, so we need to instantiate one that starts at the root of the tree:
const cursor = output.createTreeCursor();
// This is a shorthand for:
// output.tree.asNonterminalNode()!.createCursor({ utf8: 0, utf16: 0, line: 0, column: 0 });
While it’s possible to run multiple different queries concurrently using the same cursor, we will only run one in our case. To access the matched QueryResult
s, we need to call next()
repeatedly until it returns null
:
for (const match of cursor.query([query])) {
// ... do something with the matched tree fragment
}
Now, for each query result, we can use the captures we defined in the query to access the nodes we are interested in.
Each cursor points to a single node but a capture can return multiple cursors, depending on the query. In our case @txorigin
will return an array of one Cursor
pointing to a MemberAccessExpression
node.
Let’s inspect the JSON representation of the matched node pointed to by a Cursor
:
const txorigin = match.captures["txorigin"]![0]!;
console.log(txorigin.node.toJson());
// Should print our matched node:
// { "kind": "MemberAccessExpression", "children": [...] }
Reporting the findings
The only thing left to do is to report our findings to the user.
Because we get back a Cursor
that points to the offending node from our queries, we can use its .textOffset
property to map back its position in the source code. This property contains .line
and .column
properties that are exactly what we need. It’s worth keeping in mind that Slang uses 0-based indexing but the error reporting/editors often use more natural 1-based indexing, so we need to add 1
to these offsets to use it.
Having that, we can print out the warning message informing the user where the offending code is:
const txorigin = match.captures["txorigin"]![0]!;
const { line, column } = txorigin.textOffset;
console.warn(`${filePath}:${line + 1}:${column + 1}: warning: avoid using \`tx.origin\``);
To access the full span of the node, we could use the textRange
property on the cursor, which returns the start and the end offsets of the node in the source code.
We could get even more creative and plug this information into a custom formatter of our choice, but for now this will suffice.
Putting it all together
Here’s the complete code for our linter:
import fs from "node:fs";
import { NonterminalKind, Query } from "@nomicfoundation/slang/cst";
import { Parser } from "@nomicfoundation/slang/parser";
const filePath = process.argv[2];
const contents = fs.readFileSync(filePath, "utf8");
const parser = Parser.create("0.8.22");
const output = parser.parse(NonterminalKind.SourceUnit, contents);
const query = Query.parse(`
@txorigin [MemberAccessExpression
[Expression ["tx"]]
["origin"]
]
`);
const cursor = output.createTreeCursor();
for (const match of cursor.query([query])) {
const txorigin = match.captures["txorigin"]![0]!;
console.log(txorigin.node.toJson());
const { line, column } = txorigin.textOffset;
console.warn(`${filePath}:${line + 1}:${column + 1}: warning: avoid using \`tx.origin\``);
}
If we don’t count the empty lines, the code is indeed 20 lines long! 🎉
Conclusion
In this guide, we’ve demonstrated how to create a simple linter for Solidity using Slang, implementing a simple version of the avoid-tx-origin
rule from solhint
in just 20 lines of code.
We covered the essentials of parsing Solidity code, identifying specific code patterns, and reporting findings to users in a clear and straightforward manner.
We hope that this guide has inspired you to write your own linters or any other tools that operate on Solidity code using Slang!
If you have any questions or feedback, feel free to reach out to us on GitHub, or check out Slang’s documentation.