SQLMesh Sushi: An End-To-End Data Pipeline Tutorial

Get hands on with SQLMesh for scalable data transformation.

May 17, 2025

Welcome to a step-by-step tutorial on using SQLMesh with its official Sushi example project. We will start by setting up SQLMesh on your local machine, then walk through two versions of the Sushi project – a simple version and a more moderate one – to demonstrate core SQLMesh features. By the end, you’ll know how to build and run both pipelines and understand key concepts like incremental models, signals, tests, audits, cron scheduling, linting, external models, and macros.

Prerequisites and Environment Setup

Before diving in, make sure you have the following:

Python 3.8+ installed on your system (Python 3.11 recommended).
Git installed for cloning repositories.
A terminal/command-line interface to run commands.

1. Clone the SQLMesh examples repository: The Sushi example is part of the sqlmesh-examples repository. In your terminal, navigate to a directory where you want to keep the project, then clone the repo:

git clone https://github.com/TobikoData/sqlmesh-examples.git

This will create a folder named sqlmesh-examples containing the example projects.

2. Create a Python virtual environment: It’s best practice to use a virtual environment to isolate project dependencies. Change into the new sqlmesh-examples directory, then create and activate a virtual environment:

cd sqlmesh-examples
python3 -m venv .venv
source .venv/bin/activate

After activation, your shell prompt should indicate you’re in the virtual environment (e.g. a (.venv) prefix).

Cloning the repository and entering the project directory (Linux/macOS CLI).

Creating and activating a Python virtual environment for the project.

3. Install SQLMesh and dependencies: With the virtual environment active, install SQLMesh. You can install just the core library, or include extras like the web UI (built on Streamlit) and Jupyter support for notebooks:

Minimal install: pip install sqlmesh – installs SQLMesh CLI and core functionality.
Full install (recommended): pip install "sqlmesh[web]" notebook – includes the SQLMesh web UI and Jupyter notebook support.

This tutorial will use the web UI for visualizing the pipeline, so the full install is recommended.

4. Verify the installation: Check that the sqlmesh command is available. For example, run:

sqlmesh --version

You should see SQLMesh’s command-line help text. You are now ready to explore the Sushi example project!

Project Overview: The Sushi Example

SQLMesh’s Sushi example simulates a small data warehouse for a fictional sushi restaurant. It comes in two flavors (pun intended): a simple project and a moderate project, each demonstrating different features and complexity levels. Both projects use the same underlying “sushi” dataset (orders, customers, items, etc.), but the models and SQLMesh features used differ:

Simple project (1_simple/): Contains four models of type VIEW and one SEED model. All transformations here are defined as views, which are recomputed from scratch on each run, making it straightforward to understand basic SQLMesh behavior. The seed model provides some static seed data to bootstrap the pipeline.
Moderate project (2_moderate/): Contains five models of type INCREMENTAL_BY_TIME_RANGE (incremental models), one FULL model, one VIEW, and one SEED. This version introduces incremental processing – only new data is processed after each run – and other advanced features. For example, one model (customer_revenue_lifetime) demonstrates a more complex incremental calculation (like computing customer lifetime value).

What is a SEED model? In SQLMesh, a seed model is a special kind of model backed by a static CSV data file included in the project. When you define a model with kind SEED, SQLMesh will create a physical table in your local DuckDB (or warehouse) that contains the CSV’s data, and you can treat it like any other source table in your SQL. Seed models are great for reference data or lookup tables that change infrequently (e.g. a list of holidays). In the Sushi projects, the seed model provides initial data (like a small set of sushi orders or customers) so that the pipeline can run without needing an external data source.

Both the simple and moderate projects use DuckDB as the SQL engine by default. DuckDB is an in-process SQL database, so you don’t need to set up any external database – everything will run locally. The example repository even comes with a DuckDB file pre-populated with a small amount of initial data for the sushi restaurant, so the projects will run out-of-the-box.

Project Structure

Let’s take a quick look at the structure of the Sushi example projects. Inside the sqlmesh-examples/001_sushi directory, you’ll find subfolders for each version:

1_simple/ – the simple Sushi project.
2_moderate/ – the moderate Sushi project.
sushi-overview.ipynb – a Jupyter Notebook providing an overview of the projects (optional reading).

Each project (simple or moderate) is a self-contained SQLMesh project. Inside each, you should see:

A models/ directory containing SQL model files (.sql files). This includes all model definitions (views, incremental models, etc.). Models are typically namespaced by schema (e.g., sushi.orders.sql).
A seeds/ directory containing CSV files for any seed models (e.g., a orders.csv or similar if used by a SEED model).
A tests/ directory (especially in the moderate project) containing YAML files for unit tests of models.
An audits/ directory for any audit SQL queries (data quality checks).
A signals/ directory for custom signal definitions (if any are used).
A sqlmesh.yaml or config.py – the project configuration file, defining the default target engine (DuckDB) and other settings (like linting rules or default environments).

Now that we understand the context, let's dive into running the simple project first to get a feel for SQLMesh’s workflow.

Running the Simple Sushi Pipeline (Project 1_simple)

The simple Sushi project is an ideal starting point for SQLMesh. All models here are views, meaning they are rebuilt fully on each execution, and there’s one seed for initial data. This allows us to focus on the basic SQLMesh workflow without worrying about incremental logic yet.

Step 1: Navigate to the project directory

Make sure you’re in the sqlmesh-examples/001_sushi directory and then change into the 1_simple project folder:

cd 1_simple

This directory is now our SQLMesh project root for the simple pipeline. (Each SQLMesh project is self-contained; by changing directories, we ensure the CLI commands apply to the correct project.)

Step 2: Initialize the project environment (if needed)

SQLMesh uses the concept of environments (like git branches for data). By default, working in the project directory and running a plan will create a development environment (typically named dev). You don’t need to manually initialize anything for the example – the project already includes configuration to use DuckDB and the seed data.

However, it’s good to know that if you ever needed to explicitly initialize a new project, you could run sqlmesh init duckdb to scaffold one (not needed here since the example is pre-made).

Step 3: Plan the changes (create a plan)

Run the SQLMesh plan command to evaluate the project models and create an execution plan:

sqlmesh plan

The plan command is central in SQLMesh’s development workflow. It will do the following for our project:

Parse and validate all model SQL – SQLMesh builds a DAG (Directed Acyclic Graph) of model dependencies by analyzing the SELECT statements. Because our models are views referencing each other and the seed, SQLMesh will infer how data flows from one model to another.
Apply linting rules – If the project has the linter enabled, sqlmesh plan will check each model for any SQL style or semantic issues. (We’ll discuss the linter shortly.)
Run unit tests – If any tests are defined in the tests/ folder, SQLMesh executes them to verify that model outputs match expectations. In the simple project, we may or may not have test files; we will see the output to confirm.
Prepare an execution plan – Because this is the first time we are running the project, SQLMesh will treat all models as new and plan to create their tables in DuckDB. The plan will include backfilling any models that require it. (In a simple all-views project, “backfill” just means it will compute all views once since they have no prior state.)

After you run sqlmesh plan, you should see output describing the plan. For example, SQLMesh will list each model and indicate that it’s a new model to be created. It might look something like:

==  Plan Summary  ==
New Models: sushi.customers, sushi.orders, sushi.order_items, ... (and so on)
Backfill?_Y

At the end, SQLMesh will prompt you to apply the plan (usually it asks Would you like to apply this plan? (y/n)). Go ahead and confirm by typing y and pressing Enter. Applying the plan means SQLMesh will execute the SQL to create or update the model tables in DuckDB.

What just happened? By planning and applying, SQLMesh created physical tables in DuckDB for each model, corresponding to the definitions in our project. Because these were defined as VIEW models (which by definition are rebuilt fully on each run), SQLMesh executed each model’s query against DuckDB and populated the tables. The seed model was loaded from its CSV file into a table as well.

Step 4: Verify the results

At this point, our simple Sushi data pipeline has run. We can verify that data is present and the models produced outputs:

Using the SQLMesh UI: If you installed the web UI, run sqlmesh ui from the project directory. This will launch the Streamlit-based SQLMesh web application in your browser. In the UI, you can explore the project’s DAG (a graph of model dependencies) and even preview the tables. For example, you should see nodes for sushi.orders, sushi.order_items, etc., and you can click on them to see their SQL and sample data. This visual DAG is helpful to confirm that upstream models (like the seed) connect properly to downstream models.
Using CLI or DuckDB directly: You can also query the DuckDB database directly to see the results. By default, SQLMesh uses a local DuckDB file (often named local.db or similar in the project directory). Launch DuckDB’s shell or use sqlmesh fetch <model_name> to run a quick query. For instance:

sqlmesh fetchdf 'select * from raw.order_items'

At this stage, you have successfully set up SQLMesh and executed the simple sushi pipeline. All models were built fresh using the seed data. This covered the basics of planning and applying a SQLMesh project, and you saw how SQLMesh infers the dependency graph and builds everything in order.

Before moving to the moderate project, let’s introduce some of the SQLMesh features that will become more important as our pipeline grows:

Incremental Models: Models that only process new or changed data since the last run, instead of recomputing everything.
Cron Scheduling: Each model in SQLMesh has a cron schedule indicating how often it should be refreshed (daily by default). In our simple project, models likely use the default @daily cron, meaning the pipeline is intended to run daily.
Testing and Audits: SQLMesh can validate your data pipeline through unit tests (which run on plan) and data audits (which run on each apply). We’ll see these in action in the moderate project.
Macros: Reusable SQL or Python snippets to avoid repetition in queries.
Signals: Custom conditions that must be met before a scheduled run executes (advanced scheduling control).
Linter: A built-in SQL linter to enforce best practices and catch common errors in SQL code before execution.
External Models: References to tables outside of SQLMesh’s control, so that the system knows their schema and can incorporate them without managing their data.

The simple project doesn’t heavily use these advanced features (since it’s mostly straightforward views), so now we’ll upgrade to the moderate project which showcases many of them.

The Moderate Sushi Project – Advanced Features in Action

The moderate Sushi project (2_moderate/ directory) builds upon the simple one with more models and complexity. Key differences include:

Several models are defined as incremental (specifically, INCREMENTAL_BY_TIME_RANGE models) to process data in time-based partitions instead of full refresh.
There is at least one FULL model (which recalculates fully but only when upstream data changes).
The project likely includes tests and audits to ensure data quality, as well as possibly a signal or two to illustrate custom scheduling logic.
More use of macros or advanced SQL features to handle calculations like customer lifetime revenue.

Let’s repeat a similar process for the moderate project, and along the way, explain the new concepts:

Step 1: Switch to the moderate project directory

Exit the UI (if open) and navigate to the moderate project:

cd ../2_moderate

Now you’re in sqlmesh-examples/001_sushi/2_moderate, the moderate project’s root.

Step 2: Plan and apply the moderate project

Run sqlmesh plan again in this directory:

sqlmesh plan

SQLMesh will parse all the models in 2_moderate/models. This project has more models and different kinds, so let’s highlight what happens:

Incremental models detected: You will see in the model definitions (open the .sql files in models/) that many are declared with MODEL ( ..., kind INCREMENTAL_BY_TIME_RANGE(...), cron '@daily', ... ). Incremental models in SQLMesh are powerful – instead of recomputing the entire dataset each day, they only process the new time “partition.” Each incremental model specifies a time column that partitions the data (e.g. order_date or similar) and a lookback window if needed. For example, a model might be defined as:

MODEL (
    name sushi.sales_by_day,
    kind INCREMENTAL_BY_TIME_RANGE (
        time_column date, 
        lookback '7 days'
    ),
    cron '@daily'
);
SELECT ... FROM sushi.orders WHERE date BETWEEN @start_ds AND @end_ds;

The cron '@daily' means this model is intended to run daily, processing one day’s worth of data each run. The special macro variables @start_ds and @end_ds are used in the WHERE clause – SQLMesh automatically replaces these with the start and end of the time interval it’s currently processing. In other words, if the model runs for 2025-05-17, @start_ds might be '2025-05-17' and @end_ds '2025-05-17' (inclusive boundaries) for that interval. SQLMesh will ensure the query only pulls data for that interval and will prevent data leakage by also wrapping the query with its own time filter.

When sqlmesh plan runs for the first time here, since we have no prior state, it will plan to backfill all incremental models from their start date. Often, incremental models have a start property in the model config (default is yesterday if not specified). If our sushi data covers, say, the past month, SQLMesh will propose to run intervals for each day in that range to catch up the data. This means the initial plan for incremental models may include multiple intervals to backfill. The plan summary will indicate something like "Backfill 30 days for model X" if applicable.

Understanding incremental in SQLMesh: By default, whenever you create an incremental model, SQLMesh is cautious about changes that could invalidate historical data. If you modify an incremental model’s logic, SQLMesh will classify the change as breaking or non-breaking. A breaking change forces a full refresh (which for incremental means backfilling from scratch unless marked as forward-only). However, if a change is marked as forward-only (meaning we accept that old data stays as is, and only new data will use the new logic), SQLMesh can allow schema changes without full backfill. In our example, since it’s the first run, we don’t have that scenario yet, but keep this in mind as a powerful feature for very large tables where a full refresh is impractical.

Audits and tests during planning: The moderate project likely has audit SQL and unit tests:

Tests: SQLMesh will run any tests defined in tests/ YAML files during the plan. Tests in SQLMesh are essentially data assertions on models, defined by specifying input sample data and expected output for a model. For example, a test might feed a small set of orders into an upstream model and expect a certain aggregation result in a downstream model. If any test fails (meaning a model’s output didn’t match the expected values), the plan will not be applied, alerting you to a potential logic issue. Tests are a way to protect your project from regressions by verifying model logic continuously.
Audits: Audits are SQL queries that run after model execution to validate data post-build. They are defined in .sql files in an audits/ directory or inline in model definitions. Audits typically assert some condition like “no nulls in primary key column” or “sum of payments equals sum of orders” – if an audit query returns any row, it means bad data was found. By default, if any audit fails (returns data), SQLMesh will halt the plan application to prevent possibly invalid data from propagating. (This behavior can be relaxed with non-blocking audits if desired.) During sqlmesh plan, after building models, you’ll see output from any audits. If all audits pass (return zero rows), the plan continues; if not, you get an error and the opportunity to fix the data or model.

Given this is our first run, tests and audits should ideally pass. If they do, the plan will complete and ask for confirmation to apply.

Confirm with “y” to apply the plan. Applying will execute all the backfill queries. This could take a bit longer than the simple project since multiple days of incremental data might be processed. However, since the dataset is small (fictional sushi data), it should finish quickly.

Step 3: Understand the output and new tables

After applying, the moderate pipeline’s tables are materialized in DuckDB. Let’s break down some of the new features we just exercised:

Incremental model behavior: Only the needed data intervals were processed. If the sushi data had, say, data from Jan 1–7, 2025, and today is May 17, 2025, SQLMesh would backfill from Jan 1 through Jan 7 (assuming the model’s start is Jan 1). If tomorrow you add new data for Jan 8 and run another plan, SQLMesh would only process Jan 8 for that model, because it tracks that Jan 1–7 are already done. SQLMesh tracks the “intervals” of data processed for incremental models, so it knows what’s up-to-date. This tracking is stored in snapshots/metadata so even if you restart, SQLMesh remembers what data each model has covered.
Cron scheduling and sqlmesh run: So far, we have manually invoked sqlmesh plan to plan and apply changes. In a production scenario, you might schedule sqlmesh run to execute the DAG on a schedule. The built-in SQLMesh scheduler (triggered via sqlmesh run) uses each model’s cron to decide if it’s time to run that model. It ensures upstream dependencies are up-to-date and then runs any due intervals. In our development case, we manually triggered the initial build. If we want to simulate daily runs, we could call sqlmesh run now or in the future; for now, planning and applying suffices to get data in.
Signals: The moderate project may not explicitly use signals, but let’s explain them since they are a powerful feature. A signal in SQLMesh is a user-defined condition that must be true for a model’s run to execute. The scheduler by default considers two things: (1) the cron schedule and (2) whether upstream models have new data ready. Signals allow adding a third criterion – for example, “only run this model if an external file arrived” or “if some custom logic says it’s a trading day.” Signals are defined as Python functions in a signals/ directory with the @signal decorator. They take a batch of time intervals and return True/False or a subset of intervals to indicate readiness. For instance, the example docs show a random_signal that flips a coin and only lets the run proceed if the random number exceeds a threshold. To use a signal, you attach it in the model’s metadata, e.g. signals ([random_signal(threshold=0.5)]). In summary, signals are advanced and not required in our simple runs, but they showcase how SQLMesh can incorporate custom triggers beyond just cron scheduling (especially useful if waiting on external events).
Linter output: If the project has a linter configured (check sqlmesh.yaml for a linter: section), you might have noticed any warnings or errors before the plan executed. By default, the linter is off (enabled: false) unless specified. If turned on, it enforces rules like “no SELECT *” or “all models must have an owner tag”. For example, one built-in rule NoMissingOwner requires each model to declare an owner attribute (to indicate who is responsible for it). If our project violated this, running sqlmesh plan or sqlmesh lint would produce an error like:

In a development workflow, you can run sqlmesh lint directly at any time to quickly check for lint issues without planning. Linting helps maintain SQL style and catch mistakes early (for example, a rule can warn if you reference a column that doesn’t exist in a source table).

Now that the moderate pipeline has been built, let’s explore some outputs and verify everything is working:

Step 4: Exploring the Moderate Pipeline Outputs

Open the SQLMesh UI again with sqlmesh ui (make sure you’re in the 2_moderate directory when you run this, so it opens the moderate project).

In the UI:

Navigate to the DAG view to see the graph of models. You’ll notice it’s larger than the simple project’s DAG. The seed model (probably something like sushi.seed_orders) will be at the base. Upstream from it, any incremental staging models (e.g., sushi.daily_orders) feed into downstream aggregates (like sushi.customer_revenue_lifetime). The DAG might also show the FULL model and VIEW model present in the project.
Click on one of the incremental models in the DAG. The UI will show the SQL definition, which should include the INCREMENTAL_BY_TIME_RANGE config. It also shows the time intervals that have been processed for that model – essentially the partition history. Initially, this should cover all intervals from the model’s start up to the present (because we just backfilled). If you were to add new data and run again, you’d see additional intervals appear.
Check the data itself: use the UI’s data preview feature on a model like customer_revenue_lifetime. It should show each customer’s lifetime revenue as of the latest date. If the pipeline logic is correct, this number should accumulate over time per customer.

To further test incrementality, let’s simulate a new day of data:

Simulating new data arrivals: The example repository includes a Python helper to add data for new dates. In the project folder, find the Python script (it might be named something like sushi_data.py or helpers.py). The repository docs mention that each project has a CLI-addable function to insert new rows. For instance, there could be a command to add a new day of orders. You can run it as:

python sushi_data.py add_orders --date 2025-05-18

(The exact command may vary; consult the project’s README or the helper’s usage. The moderate example likely simulates adding orders for the next day.)

After adding new data (e.g., orders for May 18, 2025), run:

sqlmesh run

This will invoke SQLMesh’s built-in scheduler to evaluate any missing intervals for the current environment. Because we added orders for a new date, the incremental model covering daily orders will detect that the interval 2025-05-18 is not processed yet and will execute just that interval. Models downstream (like the full or view models that depend on up-to-date data) will also run as needed (the scheduler knows their cron and that upstream changed). The sqlmesh run command essentially automates what we did manually with planning for new data scenarios, following the cron schedule and dependencies.

After sqlmesh run, check the customer_revenue_lifetime again – each customer’s total should have increased by whatever they spent on the new date’s orders, reflecting that incremental update.

Step 5: Data Quality with Tests and Audits

Finally, let’s discuss how tests and audits in the moderate project help ensure our pipeline’s quality:

Unit Tests: If you open any YAML files in 2_moderate/tests/ (their names start with test_), you’ll see definitions of tests. For example, a test might be named test_customer_revenue_lifetime and contain a mapping of inputs and an expected output. A simplified example could be:
Audits: Check the audits/ directory or any AUDIT definitions. For instance, there might be an audit like assert_positive_order_ids that checks no order ID is negative:
Non-blocking audits: Sometimes you want an audit to warn rather than fail the pipeline. SQLMesh allows audits to be marked as non-blocking, meaning if they find bad data, it will log a warning but still let the plan proceed. In our example, critical audits (like uniqueness of primary keys) are likely blocking (the default). You can adjust this by naming conventions (audits ending in _non_blocking) or config if needed.

Step 6: Macros and Reusable Logic

The moderate project may use macros to avoid duplicating logic across models. SQLMesh supports two kinds of macros: SQLMesh macros (Python-based) and Jinja macros (SQL templating):

SQLMesh (Python) macros: These are Python functions defined in .py files under a macros/ directory in the project. Using the @macro() decorator, you can create a function that returns a string or SQL expression. This returned SQL can be inlined into your model queries. Python macros are powerful because they can include complex logic using Python, but still integrate into SQL builds. For example, you might have a macro to generate a CASE statement for many status codes. The benefit is you write it once in Python and reuse it in multiple models. In a model SQL, you call it like @{macro_name}(args) and SQLMesh will substitute the result. Note: Python macro functions must reside in the macros/ folder and are automatically picked up by SQLMesh.
Jinja macros: If you are familiar with dbt or Jinja templating, SQLMesh supports Jinja in your SQL files too (you just have to enable it, usually by naming the file with a .jinja.sql or similar or via config). Jinja macros are defined in .sql files using {% macro %} blocks or in separate Jinja files. Under the hood, Jinja macros are less powerful than Python macros (they're just text templating), but they can be handy for simple templating. SQLMesh fully supports Jinja for compatibility, but it encourages using its Python macro system for more complex tasks.

Check the 2_moderate/models/ files – do you see any @ macros being called? If yes, there should be a corresponding macro definition. For example, if a model uses @between_where(date, start_val, end_val), the macros/ directory might have:

from sqlmesh import macro

@macro()
def between_where(evaluator, column_name, low_val, high_val):
    # Return a SQL snippet that checks if column_name is between low and high (inclusive)
    return f"{column_name} BETWEEN {low_val} AND {high_val}"

Then in SQL, writing @{between_where}('order_date', '2025-01-01', '2025-01-31') would expand to order_date BETWEEN '2025-01-01' AND '2025-01-31'. This is a trivial example – more useful macros might compute dynamic date ranges, pivot data, generate repetitive SQL, etc. The key point is macros let you avoid writing the same SQL logic in multiple places.

If no custom macros are in the moderate project, it likely still uses macro variables like @start_ds / @end_ds (which we saw) or others. Macro variables are predefined by SQLMesh (like execution date, environment name, etc.) and can also be user-defined global variables for templating.

Step 7: External Models (Working with external data)

Lastly, consider external models – these are used when your SQLMesh project needs to reference a table that SQLMesh does not manage. For example, imagine the sushi project needed data from an external marketing database’s table marketing.coupons. You wouldn’t want SQLMesh to treat it as a normal model (with an upstream query), because the data comes from outside. Instead, you declare an external model so SQLMesh knows about its schema but won’t attempt to create or modify it.

In SQLMesh, an external model is basically just a schema definition (list of columns). It can be declared inline as MODEL(name external_db.some_table, kind EXTERNAL) within your models, or more commonly you run a CLI to auto-generate an external schema YAML.

For instance, if our sushi project had references to external_db.external_table, we could run:

sqlmesh create_external_models

This command scans your project for any external table references and fetches their column info from the warehouse, writing it to an external_models.yaml file. SQLMesh then knows the external table’s schema and can validate models that use it. External models are never executed by SQLMesh – they have no SQL query of their own. They are placeholders to integrate outside data. SQLMesh will not know if the external data changes or is deleted; it assumes those tables exist with the given schema. It’s up to the external system to update that data.

In our moderate sushi example, we might not have any external models (since everything is self-contained with seeds). But it’s important to know about this feature as you scale up. If you see an external_models.yaml in the project, open it – you’ll find YAML entries listing table names and their columns, which is how SQLMesh tracks externals.

Recap and Next Steps

Congratulations! You have:

Set up SQLMesh in a local environment and cloned the official examples.
Run through the Simple Sushi project, seeing how SQLMesh plans and applies a straightforward DAG of view models.
Upgraded to the Moderate Sushi project to explore incremental models and other advanced features like tests, audits, and scheduling.
Used the SQLMesh Streamlit UI to visualize the project’s DAG and inspect model outputs.
Learned how incremental processing works in SQLMesh – only new partitions of data are processed, guided by the model’s time column and cron schedule.
Seen how to incorporate data quality checks via tests (pre-execution) and audits (post-execution) to catch issues early and build trust in the pipeline.
Understood the role of the SQLMesh linter in enforcing SQL best practices and preventing common errors (and how to enable it in the config).
Touched on advanced topics: signals for custom run conditions, external models for integrating external sources, and macros to simplify and reuse SQL logic across models.

For a beginner to intermediate data engineer, you’ve now experienced the end-to-end workflow of SQLMesh: from development (plan, test, lint) to execution (apply, run) to maintenance (adding new data, incremental updates, audits). The Sushi example is a small project, but the concepts scale to large projects with dozens or hundreds of models. SQLMesh’s focus on environment isolation, automated dependency tracking, and incremental efficiency means you can confidently make changes and understand their impact before deploying to production.

Where to go from here? You can try the complex version of the Sushi project (if provided in the examples repository) for even more features, or start building your own SQLMesh project. The official documentation in the docs/ directory (which we referenced throughout) is an excellent resource for deeper dives into each feature. Happy data engineering with SQLMesh!

Andrew Madson MSc, MBA

Head of Education and Evangelism at Tobiko

Founder, Insights x Design

Insights x Design