Building Your First Domain-Specific Language: A Practical Guide in Python and Scala

How a small, focused language can eliminate boilerplate, reduce bugs, and make your team faster — with working examples you can build this afternoon.

This is a hands-on companion to the previous post about Hulu’s BeaconSpec DSL. There, we explored why Hulu built a domain-specific language for their data pipeline. Here, we’ll build something similar ourselves — in both Python and Scala — and explore what each language brings to DSL design.

The Case for a Tiny Language

Imagine you’re on a data engineering team. Every week, someone needs to define a new metric: “count playback starts by video ID,” “sum ad impressions by partner,” “average session length by device type.” Each metric follows the same pattern, but every time, someone writes 80 lines of Java or Python boilerplate — data loading, grouping, aggregation, output formatting — just to express what could be a three-line specification.

This is exactly the situation the Hulu data team faced in the early 2010s. They were running 150–175 MapReduce jobs per hour, and each new metric required hand-written Java. Their answer was BeaconSpec, a domain-specific language that let engineers write this:

basefact playback_start from playback/start {
    dimension video.id as video_id;
    dimension contentPartner.id as content_partner_id;
    fact sum(count.count) as total_count;
}

Instead of hundreds of lines of MapReduce code.

The result? Fewer bugs, faster onboarding, better monitoring, and a pipeline that was dramatically easier to reason about. The DSL didn’t replace Java — it replaced the repetitive parts of Java.

This tutorial will show you how to build something similar — a minimal DSL for data metric definitions — in both Python and Scala. Along the way, we’ll explore why the two languages lead to fundamentally different approaches, and when each shines.

What Exactly Is a DSL?

A domain-specific language is a programming language built for one job. You already use several:

DSL	Domain	What It Replaces
SQL	Relational data queries	Imperative loops over files
CSS	Visual styling	Programmatic pixel manipulation
Regular expressions	Text pattern matching	Nested if/else string parsing
Terraform HCL	Infrastructure provisioning	Manual cloud console clicks
Makefile syntax	Build automation	Shell script spaghetti

DSLs come in two flavors:

External DSLs have their own syntax, parser, and (often) compiler. SQL and regex are external DSLs. BeaconSpec was too — it used JFlex and CUP to parse .spec files and generate Java code.
Internal DSLs (also called embedded DSLs) piggyback on an existing language’s syntax. They look like a new language but are actually valid code in the host language. Think of Ruby’s RSpec, Kotlin’s Gradle scripts, or Scala’s sbt build definitions.

Internal DSLs are where we’ll start, because they’re dramatically simpler to build. No parser. No lexer. No compiler. Just clever API design.

Our Target: A Metric Definition DSL

We want users to be able to define data metrics declaratively. Here’s what we’d like the end result to feel like:

metric "playback_starts" from "playback/start" {
    dimension "video_id" from "video.id"
    dimension "partner_id" from "contentPartner.id"
    aggregate sum of "count" as "total_plays"
}

We can’t get exactly this syntax in an internal DSL (it’s not valid Python or Scala), but we can get remarkably close. Let’s see how.

Part 1: The Python Approach — Builder Pattern with Context Managers

Python doesn’t have Scala’s syntactic flexibility, but it has two powerful tools for building DSLs: context managers (with blocks) and method chaining. Let’s use both.

Step 1: Define the Data Model

Start with simple data classes that represent what a metric specification is:

from dataclasses import dataclass, field

@dataclass
class Dimension:
    name: str
    source_field: str

@dataclass
class Aggregation:
    function: str       # "sum", "count", "avg", etc.
    source_field: str
    output_name: str

@dataclass
class MetricSpec:
    name: str
    source: str
    dimensions: list[Dimension] = field(default_factory=list)
    aggregations: list[Aggregation] = field(default_factory=list)

Nothing fancy here — just plain data containers. This is the semantic model: the structured representation of what the user wants to express.

Step 2: Build the DSL Layer

Now, the fun part. We create a builder that feels like a mini-language:

class MetricBuilder:
    """A small DSL for defining data metrics."""

    def __init__(self, name: str):
        self._spec = MetricSpec(name=name, source="")

    def from_source(self, source: str) -> "MetricBuilder":
        self._spec.source = source
        return self

    def dimension(self, name: str, source_field: str) -> "MetricBuilder":
        self._spec.dimensions.append(Dimension(name=name, source_field=source_field))
        return self

    def aggregate(self, func: str, source_field: str, as_name: str) -> "MetricBuilder":
        self._spec.aggregations.append(
            Aggregation(function=func, source_field=source_field, output_name=as_name)
        )
        return self

    def build(self) -> MetricSpec:
        if not self._spec.source:
            raise ValueError(f"Metric '{self._spec.name}' has no source defined")
        if not self._spec.aggregations:
            raise ValueError(f"Metric '{self._spec.name}' has no aggregations")
        return self._spec


def metric(name: str) -> MetricBuilder:
    """Entry point for the DSL."""
    return MetricBuilder(name)

Step 3: Use It

spec = (
    metric("playback_starts")
    .from_source("playback/start")
    .dimension("video_id", source_field="video.id")
    .dimension("partner_id", source_field="contentPartner.id")
    .aggregate("sum", source_field="count", as_name="total_plays")
    .build()
)

print(spec)
# MetricSpec(name='playback_starts', source='playback/start',
#   dimensions=[Dimension(name='video_id', source_field='video.id'), ...],
#   aggregations=[Aggregation(function='sum', source_field='count', output_name='total_plays')])

That reads pretty well! But we can go further with a context-manager variant that collects definitions:

class MetricRegistry:
    """Collects metric definitions in a context block."""

    def __init__(self):
        self.specs: list[MetricSpec] = []

    def __enter__(self):
        return self

    def __exit__(self, *args):
        pass

    def define(self, name: str) -> MetricBuilder:
        builder = MetricBuilder(name)
        # We'll capture on build — override build to auto-register
        original_build = builder.build

        def registering_build():
            spec = original_build()
            self.specs.append(spec)
            return spec

        builder.build = registering_build
        return builder


# Usage
with MetricRegistry() as registry:
    (registry.define("playback_starts")
        .from_source("playback/start")
        .dimension("video_id", source_field="video.id")
        .aggregate("sum", source_field="count", as_name="total_plays")
        .build())

    (registry.define("ad_impressions")
        .from_source("ads/impression")
        .dimension("campaign_id", source_field="campaign.id")
        .aggregate("count", source_field="*", as_name="impression_count")
        .build())

print(f"Registered {len(registry.specs)} metrics")

What Makes This “DSL-ish”

Even though it’s plain Python, notice how the code reads:

metric("playback_starts").from_source("playback/start") — reads almost like English.
Method chaining creates a fluent interface where each call returns self.
The build() method acts as a terminator that validates and produces the final object.
The registry context manager gives us a block-scoped collection mechanism.

The key principle: a good internal DSL hides the machinery and surfaces the domain concepts.

Part 2: The Scala Approach — Where DSLs Feel Native

Scala was practically designed for internal DSLs. Several language features combine to make DSL construction feel effortless:

Infix notation — a method b instead of a.method(b)
Curly braces for block arguments — metric("x") { ... } is just a function call
Implicit conversions (Scala 2) / extension methods and given (Scala 3) — add methods to existing types
Operator overloading — define |, ~>, or any symbol as a method
By-name parameters — delay evaluation for block-style APIs

Let’s build the same metric DSL:

Step 1: The Data Model

case class Dimension(name: String, sourceField: String)

case class Aggregation(function: String, sourceField: String, outputName: String)

case class MetricSpec(
  name: String,
  source: String,
  dimensions: List[Dimension] = List.empty,
  aggregations: List[Aggregation] = List.empty
)

Scala’s case classes give us immutable data containers with structural equality — a natural fit for specifications.

Step 2: The DSL Builder

class MetricBuilder(name: String):
  private var _source: String = ""
  private val _dimensions = collection.mutable.ListBuffer[Dimension]()
  private val _aggregations = collection.mutable.ListBuffer[Aggregation]()

  def from(source: String): MetricBuilder =
    _source = source
    this

  def dimension(name: String, from: String): MetricBuilder =
    _dimensions += Dimension(name, from)
    this

  def sum(field: String, as outputName: String): MetricBuilder =
    _aggregations += Aggregation("sum", field, outputName)
    this

  def count(field: String, as outputName: String): MetricBuilder =
    _aggregations += Aggregation("count", field, outputName)
    this

  def avg(field: String, as outputName: String): MetricBuilder =
    _aggregations += Aggregation("avg", field, outputName)
    this

  def build(): MetricSpec =
    require(_source.nonEmpty, s"Metric '$name' needs a source")
    require(_aggregations.nonEmpty, s"Metric '$name' needs at least one aggregation")
    MetricSpec(name, _source, _dimensions.toList, _aggregations.toList)

def metric(name: String): MetricBuilder = MetricBuilder(name)

Step 3: Use It

val spec = metric("playback_starts")
  .from("playback/start")
  .dimension("video_id", from = "video.id")
  .dimension("partner_id", from = "contentPartner.id")
  .sum("count", as = "total_plays")
  .build()

println(spec)
// MetricSpec(playback_starts, playback/start,
//   List(Dimension(video_id,video.id), Dimension(partner_id,contentPartner.id)),
//   List(Aggregation(sum,count,total_plays)))

Already very clean. But Scala lets us push further.

Step 4: The Block-Style DSL

Using by-name parameters and a mutable context, we can create a block syntax that looks almost like a dedicated language:

import scala.collection.mutable.ListBuffer

class MetricContext(val name: String, val source: String):
  val dimensions: ListBuffer[Dimension] = ListBuffer.empty
  val aggregations: ListBuffer[Aggregation] = ListBuffer.empty

  def dimension(name: String, from: String): Unit =
    dimensions += Dimension(name, from)

  def sum(field: String, as outputName: String): Unit =
    aggregations += Aggregation("sum", field, outputName)

  def count(field: String, as outputName: String): Unit =
    aggregations += Aggregation("count", field, outputName)

  def toSpec: MetricSpec =
    MetricSpec(name, source, dimensions.toList, aggregations.toList)

def metric(name: String, source: String)(body: MetricContext ?=> Unit): MetricSpec =
  val ctx = MetricContext(name, source)
  body(using ctx)
  ctx.toSpec

// Helper to access the context implicitly
def dimension(name: String, from: String)(using ctx: MetricContext): Unit =
  ctx.dimension(name, from)

def sum(field: String, as outputName: String)(using ctx: MetricContext): Unit =
  ctx.sum(field, as = outputName)

def count(field: String, as outputName: String)(using ctx: MetricContext): Unit =
  ctx.count(field, as = outputName)

Now look at the usage:

val playbackMetric = metric("playback_starts", "playback/start"):
  dimension("video_id", from = "video.id")
  dimension("partner_id", from = "contentPartner.id")
  sum("count", as = "total_plays")

val adMetric = metric("ad_impressions", "ads/impression"):
  dimension("campaign_id", from = "campaign.id")
  count("*", as = "impression_count")

Compare that to the original BeaconSpec syntax at the top of this post. It’s remarkably close — and it’s real, compilable Scala. No parser needed. No code generation. The Scala compiler itself validates the structure.

Part 3: Python vs. Scala

Having built the same DSL in both languages, here’s what stands out:

Syntax Expressiveness

Feature	Python	Scala
Method chaining	Works well	Works well
Named parameters as keywords	`source_field="video.id"`	`from = "video.id"` (reads like English)
Block-scoped definitions	`with` blocks (limited)	Curly braces / significant indentation (natural)
Eliminating dots and parens	Not possible	Infix notation: `a method b`
Implicit context passing	Not built-in (use thread-locals or globals)	Context parameters (`using`/`given`)
Operator overloading	Supported but discouraged culturally	Idiomatic and widely used

Winner: Scala. It’s not close. Scala’s syntax was designed to bend. Python’s was designed to be uniform.

Ease of Implementation

Aspect	Python	Scala
Lines of code for basic DSL	~40	~40
Learning curve for DSL author	Low	Medium (need to understand implicits/givens)
Learning curve for DSL user	Very low	Low-Medium
Debugging DSL code	Straightforward stack traces	Can be confusing with implicits

Winner: Python. The builder pattern is a well-understood idiom. Anyone reading the Python version immediately knows what’s happening. The Scala version with context parameters requires more Scala-specific knowledge.

Validation and Safety

Aspect	Python	Scala
Compile-time type checking	None (runtime only)	Full type safety
Catching missing fields	Runtime `ValueError`	Compile-time with phantom types
IDE auto-completion	Good	Excellent (types guide suggestions)
Refactoring safety	Low	High

Winner: Scala. In a large codebase where dozens of engineers write metric definitions, compile-time validation catches errors before they reach production. Python’s build() validation only fires at runtime.

When to Choose Each

Choose Python when:

Your team already works in Python
The DSL users are data scientists or analysts (familiar with Python, less so with Scala)
You want rapid iteration and don’t need compile-time guarantees
The DSL is “glue” between other Python tools (pandas, Airflow, dbt)

Choose Scala when:

You’re in a JVM ecosystem (Spark, Kafka, Flink)
Type safety matters — you want the compiler to reject bad metric definitions
You want the DSL to look truly native, not like a builder pattern
You’re already using Scala for the execution layer (as Hulu was with MapReduce)

Part 4: Going Beyond — From Internal to External DSL

Everything above has been internal DSLs — valid Python or Scala code. But what if you want your own syntax entirely? That’s when you build an external DSL, like Hulu’s BeaconSpec.

Here’s a minimal external DSL parser in Python using just the standard library. We’ll parse a simplified version of the BeaconSpec syntax:

import re
from dataclasses import dataclass, field

@dataclass
class ParsedMetric:
    name: str
    source: str
    dimensions: list[tuple[str, str]] = field(default_factory=list)
    aggregations: list[tuple[str, str, str]] = field(default_factory=list)

def parse_metric_spec(text: str) -> list[ParsedMetric]:
    """A minimal parser for a BeaconSpec-like language."""
    metrics = []

    # Match each metric block
    block_pattern = r'metric\s+(\w+)\s+from\s+"([^"]+)"\s*\{([^}]*)\}'

    for match in re.finditer(block_pattern, text, re.DOTALL):
        name, source, body = match.groups()
        m = ParsedMetric(name=name, source=source)

        for line in body.strip().split("\n"):
            line = line.strip().rstrip(";")
            if not line:
                continue

            # Parse dimension lines
            dim_match = re.match(r'dimension\s+"(\w+)"\s+from\s+"([^"]+)"', line)
            if dim_match:
                m.dimensions.append((dim_match.group(1), dim_match.group(2)))
                continue

            # Parse aggregation lines
            agg_match = re.match(r'(sum|count|avg)\s+"([^"]+)"\s+as\s+"(\w+)"', line)
            if agg_match:
                m.aggregations.append(
                    (agg_match.group(1), agg_match.group(2), agg_match.group(3))
                )

    metrics.append(m)
    return metrics

# Try it out
spec_text = '''
metric playback_starts from "playback/start" {
    dimension "video_id" from "video.id";
    dimension "partner_id" from "contentPartner.id";
    sum "count" as "total_plays";
}

metric ad_impressions from "ads/impression" {
    dimension "campaign_id" from "campaign.id";
    count "*" as "impression_count";
}
'''

for m in parse_metric_spec(spec_text):
    print(f"{m.name}: {len(m.dimensions)} dimensions, {len(m.aggregations)} aggregations")

This is deliberately minimal — a regex-based parser for a simple grammar. For anything more complex, you’d reach for proper parsing tools:

Tool	Language	Approach
Lark	Python	EBNF grammar, generates parse tree
PLY	Python	Lex/Yacc style, like JFlex/CUP
ANTLR	JVM + others	Grammar-based, generates lexer + parser
Scala Parser Combinators	Scala	Composable parsers in pure Scala
FastParse	Scala	High-performance parser combinators

The Hulu team used JFlex (lexer generator) and CUP (parser generator) for BeaconSpec — essentially the Java equivalent of Lex and Yacc. This gave them a full compiler pipeline: .spec file in, generated Java MapReduce code out.

Part 5: A Quick Scala Parser Combinator Example

Scala’s parser combinators deserve a mention because they blur the line between internal and external DSLs. You write a parser that looks like a grammar definition, but it’s valid Scala:

// Using scala-parser-combinators library
import scala.util.parsing.combinator._

case class DimensionDecl(name: String, source: String)
case class AggregationDecl(func: String, field: String, alias: String)
case class MetricDecl(name: String, source: String,
                      dims: List[DimensionDecl], aggs: List[AggregationDecl])

object MetricParser extends RegexParsers:

  def identifier: Parser[String] = """[a-zA-Z_]\w*""".r
  def quoted: Parser[String] = "\"" ~> """[^"]+""".r <~ "\""

  def dimension: Parser[DimensionDecl] =
    "dimension" ~> quoted ~ ("from" ~> quoted) <~ ";" ^^ {
      case name ~ source => DimensionDecl(name, source)
    }

  def aggregation: Parser[AggregationDecl] =
    ("sum" | "count" | "avg") ~ quoted ~ ("as" ~> quoted) <~ ";" ^^ {
      case func ~ field ~ alias => AggregationDecl(func, field, alias)
    }

  def metricBlock: Parser[MetricDecl] =
    "metric" ~> identifier ~ ("from" ~> quoted) ~ ("{" ~> rep(dimension | aggregation) <~ "}") ^^ {
      case name ~ source ~ decls =>
        val dims = decls.collect { case d: DimensionDecl => d }
        val aggs = decls.collect { case a: AggregationDecl => a }
        MetricDecl(name, source, dims, aggs)
    }

  def spec: Parser[List[MetricDecl]] = rep(metricBlock)

  def parseSpec(input: String): List[MetricDecl] =
    parseAll(spec, input) match
      case Success(result, _) => result
      case failure: NoSuccess => throw RuntimeException(s"Parse error: ${failure.msg}")

This parses the exact same external syntax as the Python regex parser, but with proper grammar rules, error reporting, and composability. The ~>, <~, ~, ^^, and rep combinators are all just Scala methods — this is itself an internal DSL for writing parsers!

Principles for Your Own DSL

Whether you go internal or external, Python or Scala, keep these principles in mind:

1. Start with the Usage, Not the Implementation

Write out how you want the DSL to look before writing any implementation code. Show it to your teammates. If they can read it without explanation, you’re on the right track.

2. Model the Domain, Not the Technology

Your DSL’s vocabulary should come from the problem domain (“metric,” “dimension,” “aggregate”) not from the implementation (“mapper,” “reducer,” “partition key”). This is what made BeaconSpec powerful — engineers thought in terms of beacons and facts, not in terms of MapReduce shuffles.

3. Keep It Minimal

The best DSLs are small. If your DSL needs conditionals, loops, and variable declarations, you’re probably building a general-purpose language by accident. Stop and reconsider.

4. Validate Early and Clearly

Whether at compile time (Scala) or at build() time (Python), your DSL should produce clear error messages in domain terms:

Error: Metric 'playback_starts' has no aggregations defined.
  Did you forget to add a sum() or count()?

Not:

IndexError: list index out of range

5. Plan for What Comes After Parsing

A DSL is only as useful as what happens after parsing. BeaconSpec generated Java code. Your DSL might generate SQL, Spark jobs, Airflow DAGs, API configurations, or monitoring dashboards. Design the semantic model (those data classes) with the output stage in mind.

Wrapping Up

A DSL doesn’t have to be a huge investment. An internal DSL in Python or Scala can be built in an afternoon and can immediately start paying dividends by making repetitive specifications more readable, more maintainable, and less error-prone.

The key insight from Hulu’s BeaconSpec — and from the broader DSL tradition — is that the right abstraction at the right layer can transform an entire system. By letting engineers declare what they wanted measured rather than writing how to measure it, the team didn’t just write less code. They built a foundation for automatic validation, consistent monitoring, and faster debugging across hundreds of concurrent jobs.

Start small. Pick one repetitive pattern in your codebase. Write out what you wish the specification looked like. Then build the thinnest possible layer — a builder in Python, a block DSL in Scala — that makes that wish a reality.

This post is part of a series based on Monitoring the Data Pipeline at Hulu, presented at Hadoop Summit 2014. See also: BeaconSpec on Medium · Slides on SlideShare