Skip to content

Trim modes & literals

ERB's whitespace handling and its literal escapes are where most of the byte-for-byte subtlety lives. go-ruby-erb/erb follows MRI's ERB::Compiler scanner exactly, so the emitted source — and therefore the rendered output — matches the reference.

Tag kinds

Tag Meaning Emitted as
<% code %> run Ruby, emit nothing the bare statement
<%= expr %> evaluate and insert _erbout.<<(( expr ).to_s)
<%# comment %> comment, discarded nothing

Literals: <%% and %%>

A doubled opening tag is a literal <%, and inside a tag a doubled %> is a literal %>. The scanner unescapes them the way MRI does:

In the template Renders as
<%% <%
%%> %>

So Rate: <%%= n %> renders the literal text Rate: <%= n %> rather than treating it as a tag.

Trim modes

The TrimMode option is MRI's trim_mode string. Each character turns on one rule; they combine (e.g. "<>-").

Mode Rule
- Explicit trim. -%> strips the immediately-following newline; <%- strips leading whitespace before the tag. The trim is opt-in per tag.
> Strips the newline after a tag that ends its line.
<> Strips the newline only when the tag both starts and ends the line.
% A line whose first non-blank char is % is a code line (% code<% code %>); %% at line start is an escaped literal %.

Combinations

Modes compose. "-" is the most common: it leaves normal newlines alone and only trims where you write -%>, which is what most templates want. "%>" or "%<>" combine the percent-line syntax with an automatic end-of-line trim.

Example

src, _, _ := erb.Compile(
    "<ul>\n<% items.each do |i| -%>\n  <li><%= i %></li>\n<% end -%>\n</ul>\n",
    erb.Options{TrimMode: "-"})

The -%> on the each and end lines strips the newline that would otherwise follow each control tag, so the rendered list has no blank lines between items — byte-for-byte what MRI produces for the same template and trim mode.

Binary-exact text encoding

Each literal run between tags is emitted via Ruby's String#dump on the binary string, not via naive quoting. That is what makes embedded quotes, newlines, control bytes, and multi-byte UTF-8 round-trip exactly:

Literal run Emitted
héllo "h\xC3\xA9llo"
a line with " and a tab the String#dump escaping MRI uses

Because the encoding is byte-level and matches String#dump, the compiled source is identical to MRI's down to the escape sequences — which is exactly what the differential oracle checks.

Magic comments

A leading <%# coding: … %> or <%# frozen_string_literal: … %> — including the emacs -*- coding: … -*- form — is detected and reflected in the emitted prefix:

In the template Emitted prefix
(none) #coding:UTF-8
<%# coding: us-ascii %> #coding:us-ascii
<%# frozen_string_literal: true %> #frozen-string-literal:true

The magicComment return value of Compile reports this line on its own, and src already carries it as its first line.