Thought Leadership

The ultimate LookML style guide

DAS42
October 15, 2019
Featured image for “The ultimate LookML style guide”

Creating a thoughtful and straight-forward LookML Style Guide for your organization gives structure to your Looker project and allows users and developers to navigate the LookML with ease long after its inception.

As DAS42’s Joey Bryan points out in a previous blog post, Looker is great for establishing data democracy across all levels of your organization. But how do you ensure that a sustainable Looker Project will transition with ease through new users and developers, allows for changes in data sources, and is adaptable and easy to understand?

As with any construction project, success starts with the foundation. In Looker, continuous adherence to a LookML Style Guide is that foundation. Creating a thoughtful, straight-forward LookML Style Guide gives structure to the project. It allows users and developers to navigate the LookML with ease and understand the LookML long after its inception.

At DAS42, our LookML Style Guide has been tried and perfected over many successful Looker Projects. We’ve refined our LookML Style Guide into best practices, and we are proud to utilize them in all Looker Projects we develop.

Projects & organization

Stick to creating one Project. Only create more than one if your business requires complete code isolation for different developer teams. Each Looker Project is a completely separate set of code and resides in a separate git repository. LookML files from one Project can’t be referenced in another Project.

Models

It is preferred to have as few Models as possible, as most Projects only need one Model. Use multiple Models to delineate data access boundaries. For very large Projects with many Explores, create multiple Models and use model sets to selectively grant access to specific Models and limit what business users see in the Explores menu.

For example, if financial data needs to be isolated from all teams except for the Finance team, you can create a finance specific Model and use the “model sets permissions” feature to only give the finance team access to that Model.

View file names

Name View files using a scoped namespace. The form should be type.view_name where type is one of base, derived, schema_name.

  • View files should be named in lower case with an underscore separating words, as in SQL. This naming convention mirrors that of database layout and groups similar View files together in the View file list.
  • Names should be standardized to remove any extraneous characters, which sometimes appear when auto-generating a table name. Base Views denote View files that should only be extended and do not reference either a database table or derived table.

For example, for a database table in the public schema called public.data, the View file should be named public.data.

Include statements

Include only the necessary View files in a Model through include statements.

  • By using the scoped naming convention above, you can write include statements based on database schema as: include: “schema_name.*.view”
  • Remove any default references of include: “*.view.”
  • If you add tables from a new schema or with a new prefix, Looker will not automatically include those files until you add a specific include declaration in the Model file.
  • If a View extends another View file, you will need to explicitly include the base View in the View that is extending it.

Extends

Common dimensions repeated throughout different tables should be factored out and placed into a base View file that can then be extended in other Views. Factoring these common dimensions into a base View allows the Looker Developer to create derived dimensions of these common dimensions and have the derived dimensions defined in one place.

Some common use cases of base Views:

  • Product dimensions which appear in a denormalized orders table as well as a denormalized products fact table Derived Dimensions: Product category rollup.
  • Geographic or demographic dimensions that apply to multiple tables and Views (e.g. countries, regions)
  • Derived Dimensions: Country to region mapping.
  • Common metadata parameters generated by an ETL tool or SaaS application (e.g. Segment context data, Fivetran metadata columns like _fivetran_synced)
  • Derived Dimensions: Mapping multiple mobile device types to a mobile platform rollup.
  • Homepage and documentation
  • Projects should have a Homepage intended for business users and/or a Documentation page intended for LookML developers.
  • The Homepage should list references to training material (e.g. decks, videos), admin contacts, help resources internal to the client (e.g. slack channel, wiki, Google Group), and links to Learning Looker on VimeoLooker Exploring Data tutorial, Looker Browsing Data tutorial.
  • Developer Documentation should be high-level and generally describe the differences between multiple Models if more than one exist, the purpose of each Explore, and the the purpose of key derived tables.

Git

Looker uses git as a version control system for LookML code. Git is a tool used by software developers to track code changes, allow many developers to work on the same codebase, and integrate changes into a single source of truth.

Looker uses git as a version control system for LookML code. Git is a tool used by software developers to track code changes, allow many developers to work on the same codebase, and integrate changes into a single source of truth.

Commit messages

Commit messages should always take the active voice in present-tense. Along with helpful specific commit messages, it should be clear who changed what and why.

For example, instead of saying ”fixed foo” say “fixing foo” or “fixes foo.” If the Project uses a ticketing system and your LookML updates relates to a ticket, put the ticket number in the commit message, e.g. “[PROJ-123] fixes bug in the flux capacitor.”

Commit Frequently, Push Often

Create small change sets by committing frequently. Small change sets are less risky, more clear upon later review, and keep commit messages relevant. Additionally, pushing often allows developers to integrate and build on changes more quickly, creating faster iteration cycles.

Use pull requests

Pull Requests are supported for a number of different repo hosting providers including Github, Gitlab, and BitBucket.

  • Set up Looker to use the Pull Request workflow following this Discourse article.
  • If you are unfamiliar with how to use Pull Requests, see this Github documentation.

Views

Delete Auto-generated Cruft

When leveraging Looker auto-generated Views from database tables, start by deleting auto-generated measures (e.g. count) and field sets.

When creating a new View—especially if auto-generated—start by hiding all dimensions and measures. Only unhide dimensions and measures when it’s clear they are useful (e.g. when a business user requests something specific).

View File Layout

  • Primary key dimensions, if any, should be declared first.
  • Separate long SQL statements into multiple lines.
  • Use a big ###MEASURES### comment to clearly denote where the measures start. This is useful when scanning the file.

Dimension field ordering guidelines

Use the following ordering of declarations when defining a dimension. The hidden: declaration can be omitted if the dimension is not hidden.

dimension: my_dimension {

    hidden: yes

    group_label: “Cool Dimensions”

    label: “My Dimension”

    description: “This is a description block.”

    type: string

    sql: ${TABLE}.my_dimension

}

Dimension naming guidelines

Follow SQL naming conventions: all lowercase names with a single underscore to separate words. If the database uses a different naming convention, use the dimension name to create SQL standard names.

  • Name yesno type fields as a question where the answer is either “yes” or “no” so that it is clear to the user what a “yes” or “no” indicates.

For example, a yesno dimension with “yes” indicating a customer returned an item could be named is_order_returned.

  • For date dimension_groups, avoid words like date or time in the name. Looker automatically includes the timeframe at the end of the dimension group name, so this would create duplication.

For example, for a dimension group called order_date, Looker creates the dimensions order_date_date, order_date_month, etc. Simply use order as the dimension group name, which becomes order_date, order_month, etc.

Measure Naming Guidelines

  • Name aggregate measures by explicitly stating the aggregate type and field.

For example, total_[FIELD] for sum, average_[FIELD] for averages, unique_[FIELD] for count distincts.

  • Name ratios descriptively, explicitly stating what is measured in the ratio.

For example, items_per_order is more clear than items_percent.

Every column should be a dimension

Every column should be an untransformed dimension first. Then, build derived dimensions and measures using those base dimensions. This provides an additional layer of abstraction over the database.

For example, if a column name changes, updating the SQL: parameter in the original dimension would flow through to the rest of the uses of that column.

Measures should only reference dimensions or measures

Measures should only reference dimensions or other measures. Do not use measures that access ${TABLE} columns directly.

Joins

  • Avoid joining on concatenated primary keys defined in Looker, as this may slow down queries.
  • Use the fields directly in the join predicate or create a derived table that acts as a key/mapping table.
  • Use sql_on instead of foreign_key.
  • Use raw date dimensions for date Joins. This prevents any issues with Looker casting date values to other data types or converting time zones unexpectedly.

View labels

Utilize view labels so that only one View appears to the business user. Splitting the Explore menu based on which View a dimension/measure comes from can be confusing to business users and may not align with how they think about their data. 

Use group_label to Group Related Dimensions and Measures

In addition to liberally hiding fields and using the fields: [] parameter in the join block, group labels lend a logical grouping that helps users navigate the data.

  • Group dimensions with the business entity they relate to. Common groups used are “Product Dimensions,” “Customer Dimensions,” and “Order Dimensions”.
  • Group measures by their related KPI. Some common measure groups are “Revenue Measures,” “Order Measures,” and “Customer Measures”. Commonly, there are other groups like “Per Order Measures” or “Per User Measures” that would have average order value or spend per user measures, respectively.

Reports, dashboards, & visualizations

Link from a Dashboard to Another Dashboard

Linking from one dashboard to another is a good tactic to use when clients have dashboards with many (>15) reports. The general design pattern is to create one dashboard with high level KPIs and then a drill down dashboard that is filterable by product, client, region, etc. See this Discourse article for how to implement this.

Development space

Create a collaborative Space reserved for Look and Dashboard development. Once Dashboards or Looks are production ready, they can be moved out of the “development” space into the “production” user-facing space.

Y-axis labeling

Remove the axis names for single Y-Axis charts where the measure units is included in the chart title.

For example, a chart titled “Total Spend” doesn’t need a Y-Axis named “Spend.” If you format the ticks to include “$”, then it’s clear that spend is measured in dollars.

X-axis labeling

When the axis is a time dimension, you don’t need an x-axis label. It’s clear that the x-axis is time-based given the tick labels. The grain of the data (e.g. daily, weekly) should be explicit in the chart title.

Disable full field names

Do not include full field names in any chart, table, or visualization. It’s often repetitive and confusing to the user.

General best practices

Avoid derived tables

Transformations can generally happen on the fly and codified LookML is more business user-friendly than raw SQL. Additionally, persistent derived tables (PDTs) add complexity in scheduling and dependency management.

Keep data at the lowest level of granularity

It is best to limit data transformation logic to accomplish light-denormalization.

This can drastically improve performance, especially if data is distributed or partitioned based on the expected query predicates. On large datasets, queries of this type can be performant. Keeping data at the lowest level of granularity allows for maximum flexibility and keeps complexity relatively low. The goal of denormalization is to create Looker queries of the form:

select {dimensions}, {sum or count aggregates} from {single table} where {predicate} group by {dimensions}

Keep the number of explores to a minimum

Limit the number of Explores to create clear expectations of what users can find in each one and how they are different. Create Explores that map to business entities.

For example, an eCommerce site might have Explores for Orders, Users, Shipments, and Products. A SaaS might have Explores around Users, Visits, Pages, etc.

Number models/explores with labels to order them in the menu

Add labels to the Model and/or Explores so that the they are ordered in the dropdown menu according to what is most important.

For example, if you label a Model “1. My Important Model” that Model and all of its Explores will appear at the top of the Explore menu. If you label an Explore “1. My Cool Explore”, that Explore will appear at the top of the group of Explores within a Model.

Limit the number of views

Keeping a project limited to only the Views necessary to model the data helps developers move quickly and decisively.

Avoid table calculations

If a table calculation will be used more than once, it is generally better to make it a measure. Table calculations tightly couple business logic to an individual report. Other users cannot easily reuse that metric in other reports, and business users find them confusing to edit/create since they look like “code.” They also have limited flexibility, and any change to a table calculation requires editing each individual Look that uses it instead of making a simple LookML change.

Keep formatting clean

  • Delete any extra line breaks. Code blocks should be separated by a single, empty line.
  • Order dimension and measures alphabetically within their group to make it easy to scan and find a specific field.
  • During development, it is common to experiment by creating tests and commenting code in and out. However, before pushing updates, remove any unnecessary code.
  • In general, there should be minimal comments in production. Only comment things that are not immediately clear or deviate from general expectations (e.g. weird join relationships).
  • Try to make the code itself as clear as possible; make heavy use of labels and descriptions. This provides instruction for the business user and documents the logical meaning of dimensions and measures.

…but where to start?

This LookML Style Guide is certainly not exhaustive. Instead of the end all be all, this Guide should act as a base as you continue to build LookML practices that best fit your company; Looker developers should use this Ultimate LookML Style Guide as a foundation. Need help getting your Looker Project started? Reach out to info@DAS42.com to learn why we are a trusted Looker Consulting Partner.

Looker’s online Discourse Community and Help Center are also great places to look for further best practices. Give these posts a try:

Best Practice: LookML Dos and Don’ts

Best Practice: Writing Sustainable, Maintainable LookML

Best Practice: Create a Positive Experience for Looker Users

Fabio’s Opinionated LookML & SQL Style Guide / Ruleset