demandlab brand logo

Thoughts on Marketo Summit 2017 and Project Orion

by | 13.Jun.17

Marketo Project Orion

As a Marketo service partner, we always attend the annual Marketo Summit, and I’ve learned two keys for success when it comes to getting the most out of your trip:

  • Wear comfortable shoes
  • If anyone on the Marketo Product team is giving a talk, you go to that talk. Sorry if you were interested in anything else scheduled at that time

Product talks are often at odd times (early in the morning, last talk of the day, last day of Summit), are poorly attended, often have few people asking questions—and are probably the single most important parts of a Marketo conference. This year, three talks—Big Data Architecture and Performance Roadmap, Marketo API Office Hours and Marketo Product Roadmap and Customer Love 2017—brought some great insight into the future of Marketo through the rest of 2017 and 2018. Seeing as product talks can sometimes be too technical for a general audience, let’s distill some answers to the biggest question attendees have: what’s next for Marketo?

2016 vs. Beyond: Project Orion

Many of you will remember the launch of Project Orion in 2016 as one of Marketo’s highlights for the year—and their movement to a big data architecture, but what did that mean?

The best way to explain Orion is to think of the core of Marketo as four parts:

  • Intake: the points of data or activity that Marketo ingests
  • Storage: taking those points of data and activity and transcribing those to the database
  • Evaluation: based on incoming data, what should be done?
  • Execution: what actions Marketo needs to take being carried out

So, let’s say you have a campaign with the following:

  • Smart List: Trigger: Visits Web Page
  • Flow: Add +10 to Lead Score, Send Email
  • Schedule: Only allow record to flow through once

Now, let’s break that down a bit. When a record visits the web page, there are several things that happen:

  • Intake: Munchkin notes that the web page has been visited and calls Marketo to let it know
  • Storage: The activity of the web page being visited is written to the record’s Activity Log
  • Evaluation: Based on this activity, Marketo needs to look against all triggers and see if this activity qualifies (this, incidentally, is why you should try to limit your number of triggers!) Once it sees the qualifying criteria, it also needs to evaluate if it has run through the trigger campaign before, since we set this to only flow through once.
  • Execution: Marketo now needs to use the MLM engine to add 10 points to the Lead Score and send the email.

Now, let’s add a second layer of complexity: not all executed activities are created equal! Certain things are more important for Marketo to do first, even if other activities came before them. This is why sometimes you’ll see activities seemingly “out of order,” like an email’s open being recorded before its “Email is Delivered” activity is written. Marketo gives different priorities to different actions, so if there are many different activities going on simultaneously (such as what you would see in your Campaign Queue), it knows what to do first.

For instance, sending an email or sending an alert has the highest priority of anything in Marketo, so if there are five records whose score is changing due to a trigger but one that needs an email sent to them due to another trigger, the person with the email is going to jump to the front of the line even if that trigger fired after the records being scored.

This makes sense; one is more timely than the other. This is also why when you scale to very large Marketo instances, you’ll want to separate activities with higher priority from those with lower priority–or in other words, make the trigger campaign just handle the high priority task of sending the email and make a batch campaign to deal with the score. Otherwise, your important tasks can get stuck being attached to the less important ones before Marketo can complete it and move on to the next item.

Now, how does all that relate to Orion? Basically, all Orion boils down to is the way that Marketo handled the recording and storing of activity and the way it determined the priority of an action was overhauled. Before Orion, the bottleneck for performance happened at the Intake and Storage stages of Marketo’s process.

For instance, if you look at this chart, you’ll notice a few things like:

  • Web tracking data being stored as activity, but also jumping directly into evaluation
  • Multiple sources, such as Marketo’s MLM engine and web services, accessing the database storing activity independently of each other
  • The use of MySQL database storage, which excels at reading data and executing simple queries quickly*
Block diagram illustrating Marketo's original activity architecture described in four stages from the left: Intake, Storage, Evaluation and Execution

The Orion architecture addresses some of the problems with this setup by organizing all the activity Marketo can take (the MLM engine, web tracking, and web services) and has them handled by one service to write activity into HBase, which excels at writing data and executing complex queries quickly.

Block diagram illustrating Marketo's new Orion activity architecture described in four stages from the left: Intake, Storage, Evaluation and Execution

This brought some key improvements to the platform, including:

  • Faster import of records and activities into Marketo
  • Less reliance on cached Smart Lists
  • Traffic volume cap on Munchkin web page visits lifted
  • Better performance for Web Page Activity and Company Web Activity reports

Additionally, there have been changes specifically around Munchkin tracking to support this faster intake: as some of you may remember, the ability for anonymous records to be viewed and manipulated by Marketo was removed in 2016—because this added more intake data that wasn’t being handled the majority of the time.

Likewise, the upcoming changes in Marketo’s API to move from IDs to marketoGUID are also influenced by this change: prior to this, every record inside Marketo, whether they were anonymous or known, was sequentially assigned a number. So, you could have a setup where you’d have:

  • ID 1: Chip Dipson
  • ID 2: Anonymous activity, never shows up again
  • ID 3: Anonymous activity (Dip Dobson’s phone)
  • ID 4: Anonymous activity, persistent
  • ID 5: Dip Dobson, cookied on a laptop (which one day may or may not merge with ID 3)

As a result, your list of IDs in order would jump from 1 to 5, and as you can imagine, once you have enough anonymous activity and had been running Marketo for long enough, you’d be writing some pretty large ID numbers for everyone. MarketoGUID solves this problem by assigning a 128-character string and dropping anonymous activity from the system after 90 days with no action. Basically, you’re going from recording ID 3 to recording GUID dff23271-f996-47d7-984f-f2676861b5fa, meaning much larger websites with higher traffic can be supported.

However, the speed improvements Orion brings has been a tradeoff: with the faster recording of records and activities, there’s now a bottleneck at the other end of this equation: evaluation and execution. To use Ajay Awatramani’s example, think of Marketo as a train station:

  • Your activities are people buying train tickets
  • Your trigger/batch Smart List evaluations are the turnstiles bringing people with tickets into the station
  • Your flow and campaign execution, however, are just a few train terminals and trains, so your ticketed passengers are backed up!

As a result, activities like sending emails or changing data get backed up by having too many records in line/queued in campaigns and not enough bandwidth to process everything/execute flows.

Marketo plans to solve this with elastically scalable campaign execution to handle heavy loads and adjust the amount of “terminals” and “trains” needed on a phased rollout in late 2017.

*Technical nerds note: Yes, I am aware the difference between RDBMS and NoSQL is a little more nuanced than this, but the point of this article isn’t to go over database structure.

Share This