When we create content as marketers, our goal is to get our content viewed as widely as possible (in most cases)—and as a result, having search engines pick up content is usually seen as a good thing.
However, crawling the Internet for every single link can have some downsides: you may have late-stage content exposed, or worse, prospect and customer data exposed. Here’s how you can protect yourself from the former when using Marketo.
How does all this work?
As a search engine crawls the Internet, it tries to index as many links as possible and explore all variants of a page—including any parameters, or the key-value pairs that appear after a ? in the URL. It’s in a search engine’s interest to record any parameters it can find on a URL, since this may retrieve specific pages that help the user find what they’re looking more easily. For example, take a look at this Old Navy URL for “Graphic Faux-Leather Sunglasses Case”:
In this case, we have three parameters:
pid: The product ID, which loads the page, tells us which specific product page to load (220808) and what product variant to show (102).
cid: The Adobe Analytics Campaign ID, which is generated from in-site browsing.
pcid: The cookie value that identifies you as a unique visitor.
This makes sense for tracking user behavior and looking at the site’s Tealium/Adobe Analytics set up, you can see how all this data is fed back to Old Navy. They use those parameters to personalize the experience and record internal tracking data. However, if you search Google for “Old Navy Graphic Faux-Leather Sunglasses Case”, the URL that shows up is only https://oldnavy.gap.com/browse/product.do?pid=220808102. This will retrieve the correct product page, but not any of the campaign details that may result in data being skewed. Imagine if pcid was in the URL Google displayed—every single person who clicked on that URL from Google would be set as the same person!
However, that can—and does—happen with parameters if you’re not careful. If you look at the flowchart from our article on email clicks, you’ll notice on the final step that the URL given to a person via email will, if it’s being tracked, end with a mkt_tok parameter. That parameter allows the web visit to be tied back to a specific person—effectively connecting the person who opened and clicked on the email with any subsequent web traffic. However, as long as the mkt_tok parameter is included in the URL, it will continue to be associated with the record it was originally generated for. This is why it can be problematic to share emails meant to go to clients within your company, for instance—your internal team may become cookied as your prospect.
More problematically, though, once this mkt_tok value is associated to a given web browser, it can be used to pull that record’s data in insidious ways. For instance, if you do a Google search for “unsubscribe mkt_tok”, you’ll find quite a few results from Marketo customers that expose user data in the unsubscribe form because the mkt_tok parameter is indexed.
For example, looking at one Marketo customer, we can see that a value is pre-populated on their unsubscribe page because Google has kept the mkt_tok parameter:
Where this gets truly concerning, though, is that once a mkt_tok value is populated to a device, any prefill on file for that mkt_tok value can be seen. After doing a quick search for any other Marketo landing pages from that same company, I was able to pull pretty sensitive data just from loading a page with a form:
This is incredibly problematic from a privacy and compliance standpoint. Fixing this issue does require a bit of web work, but can—and should—be solved by your team.
Fixing the issue
First, you’ll want to see if your website domain has been set up with Google Search Console and Bing Webmaster Tools. Generally, there are a few different ways to tell, so check to see if you have any of the following:
Visit MXToolbox and enter your domain. See if any values come back with google-site-verification= in them. Bing does a similar DNS-level authentication where a string is assigned to a CNAME that resolves to the value verify.bing.com, but you will need to view your DNS records to confirm this.
Check (or have your web team check!) the root folder of your domain to see if there are any HTML files that start with google, such as google374[…].html or BingSiteAuth.xml for Bing.
Check the meta tags of your corporate website to see if you find any values such as <meta name=”google-site-verification” content=”[…]” /> or <meta name=”msvalidate.01″ content=”[…]” />
If you use Google Analytics or Google Tag Manager, you may be authorized to access Google Search Console through those properties.
If you find evidence that Google Search Console or Bing Webmaster Tools has been set up, you’ll want to find out whose account these are registered under (likely someone in digital marketing or IT) and gain access. If you can’t find any evidence or are locked out of the accounts which have access to your site, you’ll need to create a Google/Microsoft account and sign up for their tools. Moz has a great article explaining how to setup and verify your property in Google Search Console, and Bing has step-by-step instructions available on their site.
Once you’re successfully inside each platform, you’ll want to do the following:
Google Search Console
1. Visit https://www.google.com/webmasters/tools/crawl-url-parameters?hl=en. Select your website from the property dropdown.
2. If you have had your Search Console site set up for a while, you will see a list of parameters Google has detected in your URLs. If you see mkt_tok, you’ll want to click “Edit” next to the parameter name:
Otherwise, you’ll need to click the “Add parameter” button:
3. If you clicked “Add parameter,” enter mkt_tok in the Parameter (case-sensitive) field. If you clicked “edit,” make sure that Parameter: mkt_tok is at the top of your pop-up modal.
4. You should see a dropdown asking “Does this parameter change page content seen by the user?”
In this case, you need to select “Yes: Changes, reorders or narrows page content.” You may be tempted to select “No: Doesn’t affect page content (ex: tracks usage)”, but what this does is actually tell Google to only index one representative mkt_tok value it finds—and in this case, we want to make sure no values are tracked whatsoever.
5. Under “How does this parameter affect page content?,” select Other, then click “No URLs” from the list of options.
The “No URLs” setting means that Google will not index and ignore any URLs that it finds with the mkt_tok parameter in it. This does not mean that your pages will no longer be indexed overall; instead, think of it like this. If Google finds the following two pages:
Google will know to index the first, but to not index the second.
6. Click Save. If you have any other parameters that should be removed beyond mkt_tok (common examples being campaignID/cid, any URL parameters that you use to pass info to forms, fbclid/gclid in some cases, and other ad trackers; note that UTM parameters are already excluded), repeat steps 1 through 5.
Bing Webmaster Tools
1. Select your site from My Sites.
2. On the left navigation, find Configure My Site -> Ignore URL Parameters.
3. Enter mkt_tok under “Which parameter would you like Bing to ignore?” and hit Submit.
Add any additional parameters you need to and submit those as well.
Finally, to prevent this issue from recurring in the future, you’ll want to make sure your robots.txt file for your website includes the following line:
This tells other search engines to never index any URLs that have mkt_tok anywhere in its path. If you do not have a robots.txt file set up for your Marketo landing pages, you can either upload it to your Marketo instance or use your corporate website’s robots.txt setup by going to Admin -> Landing Pages -> Rules and clicking New -> New Redirect Rule. From here, click “Use non-Marketo Landing Page”. You can redirect your robots.txt to either the full path of the robots.txt file you uploaded or your corporate robots.txt:
As marketers get more and more involved in the digital sphere and are increasingly becoming the data stewards within companies, it’s important to get in front of potential privacy and compliance issues such as this—with the side effect of not skewing your Marketo or web traffic data with inaccurate parameters. Curious to know if your company is affected by this issue? Feel free to reach out for an engagement.
About the AuthorFollow on Linkedin More Content by Courtney Grimes