# Statsdcc

Statsdcc is a Statsd-compatible high performance multi-threaded network daemon written in C++. It aggregates stats and sends the results to backends, especially Graphite. We are proud to announce that we are opensourcing it today. Check out the code at https://github.com/wayfair/statsdcc.

At Wayfair we’re big believers in “measure anything, measure everything,” as the “Statsd is reborn in node.js at Etsy” announcement put it. We do application performance monitoring with the opensource tools Graphite, Grafana, the ELK stack (Elastic Search/Logstash/Kibana), and some homegrown tools. Until recently we had been using Flickr/Etsy’s 2nd-generation, node.js-based Statsd to collect metrics for Graphite. As the volume of these metrics increased, we noticed inconsistencies in the data, and realized that some metrics were being dropped. Long story short, we tried some architectural changes, scaling Statsd and Carbon horizontally (details below), but as the operational complexity of that increased, we began to wonder why we needed so many boxes. We found a bottleneck in the way Statsd buffers and flushes data to Carbon, and we decided we needed a different version.

Alternatives:

There are already quite a few alternative Statsd implementations available, but none of them really came close to meeting all of our needs. Brubeck by github is one that we found interesting, because it promised high throughput. Unfortunately, it was released after we had Statsdcc implemented and were ready to put it into production. At that point, we had no reason to take Brubeck and extend it to support the features we needed. However, we borrowed the idea of integrating a webserver to view application health from BrubeckStatsdcc and Brubeck try to solve similar problems. I would recommend checking out all these implementations and picking the one that best fits your needs.

TL;DR

If you’re interested in what we tried before starting to hack the C++, read on.

Attempts at Horizontal Scaling with Statsd:

Statsd performs aggregations on incoming metrics and sends the aggregates to a Carbon process, which in turn saves received metrics to a Whisper database.

To scale, we use multiple Statsd/Carbon chains. Each chain goes to a different disk.  Proxy daemons hash metric names to determine which chain to use. Which proxy daemon is chosen depends on round robin DNS. Consistent hashing ensures metric names are well balanced.

The diagram below depicts the architecture.

Issues:

A year ago we noticed that a certain set of metrics were being dropped, resulting in inconsistent monitoring data. We realized that this was due to a maxing out of UDP receive buffers on Statsd. So we tried adding more Statsd processes with increased UDP buffer sizes.

However, adding a new process is complicated. When a new Statsd instance is added, consistent hashing by a reverse proxy will re-route some metrics to the new process, resulting in duplicate files on different Carbon nodes for the same metric – one for the old data and one for the new data. To save space, and for Graphite to show all data, the old Whisper data files should be merged into the new ones.

In the end we were unhappy with how much traffic an individual node could handle. We discovered that the problem was a design decision in Statsd, where the same thread is responsible for both buffering incoming metrics and performing aggregations on them at every flush interval. When computing aggregations, the thread stops listening for incoming metrics, which are stored in the UDP buffer. As the rate of metrics increases, the UDP buffer overflows and drops metrics. We use single-threaded, event-looping frameworks in a few places (Node.js-based daemons for a couple of things, Python-based gunicorn+gevent for several), and we have seen this type of problem before. The event loops don’t help you when you have a blocking IO operation that can bring processing to a halt. Sometimes we work around or solve such problems within the event-loop paradigm, and sometimes we take a completely different approach.

After finding the actual root cause, we decided to rewrite Statsd as a multi-threaded application with a focus on effective use of socket-IO and CPU cycles.

Statsdcc:

Statsdcc is an alternative implementation for Statsd written in C++ for high performance. In Statsdcc, one or more server threads actively listen for incoming metrics. Server threads distribute incoming metrics among multiple workers using the formula worker = hash(metric name) % #workers. Worker threads read from their dedicated queues and update their ledgers until signaled to flush by a clock thread. Upon receiving this signal, the worker threads hand off their ledgers to short-lived flush threads, and continue with new ledgers until the next signal. To avoid lock contention and to pass metrics faster between server and worker threads, boost’s lock-free queues are used.

We have not gotten rid of consistent hashing as we did not want to lose the ability to scale horizontally. However, to solve the scaling problem in our previous architecture, where adding a new process required cleanup on the Carbon end, we moved consistent hashing from proxies to aggregators. The proxies distribute the incoming metrics among multiple aggregators using the formula aggregator = hash(metric name) % #aggregators. Each aggregator then sends the metric aggregation to its respective Carbon process by using the consistent hash. The difference from the previous architecture is that the Carbon process has more TCP connections open, one with each aggregator. However, unlike Statsd, instead of reopening connection on each flush, Statsdcc reuses established TCP connections, thereby avoiding the overhead of a TCP handshake. The diagram below describes the current architecture.

Statsdcc can handle up to 10 times more load (up to 400,000 metrics/sec) than Etsy’s Statsd. Only one instance of the Statsdcc aggregator handles all our production traffic, in contrast to the previous 12 Statsd instances. Statsdcc has been used in production for about 7 months. We hope more people will find Statsdcc as useful as we have at Wayfair.

# Tungsten in the news

There’s a great interview with our own Matt DeGennaro by Paul Krill of Infoworld that came out a few days ago. The topic is Tungsten.js, our awesome framework that ‘lights up’ the DOM with fast, virtual-DOM-based updates, React-style, and can be integrated with Backbone.js and pretty much whatever other framework you want. It’s spiffy, it has a logo,

we do it github-first, and we’re getting a lot of mileage out of it at Wayfair. Matt mentions the templating aspect of our composite system: we use server-side PHP, including Mustache templates, and then our client-side pages, also including Mustache templates as needed, get dynamic updates via Tungsten.js. That works great for us because Mustache has implementations in both Javascript and PHP, among many other languages.

What’s that you say? The PHP implementation of Mustache is not fast enough for you? Well, we’ve got you covered! Adam Baratz just put up a blog post yesterday on a server-side optimization that’s been working well for us. We use John Boehr’s excellent PHP mustache extension, which is written in C++, and is much faster than vanilla PHP Mustache. Inspired by another snippet of PHP/Mustache code, we’ve even added lambdas to that, as Adam explains. I had to do a double-take the first time he explained that to me. As far as I can tell, the PHP community, of all groups of web programmers, is the least likely to care about lambdas in particular, and any kind of functional programming in general. And yet, we’re finding lambdas very useful for our globalization efforts, and we’re starting to use them for other things as well.

We’re still working on a date, but Adam, Matt and Andrew Rota will be giving a talk on all of this at the Boston Web Performance Meetup, hosted at Wayfair, in the near future.

# Wayfair Labs in the news

Scott Kirsner has a terrific piece about tech talent wars in Boston, that was in Beta Boston on Friday, and then in the print edition of the Boston Globe on Sunday, October 26th. It features Wayfair Labs, which is our hiring and onboarding program for level 1 engineers in most of the department (a few specialized roles excepted). I am the director of it, so if you have any questions, please reach out.

# Rendering Mustache templates with PHP

For the past couple years, Wayfair’s front-end stack has relied heavily on Mustache templates. They’ve let our growing front-end team focus on the front-end. They allow us to share more code between server and client as we push towards a Tungsten-powered future.

Anyone who’s seen a Mustache template knows that they’re pretty simple to write. Rendering them can be another story. We began using a pure PHP implementation. This got us off the ground, but as we expanded our use of templates, we ran into the unfortunate truth that such a library could never be faster than the pure PHP pages we were hoping to replace. Yet, we wanted to make it work, for the mentioned organizational and architectural reasons.

To understand why rendering Mustache in PHP is slow, first you have to understand what goes into rendering Mustache. Consider a canonical template/data example:

Hello {{name}}
You have just won {{value}} dollars!
{{#in_ca}}
Well, {{taxed_value}} dollars, after taxes.
{{/in_ca}}

{
"name": "Chris",
"value": 10000,
"taxed_value": 10000 - (10000 * 0.4),
"in_ca": true
}

This template must be tokenized, then each token must be rendered. Mustache is simple enough that tokenizing mostly amounts to splitting on curlies. Not a big deal for PHP. The real trick is in replacing {{name}} with the correct content. This is fine when you have a flat set of key/value pairs. Consider this example:

{{inner}}
{{#outer}}{{inner}}{{/outer}}

{
"inner": 1,
"outer": {
"inner": 2
}
}

The output should be “12”. When rendering the {{#outer}} section, the renderer must know which “inner” to display. This is typically implemented by turning the data hash into a stack. When entering/exiting sections, data is pushed/popped. To get a value, start at the top and descend until you find a match.

This is an easy operation to describe, but it makes for some slow PHP. It was the major performance bottleneck with the Mustache implementation we first used, and it’s an issue with another popular implementation.

Bearing this in mind, we sought a more radical solution. Enter php-mustache, a C++ implementation of Mustache as a PHP extension. C++ is much better than PHP at traversing stacks. Witness this before/after from when we first deployed php-mustache:

This chart shows the render time for the product grid on our browse pages (for example, Beds). It’s a complex mustache template with a lot of data and a lot of nesting. The X-axis is clock time, the Y-axis is render time in milliseconds.

This kind of lift allowed us to justify making Mustache a standard instead of an occasional tool. And, courtesy of the open source world, we didn’t even have to write it. However, it became something of a double-edged sword. As Wayfair operates stores in multiple countries, we have to localize a lot of strings. We started handling this by building all of them in PHP and loading them into the template. This led to some thick code in some cases, which occasionally created friction around using Mustache. The typical i18n solution for Mustache involves lambdas, which unfortunately were not implemented in php-mustache… until now! If you’re a performance-minded Mustache user, we hope you’ll check it out.

# PDO and MSSQL

When you write your first web application, chances are you’re going to query a database. When you write it in PHP, chances are it’ll look like this:

$mysqli = new mysqli("example.com", "user", "password", "database");$result = $mysqli->query("SELECT * FROM product");$row = $result->fetch_assoc(); Before long, you have to start handling user input, which means escaping:$mysqli = new mysqli("example.com", "user", "password", "database");
$result =$mysqli->query("SELECT * FROM product WHERE name = " . mysqli_real_escape_string($mysqli,$product_name));
$row =$result->fetch_assoc();

As your application grows, you start writing code like this a lot. You may start encapsulating it in a DAO, but they do little besides erect walls around this chimeric code. “Okay,” you say. “This is fine, because it’s only me. I’m a Responsible Engineer and I don’t have to sugar-coat things for myself.” But soon, this project is going gangbusters. You’ve got a team, and then a large one, and now there’s no rug large enough under which you can hide this mess. And woe unto you should you decide you need connection pooling or any other resource management.

One solution to this problem is an ORM. But, some people prefer having their database interactions more “managed” than “abstracted away.” Instead your code could look more like this:

$pdo = new PDO("mysql:host=example.com;dbname=database", "user", "password");$statement = $pdo->prepare("SELECT * FROM product WHERE name = :name");$statement->bindParam(":name", $product_name);$statement->execute();
$row =$statement->fetch(PDO::FETCH_ASSOC);

A little more verbose, yes, but also easier to read and less error-prone. This is PDO. It’s a PHP extension that provides a vendor-agnostic interface to various relational databases. It pairs a well-structured API for performing queries with a series of different database drivers.

When Wayfair began adopting PDO, our database access was relatively managed. An in-house library managed connections over the course of a request, but building queries involved a whole lot of string concatenation. Complex queries would get unwieldy. Engineers with prior PDO experience wanted to know why we weren’t using it. However, to convince engineers new to PDO that it would make their lives easier, it had be as low friction as the existing library and produce output in the same format.

Simplifying PDO syntax was the easy part. Technically, the example given is shy on error handling. The PDO constructor can throw exceptions. Related functions return a boolean value, indicating whether they succeeded. So a “correct” PDO example would look like this:

$pdo = null; try {$pdo = new PDO("mysql:host=example.com;dbname=database", "user", "password");
} catch (Exception $e) { // logging, etc., if you want to note when you were unable to get a connection }$statement = false;  // PDO::prepare() will return false if there’s an error
if ($pdo) {$statement = $pdo->prepare("SELECT * FROM product WHERE name = :name"); }$row = null;
if ($statement) {$statement->bindParam(":name", $product_name); if ($statement->execute()) {
$row =$statement->fetch(PDO::FETCH_ASSOC);
}
}

// now, do something with $row Awesome, I know, right? Sure, the PDO API is, on the whole, “nicer,” but no one’s going to want to deal with it if they’re forced to jump through these kinds of hoops. And who could blame you? At Wayfair, we place a lot of value on developer ergonomics. These are problems we strive to solve well when rolling out new internal tools. We landed on a slight extension to PDO that would yield this syntax:$statement = PDO::new_statement("PT", "SELECT * FROM product WHERE name = :name"); // the first argument refers to the desired host/database
$statement->bindParam(":name",$product_name);
$statement->execute();$row = $statement->fetch(); // PDO::FETCH_ASSOC is now the default fetch style We pulled all the boilerplate into a factory function. It does the necessary error handling and reporting. If everything succeeds, it’ll return a standard-issue PDO statement object. If there are errors, it will return a special object which acts like a statement that’s failed, but will return empty result sets if asked. We felt comfortable that this would remove most of the friction around using PDO while preserving the underlying interface. Anyone who wants finer-grained control can still utilize the stock API. The trickier problem was “make output the same.” While PDO looks the same with each driver, the drivers don’t necessarily behave the same. The documentation isn’t always clear about these differences. We needed to do a fair amount of testing and source code reading to suss out the effects. While my examples have used MySQL, Wayfair is an MSSQL shop. We had been using the mssql extension. It uses a C API called DBLIB to talk to the server. Microsoft doesn’t maintain an open source version. FreeTDS is the commonly-used free implementation. One of the PDO drivers also uses DBLIB, but it returns column data differently. Instead of returning strings as strings and ints as ints, the PDO DBLIB driver returns everything as a string. We had to patch it to use the expected data types. To be able to differentiate between quoting strings as VARCHAR vs. NVARCHAR, we also added a parameter type. We also added support for the setting connection timeouts (PDO defines a PDO::ATTR_TIMEOUT constant, but it has no effect with the DBLIB driver). Another reason we were first attracted to PDO was for prepared statements. Since MSSQL supports them, it seemed like this could be an opportunity for a performance gain. However, after digging into the driver internals, we found that the DBLIB driver only emulates them. Microsoft has an ODBC driver for Linux. We tested it in conjunction with PDO’s ODBC driver, but found the two to be incompatible. We were able to get it working with the plain odbc extension, but (amazingly) found prepared statements to be slower than regular queries. Since using prepared statements would’ve necessitated a nontrivial change in coding style, we decided against investigating the speed difference. We’re currently working on deploying SQL Relay. Preliminary tests have proven out that it reduces network load without adding much overhead. It has a PDO driver, so we’ll be able to swap it into our stack without having to change how queries are made. # Tungsten.js: UI Framework with Virtual DOM + Mustache Templates Performance is top priority here at Wayfair, because improved performance means an improved customer experience. A significant piece of web performance is the time it takes to render, or generate, the markup for a page. Over the last several months we’ve worked hard to improve the render performance on our customer facing sites, and ensure it’s easier for our engineers to write code that results in page renders with optimal performance. We had been using Backbone.js and Mustache templates for our JavaScript modules at Wayfair for some time, but last year we realized that our front-end performance needed an upgrade. We identified two areas for improvement: efficiency of client-side DOM updates and abstracting DOM manipulation away from engineers. The first issue was a result of the standard render implementation in Backbone. By default, the Backbone render implementation does nothing. It is up to developers to implement the render function as they see fit. A common implementation of this (and the example given in Backbone Docs) looks something like this: render: function() { this.$el.html(this.template(this.model.attributes));
}

The problem with this implementation is two-fold: first, the entire view is unnecessarily re-rendered with jQuery’s \$().html() whenever render is called, and second, the render method always manipulates the DOM regardless of whether the data changed, so engineers must be explicit about when render is called to avoid unnecessary expensive DOM updates. The solution to both of these problems is a mix of only calling render when the engineer is sure the entire view needs to be re-rendered and then writing low-level DOM manipulation code when only portions of the view need to be updated, or the update needs to be more precise (e.g., changing a single class on an element in the view). All of this means that engineers have to be very aware of the state of the DOM at all times, and have to be aware of the performance consequences of any DOM manipulations. This makes for view modules that are hard to reason about and include low-level DOM manipulation code.

To address both of these problems, we investigated front-end frameworks that would abstract the DOM from the developer while also providing high-performance updates. The primary library we looked at was React.js, a UI library open-sourced by Facebook that utilizes a one-way data flow with virtual DOM rendering to support high-performance client-side DOM updates. We really liked React.js, but encountered one major issue: the lack of support for templates which enabled high-performance server-side rendering.

On modern web sites and applications, HTML rendering occurs at two points: once on page load when the DOM is built from HTML delivered from the server, and again (0 to many times) when JavaScript updates the DOM after page load, usually as a result of the user interacting with the page. The initial rendering happens on the server with a multi-page site like Wayfair, and we’ve put a lot of work into making sure it’s as fast as it can be. HTML markup is written in Mustache templates and rendered via a C++ mustache renderer implemented as an extension for PHP. This gives us server-side rendering at speeds faster even than native PHP views.

Since server-side rendering is an important part of our web site, we were glad that React.js comes with this feature out of the box. Unfortunately while server-side rendering is available with React.js, it’s significantly slower than our existing C++ Mustache setup. In addition to performance, rendering React.js on the server would have required Node.js servers to supplement our PHP servers. This new requirement for UI rendering would have introduced complexity as well as a new single point of failure into our server stack. For these reasons, as well as the fact that we already had existing Mustache templates we wished to reuse, we decided React.js wasn’t a good fit.

Where do we go from here? We liked many of the concepts React.js introduced us to, such as reactive data-driven views and virtual DOM rendering, but we didn’t want our choice of a front-end framework to dictate our server-side technologies, and dictate a replacement of Mustache rendering via C++ and PHP. So, after some investigation of what else was available, we decided to take the concepts we liked from React.js and implement them ourselves with features that made sense for our tech stack.

Earlier this year, we wrote Tungsten.js, a modular web UI framework that leverages shared Mustache templates to enable high-performance rendering on both server and client. A few weeks ago we announced that we were open sourcing Tungsten.js, and today we’re excited to announce that primary development on Tungsten.js will be “GitHub first,” and all new updates to the framework can be found on our GitHub repo: https://github.com/wayfair/tungstenjs.

Tungsten.js is the bridge we built between Mustache templates, virtual-DOM, and Backbone.js. It uses the Ractive compiler to pre-compile Mustache templates to functions that return virtual DOM objects. It uses the virtual-DOM diff/patch library to make intelligent updates to the DOM. And it uses Backbone.js views, models, and collections as the developer-facing API. At least, it uses all these libraries for us here at Wayfair. Tungsten.js emphasizes modularity above all else. Any one of these layers in Tungsten.js can be swapped out for a similar library paired with an adaptor. Backbone could be swapped out for Ampersand. virtual-DOM could be swapped out another implementation. Mustache could be swapped out for Handlebars, Jade, or even JSX. So, more generally, Tungsten.js is a bridge between any combination of markup notation (templates), a UI updating mechanism, and a view layer for developers.

We don’t expect Tungsten.js to be the best fit for everyone, but we think it fits a common set of uses cases very well. We’ve been using it on customer-facing pages in production for a while here at Wayfair and so far we’ve been very happy with it. Our full-stack engineers frequently tell us they far prefer using Tungsten to vanilla Backbone.js + jQuery, and we’ve improved client-side performance now that DOM manipulation is abstracted away from developers. And while we weren’t trying to be the “fastest” front-end framework around, it turns out that when we re-implemented Ryan Florance’s DBMonster demo from React.js Conf in Tungsten.js, the browser’s frame rate ends up being, give or take, at the same level as both React and Ember with Glimmer.

Here at Wayfair we have a saying that “we’re never done”. That’s certainly the case with Tungsten.js, which we’re constantly improving. We have a lot of ideas for Tungsten.js in the coming months, so watch the repo for updates. And of course we welcome contributions!

A green highlight, indicating that Google thinks we want it to pass page rank, would be a problem. If you don’t like the idea of green being a problem, the default yellow and green colors are configurable to whatever you want.

If your promotions people are working with a blogger who forgets to do that, or misspells ‘nofollow,’ or anything along those lines, it’s on you to get that cleaned up in a hurry. It’s suboptimal to have to ‘view-source’ on every page or write your own crawler: enter the browser-based ‘no-follow’ extension. There have been a few of these for different browsers over the years, but none did exactly what we wanted, so we rolled our own.

The features of ours that we like are:

• Configurable list of domains whose reputation you are trying to defend.
• Click-button activation/deactivation on pages, which is persistent.
• Aggressive defense against misspellings, bad formatting, etc.

The configurable list is important, because if you’re looking over a page that links to one of your sites, and it links to several other sites, it’s best if you don’t have to puzzle over which links you care about.

The persistent flagging of pages that you care about is important, because if you’re engaged in a promotional activity with a site, odds are someone at your company is going to be that site from time to time. Tell your colleague to enable the plugin and to be on the lookout for green links, and you’ve got a visual cue that’s hard to miss, for problems that might arise.

The defense against misspellings, special characters, and the like, is for this scenario. Brian, head of Wayfair SEO: “Hey Bob, did you put ‘nofollow’ on those links?” Bob: “Yup”. Brian: “kthxbye”. But in fact, although Bob is telling the truth, he actually put smart quotes, rather than ascii quotes, around ‘nofollow,’ so Google will not recognize the instruction. It’s funny: browsers do a great job of supplying missing closing tags, guessing common spelling mistakes, etc., because their mission in life is to paint the page, regardless of the foibles and carelessness of web page authors. Nationwide proofreaders’ strike? No problem, browsers will more or less read your mind! But Google’s mission in life is to crawl the web and pass page rank, so you have to tell it very clearly not to do that.

And of course, when all your links are clean, you can have a luau party in your Tiki hut, like our SEO team:

# TechJam 2015

Come hang out with us tomorrow, June 11th, from 4 pm to 9 pm, at TechJam. Not to be too transparent, but we’re hiring! We will be at booth 43, bostontechjam@wayfair.com, #btj2015. Steve Conine, Wayfair Founder and CTO, and I will be there, along with a bunch of our colleagues in Wayfair engineering.

We will have a Money Booth where people can enter by checking in on Facebook or tagging Wayfair in a picture on Instagram/Twitter, #wayfairbtj2015. The person who grabs the most money will win that amount in Wayfair Bucks.

We will also have some Wayfair swag at the booth.

# vim emacs talk by Aaron Bieber

I can’t believe I’m writing a post about vim and emacs in the year 2015! But our very own Aaron Bieber just spoke at the vim meetup on how he’s been secretly using emacs all the time for a few months, and is now coming out of the closet as an emacs user. Vim vs. emacs is an eternal holy war, and pretty much the opposite of a topic that I would normally want to write about. But Aaron is the opposite of a holy warrior, as anyone at Wayfair Engineering can tell you. Here’s the announcement of the talk: http://www.meetup.com/The-Boston-Vim-Meetup/events/222395931/, here’s his personal blog post on the topic: http://blog.aaronbieber.com/blog/2015/01/11/learning-to-love-emacs/, and here’s the video: https://www.youtube.com/watch?v=JWD1Fpdd4Pc, with cool jazz!

Evil mode is what makes this possible, of course. I used to be Aaron’s manager, and as someone who has used his .vimrc / .vim-folder setup, after watching this talk I’m at least going to try his .emacs file, as an adjunct to my crusty old pseudo-Python-IDE thing. As an engineering manager, the key comment to me was the thing about how ctags (for tab completion) works just as well in both environments. As long as people are using something that helps them save time that would otherwise be spent on meaningless drudgery, to each his own!

# Announcing Tungstenjs

Matt DeGennaro and Andrew Rota of our Javascript team recently spoke at the BostonJS meetup on a library we have written called Tungstenjs, which we have opensourced today. It takes the fast-virtual-dom-update idea from React.js and makes it usable with other frameworks, including Backbone.js. It ships with a Backbone adapter. There’s a server-side component too, using npm and Mustache templating, but perhaps I should just let the ‘readme’ tell you: https://github.com/wayfair/tungstenjs. We’ve been using it on Wayfair, and it’s awesome.