Recommendations with simple correlation metrics on implied preference data

When you sit down to write a recommendations system, there are quite a few  well-practiced techniques you can use, and it’s difficult to know in advance how well they are going to work out when applied to your data.  Thanks to the Netflix prize, which was initiated in 2006 and awarded in 2009, a lot has been written on recommender systems for the Netflix data set.  If you happen to have a product catalogue similar to Netflix’s (those movies from the 60s are still being viewed and rated), and your users happen to have scored it with a 5-point explicit ratings system, there are some awesome advanced techniques and frameworks that you can take for a spin.  Does that sound like you? Show of hands?  I didn’t think so.  Our data is certainly nothing like that.

Continue reading

Puppet provider for python library installation

We run a python/Tornado-based recommendations service behind the scenes at Wayfair.  As part of our code deployments, we need to install various third-party libraries to our Tornado servers. The python tools that do this kind of thing are a bit half-baked, so we paper over their inadequacies with puppet.

A while back a fellow name Richard Crowley wrote a puppet-pip provider, which seems to have been folded into Puppet 2.7, or replaced by a module in Puppet 2.7, or something like that. So in a sense his little project is dead. But Karthick on our team has resurrected a fork of it, a hybrid provider using subcommands of setuptools (easy_install) and pip for different aspects of installation, version checking and uninstallation. We call it easypip (easypip.rb), and the forked project containing it is now up on github.  Enjoy!

January 2012 Site Performance Report

It’s a new year and it’s time for another report on how fast (or slow) our sites are. If you are new here, our previous report looked at the average load time for our four major types of pages, as well as the 95th percentile load time.  The inspiration for this type of post came from our friends at Etsy. Continue reading

MySQL Virtual Directories for Pure-FTPd

Here at Wayfair, we have thousands of suppliers we work with in order to provide our products to our customers. To automate the bulk of these interactions, we use Electronic Data Interchange (EDI), so that we can trade documents back and forth. FTP is still one of the predominant methods for transferring these documents, so we have had to build a robust FTP solution to handle this traffic.

Continue reading

SVN Commit Hooks For a Better Codebase

As we have mentioned before, the main source control system we use at Wayfair is SVN, with TortoiseSVN as our client. One of the things we love about SVN is the ability to add commit hooks, or checks that run when someone tries to commit a file to source control. By having a few key checks we can prevent bugs, ensure consistent coding practices, and generally have a cleaner codebase.

Continue reading

FreeBSD ZFS Boot Environments

FreeBSD has spent the last few years implementing awesome Solaris-based features such as DTrace and ZFS. Here at Wayfair, we use FreeBSD on our servers and our development environments. As a security engineer for Wayfair, I’m tasked with many fun projects. I get to test out different configurations and applications in FreeBSD. Naturally, I set up a FreeBSD 8-STABLE VM to do my testing.

ZFS gives you a lot of flexibility and stability. I decided to go with ZFS on my VM so that I could emulate OpenIndiana’s boot environment (BE) system. For an example on how boot environments work, please refer to the OpenSolaris documentation.

Continue reading

Progressive Enhancement For a Faster Site

Progressive Enhancement is often described as an alternate approach to “Graceful Degradation” – it encourages focusing on the most basic functionality first and then building out from there.  It also forms the core of the Yahoo! Graded Browser Support model, which we use as a guide for our own rules around browser support.  This is an important topic, but it has been covered fairly extensively in other articles, so I’m not going to dive into it too much here.  Instead I am going to talk about specific progressive enhancement techniques we use at Wayfair to improve site performance.

Continue reading

Switching from Classic ASP to PHP

One of the big changes at Wayfair recently was moving all of our storefront code (well, almost all…we’re still working on our sessioned code) from Classic ASP (VBScript) to PHP.  The company was started in 2002 and at that time ASP was a common technology on the web, and one that our founders were familiar with.  After 8 years of working with it, we had pushed it to the limits and decided we’d get more benefit out of moving to a new technology.

Continue reading

Wayfair is Recognized for Technology Achievement by MassTLC

We certainly take technology seriously around here and work quite hard at trying to get it right, so it definitely means a lot when we’re publically recognized for these efforts. I was therefore quite proud last Thursday night when Wayfair was honored by the Massachusetts Technology Leadership Council (MassTLC) by being named “2011 Company of the Year” among privately held companies.

The MassTLC is an organization focused on fostering entrepreneurship, encouraging the innovative use of technology and recognizing the successes of companies who develop and deploy technology across industry sectors. Wayfair was up against four other great companies as finalists for this award: Acquia, Endeca, Karmaloop and Kiva Systems. Thank you, MassTLC for this great honor.

Wayfair Code Deployment, Part 3

If you haven’t already read the overview of our deployment system or the architecture of our deployment server you might want to check those posts out first.

In this article I will discuss how we deploy code in a unique way that gives us all the flexibility of the symlink method without requiring a symlink.

Part of the decision on how to setup our deployment system was determined by the need to roll forward and roll back our deployments as quickly as possible. We also needed a way to do this atomically, so that no errors occurred when the code was physically being copied to the webservers.

Continue reading