Céondo's Blog - Embrace Constraints To Evolve Selection of posts with tag A/B Testing. www.ceondo.com/ecte/feed/ Normal, Bold, Italic, Bold and Italic, on which one do you click? www.ceondo.com/ecte/2010/09/suprising-ab-testing-results 2010-09-17 13:12:14 GMT

If I tell you that on my website I have a download link which can be either:

  • normal;
  • italic;
  • bold;
  • or bold and italic.

The link is like on the screen shot below:

Download link of Indefero

What is your answer? I must say, I would have put:

  • bold, and, bold italic basically the same and the best conversion rate;
  • then italic and normal.

So, just for the fun, I ran the test on the download page of Indefero. After a week, the results were not really like expected. So I let the test run until my confidence percentage was stable. Here are the results:

Normal better than bold

Yes, bold and italic is the converter, and this by a large margin with 9% better, but what surprised me is that the the bold link is not statistically significantly better than the normal link!

So, my best judgment was basically wrong. What a blow, especially for something as simple as the font style off a link. This small experiment as changed a lot my way to think about improving my software. For my scientific work, I always use data, for the design I often trust my feelings. I was wrong, terribly wrong.

Now, the problem is that I cannot test everything because I am not google and I do not have thousands of visitors a day. But at least, I can test the key points in my application, that is, where actions and conversions are performed.

Sign Up or Sign up here ? www.ceondo.com/ecte/2009/09/sign-up-here 2009-09-16 07:48:54 GMT

As you already know, I am doing A/B testing work on the InDefero website. I have first increased by 50% the conversion to get the visitors to access the sign up page. Then, I decided to test the wording of the sign up page. This was just a small experiment to do some A/A testing at the same time.

The results are once again interesting.

» More to read.

First results of unique visitor tracking, the bots and crawlers are here www.ceondo.com/ecte/2009/08/results-unique-visitor-tracking-bot-crawler 2009-08-26 08:17:21 GMT

So the unique visitor tracking test is running. At the moment of writing, I have 159 unique visitors in my visitor table. From an excerpt of the results shown below, it is clear that I need to flag the bots and crawler and exclude them from the page tracking.

Part of the visitor table

id | User agent
82 | Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10_5_7; en-us) AppleWeb [...]
83 | Mozilla/5.0 (Windows; U; Windows NT 6.1; en-US) AppleWebKit/530.5 [...]
84 | msnbot/2.0b (+http://search.msn.com/msnbot.htm)
85 | DoCoMo/2.0 N905i(c100;TB;W24H16) (compatible; Googlebot-Mobile/2.1; [...]
86 | Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.5; en-US; rv:1.9.1.2) Gec[...]

Part of the log table

id  | Visitor | Page
341 |      83 | /
342 |      83 | /tour.html
343 |      56 | /refund/
344 |      56 | /privacy/
345 |      84 | /robots.txt
346 |      84 | /
347 |      85 | /robots.txt
348 |      85 | /
349 |      84 | /refund/

The good thing is that the robots and crawler are good Internet citizens, as you can see for the MSN bot with the id 84, they are always requesting the robots.txt file for the first request. This means that one can directly flag a new visitor as a bot if the first action is to grab the robots.txt file.

Now, this will kick out most of the bots and crawler but not these ones:

70 | Mozilla/5.0

A very minimal user agent string.

303 | 70 | /doc.html//?_SERVER[DOCUMENT_ROOT]=http://www.[...]
304 | 70 | /
305 | 70 | //?_SERVER[DOCUMENT_ROOT]=http://www.[...]
306 | 70 | /doc.html//?_SERVER[DOCUMENT_ROOT]=http://www.[...]
307 | 70 | /  
308 | 70 | /doc.html//?_SERVER[DOCUMENT_ROOT]=http://www.[...]
309 | 70 | //?_SERVER[DOCUMENT_ROOT]=http://www.[...]
310 | 70 | /

And looking to trash my site. I am already not logging the ones without a user agent string, but it looks like I will need to use the heuristics of AWStats to mark more of the visitors as bot.

What to do next?

  • Add a field in the visitor table to mark a visitor as bot.
  • Mark a visitor doing the first request against /robots.txt as crawler/bot.
  • Do not log the requests of the bots.
  • Merge the AWStats robots definition as a simpler regex/substring matching to catch the robots.
  • Add small heuristics for the stupid security scanners. One could perform a small check on the request string to mark them and drop the corresponding logs.

I am going to work on that this afternoon and will report to you the results.

First approach to unique visitor tracking www.ceondo.com/ecte/2009/08/unique-visitor-tracking 2009-08-25 16:15:27 GMT

In my previous post, I wrote about unique user session tracking, now, here is what I ended up creating to implement that in practice. This approach is undergoing tests by tracking the unique visitors on www.indefero.net. I will then cross check the results with the Google Analytics data of the account to assess the quality of the idea.

Database storage

The storage is composed of 2 tables, one for the visitors and one for the logs. The visitor table is needed as the goal is to track in realtime the unique visitors. To mitigate the need to lookup data in this visitor table, information is cached using Memcached.

The visitor table stores:

  • IP address;
  • User agent;
  • Cookie value;
  • Creation time stamp;
  • Last seen date, this date is update a maximum of 1 time every 30 minutes.

The log table store:

  • visitor (foreign key to the visitor table);
  • page seen;
  • time stamp.

Logging Procedure

To find a visitor in the visitor table, I first search by cookie and if not available by user agent/IP address combination. The real trick is the handling of the missing cookie. In my case, I log just before sending the response, this means that if this is a new visitor or a visitor without cookie, I have a new cookie. When doing the check for the visitor in the table, if the user agent/IP matches but not the cookie, I update the cookie in the table. This is because I have no idea if the visitor will now accept the cookie or not. This could be a performance problem.

Basically, I first perform a cookie check and then I default on the user agent/IP address combination. This is running at the moment on indefero.net (only the presentation website, not the hosted forges) and I will compare the results with the Google Analytics resuts in 24 or 48h. What is already better than GA is that I can see the bots. Maybe I should add a bot flag in the visitor table to easily exclude them when doing reports.

50% Boost of Conversion Rate with A/B Testing www.ceondo.com/ecte/2009/08/ab-testing-boost-conversion 2009-08-25 09:02:25 GMT

On the InDefero website, you can see at the top a series of links to access the different parts of the website. In my case, I wanted to improve the percentage of people accessing the page with the plans.

Before my tests, my data from Google Analytics gave me for the month of July:

  • Home page: 3,849 page views.
  • Plans: 824 page views (21.41% of the home page views).

I got everywhere that just changing a link in your website could drive the conversion up, so I decided to follow the same approach and used the base case plus 2 alternatives for my link to the plan page from the homepage:

  1. Free Hosting base case for 50% of the visitors.
  2. Pricing and Signup first alternative for 25% of the visitors (coming from the GitHub page).
  3. See Plans & Pricing second alternative for 25% of the visitors (coming from a blog post about 37signals).

And the results are really nice!

» More to read.

How to track unique user sessions www.ceondo.com/ecte/2009/08/track-unique-user-sessions 2009-08-21 12:18:16 GMT

Goal of the day (or maybe months): 300% increase of my conversion rates.

How to do that: Split testing.

What is needed: Track the unique user sessions of the website in real time.

So, how do you track the unique visitors on your website? I must say, it looks like black magic. I took the time to read the code of AWStats but was not able to understand it as both my fluent Perl is far away in the past and the code is completely written with speed in mind and not concept understanding.

So, Wikipedia on the web analytics page is providing this information:

[A unique user is] an IP address plus a further identifier. Sites may use User Agent, Cookie and/or Registration ID.

Good, so, it means that if I want to track my users, I need to use the IP address (easy), a user agent (easy), a cookie (not so easy) or a registration id (not possible in my case).

Why it is not easy to track with a cookie?

Cookies are optional. With Firefox, I have an extension to disable all the cookies but for the websites I trust.

So, if you consider that you need a distinct pair (cookie, ip) to have a unique user, then, each page I access on your website will count a new unique user.

A possible solution not tested yet

Yes, I need to test it and the solution is to implement it and compare with what gives me Google Analytics.

A unique user session is a combination of:

  • a unique IP address
  • a unique user agent
  • an optional unique cookie
  • all that active within 30 minutes

This approach means that if I do not have a unique cookie and if I have a set of users coming from the same connection with the same browser, it will get counted as a unique user.

Is it a problem? Not really. Why? Because the goal is to perform split testing, so the goal is more to have the minimum number of unique user and to be able to at least mark 50% of them for the split test. So as long as I can get a good fraction of the users with the cookie, I will be happy.

Here are more ideas to explore the tracking without cookies.

Implementation

I am a PHP shop, but you can do it in any language. What you need is simply a database and a fast in memory storage (APC or memcached).

The fast memory storage is to avoid hitting the database at each request and the database is of course to get a bit of persistence. The memory storage expires the value after your desired session time (30 minutes), this automatically takes care of the active session length handling.

The workflow is as follow for a non cookied visit:

  1. User access the website for the first time (or without cookie).
  2. Check the combination of IP + user agent in the memory store.
    1. If available, update the last seen time stamp and try to cookie it.
    2. If not available, add the IP + user agent pair in the memory store, the database and cookie it.

For a user with a cookie:

  1. Check the combination of cookie + IP + user agent in the memory store.
  2. Update the last seen value.

Speed consideration

The tracking must be performed in real time. This is why it is not possible to use the referrer information to follow the path of the user and to dissociate the users accessing the website with same IP/Agent. Anyway, it looks like no single solution will be the optimal but only something like an adaptive algorithm which can give a probability of "uniqueness" of a hit based of compounded methods.

Céondo Marketing Experiments www.ceondo.com/ecte/2009/08/ceondo-marketing-experiments 2009-08-21 12:15:50 GMT

To improve the conversion rate of InDefero, I decided to develop some software to perform A/B testing on the InDefero homepage. You will be able to follow these experiments here. The A/B testing details will be tagged accordingly.

The more general remarks about marketing will use the Marketing label. I will try to make regular "state of my marketing" entries to explain my status and feed you with interesting details.