Goal of the day (or maybe months): 300% increase of my conversion rates.
How to do that: Split testing.
What is needed: Track the unique user sessions of the website in real time.
So, how do you track the unique visitors on your website? I must say, it looks like black magic. I took the time to read the code of AWStats but was not able to understand it as both my fluent Perl is far away in the past and the code is completely written with speed in mind and not concept understanding.
So, Wikipedia on the web analytics page is providing this information:
[A unique user is] an IP address plus a further identifier. Sites may use User Agent, Cookie and/or Registration ID.
Good, so, it means that if I want to track my users, I need to use the IP address (easy), a user agent (easy), a cookie (not so easy) or a registration id (not possible in my case).
Why it is not easy to track with a cookie?
Cookies are optional. With Firefox, I have an extension to disable all the cookies but for the websites I trust.
So, if you consider that you need a distinct pair (cookie, ip) to have a unique user, then, each page I access on your website will count a new unique user.
A possible solution not tested yet
Yes, I need to test it and the solution is to implement it and compare with what gives me Google Analytics.
A unique user session is a combination of:
- a unique IP address
- a unique user agent
- an optional unique cookie
- all that active within 30 minutes
This approach means that if I do not have a unique cookie and if I have a set of users coming from the same connection with the same browser, it will get counted as a unique user.
Is it a problem? Not really. Why? Because the goal is to perform split testing, so the goal is more to have the minimum number of unique user and to be able to at least mark 50% of them for the split test. So as long as I can get a good fraction of the users with the cookie, I will be happy.
I am a PHP shop, but you can do it in any language. What you need is simply a database and a fast in memory storage (APC or memcached).
The fast memory storage is to avoid hitting the database at each request and the database is of course to get a bit of persistence. The memory storage expires the value after your desired session time (30 minutes), this automatically takes care of the active session length handling.
The workflow is as follow for a non cookied visit:
- User access the website for the first time (or without cookie).
- Check the combination of IP + user agent in the memory store.
- If available, update the last seen time stamp and try to cookie it.
- If not available, add the IP + user agent pair in the memory store, the database and cookie it.
For a user with a cookie:
- Check the combination of cookie + IP + user agent in the memory store.
- Update the last seen value.
The tracking must be performed in real time. This is why it is not possible to use the referrer information to follow the path of the user and to dissociate the users accessing the website with same IP/Agent. Anyway, it looks like no single solution will be the optimal but only something like an adaptive algorithm which can give a probability of "uniqueness" of a hit based of compounded methods.