Wednesday, June 23, 2010
As a way to combat attrition, we looked at moving to a 'one-click' model similar to some other vendors out there. The ideal being a three-step click -> confirm -> download process. We wanted to keep the 'confirm' step so that people didn't accidentally buy something but otherwise we wanted music in the user's hands as quickly as possible. Technically, this is actually not that hard to implement. We need to set up some kind of registration process (with prompt to register if they have not already done so when they click to buy) which collects personal information and billing information. We then store that info permanently and reference it when the user has the impulse to buy. In this way the user avoids having to re-enter their information every time. Naturally, if we store all that info we also need to allow the user to manage it so some kind of account management interface is also needed.
What else? Well, we then change all of our 'add to cart' buttons to something like 'buy now' or 'download now' (the jury's out on which is better psychologically). When the user clicks the button they are prompted to login (once per visit)/register (once ever)/confirm (every time) instead of starting down the old process. Ideally, the prompt is some kind of pop-up so the page context is maintained. After confirmation, the pop-up presents the download link (or download manager for multi-track purchases like albums) and the user's impulse is satisfied!
So, why haven't we already done this since we've already figured it all out? Aside from some some possible patent issues (I can't believe someone can actually patent a 'one-click purchase' but that's another story) the main stumbling block is transaction costs. Yes, that's right. The cost to process a transaction is so big that we hope a user waits until they have a full shopping cart before buying. Our take on a $0.99 tracks is quite small (labels and publishers get most of it) so even a $0.25 transaction fee is deadly to our bottom line if they only buy one track. It's not so bad if they have $5 or $10 worth of music in their cart though. Thankfully, PayPal recently released a new micropayment transaction plan that goes along way to helping us sell individual tracks. It prices transactions at $0.05 + 5% which brings a $0.99 track transaction down to around $0.10 from close to $0.29.
Now suddenly the idea of capturing those impulse purchases is starting to make financial sense! Further analysis comparing the potential lost impulse revenue vs. the higher accumulated transaction costs for more, smaller transactions is needed but my gut tells me that we'd likely come out on top if we caught more sales. I'll try and post some of our analysis as a follow-up if I can.
UPDATE: I had this post in my drafts waiting to go when I read a compelling post by another blogger today. The post was titled How Paypal can help save media - and itself and it makes a very solid argument for further refining the one-click purchase process I described above by removing the need to register or even log in! PayPal could offer a cross-site solution similar to Facebook's 'Like' buttons that would provide safe, secure, convenient purchases. How great would that be!
Let's just hope the transaction prices are reasonable...
Thursday, June 03, 2010
So, what do I mean when I say we turned our queries 'upside down'? Traditionally, with relational databases SELECT statements with JOINS and WHERE clauses are used to filter and locate the correct records. This works great for medium to small data sets with medium to small complexity on granularity (i.e. WHERE x = y) but, in our experience, starts to fail spectacularly when you move to large data sets with high granularity. Here's our story of how we evolved beyond the middling to the big time.
First, let's set the stage. Our metadata database is composed of the following entity tiers:
- Labels - a handful
- Sub Labels - thousands
- Artists - tens of thousands
- Albums - hundreds of thousands
- Tracks - millions
Filtering at the uppermost tier is quite easy and efficient just using good old fashioned WHERE clauses since there are only a few Labels and SQL is pretty good at handling two or three WHERE conditions even with fairly sizable tables. However, it completely falls apart if we were trying to filter at a lower level of, say, the middle with Artists. If we wanted to launch a music store with a small number of artists we could use the WHERE approach but what if we had 200 artists to include while excluding all the others? The initial approach would be to create a big WHERE IN clause with a big comma-delimited list of artists. This might work for a very low traffic situation but wouldn't handle scale very well. Besides, what if we then wanted to add another 100 artists? The query duration would continue to increase dramatically with every additional condition. Even getting creative with stored procedures, views, functions, etc. would not save us.
So what's the solution to adding more conditions without increasing query time dramatically? The solution is to embed filter information within the record itself. What do I mean by this? Well, we add a new column to all the entity tables with a value that describes in which music stores the content is available. Yes, we are talking about maintaining millions and millions of records - one for each entity. It's a big job, no doubt, since we release music stores constantly and for each we need to update all of the entities. Clearly, careful management is required but now we are doing most of the work up front before a user hits our site as opposed to when a user is browsing catalog and expecting the site to refresh instantly.
We have found that bitwise operations are extremely fast for this type of approach but have limitations with a maximum of 32 possible combinations which might be ok for some but at any given time we have over 100 live sites so we've developed an approach that uses a string value to represent larger numbers which are then translated into operable values. There's probably room for optimization here which we will look to include in a future iteration.
By embedding filter data within the record all of our queries' WHERE conditions are now very short and consistent no matter what level of filtering may be required for a given site; whether it's 100 or 500 artists the WHERE clause is exactly the same. This means that SQL sees fairly flat query durations across the board with all tiers of entity filtering.
The real beauty of this approach is that we can now filter at even the Track-level with an amazingly fine-grain control over what content is available on which site. Track-level filtering would have been impossible previously.
Now we have a solution that optimizes filter queries across the enterprise as well as opening up new opportunities for filtering that did not exist before. THAT's the power of heavy writes and I encourage you to investigate them for yourselves.
Wednesday, February 17, 2010
- Identify a possible market opportunity (sometimes called the ‘problem’)
- Research the market
- Identify the ideal customer
- Identity the ideal user (if the user and customer are not the same)
- Identify market size/opportunity
- Identify potential competitors
- Talk to potential customers if possible
- Talk to as many domain experts as possible to ‘test the water’
- Be careful not to give away too much!
- Brainstorm over results and refine the analysis accordingly
- Determine a clear revenue model
- Important enough to be its own step!
- Succinctly describe the refined opportunity
- If it’s still not clear, go back to the research drawing board – step #2
- Be aggressive about focusing on the core opportunity. More features can always be added later.
- Design a product based on the succinct description
- Review the design with key stakeholders
- If there are any revelations go back to the drawing board – step #2
- Develop the product
- Review the developed product (ideally often during development)
- If there are any revelations go back to the drawing board – step #2 or step #5
- Launch “soft” with little fanfare (until confidence in performance is high)
- Review performance and tweak as necessary
- Let the fanfare begin
Friday, January 29, 2010
Check out this video for a demonstration: http://vimeo.com/6949998
Tuesday, September 15, 2009
Probably the coolest takeaway are the three core foundations of successful motivation:
Mastery would be the drive to improve and become the best at what one does. Autonomy would be the feeling of empowerment and control over what one does. Purpose would be the feeling of being a part of something bigger.
I've seen all of these things work well in the past but the genius is putting them all together to drive motivation across the whole team. Dan Pink says that incentives don't work as expected and I believe him. From my experience the most motivated teams have everyone feeling engaged, improving their skills, and helping the company kick some ass.
Thanks Dan for clarifying what I've observed in a way that I can now take it an purposefully work at implementing it.
Geez, TED is a cool site.
Friday, December 19, 2008
I bought it from MemorySuppliers.com which seems to have some good deals on memory of all kinds. If you are looking for such things then check them out at memorysuppliers.com.
Wednesday, March 26, 2008
As part of my continuing effort to make this blog so esoteric that nobody ever reads it I'm going to post some thoughts on Error Handling and Event Logging. I felt it was a good time to post this as I've been discussing it a bunch with the development team lately and from those discussions we've worked out couple of methodologies that I thought were worth sharing. I'm going to say up front that I recognize that its very likely that this has all be discussed and written about in numerous programming books so it's probably not super original stuff. Nonetheless, I'm going to take a moment to describe how we are approaching the problem anyway.
The "Caller Responsibility" Error Handling Practice
Simply put, "He who starts the ball rolling catches any error thrown back."
The idea behind this best practice is that the process/method/procedure/whatever that started the workflow which ultimately generated an error is best suited to understand the context of the error in terms of the greater operation of the application. For example, if we have an application with two independent services that form a workflow then we cannot give either service responsibility for handling an error since neither understands the other service (they are "independent"). Instead, the process that governs the workflow which calls each service understands how errors in either are handled specifically.
This concept is important in SOA applications where abstraction between services is key to layering functional blocks. Each service only reports the error back, cleans up any internal state inconsistencies that may result, and then trusts the caller to react to the error in an appropriate way. I've found this approach helpful when breaking up work within teams by assigning distinct services to groups of individuals. Since error handling is distinct it's quite easy for different teams to plug into a bigger picture of error handling by trusting someone else to handle it.
The next thing I wanted to write about today was…
The "Reporter On Scene" Event Logging Practice
The idea here is that don't even bother logging exceptions/events if you are not going to capture enough information to do anything useful later on when you are trying to figure out what the heck happened. It turns out that newspaper reporters have pondered and solved this problem for us by using reporting on the "Six W's". They are: Who?, What?, Where?, When?, How?, and Why?. We can use the answers to provide a complete picture of any event to the poor sucker reading the log at 3am when something breaks.
What exactly do we mean by all of this? Let's use an exception event as an example:
"Who" is the component reporting the error. Usually, this is the website and class name (or page name).
"What" are exception details such as error description and stack trace.
"Where" is method and line number in the source code.
"When" is the time that the exception occurred (not the time it was reported).
"How" is additional context provided by the caller (see "Caller Responsibility" above) to explain the workflow that lead to the exception.
"Why" is probably the most difficult to automate but helps describe why the error happened. Often this is accomplished by providing additional supporting information such as the values of related variables and objects.
When the event log provides details that address each of these questions it is immensely useful for analyzing failure after the fact. Give it a try!