Tuesday, February 5, 2013

Four Modeling Don'ts

Four Modeling Don'ts by M.M.Herlihy

Don't forget.
Pitfall #1: Not Balancing Math and Business—Analysts need to take time to understand the key performance drivers of a business, Herlihy explains, so they can build a model that logically reflects the firm’s market situation and goals. For example, response is dependent on deliverability, but deliverability should be a pre-select before the model is developed and not a variable in the model. In addition, analysts need to be on the lookout for data fields that contain ill-defined content, such as a field that once represented survey responders but in recent years indicated which customers ordered gift packaging on orders.

Pitfall #2: Believing in Quick Fixes—Automated modeling solutions have mass appeal, but can produce lackluster to disastrous results when not handled properly. Even with a “black box” tool, says Herlihy, you still need to know how to interpret the results, run through the appropriate iterations, validate the model and tune the software and settings. Bad models are worse than no models.

Pitfall #3: Not Modeling the Right Thing—In many cases, businesses must drive contradictory behaviors to attract the right prospects and encourage the most profitable customer activitiy. Herlihy gives the example of credit lenders who tend to get the highest responses to offers from individuals who need credit but aren’t likely to pass the criteria for credit approval. Conversely, those individuals who are mostly likely to get approved are least likely to respond because they don’t need credit. Modeling for just response or persistency (conversion/payment) won’t achieve lasting business success; rather, the right thing to do is to develop a balanced model construct to look for those consumers who score well for both behaviors.

Pitfall #4: Inconsistency in Data Storage and Deployment—The wrong data easily can be pulled into a model when the files receive inconsistent coding scores. Some business use a scale of 1 to 10, with 1 being the worst score and 10 being the best; other firms reverse the scale. Still yet, computer programmers often use 0 through 9 for data storage because it takes up a consistent number of bytes, says Herlihy. Analysts who don’t sufficiently study the database and find out what tagging methodologies are being employed (especially those who are new and inexperienced) are likely to pull low-performing deciles by mistake and build the wrong model. The best practice for businesses to learn is to stick to the same tagging methodology.

A final word of advice from Herlihy: “If you don’t have at least 1,000 of whatever you are trying to model, it is best to collect more data before you invest in the model.”
***********
Why 0-9 for Scoring:
http://multichannelmerchant.com/lists/archive/data_storage_062507/
"

f you're using 0-9 as your stored decile values – with 0 being the highest score and 10 being the lowest – and you want the top two deciles pulled, records would be pulled from deciles 0-1.
But if you're using 1-10, with 1 being the highest score and 10 being the lowest, there's a good chance the 10 will be mistaken as a 0, meaning you would actually have the best names and the worst names pulled instead.
Say you pulled 100,000 records from each of deciles 0 (with potential revenue of $87,000) and 1 (with a probable $65,250 coming in) for a mailing, the revenue would total $152,250. But if you're using 1-10 as your decile scores, and the bottom decile is mistakes for the top, the potential loss in revenue could be critical."

No comments:

Post a Comment