Should you copy production data to DEV and TEST environment

A while back I responded to a post on an ASUG forum with the question about how others handle the refresh of their development client and data for development and testing. It’s been a while and I couldn’t find the post but a recent CIO Insight article made me think about what I had told the poster. The original question was precipitated because the client’s consultant had told them that it is a best practice (I hate the mis-use of this term) to refresh their development client from production regularly. My point at the that time was addressed to this issue that it is NOT a best practice to refresh the development client and that this is actually not ideal because of the loss of change transport links, modification data (there are ways to restore the modification data but not the version history) and version history. Anyways my other comment, is why would you want to do this anyways? Normally a production environment for a large company can be upwards of 900 GB, I couldn’t imagine copying this amount of data but some sites do this on a regular basis. Does it make sense? You weigh in.

Well what started me on this posting was based on the CIO insight article “Real Data Rampant During Development“, the study in this report shows that by using real data in the development and testing process customers are exposed to any number of data breaches. I understand the reasons for copying production data for using in a development and testing environment but in my opinion this is overkill and actually can lead to quality problems, let me explain.

Why do I think it’s overkill – I mentioned previously that copying 900+ GBs of data from a production client takes additional disk capacity but also consider that in order to make this happens I need my production client snapshot and and recovery process, this involves my netweaver administrator and also affects anyone wanting to utilize the system while I perform the recovery. In addition when I perform my recovery any interim work I may have in my destination client is wiped clean. So there is cost and resource time required. Last point is is the time investment really worth it, how many times is all this data used?

Can lead to quality problems – I have heard the argument several times in discussions with developers and analysts that having real production data is the only way to ensure their application works. Really? So my argument is that if the application was designed properly and requirements and specifications are documented. There is no reason you require real production data, any nuances that could be represented in this data should have already been taken into account by the specification and design process and incorporated into a test plan. Is making this data available only to make up for a problem in the development process? The reason I say it can lead to quality problems is that I have heard more then once that it worked ok in development we tested it against production data, this is such a cop out. As an analyst after having analyzed the way the system works and or doesn’t work, specifically designed tests to interrogate this behavior are needed. If a specific example was not available in a productive data set and the test is not performed, there is no way to ensure the quality of the application. As a side issue is the fact that creating the data needed is part of a integration test which is the test of the a whole scenario or process as opposed to a unit test which is an isolated application or function

I would like to hear what you think? Is the copy of productive data justified and is it worth the effort is the payback worth the potential data breach by anyone who has access to a development or QA environment?

[polldaddy poll=1920779]

Later….