Transcript Slide 1
Data Minimisation Managing Data Growth While Containing Cost and Carbon Footprint Ken Hall, Dimension Data Friday, July 17, 2015 Agenda Introductions Today’s data management challenges Energy efficiency in the data centre What is Data Minimisation? Online Active Archiving Backup Data De-Duplication Data Minimisation effects Developing the business case Questions & Answers Dimension Data - ‘Data Centre & Storage Solutions’ Network Integration Security Microsoft Solutions Infrastructure Managed Services Microsoft Solutions Application Integration Customer Interactive Solutions Data Centre & Storage Solutions – Availability, Compliance & Optimisation • Storage Solutions – SAN, NAS, CAS • Virtualisation Solutions – DR, Server & Desktop Consolidation • Backup, Recovery & Archiving Solutions • Data Centre Environmental’s – Power, Cooling & Rack Solutions Key Technology Partners • APC, Cisco, EMC, HDS, HP, IBM, Microsoft, NetApp, Quantum, Symantec, Sun The Digital Universe is Rapidly Expanding Amount of Digital Information Created and Replicated Each Year 1,773 exabytes 1,800 1,600 1,400 Exabytes 1,200 1,000 800 600 400 173 exabytes 200 0 2006 2007 2008 2009 Ten-fold growth in five years! Source: IDC White Paper, "The Diverse and Exploding Digital Universe," March 2008 2010 2011 Typical DD Customer – Exponential Data Growth • Annual Compound Data Growth of 65% • Having to squeeze more into Backup Window • Daily Incremental and Weekly Full • B2D Requirement Growing Rapidly • 2 Week Retention on Disk (3 Full’s - 10 Incr) • Backup Media Server/s Under Pressure • 4 Week Retention on Tape • Network Bandwidth Constraints • 12 Monthly’s on Tape kept indefinitely • Tape Infrastructure &Handling Costs Increasing Coping with Information Growth in Today’s Economy In 2009, IT budgets are flat or declining* Escalating costs for primary storage Difficulty meeting backup and recovery windows Ensuring high availability of information Providing timely access to historical information *“Global purchases of IT goods and services… will equal $1.66 trillion in 2009, declining by 3 percent after an 8 percent rise in 2008.” Global IT Market Outlook: 2009, Forrester Research, January 12, 2009 Data Center Energy Use is Doubling Comparison of Projected Electricity Use, 2007 to 2011 Annual Electricity Use (billion kWh/year 140 120 100 Historical energy use 80 60 40 State of the art scenario 20 0 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 IT energy use has doubled since 2000 and will likely double again by 2011 Energy operating costs will soon exceed the cost of purchase for servers Existing conservation technologies can reduce consumption to 2002 levels Source: EPA report to Congress, 2007 2011 Available Capabilities for Energy Efficiency Improve Efficiency – Reduce Energy Consumption REDUCE CAPACITY Snaps Clones Compression De-duplication Archiving INCREASE UTILIZATION Server virtualisation Data migration Storage consolidation Virtual Provisioning Flash drives Optimisation algorithms Automated discovery Document management Storage tiering Virtual LUNS File and e-mail tiering Storage virtualisation Large-capacity drives Replication across storage tiers How can we... Manage exponential data growth, while... Improving access to organisational data Containing data management and infrastructure costs Reducing the data centre’s carbon footprint... Implement a Data Minimisation Strategy Online archiving of e-mail and file systems Backup with data de-duplication Data Minimisation Elements New Technologies and Services are Enablers Primary Storage Archive Identify candidates for archiving Classify and move Backup Establish SLAs based on information class Retention and compliance Tier backup infrastructure Data reduction Optimise media: B2D, VTL, de-dupe and tape Universal access Simplify management Address security issues Simplify management Data Minimisation – How it works 1. Archive the inactive data before you perform the backup process Identify Inactive Data based on polices Automate the movement of the data to a lower cost storage tier or dedicated archive platform leaving stubs behind Items are retrieved from the online archive on user demand Backup up the archive infrequently or never 2. Backup the remaining data using resource efficient data de-duplication Rapid ‘Full Backups’ - only the ‘sub-file’ changes are sent and stored on disk Minimal Bandwidth – only a fraction of the typical 200% is sent over the wire Minimal Storage Consumption – only unique ‘sub-file’ blocks are stored Protect more, with less for longer Today: Energy-Efficient Storage Design 1 TB Data on Different Capacity/Performance Drives 94% 6,096 kWh/yr 38% Less Energy 87% 3,048 kWh/yr 73% 50% 1,434 kWh/yr 3,790 kWh/yr 787 kWh/yr 30x IOPS 73 GB Flash drive 15K 73 GB 393 kWh/yr 15K 146 GB 10K 300 GB 7.2K 500 GB CONSUME LESS ENERGY BY CAPACITY 7.2K 1 TB File System Archiving Extract inactive, final-form data to an archive Enhance performance of production applications Reduce size of backup datasets Free up expensive Tier 1 disk Store archived data on high density low cost energy efficient storage Before After Backup Back upfull, 4 TB, 10active TB data only Production 4 TB Active data Always Extract available 10 TB 6 TB Active archive Inactive Reclaimed data storage Primary storage Secondary storage 17 July 2015 E-Mail Archiving Mail Archival automatically create shortcuts to archived messages / attachments…and deletes the original attachments from the e-mail server Message Server Space saved on e-mail server is typically 60–80% Message 1 Jan. 1, 2008 To: Rick Subject: Question Attached: Shortcut E-mail Archive Server Message 1 Jan. 1, 2008 To: Rick Subject: Question Attached: Shortcut Message 2 Jan. 1, 2008 To: Ron Subject: Update Attached: Message 2 Jan. 1, 2008 To: Ron Subject: Update Attached: Shortcut Message 3 Feb. 1, 2008 To: Bill Subject: Training User’s Inbox Message 3 Feb. 1, 2008 To: Bill Subject: Training E-mail Archive Definition of De-duplication “The process of detecting and identifying the unique data segments within a given set of information, enabling the elimination of redundancy when stored or moved.” Data Set 1 De-duplication Data Set 2 Data Set 3 Before: total segments = 39 After: Unique segments = 6 Data De-duplication: How it Works First Instance Duplicate Instance May 2007 Modified Instance May 2007 June 2008 A B A B E B C D C D C D Only unique data segments are backed up A B C D Data already backed up, so only a unique ID pointer is stored (20 bytes) A B C D E New data segment identified and backed up E Unique data stored on disk, available for immediate recovery Key Point – Data Minimisation requires a platform that doesn’t need to be backed up! Archiving Functionality Customer Archival Requirements WORM DISK Active Archiving WORM delivers unique features for online archives Location independence Self-healing and management Guaranteed authenticity Single-instancing Online Archiving Tier 3 Disk Tier 3 Disk with SATA and NAS with ATA Offline Archiving Tape is best suited for offline archives Tape Management Efficiency Data Minimisation Strategy - How it all fits together Static Data growth OH Tier 2 Secondary Storage Tier 3 Data Growth No management required Tier 3 Archive long term Retention on disk 80% of data De-duped Data Tier 4 Backup to disk (De-Dupe) Quick recovery Optional 20% Data backup Automated movement relative to age Tier 1 Primary Storage Tier 5 Legacy long Term retention On tape Optional 20% Static Data growth Quantified Results – Reduce Tier 1/2 with Archiving Major reduction in expensive Tier1/2 Storage Tier 3 Archive storage minimised due to single instancing & compression 73% reduction in power and cooling requirements for archived data Quantified Results – The Data Minimisation Leverage Good Tier 4 Savings with Archiving or De-Duplication Excellent results by combining Archiving & Backup Data De-Duplication 6 x reduction in power and cooling requirements for B2D storage Quantified Results – Less Tape Infrastructure Associated reduction in Tape Library Slots, Drives, Management & Handling Power of combining Archiving & De-Duplication – 560 Less LTO4 Tapes in Year3 Tape could be removed altogether – Offsite Replication & Disk Spin-Down Data management cost comparison – Data Minimisation New Data Management Annual Costs $3,000,000.00 $2,500,000.00 $2,000,000.00 Old Cost $1,500,000.00 New Cost $1,000,000.00 $500,000.00 $0.00 Present State Year 1 Year 2 Year 3 Year 4 Year 5 Significant Reduction of Backup Infrastructure and Tape Management • 22 Tape Drive, Tape Licences, Slots, Library, Backup Server, Tape Media, Offsite Storage & Recall Costs, Admin Costs © Copyright Dimension Data 2000 - 2006 17 July 2015 Data Minimisation Assessment – Business Case • Current backup minimisation methods give you better efficient backups • However it doesn't fix the cause of the problem which is data growth • A combination of data archival, backup de-duplication and compression represents the most effective manner to contain data within your environment • Helps quantify business case for archiving (or other appropriate solution) • Workshop to identify costs/issues 23 © Copyright Dimension Data 2000 - 2008 17 July 2015 Data Minimisation – Input Variables 24 © Copyright Dimension Data 2000 - 2008 17 July 2015 Data Minimisation – Graphical View 25 © Copyright Dimension Data 2000 - 2008 17 July 2015 Data Minimisation – Graphical View (Cont.) 26 © Copyright Dimension Data 2000 - 2008 17 July 2015 Data minimisation strategy achieved by... Footprint Units / kW / Tons Archiving over 70% of data to a protected environment which removed the need for that data to be backed up via archiving sq. ft. 4,500 25,000 4,000 20,000 3,500 $ 3,000 15,000 2,500 2,000 10,000 1,500 Minimised the impact of data backup via de-duplication and compression (reduction in data volume and backup data by 80%) 1,000 $ 5,000 500 0 0 $ 2006 2008 2010 2012 2014 2016 Estimated Infrastructure Run Rate $4,500 Minimised the impact of VMware on the environment through de-duplication Power (kW) Cooling $4,000(Tons) $3,500 Footprint (sq. ft.) $ $3,000 "K$" Equipment (Units) $2,500 $2,000 $1,500 $1,000 $500 Contained Tier 1 disk growth and spend Provided the most storage efficient backup method possible today Estimated savings to be over 5 Million dollars in 5 years. $0 Year 1 Year 2 Year 3 Total Cost BAU $708 $1,410 $2,107 $4,226 Cost Optimized $278 $560 $840 $1,678 Savings $430 $850 $1,267 $2,548 My initial Sync took 12 hours now I backup in 50 mins’ – Dimension Data Customer Questions & Answers Friday, July 17, 2015