Transcript Slide 1
Lineage: a necessity or an exaggerated benefit April 16th, 2014 Saad Yacu © Allstate Insurance Company Proprietary and Confidential Agenda • Allstate at a Glance • Introduction • What is Lineage • Lineage Benefits • Types of Lineage • Presenting Lineage • Building Lineage • Integrating With Glossary • Lineage Design Questions • Final Thoughts • Questions © Allstate Insurance Company Proprietary and Confidential 2 Apr 16th, 2014 Allstate at a glance • The Allstate Corporation (NYSE: ALL) is the nation's largest publicly held personal lines insurer, serving approximately 16 million households through its Allstate, Encompass, Esurance and Answer Financial brand names and Allstate Financial business segment. • Allstate branded insurance products (auto, home, life and retirement) and services are offered through Allstate agencies, independent agencies, and Allstate exclusive financial representatives, as well as via www.allstate.com, www.allstate.com/financial and 1-800 Allstate®, and are widely known through the slogan "You're In Good Hands With Allstate®." © Allstate Insurance Company Proprietary and Confidential 3 Apr 16th, 2014 What is the presentation about? • Explain what lineage means to the technology and business users • Explain lineage concepts • Not based on any specific vendor implementation © Allstate Insurance Company Proprietary and Confidential 4 Apr 16th, 2014 What is Lineage According to techopedia.com Data lineage is generally defined as a kind of data life cycle that includes the data's origins and where it moves over time. This term can also describe what happens to data as it goes through diverse processes. Posted by: Cory Janssen © Allstate Insurance Company Proprietary and Confidential 5 Apr 16th, 2014 Lineage Benefits • Understand where is the data • Understand how the data moves • Understand what happens to the data as it moves • Impact analysis • Dependency analysis • Pictorial view of the whole process Front End Back End Reports Report 1 Field 1 Operation 1 Field 3 Field 2 Operation 2 Report 2 Field 4 © Allstate Insurance Company Proprietary and Confidential 6 Report 3 Apr 16th, 2014 Types of Lineage • Technical Lineage • Traces the data as it moves through the physical columns • Business Lineage • Provides a business friendly view of how attributes traverse across the various applications • System Lineage • Provides a high level view of how data moves between systems • Process Lineage • Provides a view of the various business processes acting on the data Front End Front End System 1 Process 1 Back End Reports Back End Reports DATA 1 Address System 3 Street Report 1 Field 1 Field 3 Process Warehouse Operation 1 City Field 2 Process 2 System 2 DATA 3 3 Process 4 Address Full Address Report 2 Operation 2 ` DATA 2 Field 4 System 4 Zip © Allstate Insurance Company Proprietary and Confidential Address 7 Report 3 Apr 16th, 2014 Presenting Lineage • Graphical • Textual © Allstate Insurance Company Proprietary and Confidential 8 Apr 16th, 2014 Building Lineage As Built As Designed Hybrid © Allstate Insurance Company Proprietary and Confidential 9 Apr 16th, 2014 Building Lineage – As Built Lineage is built from the ETL graphs which move/transform the data. Pros • Most accurate form of lineage, as it represents what the ETL is doing to the data • Most 3rd party tools will be able to generate this lineage, especially from their own ETL graphs • Most metadata tools can read ETL graphs metadata from other vendors to generate one lineage map. Cons • Not easy to traverse lineage of data flowing through non-ETL applications, like programming code • Not easy to understand data moving through disconnected services like Web Services or Message Queues easily © Allstate Insurance Company Proprietary and Confidential 10 Apr 16th, 2014 Building Lineage – As Designed Lineage is generated from the mapping design documents. Lineage is created by “Stitching” the same column from the different mapping documents to get a holistic picture as the data moves between columns. Pros • Lineage can be provided for any system not necessarily an ETL process • Lineage can be customized to satisify the required detail level Cons • Lineage might not reflect how the data move was actually implemented • Lineage will not automatically update as processes change • Manual process that is expensive and difficult to have the discipline to maintain © Allstate Insurance Company Proprietary and Confidential 11 Apr 16th, 2014 Building Lineage – Hybrid Lineage is generated by combining the “As Built” lineage mainly and completing the flow in the missing sections using “As Designed” lineage. Pros • Most complete system lineage view, as it show a view of the end-end data movement • Many vendors now allow for “Patching” the As-Built lineage with the As-Designed lineage Cons • Not very easy to implement • Some lineage sections have to be manually maintained © Allstate Insurance Company Proprietary and Confidential 12 Apr 16th, 2014 Lineage Landscape.. What should the lineage cover? • • • • • • • • • • • • • Reports & Report Fields Database Tables & Columns Database Views, Materialized Views Database Packages, Functions, Triggers, and Stored Procedures Flat Files BigData Stores Applications and Systems Hierarchal Structures Elements like XML, JSON, BSON, Avro Legacy Copybooks Files & Fields ETL Transformations and Graphs Programming modules – Cobol, Java, .NET Messaging Services & Message Queues Web services © Allstate Insurance Company Proprietary and Confidential 13 Apr 16th, 2014 Enterprise Business Glossary.. What does the lineage not cover? • • • • • • • • • • • • Business name Business definition Specific notes about usage Classification Sensitivity Stewards/owners/custodians Auditing information Operational Information Quality Information Super/Sub types Related items/fields Other implementations © Allstate Insurance Company Proprietary and Confidential 14 Apr 16th, 2014 Integrating Lineage With Business Glossary • Lineage Without Glossary Integration • Lineage With Glossary Integration © Allstate Insurance Company Proprietary and Confidential 15 Apr 16th, 2014 Lineage Design Questions • • • • • • Versioning Variation by Context Keeping Current Identifying Breakage Variation Between Design & Build Too Detailed or Not Detailed Enough Before 2011 Front End Back End Home Fron t End Back End Front End As DesignedField 1 Front End Front End Field 1 Auto Operation 1 Operati on 1 Field 1 Field 2 Field 1 Field 1 Field 1 Field 1 Field 1 Field 1 Field 1 Reports Fie ld 3 Report 2 Front End Field 1 Field 1 X Field 1 © Allstate Insurance Company Proprietary and Confidential X Field 3 Field 1 4 Fie ld Field 1 1 FieldReport 1 Field 1 Back FieEnd ld 3 Field 1 Operati on 1 Reports Report 1 Operation 1 Field 1 Report Report Repo rts Field 1 Back Field 1 End Field 1 Field 1 Operati on 1 Field 1 FieldField 1 2 Field 1 Field 1 Back End Field 1 Fie ld 2 Field 1 Field 1 As Fie ldBuild 1 Field 1 Reports Reports Report 2 Field 1 Fron t End After 2011 Fie ld 2 Field 1 Report 1 Back End Back End Field 1 Repo rts Reports Field 3 End Back Field 1 Field 1 Front End Field 1 Report 1 Back End Front End Fie ld 1 Reports Field 1 Operation 2 Field 1 Reports Report 2 Operati on 2 Field 1 Report Report Report 2 Report Report 3 Report Fie ld 2 Field 4 16 Report 3 Apr 16th, 2014 Final Thoughts • Lineage is an important and necessary item in the suite of data management & data governance utilities • To provide context added value, specifically to the business users, Lineage should be tightly coupled with the enterprise business glossary • Properly built lineage is a huge asset to improving data quality in the enterprise, as it gives insight into what is happening to the data as it is moves between the different systems • Lineage helps enterprises understand where the data is, and hence is a helpful utility in identifying the locations that hold sensitive data which needs to be secured © Allstate Insurance Company Proprietary and Confidential 17 Apr 16th, 2014 Questions Saad Yacu [email protected] © Allstate Insurance Company Proprietary and Confidential 18 Apr 16th, 2014