Transcript Document
Cost Estimation Cost Estimation • “The most unsuccessful three years in the education of cost estimators appears to be fifth-grade arithmetic. » Norman R. Augustine Goal • The cost estimation community is working to improve estimations so that sophisticated organizations can produce products with 5% of the estimated cost (instead of 10%). Goal • The cost estimation community is working to improve estimations so that sophisticated organizations can produce products with 5% of the estimated cost (instead of 10%). • The typical software organization struggles to avoid estimates that are incorrect by 100%. Goal • The cost estimation community What does “incorrect is byworking to improve 100%” estimations so that mean? sophisticated organizations can produce products with 5% of the estimated cost (instead of 10%). • The typical software organization struggles to avoid estimates that are incorrect by 100%. Requirements for Estimations • What are cost estimates used for? Requirements for Estimations • What are cost estimates used for? • What are the important characteristics of a cost estimate? Requirements for Estimation • Timely – It is of limited value to provide the estimate after the project is complete – How early? Requirements for Estimation • Timely – It is of limited value to provide the estimate after the project is complete – How early? • Before implementation • Before complete design • Maybe before complete requirements Requirements for Estimation • Timely – It is of limited value to provide the estimate after the project is complete • Accurate – How accurate is enough? Requirements for Estimation • Timely – It is of limited value to provide the estimate after the project is complete • Accurate – How accurate is enough? • Enough for planning, bidding, scheduling Scenario • A typical software development shop. A boss is speaking with the lead developer. • (I need two volunteers to play these parts.) Scenario • What’s up with the boss? Scenario • What’s up with the boss? • What does the boss think about the employee? Problem with Scenario 1 • The boss wasn’t asking for an estimate. The boss was asking for a plan to hit the target. • Your boss may or may not know the difference. Scenario 2 Scenario 2 • What’s the difference? Definitions • • • • Estimate Target Commitment Plan Estimate • 1. A tentative evaluation or rough calculation. • 2. A preliminary calculation of the cost of a project. • 3. A judgment base upon one’s impressions; opinion » American Heritage Dictionary, 2nd Ed. 1985 Targets • “We need to have the prototype ready by the end of the semester.” • “These functions should be implemented by August 30 when the contract ends.” • “The cost of the project is limited to $250,000, because that’s the maximum budget.” • “We have to ship 7.0 by second quarter next year, because I have a reunion to attend in July.” Commitment • Target: description of a desirable business objective • Commitment: a promise to deliver defined functionality at a specific level of quality by a certain date • Plan: a sequence of steps to achieve a goal (may include a schedule) Terms • Note that target, estimate, and commitment are not the same concept, and the dates given for these may differ. Scenario 3 • Suppose I give an estimate of 90 days. • What does this mean? Usual 120 Probability 100 80 60 40 20 0 Schedule (or cost) Scenario 3 This says there is a 100% probability of delivering on this schedule. • Suppose I give an estimate of 90 days. In order for the number to have value, • What does this mean? we need to know what the variance is. How likely are we to hit this estimate? Usual (Usually, this is a target, not an estimate.) 120 Probability 100 80 60 40 20 0 Schedule (or cost) Scenario 3: Bell curve • What does it mean? • Is this more accurate? Common Assumption 30 Probability 25 20 15 10 5 0 Schedule (or cost) Scenario 3: Bell curve What • Whatassumptions does it mean? • Is this more accurate? does this make? Common Assumption 30 Probability 25 20 15 10 5 0 Schedule (or cost) Scenario 3: Realistic • What does this mean? • Why is it shaped that way? Realistic 35 Probability 30 25 20 15 10 5 0 Schedule (or cost) Quiz Answers 1 Surface Temperature of the Sun 10,000 F /6,000 C 2 Latitude of Shanghai 31 degrees North 3 Area of Asian continent 17,139,000 square miles 44,390,000 sq Km 4 Birth year of Alexander the Great 356 BC 5 Total value of U.S. currency in circulation in 2004 $719,900,000,000 (in U.S. dollars) ($720 billion) 6 Total volume of the Great Lakes 1.8 *10*23 U.S. gallons 6.8*10^23 liters 7 World wide box office receipts for the movie Titanic as of 2006 $1.835 billion 8 Total length of the coastline of the Pacific Ocean 84,300 miles /135,663 Km 9 Number of books published in U.S. since 1776 22 million 10 Weight of heaviest blue whale on record 380,000 pounds 179,000 Kg Scores • • • • How many with 10 correct? 9? 8? 7? …. Math of the expected distribution • If we have a 90% probability for any single answer, then: – Probability of getting all 10 correct: .9^10 = 34.9% – Probability of getting 9 correct: (.9^9*.1)*10 = 38.7% 8: .9^8*.1^2*45 = 19.4% • For any given combination, .9^8*.1^2. But there are 45 different ways to put two wrong in a list of 10. • • • • You can put the first wrong answer in any one of the first 9 places. If you put it in the first spot, there are 9 places to put the second. If you put it in the second spot, there are 8 places for the second. And so on. • • • • 9+8+7…+1 1+2+3…+9 -------------------9*10 = 2x, x = 45. Math of the expected distribution • If we have a 90% probability for any single answer, then: – Probability of getting all 10 correct: .9^10 = 34.9% – Probability of getting 9 correct: (.9^9*.1)*10 = 38.7% 8: .9^8*.1^2*45 Conclusion: with a = 19.4% • 90% confidence, you For any given combination, .9^8*.1^2. But there are 45 different ways to put two wrong in a listhave of 10. a 93% chance ofwrong getting You can put the first answer8inor anymore one of the first 9 places. If you put it in the first spot, there are 9 places to put the second. correct. • • • If you put it in the second spot, there are 8 places for the second. • And so on. • • • • 9+8+7…+1 1+2+3…+9 -------------------9*10 = 2x, x = 45. % correct What we expect at 90% confidence 90% Confiden ce Usual 45 40 35 30 25 20 15 10 5 0 Class 10 9 8 7 6 5 4 3 2 1 0 Questions Correct Historical data Us last year Questions: • Did you feel pressure to make your ranges wider? Or narrower? (Why?) Questions: • Did you feel pressure to make your ranges wider? Or narrower? (Why?) • Where did the pressure come from? Questions: • Did you feel pressure to make your ranges wider? Or narrower? (Why?) • Where did the pressure come from? • Is estimating the volume of the Great Lakes anything like estimating software? Questions: • Did you feel pressure to make your ranges wider? Or narrower? (Why?) • Where did the pressure come from? • Is estimating the volume of the Great Lakes anything like estimating the impact of new programming tools on productivity, the productivity of an unidentified person, or the cost of developing software with no specification? Accuracy and the cost of inaccuracy • What is the cost of overestimating? Accuracy and the cost of inaccuracy • What is the cost of overestimating? – Parkinson’s Law: work expands to fill the time available – Goldratt’s Syndrome: People procrastinate until the last moment to start Accuracy and the cost of inaccuracy • What is the cost of underestimating? Accuracy and the cost of inaccuracy • What is the cost of underestimating? – Reduced effectiveness of project plans – Reduced chance of on-time completion – Poor technical approaches: Not enough time in requirements and design – Destructive late project dynamics • More status meetings • Interim releases • Fixing problems from workarounds Cost Estimation 2 Recall cost estimation: • • • • • • Sophisticated organizations: within 10% Typical software organization: >100% Estimates need to be timely and accurate Estimate, Target, Commitment, Plan Costs associated with overestimates Costs associated with underestimates How are we doing? • KSLOC is 1,000 lines of source code • MSLOC is 1,000,000 lines of source code • With your partner, what does this graph say? % complete Projet Outcomes by Project Size Early 90 80 70 60 50 40 30 20 10 0 On Time Late Failed 1KSLOC 10 KSLOC 100KSLOC Size 1MSLOC 10 MSLOC Benefits of Accurate Estimates • • • • • • Improved status visibility Higher quality Better coordination with non-software functions Better budgeting Increased credibility for team Early risk information Benefits of Accurate Estimates • Improved status visibility – Track progress by comparing actual to planned – Ability to make a plan • • • • • Higher quality Better coordination with non-software functions Better budgeting Increased credibility for team Early risk information Benefits of Accurate Estimates • Improved status visibility • Higher quality – Less stress on developers – Schedule pressure can increase defect rate by 400% (Jones 1994) • • • • Better coordination with non-software functions Better budgeting Increased credibility for team Early risk information Benefits of Accurate Estimates • Improved status visibility • Higher quality • Better coordination with non-software functions – Testing, documentation, marketing, training, support – Better estimation: tighter coordination • Better budgeting • Increased credibility for team • Early risk information Benefits of Accurate Estimates • • • • Improved status visibility Higher quality Better coordination with non-software functions Better budgeting – obvious • Increased credibility for team • Early risk information Benefits of Accurate Estimates • • • • • Improved status visibility Higher quality Better coordination with non-software functions Better budgeting Increased credibility for team – Not unusual for • team to estimate, • others (manages, marketers, sales staff) turn it into optimistic business target • Developers overrun • Others blame team • Early risk information Benefits of Accurate Estimates • • • • • • Improved status visibility Higher quality Better coordination with non-software functions Better budgeting Increased credibility for team Early risk information – If target and estimate don’t match, then opportunity to: • fix problem (reassign resources) • Re-scope • Cancel Approaches to arriving at a number • Count • Compute • Judge Approaches to arriving at a number • Count – If you want to know how many people in the room, count them – Usually not possible (e.g., how many people are on Earth?) • Compute • Judge Approaches to arriving at a number • Count • Compute – Find some approach – E.g., Count the number of jelly beans in 1” and use that to compute the total number in the jar • Judge Approaches to arriving at a number • Count • Compute • Judge – a.k.a. “guess” Counting vs. Estimating • Similarities: • Differences: Counting vs. Estimating • Similarities: – Both arrive at a number representing some real value – Both are subject to error • Differences: – Estimating implies imprecise knowledge Proxy • A value that is used to represent some other value • Example: Estimate the weight of the people in the airplane from the number of people in the airplane (requires that we also know the average weight of people) 1 minute drill • What is it that we want to estimate in software? Things we want to estimate in software • Cost • Resources • Revenue 1 minute drill: What are proxies for these? • Cost • Resources • Revenue Proxies in software estimation • Cost – Program size – Program complexity – Development time • Resources – Program size – Program complexity – Number of users • Revenue – Number of customers Sources of uncertainty • Inaccurate information about project • Inaccurate information about ability of project team • Too much chaos in project • Inaccuracies in estimation process Group Review Estimates • Individually: – Read the assignment for the movie rental program. – Predict the cost (time) to develop the code. • Put the time estimate on the card and turn it in. Group Review Estimate: Team • In groups of 4: Compare estimates. – Discuss the differences enough to understand the sources of the differences. – Work until you reach consensus on the high and low ends of the estimation ranges • You cannot just “average” the estimates. • You must reach consensus on the estimate. Discuss until you get buy-in from the entire group. • Turn in the results of this exercise. Wideband Delphi 1. Estimators prepare initial estimates 2. The estimators meet with a coordinator to discuss estimation issues 3. Estimators give their estimates to the coordinator anonymously 4. The estimates are summarized on an iteration form 5. Estimators meet to discuss differences 6. Estimators vote to accept the average. If any votes “no”, return to step 2 Wideband Delphi • Votes and estimates are anonymous • Reduces political pressure • Coordinator must prevent dominant personalities from controlling discussions • (Frequently, the most reserved person has the best insights) Results of Wideband Delphi • Estimation error cut by 40% compared to initial group average • Accuracy improves in 80% of the cases • Useful for early estimates, particularly with unfamiliar systems • Not so useful for detailed estimates LOC, SLOC, KSLOC, MSLOC • Lines of code • Standard measure of size • Often a measure of cost (i.e., time) LOC, SLOC, KSLOC, MSLOC • Lines of code • Standard measure of size • Often a measure of cost (i.e., time) What assumptions does this make? Function Points • Synthetic measure of program size used to estimate size early in the project • Easier (than lines of code) to calculate from requirements • Standards at the International Function Point Users Group (IFPUG) www.ifpug.org FP Rules: #FPs depends on: • • • • • External Inputs External Outputs External Queries Internal Logical Files External Interface Files FP Rules: #FPs depends on: • External Inputs – Data entering the system – Screens, forms, dialogs, controls – User or other program adds, deletes, modifies data – Any input that requires processing logic • • • • External Outputs External Queries Internal Logical Files External Interface Files FP Rules: #FPs depends on: • External Inputs • External Outputs – Derived Data leaving the system – Screens, reports, dialog boxes, control signals generated for end user or other program • External Queries • Internal Logical Files • External Interface Files FP Rules: #FPs depends on: • External Inputs • External Outputs • External Queries – Base data leaving the system – I/O combinations in which an input results in a simple output – Queries retrieve data with no formatting. Output is formatted • Internal Logical Files • External Interface Files FP Rules: #FPs depends on: • • • • External Inputs External Outputs External Queries Internal Logical Files – Data maintained within the application – Major logical groups of end-user data completely controlled by the program – Might be a flat file, a database table, or a collection of other data • External Interface Files FP Rules: #FPs depends on: • • • • • External Inputs External Outputs External Queries Internal Logical Files External Interface Files – Data maintained outside the application – Files controlled by other programs Function Points Complexity • The complexity of each function point depends on: – Record Element Types (RETs) – Data Element Types (DETs) – File Types Referenced (FTRs) Function Points Complexity • The complexity of each function point depends on: – Record Element Types (RETs) • a user recognizable subgroup of data elements within an ILF or EIF – Data Element Types (DETs) – File Types Referenced (FTRs) Function Points Complexity • The complexity of each function point depends on: – Record Element Types (RETs) – Data Element Types (DETs) • a unique user recognizable, non-recursive field – File Types Referenced (FTRs) Function Points Complexity • The complexity of each function point depends on: – Record Element Types (RETs) – Data Element Types (DETs) – File Types Referenced (FTRs) • a file type referenced by a transaction. An FTR must also be an internal logical file or external interface file Complexity Table- External Inputs (EI) Complexity Table- External Outputs (EO) & Inquiries (EQ) Complexity Table- Internal Logical File (ILF) & External Interface File (EIF) FP Rules: Complexity Multipliers Low Complexity Medium Complexity High Complexity External Inputs 3 4 6 External Outputs External Queries 4 5 7 3 4 6 10 15 7 10 Internal 4 Logical Files External 5 Interface Files Value Adjustment Factor • It is based on 14 general system characteristics that rate the functionality of the application – Performance, End-user efficiency, Reusability… • Each characteristic is assigned a degree of influence range on a scale from 0 to 5 • A formula is then used to account for these characteristics LOC vs FP (Boehm 2000, Stutzke 2005) Language Ada C C# C++ Java Assembly Perl VB LOC per FP 50 128 55 55 55 213 20 32 FP results • Certified counters vary by 10% • Untrained counters vary by much more • The multipliers may or may not be useful (some research indicates unadjusted FPs are more closely correlated with effort) • The LOC have on average a range of 3x wrt FPs Function Point Counting Exercise • In your teams, compute the FPs for the voting system and answer the questions at the end of the of the exercise –