This text discusses the scope of mathematical programming initiatives utilizing the instance of a number of profitable open supply/free software program initiatives. Within the writer’s expertise, it is not uncommon to come across quite optimistic expectations of the associated fee, schedule, and threat degree of mathematical analysis and improvement and programming initiatives; the 2 classes, mathematical analysis and improvement and mathematical programming, are closely blurred collectively as we speak, particularly for sensible and utilized initiatives. This text gives a tough image of the scope of mathematical programming initiatives based mostly on precise historic information quite than widespread tradition, anecdote, or people knowledge.
There are robust sensible causes for utilized mathematical analysis and improvement and programming initiatives. Many doubtlessly useful initiatives exist. These initiatives usually undergo from the “remedy for most cancers” drawback. With a number of hundred thousand folks every year in america alone succumbing to most cancers, there’s little query that there’s a massive marketplace for a remedy for most cancers. The issue is that we have no idea find out how to remedy most cancers. Equally, profitable mathematical analysis and improvement and programming initiatives provide every little thing from worthwhile funding recommendation to speech recognition for cellular gadgets and family home equipment to working fusion reactors and different new power sources. Certainly, a remedy for most cancers is one thing that mathematical strategies might provide sooner or later by means of molecular modeling or different quantitative approaches. Given the massive potential markets for profitable mathematical initiatives, it is not uncommon to come across people, organizations, and corporations with nice curiosity particularly, often sensible mathematical initiatives. These initiatives are extremely unlikely to succeed with out correct concepts concerning the scope of the initiatives.
The Scope of Some Profitable Free Open-Supply Mathematical Software program Initiatives
Program | Strains of Code | Core Strains of Code | Calendar Length | Variety of Contributors |
FFMPEG 0.6.1 Video Encoder | 373,742 | 368,457 | at the very least 2004-2011 | 50 |
x264 h.264 Video Encoder (x264-snapshot-20110204-2245) |
67,986 | 62,968 | at the very least 2004-2011 | 18 |
Impartial JPEG Group JPEG encoder/decoder v8c | 61,102 | 52,304 | at the very least 2000-2011 | 13 |
Open CV 2.2.0 Laptop Imaginative and prescient Library | 884,808 | 396,399 | at the very least 1999-2011 | 80 |
Perception Toolkit 3.20.0 Picture Segmentation and Registration Toolkit |
698,143 | 685,466 | at the very least 1999-2011 | at the very least 14 |
Pythia/Lund Monte Carlo 8.145 Particle Physics Occasion Simulation (C++ model) |
141,353 | 46,258 | 1977-2011 | 5 |
Pythia/Lund Monte Carlo 6.327 Particle Physics Occasion Simulation (final FORTRAN Model) |
60,455 | 60,455 | 1977-1996 | 5 |
EGS (Electron Gamma Bathe) | 38,921 | 31,151 | 1950’s-2011 | unknown |
LAPACK 3.3.0 Linear Algebra Library | 459,993 | 458,645 | at the very least 1970’s to current | many contributors (in all probability over 100) |
AESCRYPT Encryption/Decryption Utility | 4,331 | 4,286 | 2001-2009 | at the very least 2 |
GNU Privateness Guard (GNUpg) v. 1.4.11 | 148,374 | 120,441 | at the very least 1998 to 2008 | 47 |
Octave 3.2.4 Numerical Programming Software |
539,233 | 453,160 | at the very least Eighties to current | many contributors (in all probability over 100) |
Notes
The free, open-source CLOC (Rely Strains of Code) utility was used to depend the variety of traces of code in every venture. CLOC lists the variety of traces of code in every programming language within the venture akin to C, C++, Bourne Shell, HTML, and so forth. CLOC doesn’t depend clean traces or remark traces. Some initiatives embody sizable quantities of set up code (within the Unix Bourne Shell for instance), HTML documentation, and so forth which is counted within the whole variety of traces of code reported by CLOC. The precise mathematical code is usually applied in a couple of languages akin to C, C++, FORTRAN, or MATLAB. The time period “Core Strains of Code” refers back to the traces of code in these languages, as reported by CLOC, which is presumed to comprise the precise mathematical software program.
Generally, open supply initiatives present a wealth of detailed info that’s tough or inconceivable to amass for a lot of business proprietary initiatives. Particularly, one can see the supply code, depend the traces of code or different measures of measurement and scope, and infrequently learn feedback, change logs, logs of model management techniques, and so forth. Almost all open supply initiatives give a listing of contributors someplace within the documentation and supply tough info on the calendar length of the venture. There may be often exact info on releases and launch dates. Sadly, it’s tough to get a fairly precise measure of the particular effort expended on the venture. Most open supply initiatives don’t publish info on precise hours labored, {dollars} expended, even when data exist. A number of of the examples had been totally or partially funded both by authorities funding companies (e.g. the Nationwide Library of Medication for the Perception Toolkit) or personal sources (e.g. Intel for OpenCV), so such detailed info could also be out there in some instances.
The Examples
The examples had been chosen as profitable free open-source initiatives extensively used inside their area or utility with a top quality akin to or superior to good business software program merchandise. A number of akin to FFMPEG and x264 are extremely utilized and used within the on a regular basis world. A number of such because the Pythia/Lund Monte Carlo are primarily scientific analysis instruments. Some akin to Octave and LAPACK span each worlds.
FFMPEG is a extensively used open supply audio/video encoding utility and assortment of libraries. FFMPEG can encode and decode a variety of various audio and video codecs and compression schemes together with h.264. It incorporates numerous different utilities and libraries. x264 is a extensively used open supply h.264 video encoder. The Impartial JPEG Group disributes a extensively used open supply JPEG picture encoder and decoder. Open CV is a extensively used pc imaginative and prescient library incorporating most of the present cutting-edge pc imaginative and prescient algorithms; it’s utilized in analysis and in a couple of business merchandise. The Perception Toolkit is a toolkit of picture segmentation and registration algorithms, considerably much like Open CV in follow, geared towward medical imaging.
The Pythia/Lund Monte Carlo is a extensively used program for simulating the formation of jets of subatomic particles and different processes in experimental and theoretical particle physics, for instance on the Massive Hadron Collider (LHC) at CERN. Two variations, the unique FORTRAN model and the more moderen rewrite in C++, are listed. Electron Gamma Bathe or EGS is a extensively used program for simulating the interactions of electrons and photons (gamma rays and x-rays) with matter. It was initially developed for nuclear and particle phyics on the Stanford Linear Accelerator Middle (SLAC), however is now extensively used for medical radiation research. LAPACK is a extensively used FORTRAN library of linear algebra and different fundamental numerical algorithms; it’s usually present in different applications as effectively. AESCRYPT is a free open-source implementation of the Superior Encryption Commonplace (AES) for information encryption. GNU Privateness Guard (GNUpg) is a free, open-source implementation of the OpenPGP encryption commonplace. Octave is a free, open-source numerical programming device that’s largely compatibly with MATLAB. Octave has been mentioned in earlier articles by this writer beginning with Octave: An Different to the Excessive Price of MATLAB.
Precise Effort Estimation with Fundamental COCOMO
The Constructive Price Mannequin (COCOMO) is a software program value estimation mannequin developed by Barry Boehm. Fundamental COCOMO is the unique, quite simple value estimation mannequin printed by Boehm in his 1981 guide Software program Engineering Economics. It provides a easy, crude estimate of the hassle in man-months as a perform of the variety of traces of code in a venture. The next desk provides the estimated effort in man-months/man-years from making use of the “natural” Fundamental COCOMO mannequin to the variety of traces of code in every mathematical open supply venture on this article:
Program | Fundamental COCOMO Man-Months | Fundamental COCOMO Man-Years |
FFMPEG 0.6.1 | 1,204 | 100 |
x264 | 201.5 | 16.75 |
IJG v8c | 179.8 | 15 |
Open CV 2.2.0 | 2,982 | 248.5 |
Perception Toolkit | 2,324 | 193.7 |
Pythia/Lund 8.145 | 443 | 36 |
Pythia/Lund 6.327 | 178 | 14.8 |
EGS | 112 | 9.3 |
LAPACK 3.3.0 | 1,637 | 136.4 |
AESCRYPT | 11 | 0.9 |
GNU Privateness Guard (GNUpg) 1.4.11 | 456 | 38 |
Octave 3.2.4 | 1,771 | 147.6 |
The next Octave/MATLAB perform was used to compute the estimated man-months utilizing the Fundamental COCOMO “natural” mannequin:
perform [man_months, dev_time, people_required] = cocomo(kloc, sort) % [man_months, dev_time, people_required] = cocomo(kloc [, type]) % % kloc (hundreds of traces of code) % sort (sort of venture: natural, semi-detached, embedded) % if nargin < 2 sort="natural" finish c = 2.5; if strcmp(sort, 'natural') a = 2.4; b = 1.05; d = 0.38; finish if strcmp(sort, 'semi') % semi indifferent a = 3.0; b = 1.12; d = 0.35; finish if strcmp(sort, 'embedded') a = 3.6; b = 1.2; d = 0.32; finish man_months = a*(kloc)^b; dev_time = c*(man_months)^d; people_required = man_months / dev_time; finish
Conclusion
Whereas this information pattern is clearly restricted and a bigger research is fascinating, it ought to nonetheless be evident that profitable mathematical programming initiatives are often substantial. Even the smallest venture on the checklist, the AESCRYPT encryption utility, in all probability took a number of man-months to totally develop; Fundamental COCOMO would estimate nearly one 12 months. Thus, expectations of some weeks are typically unrealistic. Certainly, expectations of three calendar months, a fiscal quarter, the present fetish of American enterprise, are often unrealistic. Then again, expectations starting from six months to a number of years could also be practical relying on the precise venture.
Partly due to heavy authorities funding of mathematical analysis and improvement, there are numerous open-source, free mathematical programming initiatives out there. This gives a wonderful database of data on the scale and scope of such initiatives, one thing usually tough to search out for enterprise functions the place most merchandise and initiatives are proprietary. Anybody contemplating such a mathematical venture is effectively suggested to look at comparable open supply initiatives in the event that they exist to find out the scale and scope to the extent potential. Sadly, open supply initiatives usually may give solely a tough measure of the particular effort (legendary man-months) used within the venture. The Fundamental COCOMO mannequin can present a really tough manner of estimating the precise effort of the open supply venture from the traces of code, however clearly a extra direct manner of measuring the precise effort is required.
© 2011 John F. McGowan
In regards to the Writer
John F. McGowan, Ph.D. is a software program developer, analysis scientist, and marketing consultant. He works primarily within the space of advanced algorithms that embody superior mathematical and logical ideas, together with speech recognition and video compression applied sciences. He has intensive expertise growing software program in C, C++, Visible Fundamental, Mathematica, MATLAB, and lots of different programming languages. He’s in all probability finest identified for his AVI Overview, an Web FAQ (Regularly Requested Questions) on the Microsoft AVI (Audio Video Interleave) file format. He has labored as a contractor at NASA Ames Analysis Middle concerned within the analysis and improvement of picture and video processing algorithms and expertise. He has printed articles on the origin and evolution of life, the exploration of Mars (anticipating the invention of methane on Mars), and low-cost entry to house. He has a Ph.D. in physics from the College of Illinois at Urbana-Champaign and a B.S. in physics from the California Institute of Know-how (Caltech). He might be reached at [email protected].
Sponsor’s message: Take a look at Math Higher Defined, an insightful book and screencast collection that can assist you see math in a brand new mild and expertise extra of these superior “aha!” moments when concepts immediately click on.