Department of Accounting & Law
Acc 522. Statistical Methods
for Business Decisions
Fall, 2003
“Statistics are to baseball what a
flaky crust is to Mom’s apple pie.”
--Harry Reasoner
“He uses statistics as a drunken
man uses lamp-posts—for support rather than illumination. “
-- Andrew Lang
"Utility is our national
shibboleth: the savior of the American businessman is fact and his uterine
half-brother, statistics.”
--Edward Dahlberg
“There is a certain embarrassment
about being a storyteller in these times when stories are considered not quite
as satisfying as statements and statements not quite as satisfying as
statistics; but in the long run, a people is known, not by its statements or
its statistics, but by the stories it tells.”
-- Flannery O’Connor
Welcome to the exciting world of
exploratory data analysis, traditional as well as Bayesian statistics, and
datamining. The emphasis in the course will be on the use of statistics and the
powerful graphics provided by the object-oriented language S-Plus for the
analysis and visualisation of data of special interest to auditors, information
system auditors, computer security professionals, and other professionals
involved involved in data warehousing
and webmining.
The course is very fast paced,
and rather formal in terms of statistical as well as programming constructs
used.. It is therefore important that you keep up with the class at all times
and not be left behind. Should you need help, seek it immediately. I am here to
help you learn.
Use the wonderful facilities in
the Arthur Andersen Laboratory for Accounting Information Systems. Enjoy!
Semester:
Fall, 2001
Time: Th 5:45 — 7:05 PM
Room: BA 214 & Accounting Systems Lab
Instructor: Jagdish S. Gangolly
Office: BA 365C
Phone: (518) 442-4949
Fax: (707) 897-0601
Office Hours: Th
4:45 – 5:45 PM. or by appointment
Instructor
Homepage: http://www.albany.edu/acc/gangolly
Course
Homepage: http://www.albany.edu/acc/courses/acc522.fall2003
Newsgroup: sunya.class.acc522
Class Conduct:
The course consists of lectures,
solution of problems, discussion of homework and book assignments. You are
expected to do the readings well ahead of the class. Class time is to be used
for the clarification of any doubts that you may have. Do not expect to merely
listen to the instructor and gain knowledge. Applied statistics is a practical
field backed by robust theory. A good understanding of the theory and its use
in practice is essential to excel in the field. This is a hands-on course, and
you are required to demonstrate competence in the topics covered in order to
receive an acceptable grade. I shall
be giving occasional homework assignments. I also shall be calling upon some of
you to come to the board and discuss
problems either in the textbooks, other sources, or homework assigned.
Software:
I shall be using the S-Plus system running under windows 2000. When
using windows version in the class, I shall mostly be using command line mode
and not the windows user interface, except for some graphics.
Newsgroup/e-mail:
We shall be using the class newsgroup (sunya.class.acc522) extensively for
making announcements regarding tests, homework, quizzes, added links to this
course homepage, etc. In fact, the newsgroup will be the primary means of
communication between us outside of the class. You should post to the newsgroup
all your questions and doubts for clarification. You are strongly encouraged to
answer queries posted by others, and such responses will count towards class
participation points for grading. You should communicate with me via e-mail
only for individual problems and questions.
The Graduate Laboratory for
Accounting Information Systems Access:
As a graduate student in the Department, you have access to the Arthur Andersen
Laboratory. You will need to get from Ms. Lisa Scholz the password to enter the
lab. Contact her in BA 365 as soon as possible. Should you have special
requirements for software (DBMS servers) or hardware (Windows 2000 Servers) for
your projects, let me know, and arrangements will be made for your access. You
can obtain a free copy of the S-Plus software by filling in the form at http://www.albany.edu/its/software/SWRequestForm.html
Course Objectives:
· Understanding of exploratory
data analysis
· Understanding of the language S-Plus
· Understanding of basic traditional statistics
· Understanding of Bayesian analysis and networks
· Understanding of the basics of multivariate methods
Catalog Description:
Extensive coverage of sampling techniques for
decision making. Includes simple random sampling, stratified sampling, cluster
sampling, treating unequal clusters, area sampling, imperfect frames,
questionnaire design, and field operations.
Prerequisite: Msi 220 or Mat 108 or equivalent.
An Honest Description:
Gangolly:
Data
acquisition and preprocessing for statistical analysis. Exploratory descriptive
data analysis using the language S-Plus. Basic graphics commands in S-Plus
including trellis graphics. Descriptive data exploration and statistical
modeling. Data preprocessing for Datamining & Concept
Description. Data Cleaning, Data Integration & Transformation; Concept
hierarchy generation. Association Rules in Large Databases. Classification
& Prediction. Multivariate Methods: Clustering & other multivariate
statistical methods. Fundamentals of
Probability & Introduction to Bayesian Decision Theory: Probabilities:
joint, conditional & marginal; Bayes’ Theorem and Likelihood ratio.
Nomenclature of decision trees (or Bayesian Networks or Influence
Diagrams). The construction of trees,
method of Folding back, Conversion of given Probs. to usable Probability
metrics.
·
The Basics of S and S-Plus (Statistics and
Computing)
Andreas
Krause, Melvin Olson
3rd edition (20002)
Springer Verlag
ISBN: 0-387-95456-2
·
Visual Data Mining: Techniques and Tools for Data
Visualization and Mining
Tom Soukup, Ian Davidson
John Wiley Publishing, Inc. (2002)
ISBN: 0-471-14999-3
I shall also be placing materials
on reserve in the library and/or provide links on this coursepage as the
semester progresses
The classes will consist of
lectures, solution of problems, discussion of papers and programming exercises.
I shall be dividing the class into groups of 3 each, balanced in terms of
skills in accounting, programming, facility with computers, mathematical
maturity, needs of the projects selected, and other such attributes. The groups
will work through out the semester on three substantial homework projects, each
group member taking turns to be the lead on the assignment
The final course grade is dependent on the following factors:
·
100 points: Test (In class
open book/notes. Details will be announced in the class and updated here)
·
100 Points: Group Project
& written report
·
0 - 50 points: Pop-quizzes,
when given
·
25 points: Class participation
·
225 - 275 points: Total points
(max)
The final course grade is strictly relative, based on the total points
scored.
The grades, once assigned can
not be changed except in case of errors in grading. Under no circumstances
is it possible to do extra credit work to improve the grade.
Jagdish S. Gangolly is
currently an Associate Professor of Accounting and of Management Science
& Information Systems, Director of Graduate Accounting Programs in the
School of Business, and the Interim Director of the Ph. D Program in
Information Science at the School of Information Science & Policy at the
State University of New York at Albany. He is also an affiliate and advisor at
the Institute for Informatics, Logic & Security Studies at SUNY Albany. He
holds a Bachelor's degree with a major in Mathematical Statistics, a master's
degree with a major in Operations Research, and a Ph. D degree in Business
Administration (Accounting). He is also a Certified Internal Auditor. He has
previously taught at the University of Pittsburgh, University of Kansas,
Claremont McKenna College & the Claremont Graduate School, and California
State University at Fullerton. He has worked in senior executive positions in
management services in the pulp & paper industry as well as in soft-drink
franchising in India. His papers have appeared in Journal of Accounting
Research, Auditing: Journal of Practice & Theory, Journal of
the Operational Research Society, Critical Perspectives on Accounting,
Expert Systems with Applications: An International Journal, Artificial Intelligence in Accounting
& Auditing, International Journal of Digital Accounting Research, and the
New Review of Applied Expert Systems & Emerging Technologies. In 1989,
he was the guest editor of Advances in Accounting; and he currently he
serves on the editorial boards of the American Accounting Association journals Issues
in Accounting Education and the Journal of Emerging Technologies in
Accounting, the International Journal of Digital Accounting Research,
and is an Associate editor of the e-Services Journal. He also serves on
the E-Commerce Curriculum Committee of the International Federation
for Information Processing (IFIP). His current research activities are
primarily in the areas of conceptual information organisation, markup
languages supporting electronic commerce, and the formal specification of control
in accounting information systems. He also has collateral research interest
in the relationships between Accounting and Legal Philosophy.
.
Department of Accounting & Law
Acc 522. Statistical Methods
for Business Decisions
Fall, 2003
Tentative Schedule
September 11, 2003
Theme: Data preprocessing for Statistical Analysis
& Introduction to S-Plus
Powerpoint
Topics: Unix shell scripting, tr, join; ftp, character
coding, text processing.
Readings: Unix man pages for tr, join, ftp; basic use of
emacs or vi editors, etc.
To Do: Homework 1.
(Due October 1, 2003)
September 18, 2003Powerpoint
Powerpoint
Theme: Data &
Graphics in S-Plus I
Topics: Matrices, Frames, Arrays, and their manipulation;
Basic graphics commands in S-Plus; Trellis graphics
Readings: KO: Ch 3 – 6.
September 25, 2003
Powerpoint
Theme: Data &
Graphics in S-Plus II
Topics: Matrices, Frames, Arrays, and their manipulation;
Basic graphics commands in S-Plus; Trellis graphics
Readings: KO: Ch 3 – 6.
October 2, 2001
Theme: Descriptive Data Exploration & Statistical
Modeling
Topics: Use of S-Plus Graphics; Regression.
Readings: KO: Ch 7 – 8.
To Do: Homework 2 (Due October 31, 2001)
October 9, 2001
Theme: Fundamentals of Probability & Introduction to
Bayesian Decision Theory I
Topics: Probabilities: joint, conditional & marginal;
Bayes’ Theorem and Likelihood ratio. Bayesian analysis
Readings: Readings on Probability Concepts and Decision Trees
to be distributed.
October 16, 2001
Theme: Fundamentals of Probability & Introduction to
Bayesian Decision Theory II
Topics: Probabilities: joint, conditional & marginal;
Bayes’ Theorem and Likelihood ratio. Bayesian analysis Readings: Readings
on Probability Concepts and Decision Trees to be distributed.
October 23, 2001
Theme: Fundamentals of Probability & Introduction to
Bayesian Decision Theory III
Topics: Probabilities: joint, conditional & marginal;
Bayes’ Theorem and Likelihood ratio. Bayesian analysis Readings: Readings
on Probability Concepts and Decision Trees to be distributed.
October 30, 2001
Theme: Multivariate Methods I: Clustering
Topics: Hierarchical & Partitioning methods for
Clustering, and their use in S-Plus.
Readings: Clustering in S-Plus online manuals. Additional
readings distributed
November 6, 2001
Theme: Multivariate Methods II: Other Methods
Topics: Multi-Dimensional
Scaling, etc.
Readings: S-Plus online manuals.
November 13, 2001
Theme: Multivariate Methods III: Other Methods
Topics: Multi-Dimensional
Scaling, etc.
Readings: S-Plus online manuals.
TEST (80 minutes)
Theme: Auditing Applications
Readings: Papers to be assigned
Theme: Project Presentations
Updated on September 8, 2003
by Jagdish S. Gangolly (j.gangolly@albany.edu)