Department of Accounting & Law

State University of New York at Albany

 

 

 

Acc 522. Statistical Methods for Business Decisions

Fall, 2003

J Gangolly

 

 

 

 

 

“Statistics are to baseball what a flaky crust is to Mom’s apple pie.”

--Harry Reasoner

 

 

“He uses statistics as a drunken man uses lamp-posts—for support rather than illumination. “

-- Andrew Lang

 

"Utility is our national shibboleth: the savior of the American businessman is fact and his uterine half-brother, statistics.”

--Edward Dahlberg

 

“There is a certain embarrassment about being a storyteller in these times when stories are considered not quite as satisfying as statements and statements not quite as satisfying as statistics; but in the long run, a people is known, not by its statements or its statistics, but by the stories it tells.”

-- Flannery O’Connor

 

Welcome

 

 

Welcome to the exciting world of exploratory data analysis, traditional as well as Bayesian statistics, and datamining. The emphasis in the course will be on the use of statistics and the powerful graphics provided by the object-oriented language S-Plus for the analysis and visualisation of data of special interest to auditors, information system auditors, computer security professionals, and other professionals involved  involved in data warehousing and webmining.

 

The course is very fast paced, and rather formal in terms of statistical as well as programming constructs used.. It is therefore important that you keep up with the class at all times and not be left behind. Should you need help, seek it immediately. I am here to help you learn.

 

Use the wonderful facilities in the Arthur Andersen Laboratory for Accounting Information Systems. Enjoy!

 

Administrivia

Semester: Fall, 2001
Time:
Th 5:45 — 7:05 PM
Room:
BA 214 & Accounting Systems Lab
Instructor:
Jagdish S. Gangolly   
Office:
BA 365C
Phone:
(518) 442-4949
Fax:
(707) 897-0601
Office Hours: Th 4:45 – 5:45 PM. or by appointment
Instructor Homepage: http://www.albany.edu/acc/gangolly

Course Homepage: http://www.albany.edu/acc/courses/acc522.fall2003
Newsgroup:
sunya.class.acc522

 

 

Class Conduct:

The course consists of lectures, solution of problems, discussion of homework and book assignments. You are expected to do the readings well ahead of the class. Class time is to be used for the clarification of any doubts that you may have. Do not expect to merely listen to the instructor and gain knowledge. Applied statistics is a practical field backed by robust theory. A good understanding of the theory and its use in practice is essential to excel in the field. This is a hands-on course, and you are required to demonstrate competence in the topics covered in order to receive an acceptable grade. I  shall be giving occasional homework assignments. I also shall be calling upon some of you to come to the board and discuss  problems either in the textbooks, other sources, or homework assigned.

Software:
I shall be using the S-Plus system running under windows 2000. When using windows version in the class, I shall mostly be using command line mode and not the windows user interface, except for some graphics.

Newsgroup/e-mail:
We shall be using the class newsgroup (sunya.class.acc522) extensively for making announcements regarding tests, homework, quizzes, added links to this course homepage, etc. In fact, the newsgroup will be the primary means of communication between us outside of the class. You should post to the newsgroup all your questions and doubts for clarification. You are strongly encouraged to answer queries posted by others, and such responses will count towards class participation points for grading. You should communicate with me via e-mail only for individual problems and questions.


The
Graduate Laboratory for Accounting Information Systems Access:
As a graduate student in the Department, you have access to the Arthur Andersen Laboratory. You will need to get from Ms. Lisa Scholz the password to enter the lab. Contact her in BA 365 as soon as possible. Should you have special requirements for software (DBMS servers) or hardware (Windows 2000 Servers) for your projects, let me know, and arrangements will be made for your access. You can obtain a free copy of the S-Plus software by filling in  the form at http://www.albany.edu/its/software/SWRequestForm.html



 

Course Objectives:

·            Understanding of exploratory data analysis

·            Understanding of the language S-Plus

·            Understanding of basic traditional statistics

·            Understanding of Bayesian analysis and networks

·            Understanding of the basics of multivariate methods

 

 

Catalog Description:

Extensive coverage of sampling techniques for decision making. Includes simple random sampling, stratified sampling, cluster sampling, treating unequal clusters, area sampling, imperfect frames, questionnaire design, and field operations.
Prerequisite: Msi 220 or Mat 108 or equivalent.

 

 

 

An Honest Description:

Gangolly:
Data acquisition and preprocessing for statistical analysis. Exploratory descriptive data analysis using the language S-Plus. Basic graphics commands in S-Plus including trellis graphics. Descriptive data exploration and statistical modeling. Data preprocessing for Datamining & Concept Description. Data Cleaning, Data Integration & Transformation; Concept hierarchy generation. Association Rules in Large Databases. Classification & Prediction. Multivariate Methods: Clustering & other multivariate statistical methods. Fundamentals of Probability & Introduction to Bayesian Decision Theory: Probabilities: joint, conditional & marginal; Bayes’ Theorem and Likelihood ratio. Nomenclature of decision trees (or Bayesian Networks or Influence Diagrams).  The construction of trees, method of Folding back, Conversion of given Probs. to usable Probability metrics.

 

Textbooks and Readings:

·       The Basics of S and S-Plus (Statistics and Computing)
Andreas Krause, Melvin Olson
3rd edition (20002)
Springer Verlag
ISBN: 0-387-95456-2

·       Visual Data Mining: Techniques and Tools for Data Visualization and Mining
Tom Soukup, Ian Davidson

John Wiley Publishing, Inc. (2002)
ISBN: 0-471-14999-3

 

I shall also be placing materials on reserve in the library and/or provide links on this coursepage as the semester progresses

 

 

Requirements

The classes will consist of lectures, solution of problems, discussion of papers and programming exercises. I shall be dividing the class into groups of 3 each, balanced in terms of skills in accounting, programming, facility with computers, mathematical maturity, needs of the projects selected, and other such attributes. The groups will work through out the semester on three substantial homework projects, each group member taking turns to be the lead on the assignment

 

Grading

The final course grade is dependent on the following factors:

·       100 points: Test (In class open book/notes. Details will be announced in the class and updated here)

·       100 Points: Group Project & written report

·       0 - 50 points: Pop-quizzes, when given

·       25 points: Class participation

·       225 - 275 points: Total points (max)

The final course grade is strictly relative, based on the total points scored.

The grades, once assigned can not be changed except in case of errors in grading. Under no circumstances is it possible to do extra credit work to improve the grade.

 

 

About the Instructor:

Jagdish S. Gangolly is currently an Associate Professor of Accounting and of Management Science & Information Systems, Director of Graduate Accounting Programs in the School of Business, and the Interim Director of the Ph. D Program in Information Science at the School of Information Science & Policy at the State University of New York at Albany. He is also an affiliate and advisor at the Institute for Informatics, Logic & Security Studies at SUNY Albany. He holds a Bachelor's degree with a major in Mathematical Statistics, a master's degree with a major in Operations Research, and a Ph. D degree in Business Administration (Accounting). He is also a Certified Internal Auditor. He has previously taught at the University of Pittsburgh, University of Kansas, Claremont McKenna College & the Claremont Graduate School, and California State University at Fullerton. He has worked in senior executive positions in management services in the pulp & paper industry as well as in soft-drink franchising in India. His papers have appeared in Journal of Accounting Research, Auditing: Journal of Practice & Theory, Journal of the Operational Research Society, Critical Perspectives on Accounting, Expert Systems with Applications: An International Journal,  Artificial Intelligence in Accounting & Auditing, International Journal of Digital Accounting Research, and the New Review of Applied Expert Systems & Emerging Technologies. In 1989, he was the guest editor of Advances in Accounting; and he currently he serves on the editorial boards of the American Accounting Association journals Issues in Accounting Education and the Journal of Emerging Technologies in Accounting, the International Journal of Digital Accounting Research, and is an Associate editor of the e-Services Journal. He also serves on the E-Commerce Curriculum Committee of the International Federation for Information Processing (IFIP). His current research activities are primarily in the areas of conceptual information organisation, markup languages supporting electronic commerce, and the formal specification of control in accounting information systems. He also has collateral research interest in the relationships between Accounting and Legal Philosophy.

.

 

Department of Accounting & Law

State University of New York at Albany

 

Acc 522. Statistical Methods for Business Decisions

Fall, 2003

J Gangolly

 

Tentative Schedule

 

September 11, 2003

Theme: Data preprocessing for Statistical Analysis & Introduction to S-Plus  Powerpoint

Topics: Unix shell scripting, tr, join; ftp, character coding, text processing.

Readings: Unix man pages for tr, join, ftp; basic use of emacs or vi editors, etc.

To Do: Homework 1. (Due October 1, 2003)

 

September 18, 2003Powerpoint Powerpoint

Theme: Data  & Graphics in S-Plus I

Topics:  Matrices, Frames, Arrays, and their manipulation; Basic graphics commands in S-Plus; Trellis graphics

Readings: KO: Ch 3 – 6.

 

September 25, 2003 Powerpoint

Theme: Data  & Graphics in S-Plus II

Topics:  Matrices, Frames, Arrays, and their manipulation; Basic graphics commands in S-Plus; Trellis graphics

Readings: KO: Ch 3 – 6.

 

October 2, 2001

Theme: Descriptive Data Exploration & Statistical Modeling

Topics: Use of S-Plus Graphics; Regression.

Readings: KO: Ch 7 – 8.

To Do: Homework 2 (Due October 31, 2001)

 

October 9, 2001

Theme: Fundamentals of Probability & Introduction to Bayesian Decision Theory I

Topics: Probabilities: joint, conditional & marginal; Bayes’ Theorem and Likelihood ratio. Bayesian analysis

Readings: Readings on Probability Concepts and Decision Trees to be distributed.

 

 

October 16, 2001

Theme: Fundamentals of Probability & Introduction to Bayesian Decision Theory II

Topics: Probabilities: joint, conditional & marginal; Bayes’ Theorem and Likelihood ratio. Bayesian analysis Readings: Readings on Probability Concepts and Decision Trees to be distributed.

 

October 23, 2001

Theme: Fundamentals of Probability & Introduction to Bayesian Decision Theory III

Topics: Probabilities: joint, conditional & marginal; Bayes’ Theorem and Likelihood ratio. Bayesian analysis Readings: Readings on Probability Concepts and Decision Trees to be distributed.

 

October 30, 2001

Theme: Multivariate Methods I: Clustering

Topics: Hierarchical & Partitioning methods for Clustering, and their use in S-Plus.

Readings: Clustering in S-Plus online manuals. Additional readings distributed

 

November 6, 2001

Theme: Multivariate Methods II: Other Methods

Topics:  Multi-Dimensional Scaling, etc.

Readings: S-Plus online manuals.

 

 

November 13, 2001

Theme: Multivariate Methods III: Other Methods

Topics:  Multi-Dimensional Scaling, etc.

Readings: S-Plus online manuals.

 

November 20, 2001

TEST (80 minutes)

Theme: Auditing Applications

Readings: Papers to be assigned

 

November 27, 2001

No Class (Thanksgiving)

 

December 4, 2001

Theme: Project Presentations

 

 

Updated on September 8, 2003 by Jagdish S. Gangolly (j.gangolly@albany.edu)