© TLTP Tutorials - Datahandler Guide

© - This guide is copyright of the TLTP History Courseware Consortium

Starting the Data Handler & Choosing a Dataset
Exploring Individual Level Data

3.1 Selecting Variables from a Dataset
3.2 Viewing and Sorting Individual Level Data
3.3 Creating Frequency Tables
3.4 Creating Contingency Tables
3.5 Filtering Data
Exploring Aggregate Level Data

4.1 Selecting Variables from a Dataset
4.2 Viewing Aggregate Level Data
4.3 Graphing Aggregate Level Data
4.4 Working with a Range of Data
4.5 Creating Weighted Indices
Leaving the Data Handler

Introduction

The Data Handler is a tool which allows primary statistical data to be explored and manipulated. It is launched from within a tutorial and allows various simple statistical operations to be performed on historical datasets. For example, data can be sorted, cross tabulated or graphed.
1.1 Types of Data You Can Analyse
The Data Handler works with two types of data:
- Individual data: This is made up of records that list information about individual entities (e.g. each person in an Urban Directory or each person admitted to a workhouse, or each crime tried in court).
- Aggregate data: This is data where all the individual entities have been counted and exist only as numbers e.g. population figures, numbers of factories etc.
The Data Handler treats these types of data differently, since the operations appropriate for the two types of data are different. This will become apparent as you read through this guide, which is split up into two main sections to deal with each type of data separately.
This guide will help you get up and running with the Data Handler. All you need to know is how to use the "mouse." The rest is explained on screen in the form of sequential steps. You will find that there is "fly-by" help on all icons and boxes in the Data Handler.
Starting the Data Handler - Choosing a Dataset

As you work through a tutorial you will find 'button' links in the text to datasets, such as:
Births, Marriages and Burials
When you click on such a button in a tutorial the Data Handler will load together with the specific dataset.
You will then see the Main Data Handler screen, shown in figure 1 below.

Figure 1: Main Data Handler Screen - Choosing a Dataset

You will see that there are step by step instructions on screen.
Step 1 gives you the opportunity to open a new dataset, but the particular data set you have launched will be displayed in the "Dataset Collection" box. In the diagram above this data set is called "All Cities."
Step 2 shows you the tables contained within the particular dataset collection. Sometimes there may be several tables included - as in the example above, which contains tables for different cities - but often there is only one table. If there are several tables, step 2 allows you to select the one you are interested in.
Once you have selected a table, the fields or variables which make up that table are displayed in step 3. Step 3 allows you to select the variables which you would like to work with. The number of variables you can select, and the operations you can perform on them are dependent on the type of data being dealt with.
We will now look at what you can do with the different types of data, beginning with individual level data, (aggregate data is dealt with in section 4).
Exploring Individual Level Data

3.1 Selecting Variables from a Dataset
When working with individual level data, you can only select one or two variables to work with. This is done in step 3, where the screen displays two boxes - "Variable 1" and "Variable 2," as shown in figure 2.

Figure 2: Main Data Handler screen: Selecting variables - Individual data

Underneath "Variable 1" a list of all the variables or fields in the dataset you have selected are displayed. If you just want to select one variable to work with, click on that variable in this list. The selected variable will then be displayed in the "Variable 1" box. If you want to work with two variables, select a second variable from the list underneath "Variable 2." In the example below, the variables Sex and Punishment have been selected.
If you want to alter your selection, choose the Clear Selection button, which wipes the current selection and allows you to select new variables.
NB - You need only select variables if you want to graph the data. If you just want to have a look at the data, using View/Sort, then all variables will be displayed anyway, so there is no need to select one or two variables.
3.2 Viewing and Sorting Individual Level Data
Step 4 on the main Data Handler screen provides you with the View/Sort button, which allows you to examine the data in tabular form to gain a feel for the character of the data. The View/Sort screen is shown below:

Figure 3: The View/Sort Screen

In View/Sort mode you can view and scroll around the entire dataset as a "table" using the horizontal or vertical scroll bars to the right and at the bottom of the screen, or you can select just one or more variables to view.
You can sort the data, for example by year or surname, and can search for a particular variable value.
The View/Sort screen does not allow you to count up variables to see how many times they occur or to crosstabulate one variable with another - for example, age with sex. To perform these kinds of operation, you need to close the View/Sort screen (click on the red cross button shown below).

You can then use the Graph/Table button to create either a frequency table or contingency table. These are described below.
3.3 Creating Frequency Tables
A frequency table is the type of table you get if you choose one variable. It is simply a list of all values of a particular field and how many times each value appears. For example, a frequencies table of occupations will show you how common or uncommon each occupation is.
To create a frequency table, select one variable from the Main Data Handler screen, as shown in Figure 2. Then select Graph/Table. The resulting screen will show the frequency table on the left and a graph on the right. Figure 4 below shows a typical frequency table/graph. In this example, the data relates to inmates in a workhouse and the variable selected is "Age Category."

Figure 4: The Table/Graph Screen (frequencies)

The Functions box at the top of the screen allows you to alter the appearance of the table you have created. You can sort the variable selected (alphabetical/ascending/descending) or add percentages to give the relative importance of each value as well as the absolute value. Choosing "percentages" will add a new column ("% Count") to the table showing the percentage values.
The Graph types box allows you to choose a different type of graph. The default is a vertical bar chart, but you can change this to a horizontal bar chart, pie chart, line graph or scatter graph.
3.4 Creating Contingency Tables
A contingency table is the type of table you get if you choose two variables, to test whether or not there is a relationship between the two variables you choose. The values of two fields are counted rather than one - for example occupation by sex, age by wealth etc.
To create a contingency table (also referred to as a crosstabulation) select two variables from the Main Data Handler screen, as shown in Figure 2. Then choose Graph/Table. The example screen below shows some data from inmates in a workhouse once more, with the two variables "Sex" and "Age Category" chosen:

Figure 5: The Graph/Table Screen: (Contingency tables)

The contingency table is displayed on the left and the graph on the right.
The functions and graph types available for contingency tables are different from those available for frequency tables, reflecting the different nature of the data. The only Function available is the percentage function, whilst the Graph Types consist of Stacked %, Bar (vertical), Bar (horizontal), Line and Scatter. In the above example, a stacked percentage has been chosen, since where there are many values this type of chart can look clearer than the default, which is a bar chart.
3.5 Filtering Data
You may find that you do not always want to work with every value in a particular field. For example, if you have a dataset with a "Year" field which provides annual statistics from 1700-1900, but you are only interested in graphing 1700-1750, the filter function allows you to "filter" out the values you do not want and lets you specify the dates you are interested in.
Sometimes you may find that you have to "filter" the data, because you have chosen a field with so many values that the Data Handler is unable to deal with them all. If this happens a warning sign will appear stating that "you are trying to graph too many categories. Please use a filter to reduce the number."
To filter data, click on the filter button from the Main Data Handler screen. The Filter dialogue box, pictured below will then appear. Follow the steps shown.

Figure 6: The Filter Dialogue Box
Exploring Aggregate Level Data

4.1 Selecting Variables from a Dataset
Selecting variables for this type of data is exactly the same as for individual data, except that when working with aggregate level data you can choose to work with as many variables as you like. From the Main Data Handler screen, the variables you select from the left-most box in Step 3 are displayed on the right in the Variables Selected box, as shown below.

Figure 7: Main Data Handler Screen: Selecting variables - Aggregate data

4.2 Viewing Aggregate Level Data
Viewing this type of data using the View/Sort option is exactly the same as for individual data, as described in section 3.2. Sorting the data will be less useful than with individual data, but viewing the overall table structure can help in giving a feel for the kind of data/values which are being worked with. This is also the only way to see the data in table format, since no table is shown when the data are graphed.
NB - to remind yourself of the difference between individual and aggregate data, see section 3.
4.3 Graphing Aggregate Level Data
To graph the variable or variables you have selected, choose the Graph option from the Main Data Handler screen. The resultant graph will look something like the one shown in figure 8 below.

Figure 8: The Graph Screen - Aggregate Data

The Functions box at the top of the screen allows you to apply basic statistical operations to your graph. The options are:
- Moving Average This function is only appropriate for a line graph. It allows you to create a new smoother line graph based on averages. Choosing this option gives the dialogue box below. Enter the number you want your averages to be based upon. For example, if you enter the number 2, the Data Handler will create a new line graph in which the first point shown will be the average of the first two points in the original graph, the second point shown will be an average of the third and fourth points in the original graph, the third point an average of the fifth and sixth original points and so on.
- MinMax When applied to a line graph, this will draw in a horizontal line to indicate where the minimum and maximum values lie.
- Standard Deviation A measure of dispersion.
- Best Fit When applied to a line graph, this will draw in the line of best fit.
The Graph Types box allows you to choose a different graph type from the default, which is a line graph. The other types of graph available are:
Log/Lin (a graph based around the logarithm values), Area/Percentage, Scatter, Bar (vertical) and Bar (horizontal).
4.4 Working With a Range of Data
When creating a graph of aggregate data, you may not wish to graph all the values. For example, your dataset may cover 1700-1800, but you may only be interested in 1700-1750. To choose start and end dates, select the Range function from the Main Data Handler screen, as shown below:

Figure 9: Main Data Handler Screen - the Range Function

Choosing the Range function gives the dialogue box overleaf which allows you to select your range of values:

Figure 10: The Range Dialogue Box

4.5 Creating Weighted Indices
In graph mode, one choice is to construct a weighted index. This option is relevant particularly for production data, where such indices play an important part in macroeconomic debates over the extent of growth in the Industrial Revolution. For example, the Hoffman index attributes different weights to industries according to their importance in overall industrial production. The module may also be useful for other types of "bread basket" data, e.g. prices or wages.
To create a weighted index, first select Weighted Index from the main Data Handler screen. The dialogue box shown in figure 11 will appear:

Figure 11: Weighted Index Screen

Once you have chosen your weightings and selected Create Index a graph based on your weightings similar to the one in figure 12 below will be created.

Figure 12: Graph based around weighted index

Selecting Growth Rates from the graph screen allows you to calculate the annual growth rate between selected years. Choose the desired years, click on Calculate, and the annual growth rate is shown at the bottom of the dialogue box.

Figure 13: Calculating Growth Rates
Leaving the Data Handler

No matter which screen you are looking at in the Data Handler, clicking on the red cross will close that screen and take you back a level. Choosing Close from the Main Data Handler screen will take you back to the point in the tutorial at which the Data Handler was launched.

© - This guide is copyright of the TLTP History Courseware Consortium

Contents:

Introduction

Starting the Data Handler & Choosing a Dataset

Exploring Individual Level Data

Exploring Aggregate Level Data

Leaving the Data Handler

Introduction

Starting the Data Handler - Choosing a Dataset

Exploring Individual Level Data

Exploring Aggregate Level Data

Leaving the Data Handler