Home Teachers | Mentors | Discussions | Research | Find
   Green CleanCreating the Context Data Analysis   
The PathFinder Science Network
About these images
Creating
 the Context

  Home
  Research Focus
  Background Info
  Research Methods
  Data Submission
  Results of Study
  Data Analysis
  Conclusion
  Further Research


Guided
 Research

  Research Question
  Background Info
  Research Methods
  Data Submission
  Results of Study
  Data Analysis
  Conclusion
  Further Research
  Research Values


Student
 Research

  Doing Research
  Publish
  View


Tools
  Discussions
  Map

   

So, now that I have my data, what do I do with it?

Research Methods for Green Clean - Example

The following example was conducted for Jackson County, MO. Because of the nature of the dry cleaning industry in the area (many companies use the satellite network), the search for a neighboring county resulted in no dry cleaners for the entire county. County searches that result in zero facilities is still a viable result and should be submitted like any other data collection.

The following excel table was copied from the report generated on the EPA's AirData site. The input variables were Jackson County, MO for location, tetrachloroethylene for pollutant, decimal notation of emissions, and 50 records per page for report format. All other options were left at site default.

Figures 8 and 9:Initial dataset copied into Excel contains 41 rows and 10 columns since all perc utilizing industries are present followed by Final dataset has been reduced to dry cleaners only: 31 rows and 6 columns. Final data product summated at bottom (see highlighted region).

At this point, the data acquisition, manipulation, and calculation for the inventory assessment are complete. The final results to be submitted to the final inventory are provided below. Remember: the original work involved looking at a neighboring county in Kansas City that did not have any facilities.

First Query:"
    State: Kansas
  • County: Johnson
  • Number Dry Cleaning Facilities: 0
  • Pollutant Emitted (lb/year): 0
  • % Pollutant Emitted (in this county): 0
Second Query:
  • State: Missouri
  • County: Jackson
  • Number of Dry Cleaning Facilities: 31
  • Pollutant Emitted (lb/year): 271,351
  • % Pollutant Emitted: 98.99%

Conclusions: From this data, we can conclude that 271,351lbs of perchloroethylene were emitted into the atmosphere from Jackson County, MO alone. If 31 dry cleaning facilities used "green" operations, 271,351lbs of perchloroethylene would have not been emitted into the environment. This total represents a 98.99% reduction in all perc emissions from this county.

Analysis suggestions

The folowing are some suggestions for the ways to look at your data. These are the tools of science that we use to look for patterns in data.

Means and Extremes

The methods and number of steps used in gathering scientific knowledge may vary from one investigator to the next, but scientific methods usually involve the alternation of two types of activities, the observational and the explanatory. So far we have been collecting data about the use and type of chemicals in the dry cleaning industry, but we have not yet participated in the explanatory part of science. We can continue to explore the data we have collected to increase the accuracy of our observations however, the data you downloaded does not really mean very much yet because we do not know whether this data is typical, high, low, or how it compares to "normal". We are not even quite sure what "normal" means.
Statistics refers to a set of procedures and rules for reducing our large data sets to manageable proportions and allowing us to take the next step and draw conclusions from our data. Our purpose is to increase the meaning of our observations reflected in our data so we will employ descriptive statistics and use spreadsheets. The most common numerical way to look at data is by means and extremes. The mean is the sum of the data values divided by the number of data points. In everyday language, this is the average.

See if you can do a mean for the data that you have also see if you can find the extremes in your data set.

Visualizing Data

Graphs are one way to visualize data and to help the researcher look for patterns. A graph is used to show the relationships of data collected from the experiment. Graphs must be constructed accurately and according to accepted rules. Usually, a graph shows the relationship between two kinds of data. These data are called variables. Time is a very common independent variable. Independent variables are plotted in the horizontal axis, x axis. In the graph below we explore the relationship between percent lichen coverage and tardigrade density. In this graph Percent Lichen Coverage is the independent variable.

The dependent variable is sometimes referred to as the outcome variable. The dependent data is plotted on the vertical axis, the y axis.

Remember when you make graphs;
1) Select scales for the horizontal and vertical axes which will reflect the precision of the measurements. Display the data in a proportional way. Remember, each square on the graph is equal to an assigned quantity but the scale of either axis may be changed if the graph is too compact or needs to be expanded.
2) It is important to label both the vertical and horizontal axes with the variables being graphed and also to indicate the units being used.

Spreadsheets will offer you graphing options for your data but it is very important that you understand the graph you have made and that the graph accurately represents your data.


While bar graphs are interesting and a good way to visualize data, they have some problems. This graph does not allow us to really explore the relationship between stomata counts and distance from the Kansas border. (Why not?)

Line graphs show the relationship between two kinds of data in which the independent variable is continuos. After the proper points are plotted on the graph, they should be connected by a line. To line more about graphing and how to make various kinds of graphs, the DIGSTATS site should be helpful. The following is a line graph showing percent lichen coverage and tardigrade density and represents the same data presented in the bar graph.


Making a Box Plot

John Tukey has developed a technique which gives greater prominence to the dispersion, the spread of the data. This method is known as a boxplot, or a box-and-whisker plot. To learn how to construct a boxplot. The following is a boxplot of the data represented in the bar graph and in the line graph.

Using Geographic Information Systems for Analysis

A geographic information system (GIS) is a computer-based tool for mapping and analyzing things that exist and events that happen on earth. The data that we have collected as a part of this project is well suited to GIS technology because it has a critical geographic dimension. GIS integrates common database operations such as query and statistical analysis with the unique visualization and geographic analysis benefits offered by maps. These abilities distinguish GIS from other types of analysis.
Mapmaking and geographic analysis are not new, but a GIS performs these tasks better and faster than do the old manual methods. And, before GIS technology, only a few people had the skills necessary to use geographic information to help with decision making and problem solving. A GIS stores information about the world as a collection of thematic layers that can be linked together by geography. This simple but extremely powerful and versatile many real-world problems from tracking delivery vehicles, to recording details of planning applications, to modeling global atmospheric circulation.


Geographic information systems work with two fundamentally different types of geographic models - the "vector" model and the "raster" model. In the vector model, information about points, lines, and polygons is encoded and stored as a collection of x,y coordinates. The location of a point feature, such as a bore hole, can be described by a single x,y coordinate. Linear features, such as roads and rivers, can be stored as a collection of point coordinates. Polygonal features, such as sales territories and river catchments, can be stored as a closed loop of coordinates.
The vector model is extremely useful for describing discrete features, but less useful for describing continuously varying features such as soil type or accessibility costs for hospitals. The raster model has evolved to model such continuous features. A raster image comprises a collection of grid cells rather like a scanned map or picture. Both the vector and raster models for storing geographic data have unique advantages and disadvantages. Modern GISs are able to handle both models.

I would like to work with the data in a map-based format.

Using Systems Thinking and Modeling to work with data.

Models are an important part of the explanatory part of science. You have seen several models in the background material of Green Clean (for example, how dry cleaning happerns) Science is a practical study of what can be observed, and the prediction from that, of what will be observed. Models support moving beyond assimilating content to actually building understanding and effectively sharing this understanding with others. Using modeling software for analysis will build your capacity for, evaluating your models' congruence with reality and seeing complex interdependent relationships. Modeling is another tool of the practicing scientist. The structure of virtually any system can be represented using just a few simple symbols! Sophisticated mathematics is not required to capture sophisticated relationships, Once a model is constructed, simulations provide the opportunity to test the theories, observe results, and modify assumptions, thereby increasing your understanding of how things really work and how to make them work better.

As you explore your data a number of questions will no doubt come to mind. Many of them begin with "Why".......which is good because it means you are ready to really begin the explanatory part of science. This process begins with establishing and refining your questions as a research question . If you do not have much experience with this process the Guided Research will help you begin working in the explanatory part of science. If you ready to jump in on your own go ahead and begin your work. If you need some helpful suggestions for how to proceed, or if you are ready to Publish My Research, this area will help you share the information you develop.

© 1996-2006 PathFinder Science