Skip to main content

DR-NTU (Data) User Guides and Policies

TwoRavens: Tabular Data Exploration

Please note: The TwoRavens project has a more recently published user guide on their site.

Exploring and Analyzing Tabular files in Dataverse
On the files tab, click on the “Explore” button to initiate TwoRavens Data Exploration and Analysis Tool.

 

Selection/Left Panel:
The left panel contains two sets of buttons: (1) Original Data and Subset Data; and (2) Variables, Models, and Summary.

 

Original Data and Subset Data:
When TwoRavens is initiated, you begin with the original dataset, and thus the Original Data button is selected. You will not be able to select the Subset Data until you have subsetted the data using the subset and select features in the right panel. After you have selected a subset of your data, you may toggle between that subset and the original data. If you wish to select a different subset, you may do so, but note that only one subset is supported at a time.

 

Variables, Models, and Summary:
Each of these tabs displays something different in the left panel when selected. The Variables tab shows a list of all the variables. When a variable name is hovered over, you can see that variable’s summary statistics. The first three variables are selected by default and displayed in the center panel, but you may add or remove variables by clicking on their name in the Variables tab.

The Models tab displays a list of Zelig models that are supported by TwoRavens. A brief model description is visible when hovering on the model name. Depending on the level of measurement of the dependent variable (continuous, ordinal, dichotomous, etc.), particular models may or may not be appropriate.

Note that to estimate a model, you must select one from the list. Currently, please use only Ordinary Least Squares (ls) as we are working on making other models available. (Suggestion: maybe we need to gray out the ones the other ones for the time being)

The Summary tab shows summary statistics when a pebble is hovered over. If one removes the pointer from hovering over a pebble, the previous tab will be displayed. So, if you wish to have the summary tab displayed regardless of where the pointer is, click on Summary before hovering over a pebble. Otherwise, if Variables or Models has been the last tab selected, when the pointer is no longer hovering over a pebble, the table will return to Variables or Models.

 

Modeling/Center Panel:
The center panel displays a graphical representation of variable relations and denotes variables that have been tagged with certain properties. Variables may be tagged as either a dependent variable, a time series variable, or a cross sectional variable. Each of these are accomplished by clicking on the appropriate button at the top of the screen, and then clicking on a pebble in the center panel. You’ll notice that when a variable is tagged with a property, the fill color becomes white, and the outline (or stroke) of the pebble turns the color of property’s button. Note that to estimate a model, the dependent variable must be selected.

Variable relations are specified by point-click-drag from one pebble to the other. When a path between pebbles has been specified, it is visually presented with a dotted arrow line and may be removed by pushing the delete key on your keyboard.

 

Results/Right Panel:
This section allows you to specify a subset of the data that you wish to estimate the model on, or that you wish to select and see updated summary statistics and distributions, and to set covariate values for the Zelig simulations (this is Zelig’s setx function).

 

Subset
To subset the data, click on the subset button and highlight (or brush) the portion of the distribution that you wish to use. You may remove the selection by clicking anywhere on the plot that is outside of the selected area. Or, if you wish to move the selected region, click inside the selection and move it to the left or right. If no region is selected, then by default the full range of values are used. If more than one variable is selected for subsetting, only the overlapping region is used in the model. If there are no overlapping regions, (i.e., if subsetted there would be no data), then only the first variable is used. Notice that range (or extent) of the selection for each variable is displayed below.

With a region selected, you have two options. First, you may click the Estimate button to estimate a model on using only the specified subset. Second, you may click the Select button. This will not estimate a model, but it will subset the data and return new summary statistics and plots for the subset. You may wish to use this feature to see how a subset will change the Set Covariate (Zelig’s setx) defaults, for example. After selecting a subset, you may toggle back and forth between the subsetted data and the original data by activating the appropriate button in the left panel.

 

Set Covariates
The Set Covariates button plots the distributions of each of the pebbles with an additional axis that contains two sliders, each of which default to the variable’s mean. This is TwoRavens’ equivalent of Zelig’s setx function. Move these sliders to the left or right to set your covariates at the desired value prior to estimation. Notice that the selected values appear below each plot.

After clicking the Estimate button, the model will be estimated and, upon completion, results appear in the Results tab. The results include figures produced by Zelig (and eventually the equation that has been estimated, the R code used to estimate the model, and a results table).

 

Additional Buttons:
Estimate
This executes the specified statistical model. Notice the presence of blue highlight on the “Estimate” button while process is running, turning into green upon completion. Note: you cannot use estimate without selecting a dependent variable and a model.

 

Force
The Force button allows you to control the way layout of the pebbles. To use this feature, first make sure none of the pebbles are highlighted. If one is, simply click on it to remove the highlighting. Second, press and hold the control key. Third, while holding down the control key, click the Force button. Fourth, continue to hold the control key and click on a pebble. You may now release the control key. Click on a pebble and drag it around on your screen.

 

Reset
This is your start over button. Clicking this is equivalent to reloading the Web page or re-initiating TwoRavens.

WorldMap: Geospatial Data Exploration

WorldMap
WorldMap is developed by the Center for Geographic Analysis (CGA) at Harvard and is an open source software that helps researchers visualize and explore their data in maps. The WorldMap and Dataverse collaboration allows researchers to be able to upload shapefiles to Dataverse for long term storage and receive a persistent identifier (through DOI) as well as be able to easily move into WorldMap to interact with the data.

Note: WorldMap hosts their own user guide that covers some of the same material as this page.


What is Geoconnect?
Geoconnect is a platform that integrates Dataverse and WorldMap, allowing researchers to visualize their geospatial data. Geoconnect can be used to create maps of shapefiles or of tabular files containing geospatial information. Geoconnect is an optional component of Dataverse, so if you are interested in this feature but don’t see it in the installation of Dataverse you are using, you should contact the support team for that installation and ask them to enable the Geoconnect feature.

If a data file’s owner has created a map of that data using Geoconnect, you can view the map by clicking the “Explore” button. If the data is in the form of a shapefile, the button takes you right to the map. If it’s a tabular file, the Explore button will be a dropdown, and you’ll need to select “Worldmap”.

 

Uploading Shapefiles to Dataverse
To get started, you will need to create a dataset in Dataverse. For more detailed instructions on creating a dataset, read the Dataset + File Management portion of this user guide.

Dataverse recognizes ZIP files that contain the components of a shapefile and will ingest them as a ZIP. Once you have uploaded your ZIP files comprising a shapefile, a Map Data button will appear next to the file in the dataset.


Mapping shapefiles with Geoconnect
Geoconnect is capable of mapping shapefiles which are uploaded to Dataverse in .zip format. Specifically, Dataverse recognizes a zipped shapefile by:

  1. Examining the contents of the .zip file
  2. Checking for the existence of four similarly named files with the following extensions: .dbf, .prj, .shp, .shx

Once you have uploaded your .zip shapefile, a Map Data button will appear next to the file in the dataset. In order to use this button, you’ll need to publish your dataset. Once your dataset has been published, you can click on the Map Data button to be brought to Geoconnect, the portal between Dataverse and WorldMap that will allow you to create your map.

To get started with visualizing your shapefile, click on the blue “Visualize on WorldMap” button in Geoconnect. It may take up to 45 seconds for the data to be sent to WorldMap and then back to Geoconnect.

Once this process has finished, you will be taken to a new page where you can style your map through Attribute, Classification Method, Number of Intervals, and Colors. Clicking “Apply Changes” will send your map to both Dataverse and WorldMap, creating a preview of your map that will be visible on your file page and your dataset page.

Clicking “View on WorldMap” will open WorldMap in a new tab, allowing you to see how your map will be displayed there.

You can delete your map with the “Delete” button. If you decide to delete the map, it will no longer appear on WorldMap, and your dataset in Dataverse will no longer display the map preview.

When you’re satisfied with your map, you may click “Return to the Dataverse” to go back to Dataverse.

In the future, to replace your shapefile’s map with a new one, simply click the Map Data button on the dataset or file page to return to the Geoconnect edit map page.


Mapping tabular files with Geoconnect
Geoconnect can map tabular files that contain geospatial information such as latitude/longitude coordinates, census tracts, zip codes, Boston election wards, etc.

Preparing a tabular file to be mapped
1. Ingest

Geospatial tabular files need a bit of preparation in Dataverse before they can be mapped in Geoconnect. When you upload your file, Dataverse will take about ten seconds to ingest it. During the ingest process it will identify the file as tabular data.

2.Tag as Geospatial

Next, you’ll need to let Dataverse know that your tabular file contains geospatial data. Select your file, click the “Edit Files” button, and select “Tags” from the dropdown menu. This will take you to the Edit Tags menu (pictured below). Under the “Tabular Data Tags” dropdown, select “Geospatial”. Then click “Save Changes”.

3. Publish & Map Data

At this point, a “Map data” button will appear next to your file. Publish this new version of your dataset to activate this button.

Creating the map
If your tabular file contains latitude and longitude columns, then the process is simple: those columns may be directly mapped. Otherwise, you will need to use a spatial join. Spatial joins tell WorldMap how to read your tabular data file in order to create a map that accurately represents it.

To carry out a spatial join, you’ll manually connect

  • Geospatial column(s) from your Dataverse tabular file
    • e.g., a census tract column from your table

with

  • A WorldMap “target layer” that contains the same geospatial information
    • e.g., WorldMap’s “target layer” containing census tract parameters

The following screenshots illustrate the mapping process:

1. Once you’ve pressed the “Map Data” button, you’re brought to this page:

2. Choose a Geospatial Data Type

3. Choose a column from your file to match the WorldMap Layer you selected

4. Choose from the list of WorldMap Layers available for the Geospatial Data Type you selected

5.Submit the data for mapping!

6. View Results

At this point you will be presented with a basic map that can be styled to your specifications. The example pictured below includes an error message - some of the rows weren’t able to be matched properly. In this case, you could still go forward with your map, but without the information from the unmatched rows.

Finalizing your map
Now that you have created your map:

  • It exists on the WorldMap platform and may be viewed there – with all of WorldMap’s capabilities.
  • Dataverse will contain a preview of the map and links to the larger version on WorldMap.

The map editor (pictured above) provides a set of options you can use to style your map. Clicking “Apply Changes” saves the current version of your map to Dataverse and Worldmap. The “Return to the Dataverse” button brings you back to Dataverse. “View on WorldMap” takes you to the map’s page on WorldMap, which offers additional views and options.

If you’d like to make further changes to your map in the future, you can return to the editor by clicking the “Map Data” button on your file.

 

Removing your map
You can delete your map at any time. If you are on Dataverse, click “Map Data” and click the “Delete Map” button on the upper right. This completely removes the map and underlying data from the WorldMap platform.