rDNA - A Package to Control Discourse Network Analyzer from R


Description
 

rDNA is an R package that allows to integrate the content analysis software Discourse Network Analyzer into the statistical programming environment R.

Functionality

As of version 1.29, rDNA provides seven R functions:

  1. dna.init  initializes the connection between R and DNA.
  2. dna.gui starts the user interface of DNA directly from R.
  3. dna.network can import several kinds of networks directly from a .dna file into R (equivalent to the network export function in DNA).
  4. dna.attributes can import the attributes of persons or organizations from a .dna file into R (equivalent to the actor attribute export in DNA).
  5. dna.timeseries computes statement frequency time series statistics for all actors (equivalent to the time series statistics export in DNA).
  6. dna.density computes the weighted or binary within- and between-block density of matrices. Blocks are given as an optional argument.
  7. dna.categories returns a list of all categories in the .dna file as a character vector.


Motivation

R is becoming increasingly popular as a platform for social network analysis (see, for example, the statnet project). The focus of DNA is the extraction of network data from manually encoded textual discourses. Bringing together the data-generating software and the methods for processing these data therefore seems quite natural.

Before the development of rDNA, the easiest way to get DNA data into R was to export a network to a CSV file and then import this CSV file in R using the read.table() command. This can quickly become tedious because every time even minor parts of the analysis are changed, the data have to be re-exported from DNA and re-imported into R. With rDNA, in contrast, the analyst can directly pull the desired network data from a .dna file without going through the whole user interface and export options of DNA. Or, if desired, rDNA allows to start the coding window of DNA from within R in order to provide an interactive content analysis solution for R (similar to the RQDA project, but with an additional focus on network analysis).


Examples

The following R code is an introductory session which briefly demonstrates how rDNA works:

# download files and initialize DNA:
> library(rDNA)  #load the rDNA package
> download.file("http://philipleifeld.de/cms/upload/Downloads/dna-1.29.jar",
   destfile="dna-1.29.jar", mode="wb")  #download DNA
> download.file("http://philipleifeld.de/cms/upload/Downloads/sample.dna",
   destfile="sample.dna", mode="wb")  #download sample file
> dna.init("dna-1.29.jar")  #connect R to DNA

# plot a congruence network using the statnet package:
> congruence <- dna.network("sample.dna", exclude.categories=
   "There should be legislation to regulate emissions.") 
#create a congruence network
> library(statnet)  #load the statnet package for network analysis
> congruence.nw <- network(congruence)  #create network object
> plot(congruence.nw, displaylabels=TRUE, label.cex=0.6, pad=0.8)  #visualize the network

# do a hierarchical cluster analysis with an affiliation network:
> affiliation.yes <- dna.network("sample.dna", algorithm="affiliation", agreement="yes",
   include.isolates=TRUE) 
#export positive statements first;
   #include isolates because both matrices shall have the same dimensions
> affiliation.no <- dna.network("sample.dna", algorithm="affiliation", agreement="no",
   include.isolates=TRUE) 
#then export negative statements
> affiliation <- cbind(affiliation.yes, affiliation.no)  #merge the two datasets
> affiliation <- affiliation[rowSums(affiliation) > 0,]  #remove isolates
> distances <- dist(affiliation, method="binary")  #create a dissimilarity matrix
> clustering <- hclust(distances)  #hierarchical clustering
> plot(clustering)  #show a dendrogram of the cluster structure

# open the GUI of DNA in order to manually work on the data
> dna.gui()


Installation

The easy way: rDNA is on CRAN, which means it can be directly installed from R. To do this, simply enter the command install.packages("rDNA") in R, followed by library(rDNA) to load the package. This may not (yet) work on MacOS.

The hard way: Alternatively, you can download a copy of rDNA here and install it manually: download rDNA_1.29.tar.gz [version 1.29, October 16, 2011]. First, make sure that the rJava package is installed on your system. If not, enter install.packages("rJava") in your R console. Next, download the rDNA package to an empty folder on your hard drive. Call the following command from the terminal (not the R console, but the console of your operating system): R CMD INSTALL rDNA. Now you are ready to use rDNA. Start R and type library(rDNA) to load the package.

Depending on your configuration, you may have to tell R where to install the package by typing the following command instead: R CMD INSTALL -l <folder> rDNA (replace <folder> by the path of your R package folder, for example something like C:\Program Files\R\R-2.12.0\library\ on Windows XP or something like /home/user/R/i686-pc-linux-gnu-library/2.11/ on Linux). On most operating systems, -l <folder> can be omitted. If your operating system does not find the R command, you may have to enter the full path, e.g., "C:\Program Files\R\R-2.12.0\bin\R.exe" instead of "R".

A binary version (that is, already compiled for your convenience) for Windows (x86 32 bit) is available here: rDNA_1.29.zip.

Make sure you have the latest version of Discourse Network Analyzer. rDNA 1.29 requires version 1.29 of DNA.

Warning: Some of the commands may not work properly on MacOS. However, the most important method, dna.network(), should work without any problem.


Update from an old version of rDNA

> help(packages = "rDNA") #find out which version is currently installed
> remove.packages("rDNA") #remove old version
> install.packages("rDNA") #get the latest version from CRAN
> help(packages = "rDNA") #let's see if this version was really the latest release


Getting help

For each command of the rDNA package, help is available from within R. Simply type one of the following commands:

> ?rDNA #get basic instructions for the package
> ?dna.init #get help for the dna.init() function
> ?dna.network #get help for the dna.network() function, etc.
> help(package="rDNA") #display summary information for the package
> citation("rDNA") #display information on how to cite the package

Additional help is available from the reference manual for rDNA.

You may also consider joining the DNA-help mailing list. Feel free to post questions or bug reports to the mailing list if your question is not covered elsewhere.

   Home    Contact