All Presidential State of the Union Addresses from 1790-2006 organized by similarity of terms within the text and colors showing 4 communities, or classes, of Addresses. This is an example of applying social network analysis to text similarity. It is interesting to see the most central and significant Addresses were Truman’s in 1946 and Taft’s in 1912. (You can see this by the size of the node.) Also interesting to see the various years that were clustered into communities of similarity, as visualized by node color. Interested in this type of research? Join us this Friday morning! Register for the Gephi workshop here.


Step #1: Calculate document similarity

Step #2: Create network visualization (above)

  • Gephi software
  • Filters: Edge Weight (document similarity): >=.3 (30%), Node Degree: >=2
  • Layout: Force Atlas 2, Noverlap
  • Node Sizes: Proportional by Betweeness Centrality
  • Node Color: 4 Modularity Class Communities
  • Labels: Year

Data Source:

State of the Union Addresses: Networking Similarities

Leave a Reply

Your email address will not be published. Required fields are marked *