**The package smglr was renamed to graphlayouts**

I academically grew up among graph drawers, that is, computer scientists and mathematicians interested in deriving two-dimensional depictions of graphs. One may despicably call it pixel science, yet a lot of hard theoretical work is put into producing pretty graph layouts. Although I am not at all an expert in this field, I have learned a thing or two about that subject. As such, I have always been surprised why one of the (potentially) best algorithms is not implemented in R. This post is about my humble try to change this.

*If you read this and say: Hey! there is already a package for that! please do let me know.*

```
#used libraries
library(tidyverse) # for data wrangling
library(igraph) # for network data structures and tools
library(ggraph) # for prettier network visualizations
library(igraphdata) # some network data
library(patchwork) # combine ggplot objects
```

# Graph layouts in `igraph`

The R package `igraph`

comes with a lot of inbuilt layout algorithms. Just type `layout_`

in Rstudio and you will get overwhelmed by the possibilities. As a minor side note:
If you ever struggle with anything in igraph, consult the excellent tutorial from Katherine Ognyanova.

I usually have mixed feelings about using R to draw my networks and mostly resort to dedicated
software such as visone. Mostly, because I feel that the
algorithms in igraph tend to not be nice, even with the `layout_nicely()`

function.

Consider a typical benchmark graph for graph drawing, which can be downloaded here.

```
el <- read_delim("power-1138-bus.mtx",delim=" ",col_names = F)
g <- graph_from_data_frame(el,directed=F)
g <- igraph::simplify(g)
```

Let’s see what `igraph`

thinks a nice layout looks like.

```
par(mar=c(0,0,0,0))
plot(g,layout=layout_nicely,vertex.size=0.5,vertex.label=NA)
```

I know, “beauty lies in the eyes of the beholder”, but I personally do not think that this is particularly nice. Below, you see a collection of layouts, produced by different algorithms.

```
par(mfrow=c(2,2),mar=c(0,0,0,0))
plot(g,layout=layout_with_drl,vertex.size=0.5,vertex.label=NA)
plot(g,layout=layout_with_lgl,vertex.size=0.5,vertex.label=NA)
plot(g,layout=layout_with_fr,vertex.size=0.5,vertex.label=NA)
plot(g,layout=layout_with_mds,vertex.size=0.5,vertex.label=NA)
```

Notice the big differences. Personally, I would prefer the `layout_with_lgl`

(top right).
Below is a bigger version drawn with `ggraph`

.

```
ggraph(g,layout="lgl")+
geom_edge_link(width=0.2,colour="grey")+
geom_node_point(col="black",size=0.3)+
theme_graph()
```

You will notice that this layout looks different than above.
This is due to the fact, that the algorithm underlying `layout_with_lgl`

is non-deterministic, meaning
that it produces different pictures in consecutive runs. In fact, most of the other
layout algorithm have this (annoying?) feature. More than once I have found myself
layouting the network over and over again until I was satisfied.

# Stress majorization

The first thing I learned from my graph drawing peers was to minimize stress. Not necessarily
in the sense of work (which doesn’t work anyway while being a PhD student), but for
graph layouting. *Stress majorization* is actually an optimization strategy used in multidimensional scaling where the goal is to minimize the so-called stress function defined as
\[
\sigma(X)=\sum_{i<j} w_{ij}(\delta_{ij}-d_{ij})^2,
\]
where \(w_{ij} \geq 0\) is a weight between a pair of points \((i,j)\) , \(d_{ij}\) is
the geodesic distance between \(i\) and \(j\) and \(\delta _{ij}\) is the euclidean distance
of coordinates \(X_i\) and \(X_j\). By minimizing stress, we thus seek to find cartesian coordinates
for each node so that the euclidean distance is as close as possible to the geodesic distance.
If you are interested in more technical details, please see the original contribution by
Gansner et al..

# Implementation with `Rcpp`

and the `smglr`

package

I implemented stress majorization with `Rcpp`

. While the code is not that involved, it still is a bit lengthy.
I created a very rudimentary R package containing the stress majorization graph layout algorithm, which
is available via github.

```
# devtools::install_github("schochastics/smglr")
library(smglr)
```

So what does our benchmark network look like using stress majorization?

```
l <- stress_majorization(g)
ggraph(g,layout="manual",node.positions=data.frame(x=l[,1],y=l[,2]))+
geom_edge_link(width=0.2,colour="grey")+
geom_node_point(col="black",size=0.3)+
theme_graph()
```

In my opinion, this looks definitely better than any of the layouts before.

# More examples

Here are two more examples to convince you of stress based layouts (always the right one).

```
# preferential attachment
pa <- sample_pa(1000,1,1,directed = F)
ggraph(pa)+
geom_edge_link(width=0.2,colour="grey")+
geom_node_point(col="black",size=0.3)+
theme_graph() -> p1
l <- stress_majorization(pa)
ggraph(pa,layout="manual",node.positions=data.frame(x=l[,1],y=l[,2]))+
geom_edge_link(width=0.2,colour="grey")+
geom_node_point(col="black",size=0.3)+
theme_graph()-> p2
p1+p2
```

```
# yeast protein interactions from igraphdata (only biggest component)
data(yeast)
comps <- components(yeast)
bcomp <- which.max(comps$csize)
yeast <- induced_subgraph(yeast,comps$membership==bcomp)
ggraph(yeast)+
geom_edge_link(width=0.2,colour="grey")+
geom_node_point(col="black",size=0.3)+
theme_graph() -> p1
l <- stress_majorization(yeast)
ggraph(yeast,layout="manual",node.positions=data.frame(x=l[,1],y=l[,2]))+
geom_edge_link(width=0.2,colour="grey")+
geom_node_point(col="black",size=0.3)+
theme_graph()-> p2
p1+p2
```

# Caveats

Stress majorization produces nice layouts, is deterministic and easy to implement.
The downside is, that it is rather slow for large networks (I also partially blame my
implementation for that). But there is also a way out of that problem. Former colleagues of
mine published a sparse stress model
which allows stress based layouting for really large graphs. The java code can be found on
github. Also, keep an eye out for
an R package called `visone3`

which will, among other things, also allow for stress based layouts.